DataBricks Issue - "The specified schema does not match the existing schema"
Hey guys, wanted to talk about delta lake in data bricks and the ADLS ecosystem.
I have found an issue while changing the schema of delta tables.
Problem Statement - Let's say you have a table Table1 with columns column1 and column2 as a delta table and the requirement is to get rid of column2 from Table1, something like below
Using T SQL we could easily achieve this by using ALTER Table drop column ...command
However, in the spark SQL and Databricks realm, I couldn't find any such feature. There are options to re-order, add columns but not drop as per https://docs.databricks.com/delta/delta-batch.html
Solution - Seems like the only possible solution is to drop the folder/delta table in the ADLS
You would have the table/folder recreated after running the above spark SQL query (screenshot)
Does anyone have a proper solution to this? This looks like a legit limitation as changing schemas is very common in real-time projects
Comments
Post a Comment