DataBricks Issue - "The specified schema does not match the existing schema"


Hey guys, wanted to talk about delta lake in data bricks and the ADLS ecosystem.
I have found an issue while changing the schema of delta tables. 

Problem Statement - Let's say you have a table Table1 with columns column1 and column2 as a delta table and the requirement is to get rid of column2 from Table1, something like below






Using T SQL we could easily achieve this by using ALTER Table drop column ...command

However, in the spark SQL and Databricks realm, I couldn't find any such feature. There are options to re-order, add columns but not drop as per https://docs.databricks.com/delta/delta-batch.html

Solution - Seems like the only possible solution is to drop the folder/delta table in the ADLS







You would have the table/folder recreated after running the above spark SQL query (screenshot)

Does anyone have a proper solution to this? This looks like a legit limitation as changing schemas is very common in real-time projects

Comments

Popular posts from this blog

Issues Integrating Azure Data Factory with GITHUB | IN spite of admin rights on repository

SQL QUERY NIGHTMARE

Handling decimal and non numeric types using Case statement