Databricks | No Need To Skip Rows Before Header Row while reading a CSV File
Man! The past couple of weeks has been really tough. Hardcore development on Azure Data Factory, and Azure Databricks as we are up against a tight deadline (again :-) ). Loads of different scenarios and loads of new learnings. Sharing one below, keep reading. We are receiving a source file (let's call it Test.csv) which has a blank row before the header row 1 2 "colname1", "colname2" 3 "value1","value2" we are using spark.read.format to load this into a data frame. Looking at the file contents, one would assume that you need to somehow skip the first blank row. So I began researching it. Found that spark.read.format does nt provide any such property. After spending couple of hours with no major break through, I thought of testing the code as it is val rawdataframe= spark.read.format("csv").option("header","true").option("inferSchema","true").option("delimiter", s",...