[SOLVED] how to use pyspark writing to JDBC without column name

how to use pyspark writing to JDBC without column name

My question is really really simple.

I'm use pyspark to export a hive table to SQL Server.

I found I exported column names as lines in the SQL Server.

I just want to do it without column names.

I don't want these columns in tables...

My pyspark code here:

df.write.jdbc("jdbc:sqlserver://10.8.12.10;instanceName=sql1", "table_name", "overwrite", {"user": "user_name", "password": "111111", "database": "Finance"})

Is there an option to skip column names?

Solution

I think the JDBC connector isn't actually what adds those header lines. The header is already present in your Dataframe, it's a known problem when reading data from Hive table.

If you're using SQL to load data from Hive, you can try filtering the header with condition col != 'col':

# adapt the condition by verifiying what is in  df.show()    
df = spark.sql("select * from my_table where sold_to_party!='Sold-To Party'")