I'm using the printSchema
function to infer schema of Json file. I want to save the result of this function call in a variable to parse it line by line so that I can extract a structure of a schema and convert it in a DDL schema for creating a table in hive.
How can this be done?
If you inspect the source code for printSchema()
, you will see that this function just does the following:
print(self._jdf.schema().treeString())
Therefore, you can save the output as follows:
printSchemaString = df._jdf.schema().treeString()
Other references: