google-bigquerygoogle-cloud-data-fusion

Using arguments in bigquery schema for Google DataFusion


I am trying to use runtime arguments in bigquery schema defintion in Bigquery sink plugin. It is just two columns. Definition in argument setter.json. -

{
  "arguments" : [
   {"name":"bq.config.table","value":"activity_category"},
   
   {
   "name" : "sqloutput_schema",
   "type" : "schema",
   "value" : 
    [
    {"name":"activity_category_id","type":"string","nullable":true},
    {"name":"activity_category_description","type":"string"}
    ]
    }
   
  ]
}

Issue is in the 'sqloutput_schema', which is failing during runtime - PFA screenshot of plugin:- enter image description here

Error received - Spark program 'phase-2' failed with error: Argument 'sqloutput_schema' is not defined.Please check the system logs for more details. io.cdap.cdap.api.macro.InvalidMacroException: Argument 'sqloutput_schema' is not defined.

I am unable to find a solution as to why this is failing.


Solution

  • The problem is your schema definition. I had the same use-case and my argument was of type string and the value had the following format -

    "{\"name\":\"etlSchemaBody\",\"type\":\"record\",\"fields\":
    [
    {\"name\":\"Id\",\"type\":\"int\"},
    {\"name\":\"name\",\"type\":\"string\"}
    ]}"
    

    So change the type of the argument of the schema and the schema json following the format above.