I have a Scala job where I need to insert nested JSON file to BigQuery. The solution for that is to create a BQ table with field type as Record for the nested fields.
I wrote a case class that looks like this:
case class AvailabilityRecord(
nestedField: NestedRecord,
timezone: String,
) {
def toMap(): java.util.Map[String, Any] = {
val map = new java.util.HashMap[String, Any]
map.put("nestedField", nestedField)
map.put("timezone", timezone)
map
}
}
case class NestedRecord(
from: String,
to: String
)
I'm using the Java dependency "com.google.cloud" % "google-cloud-bigquery" % "2.11.0",
in my program.
When I try to insert the JSON value that I parsed to the case class, into BQ, the value of field timezone of tpye String is inserted, however the nested field of type Record is inserted as null.
For insertion, I'm using the following code:
def insertData(records: Seq[AvailabilityRecord], gcpService: GcpServiceImpl): Task[Unit] = Task.defer {
val recordsToInsert = records.map(record => InsertBigQueryRecord("XY", record.toMap()))
gcpService.insertIntoBq(recordsToInsert, TableId.of("dataset", "table"))
}
override def insertIntoBq(records: Iterable[InsertBigQueryRecord],
tableId: TableId): Task[Unit] = Task {
val builder = InsertAllRequest.newBuilder(tableId)
records.foreach(record => builder.addRow(record.key, record.record))
bqContext.insertAll(builder.build)
}
What might be the issue of fields of Record type are inserted as null?
The issue was that I needed to map the sub case class too, because to the java API, the case class object is not known.
For that, this helped me to solve the issue:
case class NestedRecord(
from: String,
to: String
) {
def toMap(): java.util.Map[String, String] = {
val map = new java.util.HashMap[String, Any]
map.put("from", from)
map.put("to", to)
map
}
}
And in the parent case class, the edit would take place in the toMap method:
map.put("nestedField", nestedField.toMap)
map.put("timezone", timezone)