scalaparquetakka-streamalpakka

How to map parquet record (json) to case class using alpakka


I have the data that saved in parquet file as json e.g {"name":"john", "age":23.5} I want to convert it to

case class Person(name: String, age: Double) so I can use pattern matching in my actor. This is what I got so far:

val reader: ParquetReader[GenericRecord] =  AvroParquetReader.builder[GenericRecord](filePath).withConf(conf).build()

val source: Source[GenericRecord, NotUsed] = AvroParquetSource(reader)
  source.ask[WorkerAck](28)(workerActor)

I tried to replace the GenericRecord with Person but I got the following error:

inferred type arguments [com.common.Person] do not conform to method apply's type parameter bounds [T <: org.apache.avro.generic.GenericRecord]
  val source: Source[Person, NotUsed] = AvroParquetSource(reader)

Solution

  • I think you have two options

    1. Use Avro code generator to generate the code for the Person DTO class. This will create a Person class that inherits from generic record. See this tutorial

    2. Add an actor that converts GenericRecord instances to Person:

    Person((String)record.get("name"), (Double)record.get("age"))