javascriptgoapache-arrowapache-arrow-flight

How do I locally host an Apache Arrow Flight server using Go and retrieve in Javascript?


I'm using Go for my backend server and Javascript as my frontend, I've successfully used http previously to localhost some simple data in IPC in Go then fetch and retrieve in my table from the localhost url in Javascript. But now I'm having trouble with trying to setup a way of serving and retrieving data using flight.

So the main part of my question is how do you setup an apache arrow flight server locally on a Go backend(with my arrow.Record being the data sent) and then retrieve this data in a Javascript frontend?

Code from my main.go file setting up the data:

import (
    "fmt"
    "github.com/apache/arrow/go/v16/arrow"
    "github.com/apache/arrow/go/v16/arrow/array"
    "github.com/apache/arrow/go/v16/arrow/flight"
    "github.com/apache/arrow/go/v16/arrow/memory"
    "log"
    "net/http"
)

var metadata = arrow.NewMetadata(
    []string{"type", "round", "date"},
    []string{"int", "1dp", "2024/03/30"},
)

var schema = arrow.NewSchema([]arrow.Field{
    {Name: "X", Type: arrow.PrimitiveTypes.Int32},
    {Name: "X + 5", Type: arrow.PrimitiveTypes.Int32},
}, &metadata)

func GetPutData() arrow.Record {

    pool := memory.NewGoAllocator()

    recordBuilder := array.NewRecordBuilder(pool, schema)

    recordBuilder.Field(0).(*array.Int32Builder).AppendValues([]int32{1, 2, 3, 4, 5}, nil)
    recordBuilder.Field(1).(*array.Int32Builder).AppendValues([]int32{6, 7, 8, 9, 10}, nil)

    record := recordBuilder.NewRecord()

    return record
}

And I imagine my main method to be something like:

func main() {
    // Create a Flight server locally with data from GetPutData
    rec := GetPutData()
    
    // Start flight server

    fmt.Println("Flight serving on port...")
}

Then on the frontend I imagine it would be some code like this:

import * as Arrow from 'apache-arrow';
import * as Flight from 'apache-arrow/flight';

async function runExample(url) {

    const table = // Fetch data from flight server

    console.table(table.toArray());
    console.log(table.schema)
    console.log(table.data)
}

runExample(url);

I've tried using previous examples but they appear outdated? But I have also tried looking through the documentation here but I find it incredibly confusing to find what old methods have changed to etc. Also I'm currently not bothered about adding in any authentication.


Solution

  • So I'm only able to really speak to the Go server side of this question as the biggest problem you're going to run into on the JavaScript side is having to access gRPC. There's an issue tracking examples for Flight in ArrowJS which hasn't yet been filled.

    That all said, for the server side there's an example server in the documentation.

    For your particular example the minimum would be to implement one method for your server: DoGet:

    type server struct {
      flight.BaseFlightServer
    }
    
    func (s *server) DoGet(*flight.Ticket, svc flight.FlightService_DoGetServer) error {
      // if your server can return more than one stream of data,
      // the ticket is how you would determine which to send.
      // for your example, we're going to ignore the ticket.  
      rec := GetPutData()
      defer rec.Release()
      // create a record stream writer
      wr := flight.NewRecordWriter(svc, ipc.WithSchema(rec.Schema()))
      defer wr.Close()
      // write the record
      return wr.Write(rec)
    }
    

    Your main function could then potentially look like so:

    func main() {
      srv := flight.NewFlightServer()
      // replace "localhost:0" with the hosting addr and port
      // using 0 for the port tells it to pick any open port
      // rather than specifying one for it.
      srv.Init("localhost:0")
    
      // register the flight server object we defined above
      srv.RegisterFlightService(&server{})
      // you can tell it to automatically shutdown on SIGTERM if you like
      srv.SetShutdownOnSignals(syscall.SIGTERM, os.Interrupt)
      fmt.Printf("Server listening on %s...\n", srv.Addr())
      srv.Serve()
    }
    

    When you start making the server a bit more complicated you can look into defining the GetFlightInfo method along with the other methods.

    Hope this helps! Feel free to ask any further questions you have.