firebasegoogle-cloud-platformgoogle-bigquerygoogle-cloud-functionsfirebase-extensions

Firebase Collection Stream to BigQuery Transform Function


the firebase extension "Stream Collections to BigQuery" allows for configuring a Transform Function for converting the Firestore Data Json to explicit BigQuery table fields. https://firebase.google.com/products/extensions/firebase-firestore-bigquery-export

Can anyone point me to an example function or detailed docs for such functions?

Thanks, Ben


Solution

  • The transform Function should be an HTTP Cloud Function with the following logic (get the input object from the request, transform it, send it back in the response) as shown in the below CF skeleton:

    exports.bqTransform = functions.https.onRequest(async (req, res) => {
        
       const inputPayload = req.body // JS Object
       // ...
       // Transform the object 
       // ...
       const outputPayload = {...}   // JS Object
        
       res.send(outputPayload);
        });
    

    As explained in the doc, the inputPayload object (i.e. req.body) contains a data property (which is an array) which contains a representation of the Firestore document, has shown below:

    { 
      data: [{
        insertId: int;
        json: {
          timestamp: int;
          event_id: int;
          document_name: string;
          document_id: int;
          operation: ChangeType;
          data: string;  // <= String containing the stringified object representing the Firestore document data
        },
      }]
    }
    

    The transformation implemented in your code shall create an object with the same structure (outputPayload in our skeleton example above) where the data[0].json property is adapted according to your transformation requirements.


    Here is a very simple example in which we totally change the content of the Firestore record with some static data.

    exports.bqTransform = functions.https.onRequest(async (req, res) => {
    
        const inputPayload = req.body; 
        const inputData = inputPayload.data[0];
    
        const outputPayload = [{
            insertId: inputData.insertId,
            json: {
                timestamp: inputData.json.timestamp,
                event_id: inputData.json.event_id,
                document_name: inputData.json.document_name,
                document_id: inputData.json.document_id,
                operation: inputData.json.operation,
                data: JSON.stringify({ createdOn: { _seconds: 1664983515, _nanoseconds: 745000000 }, array: ["a1", "a2"], name: "Transformed Name" })
            },
        }]   
    
        res.send({ data: outputPayload });
    });