[SOLVED] GCP Healthcare FHIR Ingestion is too slow

GCP Healthcare FHIR Ingestion is too slow

I am trying to ingest 1 Million FHIR JSON Files (each file in bytes size) in FHIR Store of google healthcare dataset. It is taking so much time to ingest (more than an hour). Is there any way to optimize the speed of healthcare API.

Note : I want to Ingest, de-identify and export to bigquery as well. so the entire process is taking more than 3 hours of time.

Thanks in advance

Solution

Some performance tips for bulk FHIR import in the Google Cloud Healthcare API:

Make sure your input GCS bucket is in the same region as the healthcare dataset. Cross-region imports will be slower.

Check your project quota. The relevant quota for bulk imports is "FHIR storage ingress in bytes per minute". You can request a quota increase if this becomes the limiting factor.

Performance may vary depending on the overall load in the region you are using. us-central1 is a very popular region because it's referenced in the codelab; you might achieve higher throughput elsewhere (see https://cloud.google.com/healthcare/docs/concepts/regions for available regions).