arraysjsonazure-data-factoryflatten

Ingest Yahoo Finance data with Azure Data Factory


I'm trying to ingest data from the open-source, public Yahoo Finance API using Azure Data Factory. The endpoint I'm testing is https://query2.finance.yahoo.com/v8/finance/chart/GOLD.

I am able to ingest the data but I'm coming up with an issue when trying to transform the data as part of a data flow. I am trying to flatten the JSON produced, which is a series of nested arrays in the structure of:

JSON structure

To produce a table in the below format:

timestamp volume open low high close

The setup of my flatten activity is as follows:

Flatten settings

The Partition option I'm using is Use current partitioning. This is what it looks like under the Inspect tab:

Inspect

However, when I try to preview the data, nothing comes up and the notifications in ADF show this error:

Could not fetch statistics due to operation timeout.

In the source, I've tried sampling the data to only 10 rows and I'm getting the same error so I don't think this is the issue. I have also tried a different API endpoint (MSFT) and I'm getting the same error here as well.

Any ideas appreciated!

Thanks,

Carolina


Solution

  • Figured it out! It was because the amount of data it was trying to ingest was too large. I set the query parameters as below and I'm now getting data through:

    enter image description here