Trying to download daily bitcoin miner fees using the blockchain.com API. However, even though the graph on the website (https://www.blockchain.com/explorer/charts/transaction-fees) shows daily data, the API only gives data for every 4 days.
import requests
import pandas as pd
url = "https://api.blockchain.info/charts/transaction-fees?timespan=all&format=json"
response = requests.get(url)
data = response.json()
df = pd.DataFrame(data['values'])
# Convert the timestamp to a readable format
df['x'] = pd.to_datetime(df['x'], unit='s') # 'x' is the Unix timestamp
df.set_index('x', inplace=True) # Set the datetime as the index
df.rename(columns={'y': 'fees'}, inplace=True) # Rename 'y' to 'hashrate'
print(df.head())
I tried adapting the code to rolling 4d window and interpolating, but there is a quite large error when I do this, as in this case I don't get the correct data for each day, but estimate it based on the surrounding days:
import requests
import pandas as pd
url = "https://api.blockchain.info/charts/transaction-fees?timespan=all&rollingAverage=4days&format=json"
response = requests.get(url)
data = response.json()
df = pd.DataFrame(data['values'])
# Convert the timestamp to a readable format
df['x'] = pd.to_datetime(df['x'], unit='s')
df.set_index('x', inplace=True)
df.rename(columns={'y': 'fees'}, inplace=True) # Rename 'y' to 'hashrate'
df_daily = df.resample('D').interpolate(method='linear')
print(df_daily.head())
-- Update
I assume this is an api limitation. As the raw data also misses 3 days each time: https://api.blockchain.info/charts/transaction-fees
-- Update 2 I have added 'sampled=false' to the api request, and now I get data for every 15min, which is too much. I am just looking for daily data, but the API doc is not so good:
As stated in the docs:
$sampled- Boolean set to 'true' or 'false' (default 'true'). If true, limits the number of datapoints returned to ~1.5k for performance reasons. (Optional)
Apparently, the API automatically starts adjusting (part of) the timedelta to accommodate the (implied) request for no more than ~1.5k datapoints. It's not immediately clear where it draws the line, but here are 2 workarounds:
Option 1
Use 'sampled': 'false'
:
import requests
import pandas as pd
url = 'https://api.blockchain.info/charts/transaction-fees'
params = {'start': '2009-01-01',
'timespan': 'all',
'format': 'json',
'sampled': 'false'}
response = requests.get(url, params=params)
data = response.json()
df = pd.DataFrame(data['values'])
df['x'] = pd.to_datetime(df['x'], unit='s')
df.set_index('x', inplace=True)
df.rename(columns={'y': 'fees'}, inplace=True)
Option 2
Use a for loop and request smaller chunks:
import requests
import pandas as pd
url = 'https://api.blockchain.info/charts/transaction-fees'
timespan = 6 # up to 6 (years) still gets you daily results,
# from 7 it starts using '2d' timedelta prior to 2023, and then worse from there
# set params for get request
params = {'start': '2009-01-01',
'timespan': f'{timespan}years',
'format': 'json'}
# initialize list to collect values per request
values = []
for year in range(2009, 2024, timespan):
request_params = params.copy()
request_params['start'] = f'{year}-01-01'
response = requests.get(url, params=request_params)
data = response.json()
# append values to `values`
values.extend(data['values'])
print(f'{year} added')
df = pd.DataFrame(values)
df['x'] = pd.to_datetime(df['x'], unit='s')
df.set_index('x', inplace=True)
df.rename(columns={'y': 'fees'}, inplace=True)
Output:
import numpy as np
df.iloc[np.r_[0:2, -2:0]]
fees
x
2009-01-17 0.000000
2009-01-18 0.000000
2024-10-12 44.805855
2024-10-13 24.719396
Plot:
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_theme(rc={'figure.figsize':(20, 10)})
sns.set(style="ticks")
sns.lineplot(data=df['fees'])
plt.grid()
plt.show()
Compare the chart on the website:
Actually, the chart on the website is itself skipping days, if you set it to 'all'. E.g.:
Compare that to the raw data retrieved:
df.loc['2017-12-20':'2017-12-23']
fees
x
2017-12-20 911.591743
2017-12-21 1128.761670 # matches value as shown in snippet website (1,128.762)
2017-12-22 1495.946477 # peak, not shown on website
2017-12-23 1065.615595