pythonbeautifulsoupcoinmarketcap

how to scrape historic snapshot table from coinmarketcap using beautifulsoup


I am trying to scrape historical snapshot data from coinmarketcap using python:

https://coinmarketcap.com/historical/20201227/

I've tried to use the beautifulsoup. It works fine until row 20 but after that the returned rows look a lot different.

import pandas as pd
import requests
from bs4 import BeautifulSoup

date = '20211219/'
URL = 'https://coinmarketcap.com/historical/' + date
webpage = requests.get(URL)
soup = BeautifulSoup(webpage.text, 'lxml')    # 'html.parser'
tr = soup.find_all('tr', attrs={'class': 'cmc-table-row'})

The first twenty elements of tr contains all the columns from the webpage.

Starting with the 21st element it looks much different and doesn't include what's actually on the table on the webpage:

So i am not successful in scraping the data after 20th row. How can I access this part of the table?


Solution

  • In case you haven't found a solution by now: that page is pulling the info from an api, and the following code will get you the data you're after:

    import pandas as pd
    import requests
    
    my_date = '2020-12-27'
    
    r = requests.get(f'https://web-api.coinmarketcap.com/v1/cryptocurrency/listings/historical?convert=USD,USD,BTC&date={my_date}&limit=5000&start=1')
    df = pd.DataFrame(r.json()['data'])
    print(df)
    

    This return a rather large dataframe [4048 rows x 33 columns]:

    id name symbol slug num_market_pairs date_added tags max_supply circulating_supply total_supply platform cmc_rank self_reported_circulating_supply self_reported_market_cap tvl_ratio last_updated quote.BTC.price quote.BTC.volume_24h quote.BTC.percent_change_1h quote.BTC.percent_change_24h quote.BTC.percent_change_7d quote.BTC.market_cap quote.BTC.fully_diluted_market_cap quote.BTC.tvl quote.BTC.last_updated quote.USD.price quote.USD.volume_24h quote.USD.percent_change_1h quote.USD.percent_change_24h quote.USD.percent_change_7d quote.USD.market_cap quote.USD.tvl quote.USD.last_updated
    0 1 Bitcoin BTC bitcoin 9712 2013-04-28T00:00:00.000Z ['mineable', 'pow', 'sha-256', 'store-of-value', 'state-channel', 'coinbase-ventures-portfolio', 'three-arrows-capital-portfolio', 'polychain-capital-portfolio', 'binance-labs-portfolio', 'blockchain-capital-portfolio', 'boostvc-portfolio', 'cms-holdings-portfolio', 'dcg-portfolio', 'dragonfly-capital-portfolio', 'electric-capital-portfolio', 'fabric-ventures-portfolio', 'framework-ventures-portfolio', 'galaxy-digital-portfolio', 'huobi-capital-portfolio', 'alameda-research-portfolio', 'a16z-portfolio', '1confirmation-portfolio', 'winklevoss-capital-portfolio', 'usv-portfolio', 'placeholder-ventures-portfolio', 'pantera-capital-portfolio', 'multicoin-capital-portfolio', 'paradigm-portfolio'] 2.1e+07 1.85828e+07 1.85828e+07 1 2020-12-27T23:00:00.000Z 1 2.53042e+06 0 0 0 1.85828e+07 2020-12-27T23:59:41.000Z 26272.3 6.64799e+10 -0.910864 -0.623152 11.9051 4.88213e+11 2020-12-27T23:00:00.000Z
    1 1027 Ethereum ETH ethereum 5916 2015-08-07T00:00:00.000Z ['mineable', 'pow', 'smart-contracts', 'ethereum-ecosystem', 'coinbase-ventures-portfolio', 'three-arrows-capital-portfolio', 'polychain-capital-portfolio', 'binance-labs-portfolio', 'blockchain-capital-portfolio', 'boostvc-portfolio', 'cms-holdings-portfolio', 'dcg-portfolio', 'dragonfly-capital-portfolio', 'electric-capital-portfolio', 'fabric-ventures-portfolio', 'framework-ventures-portfolio', 'hashkey-capital-portfolio', 'kenetic-capital-portfolio', 'huobi-capital-portfolio', 'alameda-research-portfolio', 'a16z-portfolio', '1confirmation-portfolio', 'winklevoss-capital-portfolio', 'usv-portfolio', 'placeholder-ventures-portfolio', 'pantera-capital-portfolio', 'multicoin-capital-portfolio', 'paradigm-portfolio', 'injective-ecosystem', 'bnb-chain'] nan 1.1401e+08 1.1401e+08 2 2020-12-27T23:00:00.000Z 0.0259834 993197 -0.514148 7.36142 6.94848 2.96236e+06 2020-12-27T23:59:41.000Z 682.642 2.60936e+10 -0.514148 7.36142 6.94848 7.78281e+10 2020-12-27T23:00:00.000Z
    2 825 Tether USDT tether 9666 2015-02-25T00:00:00.000Z ['payments', 'stablecoin', 'asset-backed-stablecoin', 'avalanche-ecosystem', 'solana-ecosystem', 'arbitrum-ecosytem', 'moonriver-ecosystem', 'injective-ecosystem', 'bnb-chain', 'usd-stablecoin'] nan 2.07532e+10 2.12833e+10 3 2020-12-27T23:00:00.000Z 3.80193e-05 3.62606e+06 -0.00446154 0.0374141 -0.0789107 789021 2020-12-27T23:59:41.000Z 0.998854 9.52649e+10 -0.00446154 0.0374141 -0.0789107 2.07294e+10 2020-12-27T23:00:00.000Z
    3 52 XRP XRP xrp 683 2013-08-04T00:00:00.000Z ['medium-of-exchange', 'enterprise-solutions', 'binance-chain', 'arrington-xrp-capital-portfolio', 'galaxy-digital-portfolio', 'a16z-portfolio', 'pantera-capital-portfolio'] 1e+11 4.5404e+10 9.99908e+10 4 2020-12-27T23:00:00.000Z 1.07733e-05 352094 -1.1233 -3.96119 -49.0989 489151 2020-12-27T23:59:41.000Z 0.283039 9.25033e+09 -1.1233 -3.96119 -49.0989 1.28511e+10 2020-12-27T23:00:00.000Z
    4 2 Litecoin LTC litecoin 747 2013-04-28T00:00:00.000Z ['mineable', 'pow', 'scrypt', 'medium-of-exchange', 'binance-chain', 'bnb-chain'] 8.4e+07 6.61837e+07 6.61837e+07 5 2020-12-27T23:00:00.000Z 0.00485367 536813 -0.325724 -1.50027 11.2073 321234 2020-12-27T23:59:41.000Z 127.517 1.41033e+10 -0.325724 -1.50027 11.2073 8.43955e+09 2020-12-27T23:00:00.000Z

    [...]