How to Extract CoinMarketCap Data Using BeautifulSoup and Python?

6 min readDec 15, 2021

The enormous amount of cryptocurrency data accessible on the internet is a good resource to do cryptocurrency investment and research. Getting the capability of managing and leveraging these data would provide us control over the crypto investment.

Web extraction is the procedure of downloading data from different websites as well as scraping important data. For our objective, we are fascinated by extracting cryptocurrency data.

With so many websites providing free tools, why would somebody want to gather your individual data? The majority of users would use websites like CoinGecko, CoinMarketCap, etc. to get data as well as create a watch list. Isn’t it more convenient?

We should utilize both options, proceeding ourselves from some newbies (utilize standard features of usual crypto websites) to our data analysis (data scraping and creating our own brainy data.)

In our experiences, we have found the following benefits:

Maintain Focus and Control: We are more focused as well as in control, understanding that a list as well as analysis, which we build using our spreadsheet is the key working version for our investment objective. We do not need to depend on other person’s data. Skipping from one website to another also diverts us as well as sway us from our main job.

Filling the Gaps: Not all the coins are accessible on main websites. You will always find inconsistencies and gaps in a coin list. If we own data, we could manage it.

For Progressive Analysis: By getting data in the spreadsheet, you could do advance analysis as well as filter to get niche coins, which websites might not provide.

Personal Comments and Notes: You could add columns in your spreadsheets for extra comments as well as investment insights. We also add which exchange we are going to utilize, as well as what capital amount we are allocating to a coin.

For example, we could search for coins that are in gaming and Solano, when we had data in the spreadsheet:

As an assessment, most websites support merely one level of filtration. For instance, CoinMarketCap could list all coins in the Polkadot ecosystem:

CoinMarketCap could list all the gaming tokens also, but not both gaming and Polkadot.

Generally, these websites just cannot go outside two/three-level filtration e.g. listing all the gaming coins associated with Polkadot.

Advanced filtration might not look like a great problem on the surface, however, with thousands of coins accessible in the market, getting that capability of automating things for the investment objectives and maintaining focus is a key for success.

We will use two libraries of Python:

BeautifulSoup is the Python library to get data out from XML, HTML, as well as other mark-up languages.

Request that is utilized for getting HTML data from a website. In case, you already get data in the HTML file, you won’t require a Request library.

We also utilize Jupyter Notebook with Google Cloud Platform to run however, the given Python code could run on all platforms.

As per the Python setup, you might require to pip install beautifulsoup4.

We will start with ‘Hello World’ for web scraping, through scraping the preliminary text about What is Binance Coin (BNB), given in the green box here.

Visit the BNB coin page with your Chrome browser and then Right-click on the page, and click Inspect to examine the elements:

Then click on a little arrow given in the center of your screen and click on corresponding web elements, as given below.

After review, we observe that all web elements are

Below div class sc-2qtjgt-0 eApVPN

The title is utilizing h2

Subtitles are utilizing h3

All rest are below p

Just go through the scraping code, which is very easy!

from bs4 import BeautifulSoup import requests # retrieve the web page and parse the contents mainpage = requests.get('https://coinmarketcap.com/currencies/binance-coin/') soup = BeautifulSoup(mainpage.content, 'html.parser') whatis = soup.find_all("div", {"class" : "sc-2qtjgt-0 eApVPN"}) # extract elements from the contents title = whatis[0].find_all("h2") print(title[0].text.strip() + "\n") for p in whatis[0].find_all('p'): print(p.text.strip() + "\n")

In the example given here, we will extract Binance Coin (BNB)’s data i.e. Market Cap, Circulating Supply, Completely Diluted Market Cap, Volume / Market Cap, Volume (24h)).

On the same BNB coin page, just go to the top of a page as well as click on or a web element. Observe the whole block is named:

Therefore, we will find the:

statsContainer = soup.find_all("div", {"class" : "hide statsContainer"}) statsValues = statsContainer[0].find_all("div", {"class" : "statsValue"}) statsValue_marketcap = statsValues[0].text.strip() print(statsValue_marketcap) statsValue_fully_diluted_marketcap = statsValues[1].text.strip() print(statsValue_fully_diluted_marketcap) statsValue_volume = statsValues[2].text.strip() print(statsValue_volume) statsValue_volume_per_marketcap = statsValues[3].text.strip() print(statsValue_volume_per_marketcap) statsValue_circulating_supply = statsValues[4].text.strip() print(statsValue_circulating_supply)

having the given outputs (result differs and prices change repeatedly).

$104,432,294,030 $104,432,294,030 $3,550,594,245 0.034 166,801,148.00 BNB

In this exercise, utilize knowledge from the previous two as well as check-in case you can extract data for Max Supply as well as Total Supply for ADA (Cardano) and BNB.

Other options include Selenium and Scrapy. We will cover all the topics when we have the time.

Selenium and Scrapy have a sharper learning curve compared to Request which is utilized to have HTML data as well as BeautifulSoup that is utilized as a parser for HTML.

Scrapy is an entire web extraction framework that takes care of all the things from having HTML to processing data.

Selenium is the browser automation tool, which can for instance allow you to steer between different pages.

Web Scraping Challenges: Durability

The key challenge of web scraping is the durability of its code. Web developers at CoinMarketCap are continually updating websites as well as old codes might not work after some time.

A promising solution is using Application Programming Interfaces (APIs) given by different platforms and websites. Although, the free versions of APIs are restricted. The data format while using the APIs is completely different from general web scraping i.e., XML or JSON, whereas in normal web scraping, you mostly deal with HTML data format.

If you want to know more about scraping CoinMarketCap data then contact X-Byte Enterprise Crawling or ask for a free quote!

Originally published at https://www.xbyte.io.

How to Extract CoinMarketCap Data Using BeautifulSoup and Python?

Written by X-Byte Enterprise Crawling

No responses yet