pythonbeautifulsouppython-requestsyahoo

Python Beautiful Soup split data


I am attempting to create a stock data analysis program in Python. I am scraping the data from Yahoo Finance. My only issue seems to be 'splitting' the data. For example, I have been trying to get the 'total revenue' data, however it returns more than that table row from the yahoo finance site, and i'm not sure how to use .split in this scenario to simply get a string of the total revenue. Here is my code:

from bs4 import BeautifulSoup
import requests



def get_fundamentals(ticker):
       #function to grab fundamental stock data from yahoo finance

       html = requests.get("https://finance.yahoo.com/quote/" + ticker.upper  () + "/financials?p=" + ticker.upper())#Tags in lxml, html5lib, or html.parser
       soup = BeautifulSoup(html.text, 'html.parser')
       stock_total_revenue = soup.find('td',{'class':'Fz(s) Ta(end) Pstart(10px)'})
       print(stock_total_revenue)

ticker = input("Please enter a stock ticker: ")

get_fundamentals(ticker)

I was able to identify the table data that I wanted, and the class, which gets me to the proper row to get the stocks revenue, however, it also comes with a lot of other additional data, and this is where I am having trouble figuring out how to 'split' the data, so that it only returns the revenue. Here is my output when I run the program:

Please enter a stock ticker: dxtr
<td class="Fz(s) Ta(end) Pstart(10px)" data-reactid="39"><span data-reactid="40">3,423</span></td>

I have been attempting to split the data so that it simply prints out whatever the total revenue is (in this case 3,423 for this stock). I will be doing this for any stock the user enters, however as you can see, I get additional data that I am not sure how to split out.


Solution

  • You already have the element. When you print an element, it includes its markup. If you only want the text of the element, use the .text attribute of the element.

    >>> print(stock_total_revenue.text)
    3,423