Click this link to see additional examples.
What is yahoo_fin?
Yahoo_fin is a Python 3 package I wrote to scrape historical stock price data, as well as to provide current information on market caps, dividend yields, and which stocks comprise the major exchanges. The most recent version includes additional functionality for scraping income statements, balance sheets, cash flows, holder information, and analyst data.
Installation
Yahoo_fin can be installed using pip:
Requirements
Yahoo_fin requires the following packages to be installed:
ftplib io pandas requests
These dependencies come pre-installed with Anaconda.
Methods
The yahoo_fin package has one module, named stock_info. This module has the below primary methods.
get_analysts_info get_balance_sheet get_cash_flow get_data get_holders get_income_statement get_quote_table get_stats tickers_dow tickers_nasdaq tickers_other tickers_sp500
Any method can be imported by running the follow line, with get_analysts_info replaced with the method of choice.
from yahoo_fin.stock_info import get_analysts_info
Alternatively, all methods can be imported at once like so:
from yahoo_fin.stock_info import *
get_analysts_info(ticker)
Scrapes data from the Analysts page for the input ticker from Yahoo Finance (e.g. https://finance.yahoo.com/quote/NFLX/analysts?p=NFLX. This includes information on earnings estimates, EPS trends / revisions etc.
Returns a dictionary containing the tables visible on the ‘Analysts’ page.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_analysts_info('nflx')
get_balance_sheet(ticker)
Scrapes the balance sheet for the input ticker from Yahoo Finance (e.g. https://finance.yahoo.com/quote/NFLX/balance-sheet?p=NFLX.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_balance_sheet('nflx')
get_cash_flow(ticker)
Scrapes the cash flow statement for the input ticker from Yahoo Finance (e.g. https://finance.yahoo.com/quote/NFLX/cash-flow?p=NFLX.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_cash_flow('nflx')
get_data(ticker, start_date = None, end_date = None, index_as_date = True)
Downloads historical price data of a stock into a pandas data frame.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. This is the only required argument. start_date The date the price history should begin. end_date The date the price history should end. index_as_date Default is True. If index_as_date = True, then the index of the returned data frame is the date associated with each record. Otherwise, the date is returned as its own column.
msft_data = get_data('msft')
If you want to filter by a date range, you can just add a value for the start_date and / or end_date parameters, like below:
from1999 = get_data('msft' , start_date = '01/01/1999') few_days = get_data('msft' , start_date = '01/01/1999' , end_date = '01/10/1999')
get_holders(ticker)
Scrapes data from the Holders tab from Yahoo Finance (e.g. https://finance.yahoo.com/quote/NFLX/holders?p=NFLX for an input ticker.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_holders('nflx')
get_income_statement(ticker)
Scrapes the income statement for the input ticker, which includes information on Price / Sales, P/E, and moving averages (e.g. https://finance.yahoo.com/quote/NFLX/financials?p=NFLX.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_income_statement('nflx')
get_quote_table(ticker , dict_result = True)
Scrapes the primary table found on the quote page of an input ticker from Yahoo Finance (e.g. https://finance.yahoo.com/quote/AAPL?p=AAPL)
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input. dict_result Default is True. If True, the function returns the results in a dict format. Otherwise, the results are returned in a data frame.
The following fields with their corresponding values are returned:
1y Target Est 52 Week Range Ask Volume Beta Bid Days Range Dividend & Yield EPS (TTM) Earnings Date Ex-Dividend Date Market Cap Open PE Ratio (TTM) Previous Close Volume
get_quote_table('aapl')
get_stats(ticker)
Scrapes data off the statistics page for the input ticker, which includes information on Price / Sales, P/E, and moving averages (e.g. https://finance.yahoo.com/quote/NFLX/key-statistics?p=NFLX.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_stats('nflx')
tickers_dow()
Returns a list of tickers currently listed on the Dow Jones. No parameters need to be passed. The tickers are scraped from Yahoo Finance (see https://finance.yahoo.com/quote/%5EDJI/components?p=%5EDJI.
tickers = tickers_dow()
tickers_nasdaq()
Returns a list of tickers currently listed on the NASDAQ. No parameters need to be passed. This method, along with tickers_other, works by scraping text files from ftp://ftp.nasdaqtrader.com/SymbolDirectory/.
tickers_nasdaq scrapes the nasdaqlisted.txt file from the link above, while tickers_other scrapes the otherlisted.txt file.
tickers = tickers_nasdaq()
tickers_other()
See above description for tickers_nasdaq.
tickers = tickers_other()
tickers_sp500()
Returns a list of tickers currently listed in the S&P 500. The data for this is scraped from Wikipedia:
https://en.wikipedia.org/wiki/List_of_S%26P_500_companies
tickers = tickers_sp500()
To learn more about Python and / or open source coding, see my recommended reading list, or click to purchase one of the books below!