What is yahoo_fin?
Yahoo_fin is a Python 3 package designed to scrape historical stock price data, as well as to provide current information on market caps, dividend yields, and which stocks comprise the major exchanges. Additional functionality includes scraping income statements, balance sheets, cash flows, holder information, and analyst data. The package includes the ability to scrape live (real-time) stock prices, capture cryptocurrency data, and get the most actively traded stocks on a current trading day. Yahoo_fin also contains a module for retrieving option prices and expiration dates.
The latest version of yahoo_fin can also scrape earnings calendar history and has an additional module for scraping financial news RSS feeds.
If you like yahoo_fin and / or this blog, consider making a contribution here to support the project.
Table of Contents
For navigating yahoo_fin’s documentation, click on any of the links below.
To see in-depth examples, check out my video series on YouTube or the following posts:
Two intro videos in the series are below.
Installation & Getting historical / real-time stock prices
Easily scraping ticker lists
Updates
Update: July 9th, 2021
yahoo_fin 0.8.9.1 is the latest version of yahoo_fin. This includes a second collection of patches due to recent changes in Yahoo Finance’s website, which were affecting get_data, get_live_price, and several other methods. Please update to 0.8.9.1 if you are using an older version. Additionally, there are two new functions, get_company_info and get_company_officers, for scraping company-related data.
Update: July 2021
yahoo_fin 0.8.9 was released in July 2021. This release includes a patch fixing get request issues due to recent changes on Yahoo Finance. These updates affect several functions, including scraping options data, get_quote_table, and scraping financials information. If you are using an older version, please update to 0.8.9.
Update: March 2021
yahoo_fin 0.8.8 was released in March 2021. This release contains a patch for the tickers_dow method.
Update: Feb. 2021
yahoo_fin 0.8.7 was released in Feb. 2021. This version adds a collection of new features.
Update: July 11, 2020
Version 0.8.6 of yahoo_fin made the following changes:
Update: April 24, 2020
This update to yahoo_fin occurred on April 24, 2020 (version 0.8.5). This version updated the get_stats function, as well as added the get_stats_valuation function. Follow the guidance in the installation section below to upgrade yahoo_fin to the latest version.
Update: December 15, 2019
An update to this package was pushed on December 15, 2019. This update fixes the issues caused by a recent change in Yahoo Finance’s website. If you have a previously installed version of yahoo_fin, please follow the guidance below to upgrade your installation using pip.
Recommended Python Version
A few methods in yahoo_fin require a package called requests_html as a dependency. Since requests_html requires Python 3.6+, you’ll need Python 3.6+ when installing yahoo_fin.
yahoo_fin Installation
Yahoo_fin can be installed using pip:
If you have a previously installed version, you can upgrade like this:
Requirements
Yahoo_fin requires the following packages to be installed:
datetime feedparser ftplib io json pandas requests requests_html
With the exception of requests_html, these dependencies come pre-installed with Anaconda. requests_html requires Python 3.6+ and is needed for several of the functions in yahoo_fin, as described above. To install requests_html, you can use pip:
However, the latest versions of yahoo_fin should automatically install the dependencies when using pip, so you shouldn’t have to manually install these other packages.
Methods
The yahoo_fin package has three modules. These are called stock_info, options, and news. stock_info has the below primary methods.
get_analysts_info get_balance_sheet get_cash_flow get_company_info get_currencies get_data get_day_gainers get_day_losers get_day_most_active get_dividends get_earnings get_earnings_for_date get_earnings_in_date_range get_earnings_history get_financials get_futures get_holders get_income_statement get_live_price get_market_status get_next_earnings_date get_premarket_price get_postmarket_price get_quote_data get_quote_table get_top_crypto get_splits get_stats get_stats_valuation get_undervalued_large_caps tickers_dow tickers_ftse100 tickers_ftse250 tickers_ibovespa tickers_nasdaq tickers_nifty50 tickers_niftybank tickers_other tickers_sp500
The methods for options are listed below:
get_calls get_expiration_dates get_options_chain get_puts
The news module currently contains one method:
get_yf_rss
stock_info module
Any method from yahoo_fin’s stock_info module can be imported by running the follow line, with get_analysts_info replaced with the method of choice.
from yahoo_fin.stock_info import get_analysts_info
Alternatively, all methods can be imported at once like so:
from yahoo_fin.stock_info import * # or... import yahoo_fin.stock_info as si
get_analysts_info(ticker)
Scrapes data from the Analysts page for the input ticker from Yahoo Finance (e.g. https://finance.yahoo.com/quote/NFLX/analysts?p=NFLX. This includes information on earnings estimates, EPS trends / revisions etc.
Returns a dictionary containing the tables visible on the ‘Analysts’ page.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_analysts_info('nflx')
get_balance_sheet(ticker, yearly = True)
Scrapes the balance sheet for the input ticker from Yahoo Finance (e.g. https://finance.yahoo.com/quote/NFLX/balance-sheet?p=NFLX.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input. yearly (default = True) Be default, get_balance_sheet will download yearly data. To get quarterly data, set yearly = False.
# get yearly data get_balance_sheet('nflx') # get quarterly data get_balance_sheet('nflx', yearly = False)
get_cash_flow(ticker, yearly = True)
Scrapes the cash flow statement for the input ticker from Yahoo Finance (e.g. https://finance.yahoo.com/quote/NFLX/cash-flow?p=NFLX.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input. yearly (default = True) Be default, get_cash_flow will download yearly data. To get quarterly data, set yearly = False.
# get yearly data get_cash_flow('nflx') # get quarterly data get_cash_flow('nflx', yearly = False)
get_company_info(ticker)
Scrapes company information for ticker from Yahoo Finance: https://finance.yahoo.com/quote/aapl/profile?p=aapl
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_company_info("aapl")
get_company_officers(ticker)
Scrapes company officers for ticker from Yahoo Finance: https://finance.yahoo.com/quote/aapl/profile?p=aapl
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_company_officers("aapl")
get_currencies()
Scrapes the currencies table Yahoo Finance: https://finance.yahoo.com/currencies
get_currencies()
get_data(ticker, start_date = None, end_date = None, index_as_date = True, interval = “1d”)
Downloads historical price data of a stock into a pandas data frame. Offers the functionality to pull daily, weekly, or monthly data.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. This is the only required argument. start_date The date the price history should begin. end_date The date the price history should end. index_as_date Default is True. If index_as_date = True, then the index of the returned data frame is the date associated with each record. Otherwise, the date is returned as its own column. interval Default is "1d", or daily. This parameter specifies the interval in which to return the data. The default value of "1d" returns daily historical data. Input "1wk" for weekly data, or "1mo" for monthly data. Any other input for the interval parameter will result in an error.
msft_data = get_data('msft')
If you want to filter by a date range, you can just add a value for the start_date and / or end_date parameters, like below:
from1999 = get_data('msft' , start_date = '01/01/1999') few_days = get_data('msft' , start_date = '01/01/1999' , end_date = '01/10/1999')
Get weekly or monthly historical price data:
weekly_data = get_data("msft", interval = "1wk") monthly_data = get_data("msft", interval = "1mo")
get_day_gainers()
Scrapes the top 100 (at most) stocks with the largest gains (on the given trading day) from Yahoo Finance (see https://finance.yahoo.com/gainers).
get_day_gainers()
get_day_losers()
Scrapes the top 100 (at most) worst performing stocks (on the given trading day) from Yahoo Finance (see https://finance.yahoo.com/losers).
get_day_losers()
get_day_most_active()
Scrapes the top 100 most active stocks (on the given trading day) from Yahoo Finance (see https://finance.yahoo.com/most-active).
get_day_most_active()
get_dividends(ticker, start_date = None, end_date = None, index_as_date = True)
Downloads historical dividend data of a stock into a pandas data frame.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. This is the only required argument. start_date The date the dividend history should begin. end_date The date the dividend history should end. index_as_date Default is True. If index_as_date = True, then the index of the returned data frame is the date associated with each record. Otherwise, the date is returned as its own column.
# get all historical dividend data get_dividends("msft") # dividends from 2010 onward get_dividends("msft", "01-01-2010")
get_earnings(ticker)
Scrapes earnings information from Yahoo Finance’s financials page for a given ticker (see https://finance.yahoo.com/quote/NFLX/financials?p=NFLX). Returns a dictionary with quarterly actual vs. estimated earnings per share, quarterly revenue / earnings data, and yearly revenue / earnings data.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_earnings('nflx')
get_earnings_for_date(ticker)
Returns a list of dictionaries. Each dictionary contains a ticker, its corresponding EPS estimate, and the time of the earnings release.
Possible parameters
date Date of interest. Required as input.
get_earnings_for_date('02/08/2021')
get_earnings_history(ticker)
Scrapes earnings history information from Yahoo Finance’s financials page for a given ticker. Returns a list of dictionaries with quarterly actual vs. estimated earnings per share along with dates of previous earnings releases. Currently, this method can pull back data for over 20 years.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
hist = get_earnings_history('msft') hist[0]
get_earnings_in_date_range(ticker)
Returns a list of dictionaries. Each dictionary contains a ticker, its corresponding EPS estimate, and the time of the earnings release. The data is returned based upon what earnings occur in the input date range. The date range is inclusive of the start_date and end_date inputs.
Possible parameters
start_date Starting date of interest. Required as input. end_date Ending date of interest. Required as input.
get_earnings_in_date_range('02/08/2021', '02/12/2021')
get_financials(ticker, yearly = True, quarterly = True)
Efficient method to scrape balance sheets, cash flow statements, and income statements in a single call from Yahoo Finance’s financials page for a given ticker (see https://finance.yahoo.com/quote/NFLX/financials?p=NFLX).
If you’re looking to get all of this information for a given ticker, or set of tickers, this function will be 3x faster than running get_balance_sheet, get_cash_flow, and get_income_statement separately. Yearly, quarterly, or both time-periods can be pulled.
Returns a dictionary with the following keys:
If yearly = True:
If quarterly = True:
If yearly and quarterly are both set to be True, all six key-value pairs are returned.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input. yearly Boolean. If True (default), yearly data will be returned. quarterly Boolean. If True (default), quarterly data will be returned. Note: If both yearly and quarterly are set to True, then both yearly and quarterly data will be returned.
# get both yearly and quarterly info get_financials('nflx') # get only yearly data get_financials('nflx', yearly = True, quarterly = False) # get only quarterly data get_financials('nflx', yearly = False, quarterly = True)
get_futures()
Returns the table of futures prices from Yahoo Finance here: https://finance.yahoo.com/commodities
get_futures()
get_holders(ticker)
Scrapes data from the Holders tab from Yahoo Finance (e.g. https://finance.yahoo.com/quote/NFLX/holders?p=NFLX for an input ticker.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_holders('nflx')
get_income_statement(ticker, yearly = True)
Scrapes the income statement for the input ticker, which includes information on Price / Sales, P/E, and moving averages (e.g. https://finance.yahoo.com/quote/NFLX/financials?p=NFLX.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input. yearly (default = True) Be default, get_income_statement will download yearly data. To get quarterly data, set yearly = False.
# get yearly data get_income_statement('nflx') # get quarterly data get_income_statement('nflx', yearly = False)
get_live_price(ticker)
Scrapes the live quote price for the input ticker.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_live_price('nflx')
get_market_status()
Returns a status specifying whether the market is currently pre-market (“PRE”), open (“OPEN”), post-market (“POST”), or closed (“CLOSED”).
get_market_status()
get_next_earnings_date(ticker)
Returns the next upcoming earnings date for a given ticker.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_next_earnings_date('nflx')
get_premarket_price(ticker)
Returns the premarket price for a given ticker if available / applicable.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_premarket_price('nflx')
get_postmarket_price(ticker)
Returns the postmarket price for a given ticker if available / applicable.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_postmarket_price('nflx')
get_quote_data(ticker)
Scrapes a collection of over 70 data points for an input ticker from Yahoo Finance (e.g. https://query1.finance.yahoo.com/v7/finance/quote?symbols=NFLX), including current real-time price, company name, book value, 50-day average, 200-day average, pre-market price / post-market price (if available), shares outstanding, and more. The results are returned as a dictionary.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_quote_data('nflx')
get_quote_table(ticker , dict_result = True)
Scrapes the primary table found on the quote page of an input ticker from Yahoo Finance (e.g. https://finance.yahoo.com/quote/AAPL?p=AAPL)
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input. dict_result Default is True. If True, the function returns the results in a dict format. Otherwise, the results are returned in a data frame.
The following fields with their corresponding values are returned:
1y Target Est 52 Week Range Ask Volume Beta Bid Days Range Dividend & Yield EPS (TTM) Earnings Date Ex-Dividend Date Market Cap Open PE Ratio (TTM) Previous Close Quote Price Volume
get_quote_table('aapl')
get_splits(ticker, start_date = None, end_date = None, index_as_date = True)
Downloads historical stock splits data of a stock into a pandas data frame.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. This is the only required argument. start_date The date the split history should begin. end_date The date the split history should end. index_as_date Default is True. If index_as_date = True, then the index of the returned data frame is the date associated with each record. Otherwise, the date is returned as its own column.
get_splits("msft")
get_stats(ticker)
Scrapes data off the statistics page for the input ticker, which includes information on moving averages, return on equity, shares outstanding, etc. (e.g. https://finance.yahoo.com/quote/NFLX/key-statistics?p=NFLX.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_stats('nflx')
get_stats_valuation(ticker)
Scrapes the “Valuation Measures” data off the statistics page for the input ticker, which includes information on Price / Sales, P/E, and market cap (e.g. https://finance.yahoo.com/quote/NFLX/key-statistics?p=NFLX.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_stats_valuation('nflx')
get_top_crypto
Scrapes data for top 100 cryptocurrencies by market cap (see https://finance.yahoo.com/cryptocurrencies). No parameters need to be passed.
get_top_crypto()
get_undervalued_large_caps
Returns the table of the top 100 undervalued large caps from Yahoo Finance here: https://finance.yahoo.com/screener/predefined/undervalued_large_caps?offset=0&count=100
get_undervalued_large_caps()
tickers_dow(include_company_data = False)
If no parameters are passed, returns a list of tickers currently listed on the Dow Jones. The tickers are scraped from Wikipedia (see https://en.wikipedia.org/wiki/Dow_Jones_Industrial_Average. If you set include_company_data = True, it will return the full table on this webpage.
tickers = tickers_dow() dow_table = tickers_dow(True)
tickers_ftse100(include_company_data = False)
If no parameters are passed, returns a list of tickers currently listed on the FTSE 100 index. Otherwise, setting include_company_data = True will return a table with ticker, sector, and company name. The tickers are scraped from here: https://en.wikipedia.org/wiki/FTSE_100_Index.
tickers = tickers_ftse100() tickers_ftse100(True)
tickers_ftse250(include_company_data = False)
If no parameters are passed, returns a list of tickers currently listed on the FTSE 250 index. Otherwise, setting include_company_data = True will return a table with ticker and company name. The tickers are scraped from here: https://en.wikipedia.org/wiki/FTSE_250_Index.
tickers = tickers_ftse250() tickers_ftse250(True)
tickers_nasdaq(include_company_data = False)
Returns a list of tickers currently listed on the NASDAQ. If you specify include_company_data = True, it will return a table containing the tickers, their corresponding company names, and several other attributes. This method, along with tickers_other, works by scraping text files from ftp://ftp.nasdaqtrader.com/SymbolDirectory/.
tickers_nasdaq scrapes the nasdaqlisted.txt file from the link above, while tickers_other scrapes the otherlisted.txt file.
tickers = tickers_nasdaq() tickers_nasdaq(True)
tickers_nifty50(include_company_data = False)
Returns a list of tickers currently listed on the NIFTY50. This method scrapes the tickers from here: https://en.wikipedia.org/wiki/NIFTY_50. If include_company_data is set to True, a table containing the tickers and company names is returned.
tickers = tickers_nifty50() tickers_nifty50(True)
tickers_niftybank()
Returns a list of tickers currently listed on the NIFTYBANK. No parameters need to be passed.
tickers = tickers_niftybank()
tickers_other(include_company_data = False)
See above description for tickers_nasdaq.
tickers = tickers_other() tickers_other(True)
tickers_sp500(include_company_data = False)
Returns a list of tickers currently listed in the S&P 500. The data for this is scraped from Wikipedia:
https://en.wikipedia.org/wiki/List_of_S%26P_500_companies
If include_company_data is set to True, the tickers, company names, and sector information is returned as a data frame.
tickers = tickers_sp500() tickers_sp500(True)
options module
We can import any method from options module like this:
from yahoo_fin.options import get_options_chain
Just replace get_options_chain with any other method. Also, we can import all methods at once like so:
from yahoo_fin.options import * # or... from yahoo_fin import options
get_calls(ticker, date = None)
Scrapes call options data for the input ticker from Yahoo Finance (e.g. https://finance.yahoo.com/quote/NFLX/options?p=NFLX.
Returns a pandas data frame containing the call options data for the given ticker and expiration date.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input. date Expiration date. Default is None, which will return the earliest upcoming expiration date's data.
get_calls('nflx') get_calls('nflx', '06/19/2020')
get_expiration_dates(ticker)
Scrapes expiration dates for the input ticker from Yahoo Finance (e.g. https://finance.yahoo.com/quote/NFLX/options?p=NFLX.
Returns a list of expiration dates for the input ticker. This list is based off the drop-down selection box on the options data webpage for the input ticker.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input.
get_expiration_dates('nflx') get_expiration_dates('amzn')
get_options_chain(ticker, date)
Scrapes calls and puts tables for the input ticker from Yahoo Finance (e.g. https://finance.yahoo.com/quote/NFLX/options?p=NFLX.
Returns a dictionary with two data frames. The keys of the dictionary are labeled calls (which maps to the calls data table) and puts (which maps to the puts data table).
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input. date Expiration date. Default is None, which will return the earliest upcoming expiration date's data.
# get data on the earliest upcoming expiration date get_options_chain('nflx') # specify an expiration date get_options_chain('amzn', '03/15/2019')
get_puts(ticker, date = None)
Scrapes put options data for the input ticker from Yahoo Finance (e.g. https://finance.yahoo.com/quote/NFLX/options?p=NFLX.
Returns a pandas data frame containing the put options data for the given ticker and expiration date.
Possible parameters
ticker Stock ticker (e.g. 'MSFT', 'AMZN', etc.). Case insensitive. Required as input. date Expiration date. Default is None, which will return the earliest upcoming expiration date's data.
get_puts('nflx') get_puts('nflx', '06/19/2020')
yahoo_fin news module
Currently the news module contains a single function, get_yf_rss, which retrieves the Yahoo Finance news RSS feeds for an input ticker.
from yahoo_fin import news news.get_yf_rss("nflx")
To learn more about Python and / or open source coding, check out a new online Python course I co-created with 365 Data Science! You’ll learn all about web scraping, how to use APIs in Python, how to scrape JavaScript pages, and how to deal with other modern challenges like logging into websites! Check it out on Udemy here!