arbitragelab.util.data_importer
This module is a user data helper wrapping various yahoo finance libraries.
Module Contents
Classes
Wrapper class that imports data from yfinance and yahoo_fin. |
- class DataImporter
Wrapper class that imports data from yfinance and yahoo_fin.
This class allows for fast pulling/mangling of information needed for the research process. These would include; ticker groups of various indexes, pulling of relevant pricing data and processing said data.
- static get_sp500_tickers() list
Gets all S&P 500 stock tickers.
- Returns:
(list) List of tickers.
- static get_dow_tickers() list
Gets all DOW stock tickers.
- Returns:
(list) List of tickers.
- static remove_nuns(dataframe: pandas.DataFrame, threshold: int = 100) pandas.DataFrame
Remove tickers with nulls in value over a threshold.
- Parameters:
dataframe – (pd.DataFrame) Asset price data.
threshold – (int) The number of null values allowed.
- Return dataframe:
(pd.DataFrame) Price Data without any null values.
- static get_price_data(tickers: list, start_date: str, end_date: str, interval: str = '5m') pandas.DataFrame
Get the price data with custom start and end date and interval. For daily price, only keep the closing price.
- Parameters:
tickers – (list) List of tickers to download.
start_date – (str) Download start date string (YYYY-MM-DD).
end_date – (str) Download end date string (YYYY-MM-DD).
interval – (str) Valid intervals: [1m,2m,5m,15m,30m,60m,90m,1h,1d,5d,1wk,1mo,3mo].
- Returns:
(pd.DataFrame) The requested price_data.
- static get_returns_data(price_data: pandas.DataFrame) pandas.DataFrame
Calculate return data with custom start and end date and interval.
- Parameters:
price_data – (pd.DataFrame) Asset price data.
- Returns:
(pd.DataFrame) Price Data converted to returns.
- get_ticker_sector_info(tickers: list, yf_call_chunk: int = 20) pandas.DataFrame
This method will loop through all the tickers, using the yfinance library do a ticker info request and retrieve back ‘sector’ and ‘industry’ information.
This method uses the yfinance ‘Tickers’ object which has a limit of the amount of tickers supplied as a string argument. To go around this, this method uses the chunking approach, where the supplied ticker list is broken down into small chunks and supplied sequentially to the helper function.
- Parameters:
tickers – (list) List of asset symbols.
yf_call_chunk – (int) Ticker values allowed per ‘Tickers’ object. This should always be less than 200.
- Returns:
(pd.DataFrame) DataFrame with input asset tickers and their respective sector and industry information.