Using Alphalens

In this example we will demonstrate how to create a simple pipeline that uses a factor and then analyze it using alphalens.

Loading the data bundle

[2]:
import warnings
warnings.filterwarnings('ignore')

import os
import pandas as pd
from datetime import timedelta

os.environ['ZIPLINE_ROOT'] = os.path.join(os.getcwd(), '.zipline')
os.listdir(os.environ['ZIPLINE_ROOT'])

os.environ['ZIPLINE_TRADER_CONFIG'] = os.path.join(os.getcwd(), "./zipline-trader.yaml")

with open(os.environ['ZIPLINE_TRADER_CONFIG'], 'r') as f:
    data = f.read()
    print(data[:20])
import zipline
from zipline.data import bundles

from zipline.utils.calendars import get_calendar
from zipline.pipeline.data import USEquityPricing
from zipline.pipeline.factors import CustomFactor
from zipline.research.utils import get_pricing, create_data_portal, create_pipeline_engine

# Set the trading calendar
trading_calendar = get_calendar('NYSE')
bundle_name = 'alpaca_api'
start_date = pd.Timestamp('2015-12-31', tz='utc')
pipeline_start_date = start_date + timedelta(days=365*2+10)
end_date = pd.Timestamp('2020-12-28', tz='utc')
data_portal = create_data_portal(bundle_name, trading_calendar, start_date)

Creating the Pipeline Engine

[3]:
engine = create_pipeline_engine(bundle_name)

Let’s create our custom factor

This is a factor that runs a linear regression over one year of stock log returns and calculates a “slope” as our factor.

[4]:
from zipline.pipeline.factors import CustomFactor, Returns
from zipline.pipeline.data import USEquityPricing

class YearlyReturns(CustomFactor):
    inputs = [USEquityPricing.close]
    window_length = 252
    def compute(self, today, assets, out, prices):
        start = self.window_length
        out[:] = (prices[-1] - prices[-start])/prices[-start]

Our Pipeline

[7]:
from zipline.pipeline.domain import US_EQUITIES
from zipline.pipeline.factors import AverageDollarVolume
from zipline.pipeline import Pipeline
from zipline.pipeline.classifiers.custom.sector import ZiplineTraderSector, SECTOR_LABELS

universe = USEquityPricing
universe = AverageDollarVolume(window_length = 5)

pipeline = Pipeline(
    columns = {
            'MyFactor' : YearlyReturns(),
#             'Returns': Returns(window_length=252),  # same as YearlyRetruns
            'Sector' : ZiplineTraderSector()
    }, domain=US_EQUITIES
)

Run the Pipeline

[8]:
# Run our pipeline for the given start and end dates
factors = engine.run_pipeline(pipeline, pipeline_start_date, end_date)

factors.head()
[8]:
MyFactor Sector
2018-01-09 00:00:00+00:00 Equity(0 [A]) 0.455036 1
Equity(1 [AAL]) 0.118720 11
Equity(2 [ABC]) 0.133466 7
Equity(3 [ABMD]) 0.774400 7
Equity(4 [ADP]) 0.150161 10

Data preparation

Alphalens input consists of two type of information: the factor values for the time period under analysis and the historical assets prices (or returns). Alphalens doesn’t need to know how the factor was computed, the historical factor values are enough. This is interesting because we can use the tool to evaluate factors for which we have the data but not the implementation details.

Alphalens requires that factor and price data follows a specific format and it provides an utility function, get_clean_factor_and_forward_returns, that accepts factor data, price data and optionally a group information (for example the sector groups, useful to perform sector specific analysis) and returns the data suitably formatted for Alphalens.

[9]:
asset_list = factors.index.levels[1].unique()

prices = get_pricing(
        data_portal,
        trading_calendar,
        asset_list,
        pipeline_start_date,
        end_date)
prices.head()
[9]:
Equity(0 [A]) Equity(1 [AAL]) Equity(2 [ABC]) Equity(3 [ABMD]) Equity(4 [ADP]) Equity(5 [AEE]) Equity(6 [AES]) Equity(7 [AJG]) Equity(8 [ALB]) Equity(9 [ALK]) ... Equity(495 [TWTR]) Equity(496 [UNM]) Equity(497 [UNP]) Equity(498 [UPS]) Equity(499 [VMC]) Equity(500 [VRSK]) Equity(501 [VZ]) Equity(502 [WBA]) Equity(503 [WEC]) Equity(504 [ZTS])
2018-01-10 00:00:00+00:00 70.80 53.78 97.20 208.21 117.66 56.49 10.775 63.575 132.96 72.07 ... 24.240 57.520 139.71 129.77 132.39 96.32 51.69 74.06 63.68 73.90
2018-01-11 00:00:00+00:00 70.80 56.18 98.12 210.17 117.18 56.15 10.960 63.390 134.68 74.82 ... 24.340 58.210 140.40 133.46 135.18 96.67 52.12 75.36 63.58 74.60
2018-01-12 00:00:00+00:00 71.72 58.22 98.99 215.19 118.39 55.50 11.030 63.875 133.46 73.52 ... 25.415 58.600 141.17 134.08 133.97 97.48 51.87 76.03 63.63 75.40
2018-01-16 00:00:00+00:00 71.24 57.73 99.55 218.27 119.38 55.45 10.680 63.410 128.00 71.28 ... 24.680 55.530 140.30 132.90 133.67 97.19 51.66 76.03 63.15 75.55
2018-01-17 00:00:00+00:00 72.05 57.91 101.29 224.23 122.05 55.79 10.730 64.040 126.96 70.83 ... 24.550 56.355 140.46 134.01 134.58 98.01 51.73 75.78 63.56 76.76

5 rows × 505 columns

[10]:
import alphalens as al
[11]:
factor_data = al.utils.get_clean_factor_and_forward_returns(
        factor=factors["MyFactor"],
        prices=prices,
        quantiles=5,
        periods=[1, 5, 10],
        groupby=factors["Sector"],
        binning_by_group=True,
        groupby_labels=SECTOR_LABELS,
    max_loss=0.8)
Dropped 1.5% entries from factor data: 1.5% in forward returns computation and 0.0% in binning phase (set max_loss=0 to see potentially suppressed Exceptions).
max_loss is 80.0%, not exceeded: OK!
[12]:
factor_data.head(10)
[12]:
1D 5D 10D factor group factor_quantile
date asset
2018-01-10 00:00:00+00:00 Equity(0 [A]) 0.000000 0.019492 0.043079 0.491687 Capital Goods 4
Equity(1 [AAL]) 0.044626 0.080141 -0.017851 0.085696 Transportation 2
Equity(2 [ABC]) 0.009465 0.037757 0.073765 0.149593 Health Care 3
Equity(3 [ABMD]) 0.009414 0.084914 0.135200 0.868852 Health Care 5
Equity(4 [ADP]) -0.004080 0.027707 0.025497 0.162133 Technology 1
Equity(5 [AEE]) -0.006019 -0.019295 0.004780 0.102944 Public Utilities 3
Equity(6 [AES]) 0.017169 0.070070 0.067285 -0.050309 Basic Industries 1
Equity(7 [AJG]) -0.002910 0.007000 0.032796 0.213659 Finance 3
Equity(8 [ALB]) 0.012936 -0.113147 -0.120751 0.507343 Basic Industries 5
Equity(9 [ALK]) 0.038157 -0.031636 -0.138754 -0.225000 Transportation 1

Running Alphalens

Once the factor data is ready, running Alphalens analysis is pretty simple and it consists of one function call that generates the factor report (statistical information and plots). Please remember that it is possible to use the help python built-in function to view the details of a function.

[13]:
al.tears.create_full_tear_sheet(factor_data, long_short=True, group_neutral=False, by_group=True)
Quantiles Statistics
min max mean std count count %
factor_quantile
1 -0.882437 0.336851 -0.221892 0.173761 77857 20.919291
2 -0.788964 0.440738 -0.049322 0.143953 74322 19.969477
3 -0.671378 0.560727 0.058053 0.144160 70527 18.949804
4 -0.570969 0.765367 0.176418 0.153070 74354 19.978075
5 -0.463738 9.102661 0.436519 0.325251 75118 20.183353
Returns Analysis
1D 5D 10D
Ann. alpha 0.012 0.007 0.008
beta -0.236 -0.309 -0.262
Mean Period Wise Return Top Quantile (bps) -0.345 -0.764 -0.528
Mean Period Wise Return Bottom Quantile (bps) 3.034 2.960 2.288
Mean Period Wise Spread (bps) -3.380 -3.663 -2.783
<Figure size 432x288 with 0 Axes>
../_images/notebooks_Alphalens_17_5.png
../_images/notebooks_Alphalens_17_6.png
Information Analysis
1D 5D 10D
IC Mean 0.017 0.017 0.010
IC Std. 0.274 0.279 0.272
Risk-Adjusted IC 0.061 0.060 0.036
t-stat(IC) 1.648 1.640 0.970
p-value(IC) 0.100 0.101 0.332
IC Skew -0.228 -0.323 -0.357
IC Kurtosis -0.256 -0.193 -0.076
../_images/notebooks_Alphalens_17_9.png
Turnover Analysis
1D 5D 10D
Quantile 1 Mean Turnover 0.054 0.113 0.155
Quantile 2 Mean Turnover 0.132 0.267 0.354
Quantile 3 Mean Turnover 0.159 0.312 0.408
Quantile 4 Mean Turnover 0.131 0.265 0.353
Quantile 5 Mean Turnover 0.056 0.117 0.162
1D 5D 10D
Mean Factor Rank Autocorrelation 0.994 0.975 0.954
../_images/notebooks_Alphalens_17_13.png

These reports are also available

al.tears.create_returns_tear_sheet(factor_data,
                                   long_short=True,
                                   group_neutral=False,
                                   by_group=False)

al.tears.create_information_tear_sheet(factor_data,
                                       group_neutral=False,
                                       by_group=False)

al.tears.create_turnover_tear_sheet(factor_data)

al.tears.create_event_returns_tear_sheet(factor_data, prices,
                                         avgretplot=(5, 15),
                                         long_short=True,
                                         group_neutral=False,
                                         std_bar=True,
                                         by_group=False)

Working with Pyfolio

We could use pyfolio to analyze the returns as if it was generated by a backtest, like so

[14]:
pf_returns, pf_positions, pf_benchmark = \
    al.performance.create_pyfolio_input(factor_data,
                                        period='1D',
                                        capital=100000,
                                        long_short=True,
                                        group_neutral=False,
                                        equal_weight=True,
                                        quantiles=[1,5],
                                        groups=None,
                                        benchmark_period='1D')


[15]:
import pyfolio as pf
pf.tears.create_full_tear_sheet(pf_returns,
                                positions=pf_positions,
                                benchmark_rets=pf_benchmark,
                                hide_positions=True)
Start date2018-01-10
End date2020-12-11
Total months50
Backtest
Annual return -3.584%
Cumulative returns -14.318%
Annual volatility 10.102%
Sharpe ratio -0.31
Calmar ratio -0.14
Stability 0.36
Max drawdown -26.261%
Omega ratio 0.93
Sortino ratio -0.39
Skew -2.46
Kurtosis 25.59
Tail ratio 0.84
Daily value at risk -1.285%
Gross leverage 0.69
Alpha -0.01
Beta -0.23
Worst drawdown periods Net drawdown in % Peak date Valley date Recovery date Duration
0 26.26 2020-03-17 2020-12-09 NaT NaN
1 9.46 2019-08-26 2019-11-04 2020-03-11 143
2 5.86 2018-09-03 2019-04-17 2019-08-08 244
3 2.92 2018-06-04 2018-07-30 2018-08-31 65
4 1.90 2018-01-18 2018-02-07 2018-02-22 26
Stress Events mean min max
New Normal -0.01% -7.48% 3.07%
../_images/notebooks_Alphalens_22_3.png
../_images/notebooks_Alphalens_22_4.png
../_images/notebooks_Alphalens_22_5.png