updated 21 October 2016

Who is this Guy ?

Proprietary/Principal Trading

  • proprietary or principal traders are a specific "member" structure with the exchanges

  • high barriers to entry, large up-front capital requirements

  • many strategies pursued in this structure have capacity constraints

  • benefits on the other side are low fees, potentially high leverage

  • money management needs to be very focused on drawdowns, leverage, volatility

Generalities: Regional Firm Cultures

New York and London

  • Bank Roots: principals who came out of bank buy-side desks, capital concentrated in relatively few partners
  • Asset Classes: all asset classes are represented, but equities dominate, with secondary representation from bonds and FX
  • Strategies: principally market neutral, stat arb, automated market making, all others represented

Asia

  • Entrepreneurial Roots: most principals represent family wealth made initially in various non-financial businesses
  • Asset Classes: dominated by FX and domestic stocks
  • Strategies: most prevalent is multiple small desks, trading short term directional strategies

Chicago

  • Floor Roots: principals almost universally started on the floors, deep culture of 'skin in the game'
  • Asset Classes: generally specialized by firm, dominated by futures and options, with representation from all other asset classes
  • Strategies: wide range from individual directional to calendar spreads to automated market making

Generalities: Types of Proprietary Traders

One-Three Person Desk

  • small desks are generally run by a trader who comes out of a larger firm
  • biggest challenge is avoiding risk of ruin, many small traders leave the market
  • strategies are generally not automated, and very limited in number

Large Semi-Automated Desks

  • growth of the small trader model, only more established and successful, or
  • consolidation of a desk around asset class or strategy type in a more corporate firm

  • generally consists of senior trader(s), 'clerks', and perhaps a small number of programmers or quants
  • typically 24 hour operations

  • larger desks have revenues to start automating trades, but rarely have much experience with automation

Large Automated Desks

  • Blair Hull describes the minimum team as comprised of a trader (someone with market and operational knowledge), a programmer, and a quant

  • more scalable than less automated desks
  • large amounts of time spent on pre-production testing: mistakes are expensive

Bank Desks

  • typically organized by asset class
  • less typically organized by strategy style
  • almost always in regional financial centers
  • typically semi-automated

Backtesting, art or science?

 

 

Back-testing. I hate it - it's just optimizing over history. You never see a bad back-test. Ever. In any strategy. - Josh Diedesch (2014), CalSTRS

 

 

Every trading system is in some form an optimization. - Emilio Tomasini (2009)

Moving Beyond Assumptions

Many system developers consider

"I hypothesize that this strategy idea will make money"

to be adequate.

Instead, strive to:

  • understand your business constraints and objectives
  • build a hypothesis for the system
  • build the system in pieces
  • test the system in pieces
  • measure how likely it is that you have overfit

Constraints and Objectives

Constraints

  • capital available
  • products you can trade
  • execution platform

Benchmarks

  • published or synthetic?
  • what are the limitations?
  • are you held to it, or just measured against it?

Objectives

  • formulate objectives for testability
  • make sure they reflect your real business goals

Building a Hypothesis

To create a testable idea (a hypothesis):

  • formulate a declarative conjecture
  • make sure the conjecture is predictive
  • define the expected outcome
  • describe means of verifying/testing

good/complete Hypothesis Statements include:

  • what is being analyzed (the subject),
  • dependent variable(s) (the result/prediction)
  • independent variables (inputs to the model)
  • the anticipated possible outcomes, including direction or comparison
  • addresses how you will validate or refute each hypothesis

Building Blocks

Filters

  • select the instruments to trade
  • categorize market characteristics that are favorable to the strategy

Indicators

  • quantitative values derived from market data
  • includes all common "technicals"

Signals

  • describe the interaction between filters, market data, and indicators
  • can be viewed as a prediction at a point in time

Rules

  • make path-dependent actionable decisions

Test the System in Pieces

How to Screw Up Less

Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise. - John Tukey (1962) p. 13

 

Fail quickly, think deeply, or both?

 

No matter how beautiful your theory, no matter how clever you are or what your name is, if it disagrees with experiment, it’s wrong. - Richard P. Feynman (1965)

Things to Watch Out For, or, Types of Overfitting

Look Ahead Bias

  • directly using knowledge of future events

Data Mining Bias

  • caused by testing multiple configurations and parameters over multiple runs, with adjustments between backtest runs
  • exhaustive searches may or may not introduce biases

Data Snooping

  • knowledge of the data set can contaminate your choices
  • making changes after failures without having strong experimental design

Measuring Indicators

A good indicator measures something in the market:

  • a theoretical "fair value" price, or
  • the impact of a factor on that price, or
  • turning points, direction, or slope the series

Make sure the indicator is testable:

  • hypothesis and tests for the indicator
    • standard errors and goodness of fit
    • t-tests or p-value
  • custom 'perfect foresight' model built from a periodogram or signature plot

If your indicator doesn't have testable information content, throw it out and start over.

Measuring Signals

Signals make predictions; all the literature on forecasting is applicable:

  • mean squared forecast error, BIC, etc.
  • box plots or additive models for forward expectations
  • "revealed performance" approach of Racine and Parmeter (2012)
  • re-evaluate assumptions about the method of action of the strategy
  • detect information bias or luck before moving on

 

 

 

quantstrat::distributional.boxplot()

Measuring Rules

If your signal process doesn't have predictive power, stop now.

 

  • rules should refine the way the strategy 'listens' to signals
  • entries may be passive or aggressive, or may level or pyramid into a position
  • exits may have their own signal process, or may be derived empirically
  • risk rules should be added near the end, for empirical 'stops' or to meet business constraints

Beware of Rule Burden:

  • having too many rules is an invitation to overfitting
  • adding rules after being disappointed in backtest results is almost certainly an exercise in overfitting (data snooping)
  • strategies with fewer rules are more likely to be robust out of sample

Walk Forward

quantstrat::walk.forward

 

 

 

quantstrat::chart.forward

Measuring the Whole System

Net profit as a sole evaluation method ignores many of the characteristics important to this decision. - Robert Pardo (2008)

blotter::chart.MAE

  • evaluating backtest/simulated trades can tell you more about the dynamics of your system
  • Maximum Adverse/Favorable Excursion (MAE/MFE) looks at how far a trade moves for/against you before it is closed
  • blotter is probably used as often in the real world for post-trade analysis of production trades as it is for backtests

Using Trade Statistics

All trading and backtesting platforms (should) provide trade statistics:

  • number of trades w/ gross and net P&L
  • mean/median, standard deviation of trading P&L per trade
  • percent of positive/negative trades
  • Profit Factor : absolute value ratio of gross profits over gross losses
  • Drawdown statistics
  • start-trade drawdown (Fitschen 2013, 185)
  • win/loss ratios of winning over losing trade P&L (total/mean/median)

Dangers of aggregate statistics:

  • hiding the most common outcomes
  • focusing on extremes
  • not enough trades or history for validity
  • collinearities of overlapping "trades"

see blotter::tradeStats and blotter::perTradeStats

Detecting Backtest Overfitting

Did we over do it?

A big computer, a complex algorithm and a long time does not equal science. - Robert Gentleman

  • White's Reality Check : from White (2000) and Hansen (2005)
  • k-fold cross validation : improves single hold-out model by randomly dividing the sample of size T into sequential sub-samples of size T/k.(Hastie, Tibshirani, and Friedman 2009)
  • CSCV sampling (combinatorially symmetric cross validation): "generate \(S/2\) testing sets of size \(T/2\) by recombining the \(S\) slices of the overall sample of size \(T\) ". (Bailey et al. 2014, 17)
  • Multiple Hypothesis Testing looks at Type I vs Type II error in evaluating backtests and at appropriate haircuts based on these probabilities. (Harvey and Liu 2015 )

Monte Carlo and the bootstrap

Sampling from limited information

  • estimate the 'true' properties of a distribution from incomplete information
  • evaluate the likelihood (test the hypothesis) that a particular result is
  • not the result of chance
  • not overfit
  • understand confidence intervals for other descriptive statistics on the backtest
  • simulated different paths that the results might have taken, if the ordering had been different

History of Monte Carlo and bootstrap simulation

  • Laplace was the first to describe the mathematical properties of sampling from a distribution
  • Mahalanobis extended this work in the 1930's to describe sampling from dependent distributions, and anticipated the block bootstrap by examining these dependencies
  • Monte Carlo simulation was developed by Françoise Ulam and John von Neumann as part of the hydrogen bomb program in 1946 (Richard Rhodes, Dark Sun, p.304)
  • computational implementation of Monte Carlo simulation was constructed by Nicholas Metropolis on the ENIAC and MANIAC machines
  • Metropolis was an author in 1953 of the prior distribution sampler extended by W.K Hastings to the modern Metropolis-Hastings form in 1970
  • Maurice Quenouille and John Tukey developed the modern 'jackknife' in the 1950's
  • Bradley Efron described the modern bootstrap in 1979

Empirical Example Setup

  • Bollinger Bands demo from quantstrat
  • only one instrument in the demo
  • levels into positions
  • allows for flat.to.flat and flat.to.reduced trade sizing

Sample from Daily P&L, with replacement

P&L Quantiles:

0% 25% 50% 75% 100%
-20898 1030 5249 10832 39399

 

 

  • block sampling, with replacement, provides multiple paths
  • mimics some of the autocorrelation structure of returns
  • may create deeper drawdowns if down streaks are effectively repeated

Disadvantages of Sampling from portfolio P&L:

  • not transparent
  • potentially unrealistic
  • really only a statistical confidence model
  • path won't line up with historical market regimes

Sample from Trades, with replacement

 

P&L Quantiles:

0% 25% 50% 75% 100%
-23163 -7432 180 7929 24703

 

 

  • generate random trades similar to backtest
  • round turn size, direction, duration are sampled from the trades and any flat periods
  • applied in to market data as new transactions at then-prevalent price
  • trade expectations in the random-trade model, compared to backtest expectations

Disadvantages:

  • more complicated to model trade dynamics
  • maintaining constraints e.g. max position

Advantages:

  • can more closely compare strategy to random entries and exits with same overall dynamic
  • creates a distribution around the trading dynamics, not just the daily P&L

  • best for modeling skill vs. luck

Using Returns for Analysis

  • Returns create a standard mechanism for comparing multiple strategies or managers
  • Choice of the denominator matters

Sample Analyses:

  • tail risk measures
  • volatility analysis
  • factor analysis / factor model Monte Carlo
  • style analysis
  • comparing strategies in return space
  • applicability to asset allocation

see blotter::PortfReturns to extract returns from cash P&L and all of PerformanceAnalytics

Asset Allocation

  • we tend to do asset allocation studies only after strategies are in production.

  • backtests are most often done on 1-lots, and initial scaling is done ad-hoc.

  • strategy daily returns become returns of a synthetic asset (the strategy) as inputs to optimization

  • optimizer should use your business objectives as the portfolio objective

PortfolioAnalytics is used extensively by traditional asset managers, but it was developed to optimize capital allocation for trading strategies

Conclusion

  • understand the business context you operate in
    • constraints
    • benchmarks
    • objectives
  • separate the components of the strategy
  • construct testable hypotheses at each step of the process
  • evaluate the components separately
  • test yourself often

Thanks

Thank You for Your Attention

 

Thanks to my team, and my family, who make it possible.

©2015-2016 Brian G. Peterson brian@braverock.com

Code to apply the techniques discussed here may be found in the R blotter, quantstrat, PerformanceAnalytics, and PortfolioAnalytics packages.

The paper this presentation is largely drawn from is available here: http://goo.gl/na4u5d

Thanks to Jasen Mackie as the primary author of mcsim and txnsim.

All views expressed in this presentation are those of Brian Peterson, and do not necessarily reflect the opinions or policies of DV Trading.

All remaining errors or omissions should be attributed to the author.

Installing an R strategy toolchain

PerformanceAnalytics,PortfolioAnalytics,FinancialInstrument, xts, and quantmod are all on CRAN.

Development for these packages is on github.

Install devtools

If on Windows, install Rtools

Install PerformanceAnalytics and FinancialInstrument using install.packages

install.packages('FinancialInstrument')
install.packages('PerformanceAnalytics')

Install current versions of xts, quantmod, blotter, and quantstrat from github:

require(devtools)
install_github('joshuaulrich/xts')
install_github('joshuaulrich/quantmod')
install_github('braverock/blotter')
install_github('braverock/quantstrat')

Resources

Bailey, David H, Jonathan M Borwein, Marcos López de Prado, and Qiji Jim Zhu. 2014. “The Probability of Backtest Overfitting.” http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2326253.

Fitschen, Keith. 2013. Building Reliable Trading Systems. John Wiley & Sons, Inc.

Hansen, Peter R. 2005. “A Test for Superior Predictive Ability.” Journal of Business and Economic Statistics.

Harvey, Campbell R., and Yan Liu. 2015. “Backtesting.” SSRN. http://ssrn.com/abstract=2345489.

Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Second Edition. Springer.

Pardo, Robert. 2008. The Evaluation and Optimization of Trading Strategies. Second ed. John Wiley & Sons.

Racine, Jeffrey S, and Christopher F Parmeter. 2012. “Data-Driven Model Evaluation: A Test for Revealed Performance.” McMaster University. http://socserv.mcmaster.ca/econ/rsrch/papers/archive/2012-13.pdf.

Tomasini, Emilio, and Urban Jaekle. 2009. Trading Systems: A New Approach to System Development and Portfolio Optimisation.

Tukey, John W. 1962. “The Future of Data Analysis.” The Annals of Mathematical Statistics. JSTOR, 1–67. http://projecteuclid.org/euclid.aoms/1177704711.

White, Halbert L. 2000. “System and Method for Testing Prediction Models and/or Entities.” Google Patents. http://www.google.com/patents/US6088676.