From the January 01, 2012 issue of Futures Magazine • Subscribe!

Back- vs. forward testing: Test twice (or more), trade once

Data considerations

When testing a trading idea, the historical data can be divided into two or more segments to provide more reliable results. The data that is used during the initial testing and optimization is called the in-sample data. The second data set, referred to as the out-of-sample data, is a "clean" data set that is not used until the in-sample backtesting and optimizations have been completed. Because the out-of-sample data has not been used in any of the optimizations, traders can apply the optimized system to this reserved historical data to determine if the two data sets provide similar results.

Before backtesting or optimizing begins, the historical data can be divided into two distinct periods to accommodate in-sample and out-of-sample testing. One method is to divide the data into thirds, reserving one-third for out-of-sample testing and using the remaining two-thirds for in-sample testing and optimization. To clarify, to preserve the out-of-sample data, only the in-sample data should be used during any optimizations.

The results of the in- and out-of-sample testing can be evaluated by comparing the performance results or by reviewing the corresponding equity curves of the two data sets. Positive correlation exists where the results are similar, and this shows that the system has promise. Negative correlation, where the out-of-sample results are poor compared to the in-sample results, indicates that the system may have been curve-fit to match the in-sample data.

The stronger the correlation between the in- and out-of-sample testing, the higher the probability that the system will do well in forward performance testing and live trading. "Positive promise" (below) illustrates a strategy that has positive correlation between in- and out-of-sample testing, as well as forward performance testing. There is a good probability that this strategy would perform well in live trading.

comments powered by Disqus