When you set out to build a trading system, there are some fundamental questions that you must ask before you get to the step of entry rule development. If these fundamental questions aren’t answered clearly and definitively, then you are sure to have problems.
For example, does the system work best on one market or a basket of markets? If you go with a single-market system, you have to be far more careful about curve-fitting and have a stronger premise — such as using silver as a proxy for inflation to forecast changes in Treasuries. Another question is if you are developing for a basket, do you use the same parameters for all markets or change them up? As is often the case in system development, the answer is “it depends.”
THE BIG QUESTION
The most important decision to make is whether you are going to use out-of-sample testing to validate your program. This question is two-sided, and not as simple as assuming you should always do out-of-sample testing.
If 10% of a data series is held out-of-sample, a good system might be thrown out if the results are not good in the out-of-sample period, even though that number of cases is too small to prove anything. This problem is primarily evident in long-term systems. However, if trading systems are developed on diversified baskets of markets, and care is taken in the analysis to avoid curve-fitting it may not be necessary to rely on out of sample testing.
Another solution to this problem is using a method called walk-forward testing. Walk-forward testing works like this. Assume you have 10 years of data — say 1995 through 2004 — for the markets you want to trade. Your trading strategy needs a minimum of three years of data for testing and optimization. Start by developing and optimizing your system using only the first three years of data, 1995-97. On these three years of data, try as many ideas as you like and optimize parameters in as many ways as you can. It is critical to keep any and all data after 1997 untouched. When you are done, record the rules for your system and the optimum parameters.
Now, use these rules and settings to test the system on the first month of 1998. After you record the out-of-sample results, slide the three-year data window forward to include the first month of 1998. Repeat the analysis, including optimization, and record the rules and optimized parameters. Use these parameters to test on the second month of 1998. Continue walking forward and optimizing the three-year data periods. When the data run out in 2004, test the system for the entire period from 1998 to 2004. Switch the rules and parameters each month to use the ones that you found and recorded.
The system performance for these seven out-of-sample years (84 out-of-sample months) is a much better indication of how a system will perform in real-time than the performance of any single period used for optimization. There is nothing magic about the assumed periods — three years for system development and one month for the walk-forward interval. In many ways, these parameters can be adjusted to best fit your own logistical situation.
Finally, if the results of the out-of-sample months look good, continue the walk-forward process in real-time. A good rule of thumb would be to paper trade the system for the length of time you are using for your in-sample optimization: three months in our case. Then, when that period is up, you can optimize on those three months and have the values to use with real money, assuming the system still meets your risk-tolerance and profitability standards.
OPTIMIZATION DO’s AND DON’Ts
Optimization is a necessary evil in trading system development. No matter how sound a trading concept is, there will be sets of parameters that work better for a given market or basket of markets. Optimization is not bad in its own right. Used correctly, optimization can be very powerful. The key is understanding how to pick the best set of parameters and gauging how robust a system will be in actual trading by studying the optimization process.
We can get a better understanding of optimization using a trend-following basket system and comparing a dual-moving-average crossover system to a triple-moving-average system. The analysis starts by selecting the markets that will be used. In this case study, we will use a basket of 20 markets including major currencies, softs, grains, metals and energies.
We will test both systems on this basket of markets and show how to select the best set of parameters. We will use two simple systems. The code for the dual-moving-average system is shown in “Two averages” (below). In the testing, we will deduct $25 for commissions and $75 for slippage. The testing period will run from Jan. 1, 1980 to Sept. 15, 2006.
Forget for a moment that we are talking about moving-average-crossover systems. Let’s assume that we have developed a new system and want to see how robust it is across a wide range of parameters. A good first quick test is to do the following:
For all sets of valid parameters, take an average of net profit and standard deviation. Then, calculate net profit minus one standard deviation, two standard deviations and three standard deviations. Also run this calculation using the drawdown column. This can be done for the results of the combined long and short tests. Here are the results for this system:
This test gives an idea of the consistency of profits through the optimization period. The numbers could be adjusted by standard error for the number of cases in the sample, but this is a good first estimate as long as the number of cases is reasonable.
The average profit was about $554,000 with a standard deviation (STD) of more than 60% of the average profit. This is not a good number. At two times the standard deviation, the system loses money. It is best to see the system being profitable for the average profit minus three times the standard deviation of those profits.
The profits at the portfolio level are shown in “Across the board” (below). These use the 20/65 set of parameters to see how this system did on a market-by-market basis. This set of parameters was one of the better performing sets in the optimization test.
The system was profitable in most markets. There are some markets where its performance was better, but the overall profits are not due to one or two outlier markets.
The market-by-market analysis looks good, but we need to look at more than just the end numbers of one parameter combination. It’s also important to look at the results of parameter values next to those that show the strongest performance. Unfortunately for this system, the smoothness of the optimization space reveals problems with the dual-moving-average system. For values between 20 for the short-term average and 60 to 80 for the long-term average the results are strong. After that, the results fall off sharply. This shows the danger of this system if the market shifts.
The problems with this test are related to the fact that the math for the dual moving average crossover system is flawed. Assuming that a market moves in perfect sine waves, a moving average crossover system using a short-term moving average length that is 50% of the dominant cycle and a longer-term moving average that is 100% of the dominant cycle for the long term would buy every top and sell every bottom. This is not the case with the system, but this serves as an illustration of the fundamental flaw in this system and shows a premise is sometimes not what you expect. Even if we did not know this, we are able to see it using this simple test.
The results for this system look much better than those for the dual-moving-average system. Even at three times the standard deviation, the profit measure is still more than $375,000 vs. losing almost $500,000 for the dual-moving-average example. Maximum drawdown is only $222,000 compared to more than $620,000.
These results suggest that the triple-moving-average crossover system is more robust than the dual-moving-average crossover system. Further, “Behind the results,” (right) shows that profits are evenly dispersed across the markets with natural gas being 15% of the total. From this test, we can see that the triple-moving-average system is much more robust over a wide range of parameters, so we will continue our analysis on this system.
IS TIME ON YOUR SIDE?
This set of parameters has made money in all years since 1980 except for 1999 and 2003. In addition you want to look at a moving window of change equity. Good periods are either a 128- or 254-day moving window. You want to see this above zero. Ideally we would also like the curve sloping up in the most recent periods.
You’ll also want to dig deeper into the performance of nearby parameter values. The optimization results for the triple-moving-average crossover system are shown in “Smooth results” (below).
The parameter set of five, 20 and 75 is the best set and has many similar parameter results. There are many short-term periods of five in this top grouping. There were also several periods of 20 for the intermediate-term average and 65 to 80 for the longer-term average. However, the top sets of parameters may not be the best one to use going forward. Often, the third or fourth best set of parameters turns out to be the best one to use.
Once you have a trading methodology and an initial set of parameters, it is time to fine tune your approach. It’s at this stage when you might learn how to improve the system with money management stops or profit targets. It’s also important to learn more about your system’s characteristics because if those change going forward, it is a danger signal that the system might blow up, even if it is making money at the time.
Because we have developed and tested a trend-following system, if the average winning trade suddenly becomes the same size as our average losing trade, it would be cause for worry. Trend-following systems make money because the average winning trade is larger than the average losing trade, not from a number of smaller winners or a large winning percentage.
You also want to look closer at drawdown figures. Don’t rely on simply the maximum drawdown. Consider something called “start trade drawdown” (see “Downside review,” below). For example, even though the maximum drawdown for the system is $120,000, there is better than a 92% chance that the drawdown will be less than $55,000. In evaluating drawdowns, this chart is much more valuable than simply looking at maximum drawdown. The reason is most maximum drawdowns are caused by extreme market shocks, such as September 11, which are not typical and are impossible to predict.
You also need to study scatter charts, such as profit vs. market volatility and adverse excursion vs. final trade profit. Once you have a good understanding of what to expect from your trading system, you will know what to look for during actual trading. Now that we have a better understanding of the tough questions in system development, we have a better idea of how to develop our rules, set the parameters for our rules and fine-tune our rules. In the next installment, we will put this understanding all together to build a viable system for actual trading.
Murray A. Ruggiero Jr. is a consultant in East Haven, Conn. His firm, Ruggiero Associates develops market-timing systems. He is editor-in-chief of Inside Advantage Gold Club (www.iagoldclub.com) and author of Cybernetic Trading Strategies (John Wiley & Sons). E-mail: email@example.com.