Advanced model building for trading the S&P 500

Support vector machine (SVM) models are closely related to neural network models. In short, they construct an n-dimensional space that separates data into different classifications. This analysis isn’t for the uninitiated, but for those who have done their homework, it can be used to develop an S&P 500 model that outperforms the market (see “Breaking new ground with neural nets,” February 2013, for SVM general concept). 

Our first step is simple: Determine what, exactly, we are modeling. For us, that’s the S&P 500 cash index. Next, we consider the time frame — not as straightforward as you may think. Although an intermediate-term trader might gravitate toward daily bars and a day-trader might assume 15-minute bars are appropriate, noise is a factor. Because of this, longer-term periods have an advantage. (They also make fundamental inputs viable; employment data is relevant on a weekly basis, but hardly significant over a few minutes.) We’ll use weekly data for this reason.

Of course we must identify our target, which obviously is relevant to our independent variables. For our target, we won’t attempt to predict the actual price level. Instead, we’ll identify a metric that reflects forward momentum of the S&P 500. 

Determining which independent variables are predictive is tricky. For our model, our variables include weekly earnings of the stocks in the S&P 500 (from Pinnacle Data Corp.), simulated M3 money supply (from Shadow Stats), unemployment and other key measures. Here’s a rundown:

  • S&P 500 close
  • S&P 500 earnings
  • S&P 500 dividends
  • Dow Jones bond index
  • Long bond rates
  • Three-month commercial paper rates
  • The consumer price index (CPI)
  • Measures of money supply (M1, M2 and M3)
  • S&P 500 commitment of traders data
  • The yield curve
  • Difference between CPI and producer price index
  • Gross domestic product

There is nothing magic or particularly secret about these numbers. All have been discussed routinely in both academic and professional settings. Indeed, by using CPI, PPI, M1, 10- vs. 2-year Treasury rates and gross domestic product — and linearly interpolating the data to a monthly basis — we can model S&P 500 price levels accurately.

One tool for developing independent inputs for our models is a simple scatter chart. Because we are building a relatively long-term model, we need to normalize our data so that long-term trends in earnings and stock prices do not become an issue. First, we must develop a simple normalized momentum indicator: 

Now, let’s compare our data in a scatter chart. (We multiply our momentum by 100 so that the scales are easier to read.) First, we’ll look at both S&P 500 momentum vs. the S&P 500 momentum one week into the future (see “Moved by momentum,” below). We will use 16 weeks as a moving average length for both. There’s a clear linear relationship between the current value of the momentum indicator and the future value of the S&P 500 indicator. It’s not perfect — the data are relatively spread out — but the relationship is there. 

“Driven by earnings” (below) shows the S&P 500 weekly momentum and our weekly earnings momentum in a scatter chart. Again, we are showing the relationship between the current value of the indicator and the future value of the S&P 500 momentum gauge. As we would expect, the data spreads and relationships are similar, and both show a linear pattern.

Testing our models

We can use our inputs to create simple stop-and-reverse trading systems. Our systems also will allow us to further analyze the predictive nature of our inputs, as well as analyze how given inputs interact with one another. 

Consider the following set of rules:

These simple rules (which have been optimized) create a model that beats buy and hold. Tested from Jan. 2, 1978 through May 23, 2013, our trades are 75.68% profitable with an average win of $507.02 and an average loss of $258.19. Our maximum intraday drawdown is $485.62, and the profit factor is a whopping 5.32. The total net profit over the period is $2,401.49. The system makes money on both the long and short sides of the market. The implication of these results, even though they are not walk-forward, is that earnings momentum is predictive of future S&P 500 price movement.

This variant is based on dividends:

It does not do as well as the earnings momentum version, but it still beats buy and hold. We make $1,427.80 on the long side with an open trade of $508.80. We also made $369.31 on the short side. This shows that dividend momentum is indeed predictive as well. 

We also can use a variation of our momentum indicator that replaces the moving average with the log of the ratio of the value vs. the value N bars ago:

Let’s use this variation to further develop a S&P 500 weekly model. These will be the three primary independent variables:

  • S&P 500 weekly prices 
  • S&P 500 weekly earnings
  • S&P 500 weekly dividends

Our first step is to select the precise inputs. We need to make sure our inputs are predictive while keeping in mind that we cannot predict our target perfectly. Out-of-sample correlations of 0.25 to 0.50 would be within expectations. Our target is the S&P 500 momentum indicator shifted forward: 

We will shift our future S&P 500 close from zero bars to 10 bars into the future. Depending on the moving average length, our output will not use completely future data. For example, if we shift two bars into the future, and we use a 10-bar moving average, then we are using eight past values for our moving average. Let’s now see how our target, ranging from a zero-bar forward shift to a 10-bar forward shift, compares as a trading system. Results are shown in “Zero to 10” (left).

The results show that predicting only one, two or three bars into the future helps our performance. In fact, predicting nine or 10 bars into the future makes about one-third as much as a 10-bar average shifted just one bar into the future. 

Let’s now look at a more direct target: N-day momentum based on prices shifted Y bars into the future. Future values are represented by negative numbers, so -1 is one bar in the future and 1 represents a one bar lag (see “Testing perfection,” below). When the next day’s close is higher than the current day’s, we buy at the next day’s open. This is why we have some losing trades even for a perfect, one-day ahead indicator. We see that the maximum profit is a bit over 24,000 points. However, we also can see that many combinations are in the 10,000-12,000 range. 

We’ve considered a large number of inputs, and more are viable. Volatility can be quite predictive of price movement. However, when using SVM (and most modeling approaches, for that matter), it’s important to keep our number of inputs as small as possible. Conventional wisdom also suggests that inputs should not be correlated among themselves. For example, since 1943, earnings and dividends have a correlation coefficient of 0.94; this is exceptionally high. Rather than support the conventional wisdom, however, some research shows that models actually can perform better on out-of-sample data with correlated inputs. 

Although space prohibits a deeper look at the development of these larger models, we can examine core code elements from the tools used to build them (TradersStudio and the NeuralStudio add-in) to demonstrate key stages. Our code supports four different inputs. They are the S&P 500 cash close, earnings, dividends and the cash long bond rate. Here are the variable definitions, along with momentum calculations using a log of the ratio of the current value divided by a moving average of that variable:


Next, we must code the training of our models. As part of this process, we sample the variables and store them for future analysis. We then shift our examination period backward by “LookAhead” bars. This way the current value of momentum appears like a future value. The following code is used to set up which bars to train:

We do not train until we reach a given number. Our number in this case is 1,000 bars, and we will retrain every 200 bars using the last 1,000 bars.

We’re still not ready to pass the data to the SVM. First, it must be normalized. To train the SVM, each of the inputs X and the output Y are independently scaled and offset so that over the variation in the training set each input and output feature varies around zero and has a typical variation of zero to 1. For example, for the output value Y, the scaling factors A and B are chosen so that the scaled array YS is properly scaled before it is passed to the support vector algorithms.

Once trained, and presented with a scaled set of input features, the output of the support vector machine YP is reverse-scaled before being presented as the predicted value P.

A critical step is the optimization of C and gamma. These values are optimized and used to calculate the models. Models then are scored and the best ones are selected. Once we find the best model and make it the active model, we generate a prediction. A mean square error calculation is effective for calculating which model is best.

The code above integrates a simple rule based on our predicted value. If our prediction is above zero, we buy. If it’s below zero, we sell. 

We can try many different combinations of inputs and also various transforms. For example, we might take a ratio of two variables. We can look at the ratio of earnings momentum and dividend momentum but only include the earnings momentum as a variable. This could limit co-linearity and allow us to extract valuable information from correlated variables. We also could try predicting derived variables, such as a corrected signal from simple systems. For example, we could take the signal from the simple rule-based earnings model we discussed earlier, use 1 for long and –1 for short, and correct the signals that produce most of the drawdowns and losing trades. In turn, we could try predicting this new output. 

Using macros we can develop code that tries the many different combinations of inputs and parameters. Subsequently, this can use a genetic optimizer to develop the best combinations of models from the many raw inputs and pre-processing concepts. Because computer power is now reasonably inexpensive, with quad-core machines commonplace, everyday machines can test out these complex theories. Indeed, we easily can see a new era on the horizon, one where intelligent models are no longer bound by computer power, but the creativity of the model developers.

Murray A. Ruggiero Jr. is the author of “Cybernetic Trading Strategies” (Wiley). E-mail him at

About the Author

Murray A. Ruggiero Jr. is the author of "Cybernetic Trading Strategies" (Wiley). E-mail him at