We’re still not ready to pass the data to the SVM. First, it must be normalized. To train the SVM, each of the inputs X and the output Y are independently scaled and offset so that over the variation in the training set each input and output feature varies around zero and has a typical variation of zero to 1. For example, for the output value Y, the scaling factors A and B are chosen so that the scaled array YS is properly scaled before it is passed to the support vector algorithms.
Once trained, and presented with a scaled set of input features, the output of the support vector machine YP is reverse-scaled before being presented as the predicted value P.
A critical step is the optimization of C and gamma. These values are optimized and used to calculate the models. Models then are scored and the best ones are selected. Once we find the best model and make it the active model, we generate a prediction. A mean square error calculation is effective for calculating which model is best.
The code above integrates a simple rule based on our predicted value. If our prediction is above zero, we buy. If it’s below zero, we sell.
We can try many different combinations of inputs and also various transforms. For example, we might take a ratio of two variables. We can look at the ratio of earnings momentum and dividend momentum but only include the earnings momentum as a variable. This could limit co-linearity and allow us to extract valuable information from correlated variables. We also could try predicting derived variables, such as a corrected signal from simple systems. For example, we could take the signal from the simple rule-based earnings model we discussed earlier, use 1 for long and –1 for short, and correct the signals that produce most of the drawdowns and losing trades. In turn, we could try predicting this new output.
Using macros we can develop code that tries the many different combinations of inputs and parameters. Subsequently, this can use a genetic optimizer to develop the best combinations of models from the many raw inputs and pre-processing concepts. Because computer power is now reasonably inexpensive, with quad-core machines commonplace, everyday machines can test out these complex theories. Indeed, we easily can see a new era on the horizon, one where intelligent models are no longer bound by computer power, but the creativity of the model developers.
Murray A. Ruggiero Jr. is the author of “Cybernetic Trading Strategies” (Wiley). E-mail him at email@example.com.