From the July 01, 2006 issue of Futures Magazine • Subscribe!

Building trading systems the automatic way

Advances in technology now allow reliable trading systems to be built with evolutionary software that “writes” the rules for the strategies. Such trading system software generators can operate up to 200 times faster than other related algorithms and produce actual trading system code in a variety of languages, including TradeStation’s EasyLanguage.

This is not simply optimizing an existing trading system using a genetic algorithm, nor is it making use of predictive analytics such as neural networks. The central engine used here is an advanced evolutionary algorithm, called genetic programming, and has been adapted for use in the financial sector and is used specifically to produce trading systems.

Here we will discuss the basics of this technology. A second article will demonstrate the process and use it to develop a single market directional trading system, which will be made available in EasyLanguage format.

An important distinction of this process is that it operates at the machine code level. Doing so, it will evolve trading systems at rates of more than 100,000 systems per hour, depending on the amount of data used and the speed of the processor. In our case study, we will use 62 inputs on nearly five years of data, but this will eat up just about three minutes of run time on a normal desktop PC. This is millions of times faster than brute-force optimization can accomplish and is one of the benefits of this approach.

This is not an experiment in optimization. In fact, this approach goes through pains to avoid it. The parameters typically the focus of optimization, such as the length of an average price, are fixed throughout the run. The program may shift these variables during the run, but only by operating outside of the established input feature. In our case, a new trading system will be developed automatically based on out-of-sample performance during the evolutionary process.

BETTER TOOLS

The central algorithm used in our trading system engine is called the AIM-GP, which stands for Automatic Induction of Machine Code with Genetic Programming. For more information on this tool, see “Efficient Evolution of Machine Code for CISC Architectures Using Blocks and Homologous Crossover," by Nordin, J.P., Francone, F., and Banzhaf, W., in Advances in Genetic Programming 3, MIT Press.

The AIM-GP stores individual programs (trading systems) as linear strings of native binary machine code, which are directly executed by a central processor. This approach is superior to other techniques for several reasons. First, loading instructions sets directly into the CPU eliminates the programming interpreters in the evaluation process. Second, the evolutionary process consists of the traditional operations: crossover, reproduction and mutation but includes several important additional operations. Non homologous crossover, for example, allows evolution of varied size programs, which is very different from the fixed-length instruction sets found in genetic algorithms. Third, the AIM-GP produces new, actual computer program code.

But perhaps the most important benefit of this approach is how selective it is. AIM-GP will discard easily an entire input or set of inputs if it determines that the input does not contribute to the solution. In fact, we will see that most of our 62 inputs will be discarded completely and quickly.

Other development tools are available. However, the more recent AIM-GP algorithm is among the fastest and most accurate available. See, for example, “Extending the boundaries of design optimization by integrating fast optimization techniques with machine-code-based, linear genetic programming,” by Frank D. Francone and Larry M. Deschaine, in Information Sciences-Informatics and Computer Science: An International Journal, vol 161. The AIM-GP produces consistently good solutions compared to comparable approaches, and it’s ideal for complex, noisy, poorly understood domains, such as financial market data.

Of course, developing any good trading system requires reducing the potential of curve-fitting, which is designing a system to profit in past markets and can be identified when out-of-sample results don’t match in-sample results.

Developers use methods such as a large testing base, minimal parameters and sensitivity analysis of parameters to minimize curve-fitting. However, many trading systems fail in out-of-sample performance. To create trading systems and indicators automatically while minimizing the potential for curve-fitting, the following steps and considerations are necessary:

1. Parsimony pressure, which places mathematical reduction effects on the evolved programs to favor simpler programs, draws from the concept of minimal description length. The simpler a program is, the less likely it has curve-fit the data.

2. Randomization of 16 higher level genetic programming parameters, including crossover, mutation, migration and population size to help find solutions that are globally optimum.

3. Randomization of more than 30 machine level instructions to assist in finding globally optimum solutions.

4. Multi-run conditions implemented to re-evolve trading systems with reinitialized and rerandomized parameters.

5. Dynamically varied program size making use of non homologous crossover, reducing fixed-length program size limitations and allowing simple and elegant solutions to emerge.

6. Employing an unbiased initial terminal set making use of 62 technical patterns and indicators used as the initial genetic material in the evolutionary process.

7. Evolution initiated at a zero point origin, making no initial assumptions regarding the direction of the market, nor how any particular indicator or pattern is to be used within the evolutionary process. Thus, the trading system is not developer-biased because no trading system structure is defined before the evolution begins.

Speed of evolution is critical. Interpretive software, such as charting, testing platforms, spreadsheets and document editors, operate far away from the CPU and typically suffer from software bloat, caused by an abundance of visually pleasing human-friendly interfaces. This bloat grinds down processing speed. For example, to evaluate the expression:

Z = Y + Z,

requires about 20 clock cycles in the typical interpretive software system. Our system, on the other hand, can execute the above expression as a single instruction in one clock cycle. While higher-level software suffers from inefficiencies, lower level code is highly compressed, efficient and fast. Further, compactness of the CPU provides a high-speed environment ready to accept these low-level instructions.

However, to exploit this high-speed processing, you must program code at the machine level. The AIM-GP operates at this efficient machine level with highly compact machine code and will also produce efficient assembly code, as well as C code, Java and translated EasyLanguage code, as its output.

AVOID THE BOG

Designing a trading system the traditional way is time intensive. Trading system developers spend countless hours testing and optimizing trading systems. However, to test and optimize a trading system, you must first have an existing trading system or at least a general idea or group of ideas for a trading system. That sounds simplistic, but in reality, trading systems developers must have a starting point to grow upon, which is generally a theory or premise of market movement.

Some examples of starting points for system development are filtered adaptive channel breakouts, adaptive support and resistance counter moves and lagged patterns with indicator filters. The system developer has some kernel of an idea to work with; and through testing, optimization, de-optimization, parameter space evaluation, sensitivity analysis, related market studies, filter additions and random noise injection, he ends up with a trading system.

At any rate, many hours will be spent on this development while the most important tool the developer has will sit mostly idle and underutilized. That tool is, of course, the CPU.

In 1955, software costs were one-tenth of a project’s cost. Today it is the hardware that costs one-tenth of the project’s cost and consequently this software crisis has led to the fact that 99% of CPU cycles are simply not used. Programming costs are still high today, even considering off-shore programming and outsourcing. Using an algorithm that produces programs automatically is beneficial to the user, the industry and the consumer.

The AIM-GP evolutionary process includes sets of instructions operating on inputs derived from traditional financial market data, such as patterns, indicators and inter-market data. An example of a Boolean pattern-based input parameter is:

CLOSE > = CLOSE[1].

This expression has one of two states: true or false. When combined with an indicator-based Boolean input parameter, such as:

C < = AVERAGE (CLOSE, DC) - EMGP,

a sell or buy setup pattern emerges. Note that we do not define this as a buy, sell or remain flat pattern or determine if it is relevant at all. We simply state it as initial genetic material. Here, DC is the dominant cycle of the time series, and EMGP is what we call an “emergent variable,” which is created by the genetic programming.

In addition to standard mathematical operators, such as +, -, * and ÷, conditionals and transcendental functions are included in the instruction sets. We also may use other preprocessed data series within our data set, including the Commodity Futures Trading Commission Commitment of Traders data, intermarket data, fundamental data, options data, etc. The genetic programming will determine if the inclusion of additional elements contributes to the trading system. After thousands of instruction set manipulations, at the machine level, a trading system will usually emerge after 10,000 to 100,000 passes through the data. This may take a few minutes of run time depending on the data size.

The evolution may be targeted to one of many criteria, including a Sharpe ratio, net profit/max drawdown, profit factor or simply raw net profit. Failure to improve the target results in the termination of an element of the system. In our example, which we’ll step through in the next article, we will evolve a trading system that targets net profit. Note that the highest net profit may not always be the best goal when evolving a trading system.

The only other constraint we’ll add is that there must be a minimum of 12 round turn trades per year. This forces the program to evolve more active systems, thus increasing the trade-to-parameter ratio of our evolved code and consequently increasing its chance for robustness.

We do not know where any of the typical measures of system performance, such as profit factor, percent accuracy, net profit or drawdown will end up. What we can do is assign our criteria, begin the run, wait a few minutes for our trading system to emerge and review the results.

So, the three steps to design our trading system are:

1. Run the data preprocessor on our selected market.

2. Run the trading system generator.

3. Translate our code into EasyLanguage.

Once automated, this will usually take only a few minutes of our time. In the next article we will cover each step in detail and explain the process of building a trading system. We’ll then build and detail a trading system for gold futures developed with this automated process.

Michael L. Barna is a registered commodity trading advisor and president of Trading System Lab. He has degrees in mathematics and astronautical engineering. Reach him via his Web site at www.tradingsystemlab.com.

Comments

eNewsletter Signup

Get the latest news and timely trading strategies for stock, options, forex, commodity, and financial derivatives markets with Futures' Daily Market Focus - FREE!