In defense of data

Just as there is no such thing as being a little bit pregnant, there’s no such thing as data being a little bit wrong. Old-school chartists appreciate this importance of the efficacy of data. When you’ve built charts by hand, pouring in excruciating detail over the minute differences of subsequent highs, lows, closes and opens, you become a bit perturbed if you later find that the data were invalid.

This appreciation also is fostered by an intimate understanding of price data behavior. Because you are creating the chart by hand, you tend to remember important numbers, such as past significant tops and bottoms. Old tops and bottoms, which may have long passed from most traders’ memories, are easy to spot, and you’re also sensitive to subtle changes in how markets close or settle. One way to develop this sensitivity, if you’re so inclined, is to monitor a live data feed while charting the daily market by hand using end-of-day data.

Of course, modern technology has taken the role of charting daily data for most of us. There are many software packages available that can provide a plethora of charts, markets and data compressions. However, with the advancement of these products, there have been two disturbing developments that many consumers of several data feeds, this author included, have come to notice. One is a nonchalant attitude toward the efficacy of data and the need to correct errors systematically; and two, data corrections, when they do take place, often occur without prior or subsequent notice.

These two problems have affected serious market research and trading by a number of traders, this author included. Indeed, it’s to the point where the tools to analyze the data, and not the data quality itself, are the product that software vendors are selling these days. This has been partially responsible for a laissez-faire attitude regarding data accuracy (see “Missing the point”). Traders are starting to doubt their data, and once you’ve done that, then no indicator in the world can save you.

Of course, with disclaimers all the rage, there’s little the concerned trader can do once they’ve signed on the bottom line. Your data, and the dollars you expected to earn from trading that data, are ultimately at the mercy of vendor and exchange benevolence (see “Without warranty”). It doesn’t end with the vendors, however. The first two disclaimers in “Without warranty” are from exchanges, not vendors. Of course, this begs the question: If the exchange or the vendor is admitting that data errors happen, then they must have some statistics as to the frequency of the errors. Why not share those statistics so consumers can make

informed decisions?

This cannot be overstated: Accurate data are needed to give a trader an opportunity to come to reasonable conclusions in attempts to capture a profit from the market based on trading systems. If systems are being created with inaccurate data, the applicable term is “garbage-in, garbage-out.”


Some may argue that errors are so infrequent that smoothing the data eventually will cause the inaccuracy to be filtered out, but that depends on the trading methodology being used. Such data may be usable for a smoothed moving-average-based system, but some systems depend on accurately measured moves from high to low or close to close. Some traders use Gann angles or trendlines drawn from tops and bottoms. Other traders rely on chart patterns that are dictated by closes. Still others use oscillators, which use the closing price as a factor.

Traders and investors use a variety of fundamental or technical signals to enter the markets. Through study and practice, they develop a trading system, backtest the methodology, and once they achieve the desired results, apply real money to the system.

At first, traders rigidly follow the system, taking wins and losses along the way, but after a while (in some cases) the desired results may not follow. These traders may begin to blame their original system. They may even switch to fading signals or doubling up their positions, betting the next trade will be a big winner. Lucky traders caught in this spiral will give up and go back to the analysis table before it’s too late. Unlucky traders will trade themselves into financial oblivion.

Sometimes the end may come in a sudden blow up of losses. More often than not, the inaccurate data, from incorrect volume size to highs and lows changing to closes and settlements being used interchangeably, not the system, is at fault.


Many traders use volume in their studies of market movement. While this may be a valid indicator over the long run, the distribution of the volume data actually may be the cause of problems in the analysis. Trades are often missed because signals are not being generated in real time the same way that they were generated in a back test.

This problem is caused by volume being posted at the end-of-today, usually as an estimate. The following day, however, correct volume should be posted. If your signal requires a minimum amount of volume, today’s estimate may actually be below that amount, failing to generate a buy or sell. The next day, when the final volume is posted, you may see that your minimum volume level was actually achieved. Of course, this knowledge comes a day late (and now, probably, a dollar short).

A backtest, however, would have picked up this trade. This makes it critical to know how and when exchanges report volume to the data vendors. There may be a number of preliminary estimates of volume before the final volume number is posted. While it is not an error in reporting, as a trader, you must know which volume figure your data vendor is using, and if he is using it on a consistent basis.

A recent example of this occurred in November 2008 soybean futures. One end-of-the day vendor posted a volume estimate for both July 2 and July 3 of 152,632. This is because the volume for July 3 is an estimated volume figure. The actual volume for July 3 will be posted at the end of the day on July 7, the Monday after a holiday weekend. In addition to the posting of a final volume number, exchanges have been known to adjust volume figures for several previous days. Again, the problem with your trading system may not be your logic, but it may be changes in data.


Gann analysts and traders, in particular, rely on the accuracy of tops and bottoms to draw important lines for future support and resistance. A general rule is to draw down-trending Gann angles from two-bar swing tops and up-trending Gann angles from two-bar swing bottoms. Unlike trendlines, Gann angles move up or down at a uniform rate of speed. These angles represent both the trend of the market and support and resistance.

Changing a top or bottom several days after its original posting can have a material effect on this analysis. If tops and bottoms are changed, Gann angles have to be redrawn. Drawing charts by hand means erasing the old Gann lines, putting in the new Gann lines and recalculating all of the support and resistance points along the angles as well as recalculating new square dates. To the computerized trader, it means deleting then replacing Gann angles on a chart. However, in either case, the issue is not the work involved, but a proper notification of the change.

Many times, the trader may not find these changes until a weekly chart is created or updated. By then it is sometimes several days from the occurrence. Several days or weeks to a Gann trader may mean the angle long awaited is off by several points. Data vendors should correct data errors when possible and when changes are expected, but a failure to notify a user of this change can and will cause traders to miss trading opportunities, or make trades that should have never been attempted.


Settlement prices are the most important prices of the day. This price tells the world what the particular futures contract is worth. Unfortunately, it is also the most abused price. Studies show that several problems can arise with the settlement. Although most data vendors claim to get the settlement price from an exchange, there are still a number of errors that occur. “Blowing the close” (right) shows settlements for the ICE September 2008 cocoa contract for the last 10 trading days of June.

All shown vendors claim they use the ICE settlement as the source for the data; however, Vendor A and Vendor B have, at times, different settlements. One interesting note is that the differences all appear to be random and range from two to 20. On several days, Vendor A and Vendor B have the same settlement, while other days, Vendor A, C and D all agree on the settlement.

There’s no question: An inaccurate settlement affects many trading indicators. These include moving averages, stochastics, RSI and ADX. Trading off an accurate settlement price immediately places a trader at an advantage to those who aren’t.

Data vendors that provide extensive technical tools, as well as trading platforms, have the most trouble keeping up with data changes. In these cases, the vendor should at least provide the source of the original data so a proper comparison can be made. Merely suggesting another source with which to compare data does not assure accuracy. This is the case for Vendor A and Vendor B. These are the settlement prices of the actual live and end-of-day data used in this author’s analysis. Obviously, it is possible for both to be wrong, while agreeing on the same settlement. The answer in this case is obvious: The exchange must stand as the comparison measure.

Traders have to be aware of the various ways the data they receive may not be suitable for their trading system or analysis. Real-time trading results may not be the same as backtested results; however, the issue may not be with the logic of the system. In many cases, the data — the most important part of many trading programs — is simply different between real-time and end-of-day.

Unfortunately, the only solution is making ongoing comparisons and, much like the hand chartists of decades past, learn the intricacies of the mistakes their vendors make. More solutions can be offered on the vendor side, however:

• More information has to be disclosed by the data vendor as to his original source of data so that comparisons can be made.

• Data vendors should provide the customer with statistics demonstrating the accuracy of the data, as measured against that original source.

• Price changes and data corrections also should be presented to the user in a more timely manner.

• Price corrections must be announced at the time they are made so traders can adjust their analysis accordingly.

Technology is a wonderful thing. It has enabled traders to make lightening-fast trading decisions. However, those decisions must be made with accurate information. If vendors would make a renewed effort to apply that technology to the data instead of the pre-packaged analysis tools and fancy graphical charting options, then they will truly empower their customers to take full advantage of their product.

James A. Hyerczyk is a Gann technician and trading educator who has been analyzing markets since 1982. He authored "Pattern, Price & Time," and writes a futures, forex, ETF and equities advisory newsletter for traders and institutions at He can be reached at

