The industry’s solution is a new way to process data: stream processing. Stream processing makes it possible to execute the same types of queries and computations against real-time streaming data that were previously possible only on stored data.
Programmers have traditionally built real-time analytical applications on high-volume event streams by using languages such as C++ and Java.
But custom coding and the use of low-level tools can translate into high development costs. Instead, stream processing looks at data differently for faster performance, easier programming and integrated access to both real-time and historical data.
Stream processing engines (SPEs) use what’s called inbound processing. This means incoming event streams are processed in memory before being stored in a database. Traditional systems instead use outbound processing. In outbound processing, the data needs to be written to the disk before they can be processed.
Programming also is easier with SPE applications because they can use StreamSQL, a high-level language that extends the familiar SQL standard of database processing to perform complex operations on continuous data streams. For example, with StreamSQL extensions, indicators such as moving averages and operators to filter, merge, combine, correlate and run user-defined analytical functions can be processed over streams. These operators have built-in mechanisms to manage stream disorder and late or missing data. This is particularly helpful in fast markets when a mistake is even more costly.
A key feature of StreamSQL is that it can be used to access and manipulate real-time (streaming) and historical (stored) data in a uniform manner.
As shown below, this can be done by referring to both streams and stored tables in the “From” clause of a StreamSQL statement.
SPEs also provide a spectrum of integrated storage options for all data volume ranges. An SPE can store and query gigabytes of data with near-zero latency, or slice and dice terabytes of historical data spanning a period of months or years, all within the same process. “Dicing up data” shows the available storage options and the corresponding access latency vs. data capacity characteristics.
Forex transaction volumes are continually increasing. While the opportunity depth is high due to extreme liquidity, all-day trading hours and a large and diverse set of participants and profit potential even in falling markets, the opportunity windows are becoming narrower due to increased automation and emerging algorithmic trading tools. These characteristics necessitate customizable real-time applications that can be adapted or adjusted on-the-fly. This is a scenario where the stream processing technology shines.
On the sell side, it is critical for forex institutions to continually optimize the overall price delivery from price sourcing, setting and publishing to trade processing. Price quality is the key differentiator and is a function of speed given the high market volatility and increasing choices available to the buyers. Latency is a key consideration in both data cleaning and price setting, which are two fundamental pricing-engine tasks. Even with manual operations, sub-second latencies are highly desirable.
On the buy side, the trend toward integrated access to multiple sell-side institutions and liquidity portals, such as FxConnect and Hotspot, create new opportunities for arbitrage and cross-market trading. With algorithmic trading, the latency requirements get drastically smaller. Milliseconds can make a big difference. In particular, forex-based hedge funds are aggressively leveraging the inefficiencies by arbitraging price differences from multiple liquidity providers.
Besides latency, customization and agility are big concerns. Off-the-shelf solutions can be too simplistic, while fully custom ones are expensive to build, evolve and maintain. When new trading opportunities arise, it is important to quickly revise the processes to capitalize on them before they disappear. Stream processing can offer significant value here, as it allows easy customization that lets you focus on the semantics without worrying about the back end.
Data validation and pricing are two applications that can make immediate use of stream processing. As traders know, bad data happen. Backward prices (ask < bid) and partial data (bid and no ask) need to be detected and removed or corrected. Selecting the right values from multiple data feeds can often be done with the staleness criterion: Although it is rarely possible to assess the real age of the values in the absence of time stamps, it is possible to track arrival rates and look for shifts or spikes in arrival rates, which indicate potential throttling problems.
Outlier detection is also common. Outliers are either ignored or marked to suggest they are less trustworthy. Big changes in spreads may also indicate unreliable data and can be detected using time-localized comparisons, such as changes from the last tick or comparisons with moving averages.
Once the data are sanitized, the prices can be calculated. This process is complicated by various date and holiday adjustments, especially when composite crossing currencies (those that are not directly traded) are involved. For example, a composite such as GPD/USD can be derived using GPD/EUR, EUR/JPY and JPY/USD.
On the buy side, it is often useful to produce a view of the raw input streams that hides the details and complexities from the trader. A super-feed application can aggregate prices from various liquidity platforms and present a single best number.
This view can also derive positions for composite crossing.Alerting-style applications are also common: patterns in the cash prices and derivatives (options, futures, etc.) can be tracked in real time and the trader can be alerted when a pattern of interest is identified.
For example, an application can track a moving average convergence divergence (MACD) indicator, where two exponential moving averages with different speeds (for example, a 10-minute and a one-minute moving average) are computed in real time and an alert produced when the averages are about to cross over.
Alternatively, the system can be instructed to automatically take buy or sell actions if a cross price hits or exceeds a threshold. The threshold might be a fixed value, changeable dynamically, or derived via MACD or another user-specified tool. A dynamic stop/loss limit can also track the price and buy or sell at a prespecified minimum or maximum price.
In a similar manner, an application can track trading performance in real time by computing the profit/loss through a window of recent transactions and trigger alerts if the performance is below a threshold.
Finally, stream processing simplifies the integration of historic data for calculating a derived currency value, along with other attributes such as variance and actual volatility. The system can annualize the data, deliver the results to analysts and traders, and use the results in a particular trading or pricing scenario.
Real-time risk management operations can also benefit from stream processing. Because forex traders must reduce portfolio risk with a lot of moving variables, risk values must be calculated continually at precise levels to guide future trading actions. Here, auto-hedging can be used in conjunction with auto-trading by continually monitoring risk exposure for various currency positions and trade when the risk exceeds a customizable threshold.
Some aspects of auto-compliance exhibit similar characteristics; for example, cross currency positions can be tracked in real time to make sure compliance thresholds are met.
THE FUTURE OF FOREX
Fueled by system and protocol standardization, there is a big push toward automating all aspects of forex trading. Transaction volumes and complexity will continually increase and windows will get narrower as markets start to consolidate and new participants, such as hedge funds and large proprietary traders, enter the market.
Stream processing also will empower applications to analyze historical data and consult historical trends within real-time queries. StreamSQL makes it possible to switch processing from live to historical data instantly, thereby simplifying the execution of sophisticated “what if?” and “why it happened” scenarios. Moreover, increased automation will enable more fundamental analysis and risk-taking models that automatically respond to events from various electronic information sources, such as news feeds.
In the long run, the forex market will look increasingly more like the exchange market, where stream processing has already proven its value within a short time. Stream processing engines promise to be a key component in next generation forex platforms due to their unmatched performance, flexibility and agility advantages.
Ugur Cetintemel is an assistant professor in the department of computer science at Brown University, focusing on the architecture and performance of advanced information systems and databases. William Hobbib is a vice president at StreamBase Systems, which develops real-time event processing software. www.streambase.com.