Neurensic is a Chicago-based company that has developed a set of tools for the detection of disruptive trading practices such as spoofing. It uses a form of artificial intelligence to search through trading data and identify patterns of activity that may raise a red flag for regulators. In this article, two of Neurensic’s founders explain how machine learning works and why it is particularly suitable for this type of compliance problem.
Since the passage of Dodd-Frank, market regulators and law enforcement oﬃcials have had a new tool in their efforts to prevent market abuse. “Spooﬁng” has been codiﬁed as a prohibited market activity and the Commodity Futures Trading Commission is now working with the Department of Justice and the Federal Bureau of Investigation to enforce this new law through several high-proﬁle cases.
Many would argue, however, that the term is ill-deﬁned. The regulators appear to believe that their written guidance communicates their expectations for market participants, but at the same time they describe the law as necessarily vague. Market participants are now struggling to understand what types of trading activity might be viewed as spooﬁng and compliance ofﬁcers seeking to protect their ﬁrms from possible enforcement action are combing through massive amounts of trading data to determine if any of their trading might be construed as spooﬁng.
Existing market surveillance tools search for violations of speciﬁc rules. This approach works well with something like cross trading. Has a broker matched the buy order of one customer against the sell order of another? This yes/no question can be evaluated and alerts can be generated when the rule is broken.
The problem with spooﬁng is that the deﬁnition is vague by design. A spoof can involve ﬁve contracts or ﬁve thousand, seven order messages or seven thousand. The duration could be measured in milliseconds or hours. To be considered spoofing, an order must have been placed with the intent to cancel before execution. A simple software solution cannot solve this problem. The evaluation of trading behavior is not a yes/no problem and requires a new type of tool.
Machine learning is that tool. It is a ﬁeld of artiﬁcial intelligence that involves the development of self-learning algorithms. Rather than following a list of instructions in a program, the machine learns to make classiﬁcations from examples and the more data it is exposed to, the more eﬀective it becomes in making those classiﬁcations.
Machine learning techniques are being applied all around us. Today’s cameras, for example, are able to identify faces and adjust the focus accordingly. Although this is an easy task for people, a software solution to this problem proved elusive until advanced machine learning techniques met microprocessors small enough and powerful enough to ﬁt into a camera.
One of the techniques cameras use is called a support vector machine (SVM). While the mathematics involved are tricky, it relies upon the concept of features. A feature is a measurable piece of data—an area with facial colors, for instance, or the ratio of the distances between the eyes, nose and chin. The software that controls the camera’s viewﬁnder is continuously analyzing every pixel it captures and scoring pieces of the image based upon those features.
If plotted on a graph, you would ﬁnd clusters with similar scores. In the middle you would ﬁnd sets of pixels that scored high and are more likely to be a face. A computer can take this several steps further and plot this problem on as many axes as there are useful features. It is not possible to visualize a support vector machine that evaluates a problem in ﬁfteen dimensions, but the concept is the same.
There are many examples of companies applying machine learning to solve a host of practical problems. We see machine learning tools applied in spam ﬁlters that improve over time, recommendation engines on websites such as Netﬂix, medical technology for disease diagnosis and of course the driver-less cars that have attracted so much attention in the media.
Our company is now applying machine learning to the problem of detecting spooﬁng. We oﬀer no view on whether the regulatory deﬁnition of spooﬁng is right or wrong, or whether a trading pattern constitutes a violation. We simply oﬀer a tool that allows ﬁrms to identify trading activities that may ﬁt this deﬁnition. In other words, we train the computer to know what regulators are looking for and then search client data for patterns that resemble this activity. Each pattern receives a risk score of 200-800; the higher the score, the more risk of attracting regulatory attention.
Keep in mind the monstrous amounts of data that need to be processed. The North American futures industry, for instance, generates over 100 billion order messages each day and the securities markets billions more. In our view, a machine learning solution is much more eﬀective at processing this amount of data than rules-based software solutions.
The anonymized data on the next page is a real example of order messages generated by a trader in the coﬀee futures market. Our tool identiﬁed a cluster of activity that appeared to be related by behavioral intent and scored this cluster with a very high probability of attracting regulatory attention.
The trading activity shown is similar to patterns that regulators have identiﬁed as spooﬁng in recent enforcement actions, where a trader’s alleged intention was to cancel a bid or oﬀer before execution.
Examining the rest of this trader’s activity, multiple instances of this trading pattern were found, further strengthening the impression that the intent of the trader’s order actions was to gain an unfair execution advantage.
Our second tool takes this one step farther. We are training the computer to understand normal market activity and tell us when the market is unbalanced—when liquidity is illusionary or volume is artiﬁcial.
As we said above, it is not for us to judge who is a bad actor or not. We cannot judge intent. What we can do is identify clusters of actions that appear to share intent, score the likelihood of attracting regulatory attention and measure the eﬀect that the actions had on the market. The rest is in the hands of the market participants.
The detection of spooﬁng is, of course, not the only application for machine learning in the ﬁnancial markets. In addition to our ﬁrm, there are a number of solution providers who have begun to solve diﬃcult problems.
On the hardware side, Nervana Systems has developed a cloud computing solution that allows deep learning solutions to better handle large-scale datasets. In other words, they are embedding intelligent software inside computer hardware so that data can be stored, accessed and moved far faster than with conventional hardware.