Michael Williams presents another article courtesy of Broz on Bonds: Academic Corner.

Many academic studies on market microstructure use trade classification algorithms to infer whether a given trade is initiated by a buyer or a seller. Yet, authors Paul Asquith, Rebecca Oman, and Christopher Safaya (Short Sales and Trade Classification Algorithms; Journal of Financial Markets: 2010, Vol. 13-1, pg. 157-173) caution that trade matching algorithms tend to overwhelmingly misclassify seller-initiated trades as being buyer-initiated trades. The authors present a stark warning to those using classification algorithms for trading and inference purposes.


The issue of trade classification stems from the fact that most trade-by-trade datasets, whether they are real-time or delayed, do not explicitly specify whether a given trade is buyer-initiated or seller-initiated. Yet, knowing whether a trade is buyer-initiated or seller-initiated is useful for many applications such as trying to gauge the sentiment of the market, determining the potential for short-sale (or even short-sale-predatory) strategies, measuring the depth and liquidity of an asset, and determining the levels of private information in a given marketplace (e.g. the Probability of Informed Trading measure of David Easley, Nicholas M. Kiefer, Maureen O’Hara, and Joseph B. Paperman in Liquidity, Information, and Infrequently Traded Stocks; Journal of Finance: 1996, Vol. 51-4, pg. 1405-1436).


One solution to the need for classifying trading intent can be found in the highly popular Lee and Ready Trade Classification Algorithm (Inferring Trade Direction from Intraday Data; Journal of Finance: 1991, Vol. 46-2, pg. 733-746, authors: Charles M.C. Lee and Mark J. Ready). This algorithm takes a sequential approach to classifying trades by first using a quote-based test and then, if necessary, applying a tick-based test. So, for a given trade, the algorithm first looks at the trade price relative to the contemporaneously-quoted bid and ask prices. If the trade price is closer to the ask price, then the trade is classified as buyer-initiated. If, on the other hand, the trade price is closer to the bid price, then the trade is classified as seller-initiated.


The quote-based test fails when the trade price is half-way between the bid and ask prices (i.e. trade price is equal to the bid-ask mid-quote). In that case, the Lee and Ready Algorithm reverts to a tick-based test. Here, if the current trade price is higher relative to the last (old) trade price, the trade is classified as buyer-initiated. Conversely, if the current trade price is lower than the last (old) trade price, the trade is classified as seller-initiated. If the current trade price is equal to the old trade price, the previously-determined trade classification is used (given that buy (sell) trades tend to follow previous buy (sell) and trades). Finally, if the quote-test fails and two unchanged trades are recorded back-to-back, the current trade is classified as indeterminate.


Authors Paul Asquith, Rebecca Oman, and Christopher Safaya thus caution that, despite their popularity, trade-classification algorithms tend to misclassify trades. This tendency to misclassify can bias the results of inferential academic studies as well as render trading strategies built around them unprofitable.



