Many academic studies on market microstructure use trade classification algorithms to infer whether a given trade is initiated by a buyer or a seller. Yet, authors Paul Asquith, Rebecca Oman, and Christopher Safaya (Short Sales and Trade Classification Algorithms; Journal of Financial Markets: 2010, Vol. 13-1, pg. 157-173) caution that trade matching algorithms tend to overwhelmingly misclassify seller-initiated trades as being buyer-initiated trades. The authors present a stark warning to those using classification algorithms for trading and inference purposes.
The issue of trade classification stems from the fact that most trade-by-trade datasets, whether they are real-time or delayed, do not explicitly specify whether a given trade is buyer-initiated or seller-initiated. Yet, knowing whether a trade is buyer-initiated or seller-initiated is useful for many applications such as trying to gauge the sentiment of the market, determining the potential for short-sale (or even short-sale-predatory) strategies, measuring the depth and liquidity of an asset, and determining the levels of private information in a given marketplace (e.g. the Probability of Informed Trading measure of David Easley, Nicholas M. Kiefer, Maureen O’Hara, and Joseph B. Paperman in Liquidity, Information, and Infrequently Traded Stocks; Journal of Finance: 1996, Vol. 51-4, pg. 1405-1436).
One solution to the need for classifying trading intent can be found in the highly popular Lee and Ready Trade Classification Algorithm (Inferring Trade Direction from Intraday Data; Journal of Finance: 1991, Vol. 46-2, pg. 733-746, authors: Charles M.C. Lee and Mark J. Ready). This algorithm takes a sequential approach to classifying trades by first using a quote-based test and then, if necessary, applying a tick-based test. So, for a given trade, the algorithm first looks at the trade price relative to the contemporaneously-quoted bid and ask prices. If the trade price is closer to the ask price, then the trade is classified as buyer-initiated. If, on the other hand, the trade price is closer to the bid price, then the trade is classified as seller-initiated.
The quote-based test fails when the trade price is half-way between the bid and ask prices (i.e. trade price is equal to the bid-ask mid-quote). In that case, the Lee and Ready Algorithm reverts to a tick-based test. Here, if the current trade price is higher relative to the last (old) trade price, the trade is classified as buyer-initiated. Conversely, if the current trade price is lower than the last (old) trade price, the trade is classified as seller-initiated. If the current trade price is equal to the old trade price, the previously-determined trade classification is used (given that buy (sell) trades tend to follow previous buy (sell) and trades). Finally, if the quote-test fails and two unchanged trades are recorded back-to-back, the current trade is classified as indeterminate.
Authors Paul Asquith, Rebecca Oman, and Christopher Safaya thus caution that, despite their popularity, trade-classification algorithms tend to misclassify trades. This tendency to misclassify can bias the results of inferential academic studies as well as render trading strategies built around them unprofitable.
1 The BrozOnBonds: Academic Corner is a periodic, educational piece intended to highlight recent academic work to the trading and investing community. These articles are authored by Dr. Michael Williams, an Assistant Finance Professor in the College of Business and Public Administration at Governors State University. Dr. Williams both teaches courses and conducts research on derivative assets and their markets. For more information, please contact Dr. Williams at email@example.com or visit brozonbonds.com.