Editors' picks
OPEN-SOURCE SCRIPT

Market Microstructure Analytics

12 729
The Hidden Toll on Every Trade

Every time you buy or sell a financial instrument, you pay a cost that never appears on your brokerage statement. It is not a commission. It is not a fee. It is the spread between the price at which someone is willing to sell to you and the price at which someone is willing to buy from you. That gap, measured in ticks, basis points, or fractions of a percent, is the bid-ask spread. Over a single trade it looks small. Over thousands of trades, across a year, for a fund managing billions, it compounds into one of the most significant sources of performance drag in all of finance.

For decades, institutional traders have measured this cost obsessively. Research desks at hedge funds and investment banks have dedicated entire teams to understanding when spreads are wide, why they widen, who is causing them to widen, and what that signal implies about the near-term behaviour of a market. Retail traders, however, have had almost no access to this kind of analysis. The reason is simple: measuring the bid-ask spread in real time requires access to the order book, tick-by-tick trade data, and quote data that most platforms either do not provide or lock behind expensive data terminals.

This indicator changes that. Using only the OHLCV data that every chart on TradingView already contains, it reconstructs spread estimates and liquidity conditions through seven statistically validated models drawn directly from the academic market microstructure literature. It cannot replicate what a full order book feed provides, and the documentation is explicit about where the approximations are. But it gets considerably closer than anything available to the typical chart-based trader, and on short intraday charts it delivers information that is genuinely useful for both execution decisions and regime assessment.


What Market Microstructure Actually Measures


Market microstructure is the academic field that studies how prices are formed at the level of individual transactions. Its central question is not where a price will go tomorrow but how the mechanics of trading itself affect price formation right now. Two papers published decades apart established the framework this indicator builds on.

The first was by Roll (1984), who noticed something elegant: in an efficient market, the prices of consecutive trades should not be correlated with each other, because any predictability would be arbitraged away. But if you look at actual trade-by-trade price changes, you consistently find negative autocorrelation. Prices bounce back and forth. The reason, Roll argued, is the bid-ask spread itself. Buyers trade at the ask and sellers at the bid, so consecutive trades alternate between two price levels. This bouncing creates a predictable negative covariance in price changes, and the size of that covariance is directly related to the size of the spread. From this insight he derived the formula S = 2 times the square root of the negative covariance of consecutive price changes. If you observe a series of trades and measure how negatively they correlate with each other, you can back out the spread without ever seeing a quote.

The second foundational contribution came from Kyle (1985), who approached the problem from a completely different angle. He asked: if a market contains some traders who have private information about the true value of an asset, how do their orders affect price? His answer was the lambda coefficient, a measure of how much the price moves per unit of net order flow. A high lambda means the market is thin and informed: each additional unit of buying or selling pushes the price significantly. A low lambda means the market absorbs flow without moving much. Lambda is not just a spread measure; it is a measure of how much information asymmetry exists in the market at any given moment. This is the adverse selection component of the spread, and it is arguably the most strategically useful signal the indicator produces.


The Spread Estimators

The first layer of computation produces four distinct estimates of the bid-ask spread, each using a different statistical approach.

The Roll (1984) estimator is the oldest and most widely cited. It computes the rolling covariance between a price change and the price change that came before it, then takes two times the square root of the negative of that covariance. One important detail: Roll's model is defined in terms of absolute price changes, not log-returns. Using log-returns introduces a scaling distortion tied to the price level of the asset, which biases the spread estimate upward at high prices. This implementation correctly uses delta-P throughout.

The Corwin-Schultz (2012) estimator takes a fundamentally different approach. Rather than looking at the serial structure of price changes, it uses the high-low range of a bar. The core insight is that the high price of any trading period is most likely a transaction that occurred at the ask, while the low price is most likely a transaction at the bid. If you look at a two-period window, the combined high-low range reflects the true price variance over those two periods plus the spread component. A single-period range conflates variance and spread; the two-period structure allows them to be separated algebraically. The resulting formula involves a decomposition using the constant k = 3 minus 2 times the square root of 2, which emerges from the statistical properties of the high-low range under continuous diffusion. Corwin and Schultz (2012) validated this estimator extensively against actual quoted spreads across thousands of US equities and found it performs well both in cross-section and over time.

The Abdi-Ranaldo (2017) estimator is the most recent of the three and, in empirical tests, the most stable. For each bar, it computes a quantity called c, defined as the log of the close price minus the average of the log-high and log-low. This is the signed deviation of the close price from the geometric midpoint of the bar's range, expressed in log-space. Abdi and Ranaldo proved that the expected value of the product of c at time t and c at time t plus one equals negative one quarter of the spread squared. This means that by measuring how negatively c correlates with the next period's c, you can recover the spread. The estimator inherits much of the intuition of Roll but anchors itself to the intrabar price range rather than the close-to-close change, which tends to reduce noise substantially. To handle cases where the high and low are identical, which occurs on 1-tick bars or extremely liquid instruments, the implementation excludes invalid pairs from the covariance calculation rather than substituting zeros, which would bias the estimate toward zero.

The effective spread proxy takes yet another approach. Rather than estimating the quoted spread, it attempts to estimate the effective spread, which is the actual cost paid by a specific trade. The formula is two times the trade direction multiplied by the distance between the transaction price and the quote midpoint. Trade direction is approximated using the tick rule, which assigns a positive sign to transactions at prices higher than the previous price and a negative sign to those at lower prices, carrying the previous sign forward when the price is unchanged. This classification method was formalised by Lee and Ready (1991) and remains the standard approach for assigning direction when quote data is unavailable. The bar midpoint substitutes for the true quote midpoint, which introduces a systematic upward bias because the high and low of a bar are extreme transaction prices, not quotes. The effective spread proxy is therefore most reliable as a relative indicator of whether transaction costs are rising or falling, rather than as an absolute estimate of the quoted spread.


The Liquidity Metrics

The second layer moves beyond spread estimation into broader liquidity measurement. The key distinction is this: the spread tells you what it costs to execute one trade right now. Liquidity metrics tell you something about the structure of the market, how deep it is, how much information is embedded in the current order flow, and how efficiently prices are absorbing volume.

The Amihud (2002) illiquidity ratio is the most widely used liquidity measure in the academic asset pricing literature. Its construction is conceptually simple: it divides the absolute value of a log return by the dollar volume of trading in the same period. What this measures is price impact per dollar traded. If a stock moves one percent and 10 million dollars changed hands, the ratio is small. If the same one percent move happened on only 50,000 dollars of volume, the ratio is large, indicating a thin market where small amounts of capital move prices significantly. Unlike the spread measures, which capture the cost of a single round trip, the Amihud ratio captures market depth. This implementation uses dollar volume rather than share or contract volume, which is the correct specification for comparability across instruments at different price levels. The ratio is scaled by a factor of 100 million for display purposes; its absolute level is asset-dependent and should always be interpreted relative to the instrument's own history.

Kyle lambda, estimated here via ordinary least squares regression of price changes on signed volume, is the most theoretically sophisticated metric in the indicator. Each bar's signed volume is the total volume signed by the tick rule direction: positive if the bar closed higher than the previous bar, negative if it closed lower. The regression coefficient from regressing price changes on this signed volume is the lambda estimate. A high positive lambda means prices are moving more than expected for the amount of flow being absorbed, which is the signature of informed trading. When lambda rises, someone in the market likely knows something that others do not, and market makers are widening their spreads in response. The critical implementation detail here is that the volume must not be normalised before the regression. Normalising the signed volume changes the regression coefficient from a price-impact-per-share measure to a dimensionless sensitivity measure, which is a different quantity and does not correspond to Kyle's original model.

The Parkinson (1980) range-based volatility estimator serves a supporting role: it estimates intrabar variance from the high-low range using the formula sigma-squared equals one over four times the natural log of two, multiplied by the square of the log ratio of high to low. This estimator is approximately five times more statistically efficient than the classic close-to-close variance estimator for the same number of observations (Parkinson 1980). Its role in this indicator is to help decompose the high-low range: the range reflects both volatility and the spread, and the ratio of the composite spread estimate to the Parkinson volatility tells you which component is dominant at any given time.


The Composite and the Regime System

Having computed multiple independent estimates of the spread, the natural question is how to combine them. Simple averaging is theoretically suboptimal when the estimators have different levels of noise. The precision-weighted composite assigns each estimator a weight inversely proportional to its robust variance, so that noisier estimators contribute less to the final reading.

The key word is robust. Rather than computing standard rolling variance, which is dominated by extreme observations and can make a normally well-behaved estimator look unreliable for weeks after a single outlier bar, this implementation uses a variance estimator based on the Median Absolute Deviation, or MAD. The MAD is the median of the absolute deviations from the rolling median. Multiplied by the consistency factor 1.4826, it provides an equivalent to the standard deviation that is resistant to outliers with a breakdown point of 0.5, meaning up to half the observations in a window can be extreme values without corrupting the estimate. This approach follows Rousseeuw and Croux (1993), who established the formal properties of MAD-based scale estimators.

Two further safeguards stabilise the weights. A ridge regularisation term, set to five percent of the mean robust variance across active estimators, prevents any weight from exploding toward infinity when an estimator is temporarily near-constant. And a weight cap, set by default at 70 percent of the total, prevents any single estimator from dominating the composite during regimes where it happens to be locally smooth. The live weights are displayed in the dashboard so the user can always see how the composite is currently distributed.

The regime detection system answers the question of whether the current spread level is historically unusual. This is done through a robust z-score: the composite spread is compared to its rolling median, and the deviation is normalised by the MAD. The result is a standardised score that tells you how many robust standard deviations the current spread is from its recent typical level. A score of two or above signals a statistically unusual widening event. The same procedure is applied independently to the Amihud illiquidity ratio and to the absolute value of Kyle lambda.

These three scores are then combined into the Liquidity Stress Index, computed as their equal-weighted average after each component has been winsorised at plus or minus three robust standard deviations. The winsorisation prevents a single extreme reading in one dimension from overwhelming the composite. Each component is then winsorised before averaging to prevent a single extreme dimension from dominating. The result is mapped to a zero-to-100 scale using the hyperbolic tangent function, where 50 represents neutral conditions, readings in the 65 to 80 range indicate elevated stress, and readings above 80 indicate severe stress across multiple liquidity dimensions simultaneously.


Practical Use Cases

For a retail trader, the most immediately useful output is the composite spread and its regime classification. When the composite spread widens and the regime indicator shifts to Elevated or Stress, entering a new position becomes more expensive than usual. On illiquid instruments this widening can be dramatic, consuming a significant fraction of the expected profit in transaction costs before the trade even begins. Conversely, when spreads are compressed, the market is functioning efficiently and execution is cheap. Timing entries and exits around spread conditions is a simple, evidence-based way to reduce the invisible drag that erodes returns over time.

The spread trend indicator, which compares a five-period exponential moving average of the composite spread against a twenty-period average, provides a simple directional signal. A widening trend often precedes a period of higher volatility, lower liquidity, or increased uncertainty. This does not tell you which direction the price will move, but it tells you that the environment is becoming less predictable and more costly to trade, which is operationally important information.

For professional traders and systematic strategy developers, the Kyle lambda signal has specific applications. When lambda is elevated relative to its own history, which the dashboard displays as the adverse selection z-score, it indicates that price changes are disproportionate to the measured order flow. This is consistent with the presence of informed traders, a phenomenon central to the theoretical work of Kyle (1985) and Glosten and Milgrom (1985). Elevated adverse selection is one of the clearest early warning signs of an impending directional move driven by asymmetric information, such as pre-announcement positioning, earnings whispers, or macroeconomic data leakage.

The Amihud illiquidity ratio is particularly valuable for cross-asset comparisons and for monitoring the liquidity conditions of a specific instrument over time. Portfolio managers can use it to time their entries and exits in less liquid securities: entering when illiquidity is below its historical median and exiting before a period of known low liquidity such as a holiday period or low-volume session. Research by Amihud and Mendelson (1986) established that expected returns are positively correlated with illiquidity, meaning that investors demand higher compensation for holding assets where transaction costs are high. The illiquidity z-score in this indicator allows that premium to be tracked in real time.

The spread-to-volatility ratio is a metric that practitioners familiar with the work of Corwin and Schultz (2012) will recognise immediately. It expresses the composite spread as a percentage of the Parkinson volatility estimate. When this ratio is high, the spread accounts for a large fraction of the observed price range, which typically indicates a market where market makers are cautious and price discovery is slow. When it is low, the price range is driven primarily by genuine information, not by the mechanics of the spread. This ratio is useful for distinguishing between a volatile and actively traded market, which is generally healthy, and a wide-spread market that looks volatile but is actually just illiquid.

The Liquidity Stress Index in its scaled zero-to-100 form provides an accessible summary for traders who do not want to track multiple metrics simultaneously. During normal market conditions the reading sits near 50. When all three components, the spread, the illiquidity ratio, and the adverse selection estimate, are simultaneously elevated relative to their own histories, the index rises sharply. The historical examples of this pattern occurring together include the flash crash of May 2010, the August 2015 China-driven volatility spike, the COVID-19 crash of March 2020, and various cryptocurrency deleveraging events. In each case, the simultaneous widening of spreads, collapse of market depth, and spike in price impact coefficients preceded the most severe price dislocations by enough time to be actionable.


Configuration and Settings

The Estimation Window controls the rolling window for all covariance and liquidity calculations. A shorter window, around 10 to 20 bars, makes the estimators more responsive to recent changes but increases noise. A longer window, around 50 bars, produces smoother estimates that better reflect structural conditions but lag more. The default of 20 is a reasonable starting point for most intraday timeframes.

The EMA Smoothing parameter applies an exponential moving average to each raw spread estimate before it is used in the composite and displayed on the chart. This reduces bar-to-bar noise without introducing the same lag that a longer estimation window would create. Setting it to 1 disables smoothing entirely, which is useful for research purposes but not for trading.

The Regime Window determines how far back the robust z-scores look when assessing whether current conditions are unusual. A setting of 100 means the indicator asks whether the current spread is unusual relative to the last 100 bars. For daily charts, 100 bars is approximately five months of trading. For tick charts, it represents the most recent 100 tick bars. This parameter should be set large enough to capture at least one full market cycle of the relevant timeframe.

The Maximum Composite Weight prevents any single estimator from being assigned more than the specified fraction of total weight. The default of 70 percent is conservative; in practice, during regimes where all three estimators agree and produce similar variances, the weights tend to distribute fairly evenly. The cap becomes most important when one estimator is temporarily quiet and its MAD-based variance falls to near zero, which would otherwise assign it almost all the weight.

The LSI Winsorisation Cap limits the influence of extreme readings in any single component before they contribute to the Liquidity Stress Index. At the default of three robust standard deviations, a reading of ten, which would represent a truly exceptional event, contributes the same as a reading of three. This prevents a single data anomaly or calculation artifact from permanently elevating the stress index.


Structural Limitations

No representation is made that these outputs are equivalent to actual exchange quote data. They are not. TradingView provides bars, not tick-by-tick trades, and the academic models on which this indicator is based were developed for transaction-level data. The Roll estimator assumes that each observation is a single trade; when a bar aggregates hundreds or thousands of trades, the covariance structure it observes is a convolution of many individual trade-level covariances, and the result understates the true spread. This bias grows with bar duration and trade frequency. On 1-tick or 5-tick bars the bias is minimal; on daily bars it can be substantial.

The tick rule classification, which assigns trade direction to bars and underpins both the Kyle lambda and effective spread estimates, was designed for individual trades. Applied to the close price of aggregated bars, it misclassifies a material fraction of bars. Ellis, Michaely and O'Hara (2000) documented misclassification rates of 30 to 50 percent on daily stock data. On short intraday bars the performance is better, but it never reaches the accuracy achievable with actual quote data.

The rolling MAD computation is a streaming approximation to the exact finite-window MAD. In a stationary process the difference is negligible and the heavy-tail robustness property is preserved. In rapidly changing regimes the approximation introduces a small second-order error that does not materially affect the interpretation of the outputs.

The effective spread proxy suffers from a systematic upward bias because it uses the bar midpoint rather than the true quote midpoint. This bias is largest when the intrabar range is wide relative to the actual spread, which is precisely when the estimate is most needed. On very short tick bars the range collapses toward the actual spread and the bias diminishes, but on longer bars the effective spread reading should be treated as an upper bound rather than a point estimate.

Pine Script v6 introduced the built-in variables bid and ask, which return the current best bid and ask prices from a connected broker feed when accessed on the 1-tick timeframe via request.security(syminfo.tickerid, "1T", bid) and request.security(syminfo.tickerid, "1T", ask). This is a genuine improvement over bar-based proxies for the single most recent bar. However, these variables carry three constraints that prevent them from replacing the statistical estimators in this indicator. First, they carry no historical record: the values exist only at the current bar and return na on all prior bars, which makes it impossible to compute rolling covariances, MAD-based z-scores, or any of the regime detection logic that requires a lookback window. Second, the data is only available through a live broker connection on TradingView. Users on free accounts, paper trading environments, or instruments not covered by their connected broker will receive na throughout. Third, instrument coverage is uneven: major forex pairs, selected cryptocurrency pairs on exchanges such as Binance, and equities through brokers such as Interactive Brokers are generally supported, but futures, CFDs on many instruments, and equities through data-only feeds often return no data. The statistical estimators in this indicator therefore remain the primary analytical engine. If a broker connection is active, the live bid-ask spread retrieved via these built-in variables can serve as a real-time reference point to validate whether the rolling estimates are in a plausible range for the current session, but it cannot contribute to the historical signal calculations.

None of the outputs should be used as the sole basis for any trading decision.


References

Abdi, F. & Ranaldo, A. (2017) A Simple Estimation of Bid-Ask Spreads from Daily Close, High, and Low Prices. Review of Financial Studies, 30(12).

Amihud, Y. (2002) Illiquidity and Stock Returns: Cross-Section and Time-Series Effects. Journal of Financial Markets, 5(1), 31-56.

Amihud, Y. & Mendelson, H. (1986) Asset Pricing and the Bid-Ask Spread. Journal of Financial Economics, 17(2).

Corwin, S.A. & Schultz, P. (2012) A Simple Way to Estimate Bid-Ask Spreads from Daily High and Low Prices. Journal of Finance, 67(2).

Ellis, K., Michaely, R. & O'Hara, M. (2000) The Accuracy of Trade Classification Rules: Evidence from Nasdaq. Journal of Financial and Quantitative Analysis, 35(4).

Glosten, L.R. & Milgrom, P.R. (1985) Bid, Ask and Transaction Prices in a Specialist Market with Heterogeneously Informed Traders. Journal of Financial Economics, 14(1).

Hasbrouck, J. (2009) Trading Costs and Returns for U.S. Equities: Estimating Effective Costs from Daily Data. Journal of Finance, 64(3).

Kyle, A.S. (1985) Continuous Auctions and Insider Trading. Econometrica, 53(6).

Lee, C.M.C. & Ready, M.J. (1991) Inferring Trade Direction from Intraday Data. Journal of Finance, 46(2).

Parkinson, M. (1980) The Extreme Value Method for Estimating the Variance of the Rate of Return. Journal of Business, 53(1).

Roll, R. (1984) A Simple Implicit Measure of the Effective Bid-Ask Spread in an Efficient Market. Journal of Finance, 39(4).

Rousseeuw, P.J. & Croux, C. (1993) Alternatives to the Median Absolute Deviation. Journal of the American Statistical Association, 88(424).

Disclaimer

The information and publications are not meant to be, and do not constitute, financial, investment, trading, or other types of advice or recommendations supplied or endorsed by TradingView. Read more in the Terms of Use.