Local

Reinforcement learning improves execution quality in financial markets

In financial markets, the hardest decisions are often not whether to trade, but how to trade

Close,Up,Of,Notebook,With,Abstract,Glowing,Ai,Chip,On (Golden Dayz/Shutterstock / Golden Dayz)

ORLANDO, Fla. — In financial markets, the hardest decisions are often not whether to trade, but how to trade.

Algorithmic and high-frequency strategies now account for over 60% of equity trading volume in major markets, meaning much of the visible liquidity in modern markets is shaped by automated systems reacting in milliseconds. In that environment, execution quality depends not only on a trading signal, but on how well a system anticipates and responds to other models operating at similar speed.

That is where Alex Chen, founding data scientist at Ondo Finance, has focused much of his work. Through a project conducted in collaboration between UC Berkeley and a leading financial institution, Chen developed an adaptive execution framework using reinforcement learning to reduce execution costs and make market trade-offs more visible.

His analytical approach has also informed his work supporting FPV Ventures’s diligence efforts, where he assesses the growth potential and competitive positioning of emerging startups.

Most people think of machine learning in finance as a forecasting tool. How do you think about it differently?

Trading is an execution problem. A trader may know what position to take and still perform poorly if the order is handled badly. That is where machine learning becomes most meaningful to me, not at the stage of predicting what the market will do, but at the stage of deciding how to act within it.

The basic tension is between speed, price and market response. A market order prioritizes immediacy but can expose you to higher execution costs. A limit order prioritizes price but risks sitting unfilled while the market moves away. Once order size increases, the problem becomes more strategic because showing too much demand too quickly can move the market against you.

You built an adaptive execution framework at UC Berkeley around reinforcement learning. How does it work?

The framework is built around three connected elements: a limit order book simulator, a reinforcement learning agent and feature selection. Instead of relying on fixed assumptions about market impact, the system learns in a simulated environment where actions generate responses and those responses shape the next decision.

States capture both trader position and market conditions. Actions determine whether to use passive orders, aggressive orders or wait. Rewards are defined in economic terms tied to execution cost.

You use a decision tree to represent the learned policy. Why does interpretability matter?

A policy you cannot explain is a policy you cannot trust at scale.

Reinforcement learning can produce strategies that are effective in simulation but opaque in practice. If a trader or risk manager cannot understand why the system is choosing to wait rather than execute, or why it is being aggressive at a particular moment, it becomes difficult to apply judgment when conditions differ from what the model was trained on.

What do older execution models get wrong that this approach corrects?

Older models helped clarify execution by simplifying price impact and assuming cleaner market structures. They were useful for their time. But real markets are more reactive than those assumptions allow.

A model becomes more useful when it can learn from dynamic conditions, incorporate order-book information and adapt to the fact that execution itself changes the environment. Every large order leaves a mark.

Machine learning matters less because it is fashionable and more because it can respond to changing conditions with more nuance than a fixed rule. If that nuance is well grounded, it can translate into lower execution costs.

Financial services firms are expected to invest heavily in AI through 2027. What should they be building toward?

They should be building for adaptability, but adaptability that remains clear.

The industry has invested heavily in prediction, but the larger opportunity is improving how decisions are made under real market constraints. As trading and decision-making systems become more dynamic, their policies need to remain understandable and subject to human judgment.

What should practitioners understand before reaching for machine learning in execution?

Practitioners need to begin with the mechanics of the market before reaching for the machinery of the model.

They need to understand what kind of order they are placing, what information the market can infer from it, how impact is created and what trade-off they are actually managing. Only then does machine learning add real value. Without that foundation, they may be optimizing something they have not fully defined.

Markets are responsive systems, not passive backgrounds. Execution quality depends on timing, order type, liquidity and visibility.

Markets reward systems that learn without losing sight of execution itself. The closer AI gets to the real structure of action, response and consequence, the more valuable it becomes.

Brody Wooddell

Brody Wooddell, WFTV.com

Brody Wooddell is a digital journalist and media leader with more than a decade of experience in content strategy, audience growth, and digital storytelling across television and online news platforms.

0