ClarityX Research Institute

Design Paper

The Self-Improving Signal

How We Designed an Alpha Engine That Compounds Its Own Intelligence

Parson Tang — April 2026


Overview

This paper expands on the signal generation layer described in Compounding Analytical Intelligence. Where that paper describes the full five-stage learning architecture across all four agents, this paper zooms into one specific component: the alpha engine — how signals are structured, validated, and designed from day one to feed machine learning.


The Contract That Makes Everything Possible

Every strategy in the alpha engine follows a single, strict contract:

Input: Price and volume data across the investment universe
Output: Signal + metadata — entry, stop, target, confidence, pattern classification

That contract is not a technical convenience. It is the architectural decision that makes the entire engine extensible, testable, and ML-ready.

Because every strategy speaks the same language, the system does not care where a signal came from. A rules-based pattern scanner and a trained machine learning model are interchangeable at the output layer. The engine treats them identically.

This has three concrete consequences:

  1. Adding a strategy requires writing one file, not rebuilding the plumbing
  2. Validating edge is systematic — every strategy runs through the same backtest harness before touching a live portfolio; strategies that do not show demonstrated edge are retired, not kept around
  3. ML can replace any strategy without changing the infrastructure around it — same input, same output, the engine does not notice the switch

Most systems are not built this way. Each strategy gets wired in differently, accumulating technical debt that eventually prevents the system from learning. The contract prevents that from the start.


Three Layers of Intelligence

The alpha engine is structured around three distinct layers. Each has a different job. Each feeds the next.

Layer One: Signal Generation

The first layer finds opportunities. Rules-based strategies today — pattern recognition, momentum structures, volatility conditions — each validated against real price history before production deployment.

The design decision that matters here: this layer is explicitly designed to be replaceable. A machine learning model trained on outcomes — what did price actually do in the 21 days after this pattern fired? — slots into the same position as any rules-based strategy. The surrounding system does not change. That is the point of the contract.

Layer Two: Signal Combination

The second layer asks a different question: not did a signal fire, but when does it matter most?

Five signals firing simultaneously is not five times better than one. Context determines value. A momentum signal in a trending macro regime is a different proposition than the same signal firing during a late-cycle transition. A technical setup with strong fundamental backing is different from one firing in isolation.

This layer learns those differences from the data the engine produces. The feature set for the combination model is:

FeatureSource
Signal type and confidenceAlpha engine output
Macro regime at time of signalMacro Agent classification
Fundamental score at time of signalFundamental Agent
Cross-signal alignmentWhether multiple strategies agree
21-day forward returnRecorded outcome

Every row in this table is produced automatically as a byproduct of running the system. No separate data pipeline required.

Layer Three: Position Intelligence

The third layer manages what happens after entry. This is where reinforcement learning is the natural fit.

The problem is precisely specified:

  • State: Entry price, current P&L, days held, regime, news sentiment
  • Actions: Hold, add, trim, or exit
  • Reward: Realized P&L when the position closes

The scanner finds entries. Position intelligence optimizes exits and sizing. Different jobs, same data pipeline. The separation matters — conflating signal generation with position management is how systems get overfit to both tasks and excel at neither.


Training Data That Builds Itself

The most operationally important property of this architecture is that the training dataset accumulates automatically — not as a separate data engineering project, but as a direct byproduct of running the system.

The schema for a single signal row looks like this:

ColumnValue
signal_idUnique identifier
strategyWhich rules-based pattern fired
entry_dateDate of signal
regime_at_signalMacro Agent classification at time of fire
fundamental_scoreFundamental Agent score at time of fire
confidenceModel-assigned confidence classification
cross_signal_alignmentNumber of strategies agreeing
return_5dActual price return, 5 days forward
return_21dActual price return, 21 days forward
outcome_labelWin / loss / neutral (populated after outcome date)

Every row is produced automatically. Every outcome populates return_5d, return_21d, and outcome_label without manual intervention. There is no separate pipeline — the clock starts on day one of production.

The result is a labeled training dataset: real signals, in real market conditions, with real outcomes attached. This is the input to Layer 2 combination models and, eventually, Layer 3 RL. It is also the data that cannot be replicated quickly — which is covered in Compounding Analytical Intelligence.


What This Is Not

It is not a black box. Every signal traces to a specific pattern, a specific set of conditions, a specific validation history. The reasoning is inspectable at every layer. This matters for institutional review.

It is not overfit to history. The validation framework is designed to surface strategies with genuine out-of-sample edge and retire those that do not hold up. The system is as rigorous about killing bad ideas as it is about scaling good ones.

It is not regime-agnostic. The architecture tracks signal performance by regime condition because strategies that work in trending markets routinely fail during macro transitions. Regime-awareness is captured at signal time, not estimated after the fact.


Status

Layer 1 — signal generation and systematic validation — is operational and running against a live universe. Layer 2 data capture is active; the feature table described above is being populated with each daily run. Layer 3 is the designed next stage, with the state and reward structure already defined in the architecture.

The moat argument — why accumulated signal history compounds into an advantage that cannot be quickly replicated — is covered in Compounding Analytical Intelligence.


An Invitation

ClarityX Research Institute builds investment intelligence systems designed for the way sophisticated allocators actually work — multi-asset, multi-regime, with the rigor that institutional mandates demand.

If you are evaluating how systematic signal generation should be structured to remain extensible, auditable, and ML-ready as markets evolve — I welcome the conversation.

Parson Tang
clarityxresearch@gmail.com


Parson Tang is the founder of ClarityX Research Institute and the architect of MARY, a production multi-agent investment intelligence system. He has over 20 years of experience in asset management and private banking, including roles at Goldman Sachs, J.P. Morgan, and Credit Suisse. He holds a Computer Science degree from the University of Southern California and an MBA from the University of Oxford. His analytical work is published at clarityxinstitute.com.