Design Paper

The Self-Improving Signal

How We Designed an Alpha Engine That Compounds Its Own Intelligence

Parson Tang — April 2026

Overview

This paper expands on the signal generation layer described in Compounding Analytical Intelligence. Where that paper describes the full five-stage learning architecture across all four agents, this paper zooms into one specific component: the alpha engine — how signals are structured, validated, and designed from day one to feed machine learning.

The Contract That Makes Everything Possible

Every strategy in the alpha engine follows a single, strict contract:

Input: Price and volume data across the investment universe
Output: Signal + metadata — entry, stop, target, confidence, pattern classification

That contract is not a technical convenience. It is the architectural decision that makes the entire engine extensible, testable, and ML-ready.

Because every strategy speaks the same language, the system does not care where a signal came from. A rules-based pattern scanner and a trained machine learning model are interchangeable at the output layer. The engine treats them identically.

This has three concrete consequences:

Adding a strategy requires writing one file, not rebuilding the plumbing
Validating edge is systematic — every strategy runs through the same backtest harness before touching a live portfolio; strategies that do not show demonstrated edge are retired, not kept around
ML can replace any strategy without changing the infrastructure around it — same input, same output, the engine does not notice the switch

Most systems are not built this way. Each strategy gets wired in differently, accumulating technical debt that eventually prevents the system from learning. The contract prevents that from the start.

Three Layers of Intelligence

The alpha engine is structured around three distinct layers. Each has a different job. Each feeds the next.

Layer One: Signal Generation

The first layer finds opportunities. Rules-based strategies today — pattern recognition, momentum structures, volatility conditions — each validated against real price history before production deployment.

The design decision that matters here: this layer is explicitly designed to be replaceable. A machine learning model trained on outcomes — what did price actually do in the 21 days after this pattern fired? — slots into the same position as any rules-based strategy. The surrounding system does not change. That is the point of the contract.

Layer Two: Signal Combination

The second layer asks a different question: not did a signal fire, but when does it matter most?

Five signals firing simultaneously is not five times better than one. Context determines value. A momentum signal in a trending macro regime is a different proposition than the same signal firing during a late-cycle transition. A technical setup with strong fundamental backing is different from one firing in isolation.

This layer learns those differences from the data the engine produces. The feature set for the combination model is:

Feature	Source
Signal type and confidence	Alpha engine output
Macro regime at time of signal	Macro Agent classification
Fundamental score at time of signal	Fundamental Agent
Cross-signal alignment	Whether multiple strategies agree
21-day forward return	Recorded outcome

Every row in this table is produced automatically as a byproduct of running the system. No separate data pipeline required.

Layer Three: Position Intelligence

The third layer manages what happens after entry. This is where reinforcement learning is the natural fit.

The problem is precisely specified:

State: Entry price, current P&L, days held, regime, news sentiment
Actions: Hold, add, trim, or exit
Reward: Realized P&L when the position closes

The scanner finds entries. Position intelligence optimizes exits and sizing. Different jobs, same data pipeline. The separation matters — conflating signal generation with position management is how systems get overfit to both tasks and excel at neither.

Training Data That Builds Itself

The most operationally important property of this architecture is that the training dataset accumulates automatically — not as a separate data engineering project, but as a direct byproduct of running the system.

The schema for a single signal row looks like this:

Column	Value
`signal_id`	Unique identifier
`strategy`	Which rules-based pattern fired
`entry_date`	Date of signal
`regime_at_signal`	Macro Agent classification at time of fire
`fundamental_score`	Fundamental Agent score at time of fire
`confidence`	Model-assigned confidence classification
`cross_signal_alignment`	Number of strategies agreeing
`return_5d`	Actual price return, 5 days forward
`return_21d`	Actual price return, 21 days forward
`outcome_label`	Win / loss / neutral (populated after outcome date)

Every row is produced automatically. Every outcome populates return_5d, return_21d, and outcome_label without manual intervention. There is no separate pipeline — the clock starts on day one of production.

The result is a labeled training dataset: real signals, in real market conditions, with real outcomes attached. This is the input to Layer 2 combination models and, eventually, Layer 3 RL. It is also the data that cannot be replicated quickly — which is covered in Compounding Analytical Intelligence.

What This Is Not

It is not a black box. Every signal traces to a specific pattern, a specific set of conditions, a specific validation history. The reasoning is inspectable at every layer. This matters for institutional review.

It is not overfit to history. The validation framework is designed to surface strategies with genuine out-of-sample edge and retire those that do not hold up. The system is as rigorous about killing bad ideas as it is about scaling good ones.

It is not regime-agnostic. The architecture tracks signal performance by regime condition because strategies that work in trending markets routinely fail during macro transitions. Regime-awareness is captured at signal time, not estimated after the fact.

Status

Layer 1 — signal generation and systematic validation — is operational and running against a live universe. Layer 2 data capture is active; the feature table described above is being populated with each daily run. Layer 3 is the designed next stage, with the state and reward structure already defined in the architecture.

The moat argument — why accumulated signal history compounds into an advantage that cannot be quickly replicated — is covered in Compounding Analytical Intelligence.

An Invitation

ClarityX Research Institute builds investment intelligence systems designed for the way sophisticated allocators actually work — multi-asset, multi-regime, with the rigor that institutional mandates demand.

If you are evaluating how systematic signal generation should be structured to remain extensible, auditable, and ML-ready as markets evolve — I welcome the conversation.

Parson Tang
research@clarityxinstitute.com

Parson Tang is the founder of ClarityX Research Institute and the architect of MARY, a production multi-agent investment intelligence system. He has over 20 years of experience in asset management and private banking, including roles at Goldman Sachs, J.P. Morgan, and Credit Suisse. He holds a Computer Science degree from the University of Southern California and an MBA from the University of Oxford. His analytical work is published at clarityxinstitute.com.

← All Publications