Design Paper
The Self-Improving Signal
How We Designed an Alpha Engine That Compounds Its Own Intelligence
Parson Tang — April 2026
Overview
This paper expands on the signal generation layer described in Compounding Analytical Intelligence. Where that paper describes the full five-stage learning architecture across all four agents, this paper zooms into one specific component: the alpha engine — how signals are structured, validated, and designed from day one to feed machine learning.
The Contract That Makes Everything Possible
Every strategy in the alpha engine follows a single, strict contract:
Input: Price and volume data across the investment universe
Output: Signal + metadata — entry, stop, target, confidence, pattern classification
That contract is not a technical convenience. It is the architectural decision that makes the entire engine extensible, testable, and ML-ready.
Because every strategy speaks the same language, the system does not care where a signal came from. A rules-based pattern scanner and a trained machine learning model are interchangeable at the output layer. The engine treats them identically.
This has three concrete consequences:
- Adding a strategy requires writing one file, not rebuilding the plumbing
- Validating edge is systematic — every strategy runs through the same backtest harness before touching a live portfolio; strategies that do not show demonstrated edge are retired, not kept around
- ML can replace any strategy without changing the infrastructure around it — same input, same output, the engine does not notice the switch
Most systems are not built this way. Each strategy gets wired in differently, accumulating technical debt that eventually prevents the system from learning. The contract prevents that from the start.
Three Layers of Intelligence
The alpha engine is structured around three distinct layers. Each has a different job. Each feeds the next.
Layer One: Signal Generation
The first layer finds opportunities. Rules-based strategies today — pattern recognition, momentum structures, volatility conditions — each validated against real price history before production deployment.
The design decision that matters here: this layer is explicitly designed to be replaceable. A machine learning model trained on outcomes — what did price actually do in the 21 days after this pattern fired? — slots into the same position as any rules-based strategy. The surrounding system does not change. That is the point of the contract.
Layer Two: Signal Combination
The second layer asks a different question: not did a signal fire, but when does it matter most?
Five signals firing simultaneously is not five times better than one. Context determines value. A momentum signal in a trending macro regime is a different proposition than the same signal firing during a late-cycle transition. A technical setup with strong fundamental backing is different from one firing in isolation.
This layer learns those differences from the data the engine produces. The feature set for the combination model is:
| Feature | Source |
|---|---|
| Signal type and confidence | Alpha engine output |
| Macro regime at time of signal | Macro Agent classification |
| Fundamental score at time of signal | Fundamental Agent |
| Cross-signal alignment | Whether multiple strategies agree |
| 21-day forward return | Recorded outcome |
Every row in this table is produced automatically as a byproduct of running the system. No separate data pipeline required.
Layer Three: Position Intelligence
The third layer manages what happens after entry. This is where reinforcement learning is the natural fit.
The problem is precisely specified:
- State: Entry price, current P&L, days held, regime, news sentiment
- Actions: Hold, add, trim, or exit
- Reward: Realized P&L when the position closes
The scanner finds entries. Position intelligence optimizes exits and sizing. Different jobs, same data pipeline. The separation matters — conflating signal generation with position management is how systems get overfit to both tasks and excel at neither.
Training Data That Builds Itself
The most operationally important property of this architecture is that the training dataset accumulates automatically — not as a separate data engineering project, but as a direct byproduct of running the system.
The schema for a single signal row looks like this:
| Column | Value |
|---|---|
signal_id | Unique identifier |
strategy | Which rules-based pattern fired |
entry_date | Date of signal |
regime_at_signal | Macro Agent classification at time of fire |
fundamental_score | Fundamental Agent score at time of fire |
confidence | Model-assigned confidence classification |
cross_signal_alignment | Number of strategies agreeing |
return_5d | Actual price return, 5 days forward |
return_21d | Actual price return, 21 days forward |
outcome_label | Win / loss / neutral (populated after outcome date) |
Every row is produced automatically. Every outcome populates return_5d, return_21d, and outcome_label without manual intervention. There is no separate pipeline — the clock starts on day one of production.
The result is a labeled training dataset: real signals, in real market conditions, with real outcomes attached. This is the input to Layer 2 combination models and, eventually, Layer 3 RL. It is also the data that cannot be replicated quickly — which is covered in Compounding Analytical Intelligence.
What This Is Not
It is not a black box. Every signal traces to a specific pattern, a specific set of conditions, a specific validation history. The reasoning is inspectable at every layer. This matters for institutional review.
It is not overfit to history. The validation framework is designed to surface strategies with genuine out-of-sample edge and retire those that do not hold up. The system is as rigorous about killing bad ideas as it is about scaling good ones.
It is not regime-agnostic. The architecture tracks signal performance by regime condition because strategies that work in trending markets routinely fail during macro transitions. Regime-awareness is captured at signal time, not estimated after the fact.
Status
Layer 1 — signal generation and systematic validation — is operational and running against a live universe. Layer 2 data capture is active; the feature table described above is being populated with each daily run. Layer 3 is the designed next stage, with the state and reward structure already defined in the architecture.
The moat argument — why accumulated signal history compounds into an advantage that cannot be quickly replicated — is covered in Compounding Analytical Intelligence.
An Invitation
ClarityX Research Institute builds investment intelligence systems designed for the way sophisticated allocators actually work — multi-asset, multi-regime, with the rigor that institutional mandates demand.
If you are evaluating how systematic signal generation should be structured to remain extensible, auditable, and ML-ready as markets evolve — I welcome the conversation.
Parson Tang
clarityxresearch@gmail.com
Parson Tang is the founder of ClarityX Research Institute and the architect of MARY, a production multi-agent investment intelligence system. He has over 20 years of experience in asset management and private banking, including roles at Goldman Sachs, J.P. Morgan, and Credit Suisse. He holds a Computer Science degree from the University of Southern California and an MBA from the University of Oxford. His analytical work is published at clarityxinstitute.com.