Zichen Luo | Machine Learning

NeonBench: LLM Architecture Research

NeonBench is my experimental framework for rapidly iterating over novel Language Model architectures, specifically targeting sub-10M parameter bounds (primarily 5.0M non-embedding parameters) to test architectural efficiency before scaling up. Below is a summary of the core concepts I've been researching:

1. Quasi-Encoder Fusion & SplitBrain Attention

The SplitBrain architecture operates by bifurcating the standard causal attention matrix. By processing a "causal stream" and a "lookahead stream" (which is allowed to see future tokens up to a certain sequence depth, acting as a Quasi-Encoder), the network achieves deep bidirectional context within a primarily autoregressive framework.

Recent experiments fine-tuning the lookahead ratio (e.g., 25%, 50%, 100%) against parameter budgets have yielded fascinating trade-offs between training speed and long-context comprehension on rigorous tasks.

2. Learnt Intent Streaming

A novel concept where an independent set of learned residual states—the "intent stream"—flows alongside the primary token representation stream. Rather than discarding sequence representations at each layer, the intent stream aggregates contextual meaning, which is then fed back into the query projections of subsequent attention layers.

We observed that adding this explicit parallel intent stream prevents representation collapse in deep networks and provides robust multi-token reasoning capabilities natively within the forward pass.

3. Continuous Ablation Studies

I enforce strict parameter-parity bounds (e.g. 5,004,528 params) across all experimental models. We dynamically adjust d_ff dimensions to compensate for any newly added parameters when introducing SplitBrain Lookahead, Intent Streams, or Convolutional Mixing layers.

Phase 6: Validating deep lookahead mask structures (50% vs 100% visibility).
Phase 7: Combining lookahead with Learnt Intent.
Phase 8 & 9 (Current): Pure intent-only and convolution-only ablations to definitively prove the source of representation supremacy.

View Scripts on GitHub Launch NeonCore UI