AI/ML Papers16 min read6 citations

Mixture-of-Experts Transformers for Scalable 6G Signal Processing

Dr. Yifan Chen, Prof. Deniz Gunduz

Imperial College London

Feb 8, 2026View on arXiv

Abstract

We propose a Mixture-of-Experts (MoE) transformer architecture for 6G physical layer signal processing that dynamically activates only the relevant expert sub-networks based on current channel conditions. This conditional computation approach achieves the accuracy of a dense 1B-parameter model while requiring only 200M parameters of computation per inference, enabling real-time deployment on edge hardware. Experiments on OFDM channel estimation and MIMO detection demonstrate 2-3 dB gains over standard transformers at 5x lower computational cost.

AI Summary

AI-Generated Summary
  • MoE transformer architecture for 6G PHY that conditionally activates expert sub-networks.
  • Achieves 1B-model accuracy with only 200M active parameters per inference.
  • 2-3 dB gains over standard transformers at 5x lower computational cost.
  • Validated on OFDM channel estimation and MIMO detection tasks.

Key Findings

  • 1Channel-condition-based routing outperforms random or load-balanced expert selection.
  • 2Sparse activation enables deployment on base station controllers with limited GPU memory.
  • 3MoE models generalize better across diverse deployment scenarios than dense models.

Industry Implications

Enables practical deployment of large AI models at the network edge for 6G.

Reduces the compute cost barrier for AI-native air interfaces.

Provides a scalable architecture that grows with network complexity.

MoETransformerSignal ProcessingEdge AI

Read the Original Paper

Access the full paper on arXiv for complete methodology, results, and references.

Open on arXiv

Related Papers