Vision Transformers for RF Spectrum Monitoring and Classification
Dr. Tim O'Shea, Dr. Nathan West
DeepSig Inc.
Abstract
We apply Vision Transformers (ViT) to RF spectrum monitoring by treating spectrograms as images. Our ViT-RF model classifies 24 modulation types with 98.5% accuracy at 10 dB SNR, outperforming CNN baselines by 3.2%. The attention mechanism provides interpretable visualization of which time-frequency regions drive classification decisions. The model runs at 2ms per spectrogram on edge GPU hardware, enabling real-time spectrum monitoring for 6G cognitive radio applications.
AI Summary
- Vision Transformers applied to RF spectrogram classification.
- 98.5% accuracy on 24 modulation types at 10 dB SNR, 3.2% above CNN baselines.
- Interpretable attention maps show which time-frequency features drive decisions.
- Real-time inference at 2ms per spectrogram on edge GPU.
Key Findings
- 1ViT's global attention captures inter-carrier relationships that CNNs miss.
- 2Pre-training on synthetic spectrograms improves real-world performance by 5%.
- 3Attention visualization aids in understanding and debugging classification errors.
Industry Implications
Enables AI-driven spectrum management for 6G dynamic spectrum access.
Supports regulatory enforcement and unauthorized transmitter detection.
Interpretable AI builds trust for spectrum management decisions.
Read the Original Paper
Access the full paper on arXiv for complete methodology, results, and references.
Open on arXivRelated Papers
Multi-Agent Deep Reinforcement Learning for Dynamic Spectrum Access
University of Houston — 15 citations
AI/ML PapersTransformer-Based Channel Estimation for Massive MIMO Systems
Tsinghua University — 12 citations
AI/ML PapersFederated Reinforcement Learning for Distributed Network Optimization
Stanford University — 8 citations