Diffusion-Based Generative Models for Synthetic Network Traffic Generation
Dr. Guillaume Chevalier, Prof. Jiayu Zhou
Michigan State University
Abstract
We develop a denoising diffusion probabilistic model (DDPM) for generating realistic synthetic network traffic data. The model captures complex temporal correlations, long-range dependencies, and multi-variate relationships in network KPIs. Synthetic data generated by our model passes 95% of statistical fidelity tests and, when used for training, improves downstream ML model performance by 18% in data-scarce scenarios. This enables operators to develop and test AI solutions without exposing proprietary network data.
AI Summary
- Diffusion model generating realistic synthetic network traffic data.
- Passes 95% of statistical fidelity tests for realism.
- 18% improvement in downstream ML when used for training augmentation.
- Enables AI development without exposing proprietary network data.
Key Findings
- 1Diffusion models capture multi-variate network KPI relationships better than GANs.
- 2Conditional generation enables creating data for specific network conditions.
- 3Privacy analysis confirms synthetic data does not memorize individual records.
Industry Implications
Solves the data scarcity problem for telecom AI development.
Enables secure AI model benchmarking and sharing between organizations.
Could create a marketplace for synthetic network data.
Read the Original Paper
Access the full paper on arXiv for complete methodology, results, and references.
Open on arXivRelated Papers
Federated Split Learning for Privacy-Preserving AI in Multi-Operator Networks
University of Hong Kong / Imperial College London — 10 citations
AI + Network PapersAI-Native Air Interface Design: End-to-End Learning for 6G Physical Layer
University of Stuttgart — 41 citations
AI + Network PapersDigital Twin Networks: AI-Driven Real-Time Network Simulation for 6G
Oulu University / Ruhr University Bochum — 29 citations