• Conference
  • Engineering and Numerical Tools

Generating Realistic Cyber Security Datasets for IoT networks with Diverse Complex Network Properties

Conférence : Communications avec actes dans un congrès international

In the cybersecurity community, finding suitable datasets for evaluating Intrusion Detection Systems (IDS) is a challenge, particularly due to limited diversity in complex network properties. This paper proposes a dualpurpose approach that generates diverse datasets while producing efficient, compact versions that maintain detection accuracy. Our approach employs three techniques – community mixing modification, centralitybased modification, and time-based modification – each targeting specific network property adjustments while achieving significant dataset size reductions (up to 81.5%). Our approach is validated on real-world datasets, including NF-UQ-NIDS, CCD-INID-V1, and TON-IoT, demonstrating its ability to generate realistic datasets while preserving network properties, attack patterns, and structural integrity. The generated datasets exhibit diverse complex network properties, making them particularly useful for IDS technique evaluation that incorporâtes complex network measures. The reduced size and preserved accuracy (96.4%) make these datasets especially valuable for resource-constrained environments. Moreover, our approach facilitates the construction of homogeneous datasets required for federated learning situations where data distribution similarity across clients is essential. This contribution helps address both dataset scarcity and computational efficiency challenges while ensuring that the generated datasets retain the characteristics of real-world network traffic.