Merit Logo Merit IconSet
Skip to content

CLASSNET: Community Labeling and Sharing of Security and Networking Test datasets

The Community Labeling and Sharing of Security and Networking Test datasets (CLASSNET) project will provide new, labeled, rich and diverse datasets to the research community to support network and security research. The project will develop a framework for collaborative, community-driven enrichment and labeling of data, enabling use of these datasets for machine learning (ML) in networking and security. Furthermore, the CLASSNET project will make data available to researchers through multiple methods, ensuring privacy of data while enabling flexible data computation. The project will also generate diverse continuous (constantly, automatically updated) and curated (selected by human) datasets for research use.

CLASSNET project will innovate in dimensions of data labeling, data distribution and data sources. In data labeling, the CLASSNET collaborative framework will provide a low-friction framework for sharing annotations among researchers. The framework will incentivize labeling with feedback mechanisms and user credits, and support bulk, automatic, algorithmic labeling. In data distribution, CLASSNET will support multiple ways of data access, ranging from downloading anonymized data to processing data in the cloud, on provider machines or via the code-to-data approach. Finally, CLASSNET data sources will provide new, diverse, continuous, and curated datasets that are useful for network and security research, including traffic packets and flows, network telescope data, Domain Name System (DNS) data and Internet topology data.

The immediate impact of this project will include new types of labeled, curated and continuous datasets that enable new security, networking, and ML research and education, impacting a large community. The broader impact of this data will be to foster research and education, which will make the Internet safer, more stable, and more secure, and will increase the community’s knowledge about the Internet. With the Internet’s importance for tele-work, tele-medicine, remote learning, e-commerce and e-government, these improvements will have a broad societal impact. In addition, CLASSNET datasets will support data-driven exercises for graduate and undergraduate education, and new PhD research. CLASSNET project’s innovations in multiple pathways to data access, combined with The automated and incentivized enrichment framework, will improve the state-of-the-art for responsible data sharing in related disciplines of information technology.

Data from CLASSNET will be made available to researchers at no cost, and used to support education and research. Datasets can be requested by visiting: https://comunda.isi.edu/

Support: CLASSNET is supported by NSF/CISE as an NSF CRI-8115780 grant. CLASSNET is a joint effort of USC/ISI and Merit Network, Inc. See also: Community Labeling and Sharing of Security and Networking Test datasets (CLASSNET)

Publications:

Labeling Network Telescope Data: Challenges and New Directions
By Michalis Kallitsis, DINR 2023. Presentation slides

Detecting and Interpreting Changes in Scanning Behavior in Large Network Telescopes
By Michalis Kallitsis, Rupesh Prajapati, Vasant Honavar, Dinghao Wu, John Yen, IEEE Transactions on Information Forensics and Security, October 2022.

AMON-SENSS: Scalable and Accurate Detection of Volumetric DDoS Attacks at ISPs
By Rajat Tandon, Pithayuth Charnsethikul, Michalis Kallitsis, Jelena Mirkovic, GLOBECOM 2022-2022 IEEE Global Communications Conference.

Collecting, Labeling, and Using Networking Data: the Intersection of AI and Networking
By John Heidemann, Jelena Mirkovic, Wes Hardaker and Michalis Kallitsis, NSF Workshop on AI for Networking, Virtual Event, Oct. 2021