Abstract:In recent years, many high-quality datasets have supported the rapid development of deep learning in the field of computer vision, speech and natural language processing. Nevertheless, there is still a lack of high-quality datasets in the field of electromagnetic signal recognition. In order to promote in-depth learning in the application of electromagnetic signal recognition, a large-scale real electromagnetic signal dataset is established based on Automatic Dependent Surveillance-Broadcast (ADS-B). An automatic data collection and labeling system is designed to automatically capture ADS-B electromagnetic signals in open and real scenes. A high quality ADS-B signal dataset is established by data cleaning and sorting of ADS-B signals. The performance of in-depth learning models using datasets is studied, and the models are evaluated comprehensively under different signal-to-noise ratios, sampling rates and number of samples. The data set provides a valuable benchmark for relevant researchers.