Abstract:To solve the difficulty of identifying the mixed binary protocol data frames without any prior knowledge, a clustering method based on joint Gaussian Mixture Model(GMM) and auto-encoder is proposed. For the captured unknown binary data frames, firstly its features are extracted via dimension reducing by stacked auto-encoder, and then the optimal number of clusters is obtained according to the corresponding criteria, finally the auto-encoder with modified cost function is utilized to train the binary data frames to improve clustering accuracy. The experimental results show that the accuracy of this method for recognizing the network binary protocol data frames is over 94%.