Abstract:Micro-blog text data is high-dimensional, bearing the obvious features of synonymy and polysemy. Traditional topic detection method based on Vector Space Model(VSM) combined with K-means has some problems such as low accuracy, complex calculation, and being difficult to determine the center of clustering. A Relevance Vector Machine(RVM) optimized VSM method is proposed to realize the text vectorization. Firstly, the dimension of VSM feature vector is reduced automatically by using the adaptive feature selection ability of RVM, and then Principal Component Analysis(PCA) is applied to determine the cluster center of K-means clustering algorithm. K-means algorithm is employed to get the clustering results. Finally, according to the number of micro-blog forwarding and comments, the topic with the largest heat index is the current hot topic. The results show that compared with two traditional methods, the accuracy of the proposed method is improved by 7.3% and 1.1%, and the real-time performance is improved by 45% and 53%, respectively.