Chinese keywords extraction algorithm based on frequent pattern mining
Author:
Affiliation:

Funding:

Ethical statement:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
    Abstract:

    A keyword extraction algorithm for Chinese documents based on frequent pattern mining is proposed aiming at the problems of existing Keywords Extraction Algorithm(KEA) including high computational complexity and mining shallow semantic information. This algorithm adopts improved FP-Growth technology to extract word co-occurrence information and remove noisy words. It utilizes semantic similarity algorithm to eliminate synonyms and simplify the characteristics of candidates, thus reducing the storage space and the amount of calculation when ensuring the high precision and recall. Experimental results show that the average F value of corpus reaches 59.7%, which is higher than classical algorithms; and that the support threshold is the vital influencing factor.

    Reference
    Related
    Cited by
Get Citation

崔诚煜,冉晓旻.基于频繁模式挖掘的中文关键词提取算法[J]. Journal of Terahertz Science and Electronic Information Technology ,2015,13(2):279~284

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
History
  • Received:April 14,2014
  • Revised:May 16,2014
  • Adopted:
  • Online: May 12,2015
  • Published: