数字农科院2.0

Text classification on heterogeneous information network via enhanced GCN and knowledge

文献类型: 外文期刊

作者: 李慧;;阎琰;;王烁;;刘娟;;崔运鹏

作者机构:

关键词: Text classification;Graph convolutional networks;Knowledge graph;Heterogeneous information network;Pre-trained model

期刊名称: NEURAL COMPUTING & APPLICATIONS

ISSN: 0941-0643

年卷期: 2023 年

页码:

收录情况: SCIE(2023版) ; ; EI(2023版)

摘要: Graph convolutional networks-based text classification methods have shown impressive success in further improving the classification results by considering the structural relationship between words and texts. However, existing GCN-based text classification methods tend to ignore the semantic representation of the node and the global structural information among nodes. Besides, only the word granularity information within the text, i.e., endogenous source, is used to represent the text. Furthermore, the existing graph convolutional network approaches are faced with major challenges to handle large and dense graphs, i.e., neighbor explosion and noisy inputs. To address these shortcomings, this paper proposes an inductive learning-based text classification method that utilizes representation learning on heterogeneous information networks and exogenous knowledge. Firstly, a weighted heterogeneous information network for text (HINT) is constructed by introducing exogenous knowledge, in which the node types cover text, entities and words. The unstructured text is represented as a structured heterogeneous information network, which expands the granularity of text features and makes full use of the exogenous structural information and explicit semantic information to enhance the interpretability of text information. Besides, we also enhanced the graph neural network against the challenges of neighbor explosion and noisy inputs derived from HINT using two strategies: graph sampling and Dropedge, for semi-supervised learning with improved classification performance. The effectiveness of our model is demonstrated by examining four publicly available text classification datasets. Based on experimental results, our approach achieves state-of-the-art performance on the text classification datasets.

分类号:

  • 相关文献

[1]Research And Construction Of Semantic R.etrieval Based On Knowledge G raph. Huang, Yongwen,Xian, Guojian,Li, Jiao,Kou, Yuantao,Kou, Yuantao. 2018

[2]Multi-Label Classification of Chinese Rural Poverty Governance Texts Based on XLNet and Bi-LSTM Fused Hierarchical Attention Mechanism. Xin Wang,Leifeng Guo. 2023

作者其他论文 更多>>