Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
portfolio
Incorporating world knowledge to document clustering via heterogeneous information networks
Chenguang Wang, Yangqiu Song, Ahmed El-Kishky, Dan Roth, Ming Zhang, and Jiawei Han.
In Proc. 2015 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD 2015).
[paper] [slides] [video] [code] [data]

We provide three ways to specify the world knowledge to domains by resolving the ambiguity of the entities and their types, and represent the data with world knowledge as a heterogeneous information network.
Text classification with heterogeneous information network kernels
Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang and Jiawei Han.
In Proc. 2016 AAAI Conf. on Artificial Intelligence (AAAI 2016).
[paper] [slides] [code] [data]

This paper presents a novel text as network classification framework, which introduces a structured and typed heterogeneous information networks (HINs) representation of texts, and a meta-path based approach to link texts.
Crowd-in-the-loop: A hybrid approach for annotating semantic roles
Chenguang Wang, Alan Akbik, Laura Chiticariu, Yunyao Li, Fei Xia, and Anbang Xu.
In Proc. 2017 Conf. on Empirical Methods on Natural Language Processing (EMNLP 2017).
[paper] [data] [slides]

Our experimental evaluation shows that the proposed approach reduces the workload for experts by over two-thirds, and thus significantly reduces the cost of producing SRL annotation at little loss in quality.
Language models with Transformers
Chenguang Wang, et al.
In arXiv preprint arXiv:1904.09408 (arXiv 2019).
[paper] [code] [slides]

Gets more than 4.4k blog views and more than 320 Likes and Retweets on Twitter. Experimental results on the PTB, WikiText-2, and WikiText-103 show that proposed method achieves perplexities between 20.42 and 34.11 on all problems, i.e. on average an improvement of 12.0 perplexity units compared to state-of-the-art LSTMs.
Language Models are Open Knowledge Graphs
Chenguang Wang, Xiao Liu, and Dawn Song.
In arXiv preprint arXiv:2010.11967 (arXiv 2020).
[paper] [slides]

What's the relationship between deep language models (e.g., BERT, GPT-2, GPT-3) and knowledge graphs? Can we use the pre-trained deep language models to construct knowledge graphs? We find that we can construct knowledge graphs from the pre-trained language models. The generated knowledge graphs not only cover the knowledge already in existing knowledge graphs, such as Wikidata, but also feature open factual knowledge that is new.
publications
Future-related future prediction system by query subtopic analysis based on Chinese news web pages
Chenguang Wang, Chongwen Wang, and Jie Bing.
In Proc. 2010 Sciencepaper Online (in Chinese).
[paper]
Study of cloud computing security based on private face recognition
Chenguang Wang, and Huaizhi Yan.
In Proc. 2010 Int. Conf. on IEEE Computational Intelligence and Software Engineering (CiSE 2010).
[paper]
ENGtube: an integrated subtitle environment for ESL
Chi-Ho Li, Shujie Liu, Chenguang Wang, and Ming Zhou.
In MT Summit XIII: the Thirteenth Machine Translation Summit (MTSummit 2011).
[paper]
Paraphrasing adaptation for web search ranking
Chenguang Wang, Nan Duan, Ming Zhou, and Ming Zhang.
In Proc. 2013 Annual Meeting of the Association for Computational Linguistics (ACL 2013).
[paper] [slides]
Measuring domain influence in heterogeneous networks
Quan Liu, Chenguang Wang, and Ming Zhang.
In Proc. 2014 ACM Int. Conf. on Web Search and Data Mining Workshop on Diffusion Networks and Cascade Analytics (WSDM 2014 Workshop).
[paper]
Spectral label refinement for noisy and missing text labels
Yangqiu Song, Chenguang Wang, Ming Zhang, Hailong Sun, and Qiang Yang.
In Proc. 2015 AAAI Conf. on Artificial Intelligence (AAAI 2015).
[paper]
Incorporating world knowledge to document clustering via heterogeneous information networks
Chenguang Wang, Yangqiu Song, Ahmed El-Kishky, Dan Roth, Ming Zhang, and Jiawei Han.
In Proc. 2015 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD 2015).
[paper] [slides] [video] [code] [data]
Constrained information-theoretic tripartite graph clustering to identify semantically similar relations
Chenguang Wang, Yangqiu Song, Dan Roth, Chi Wang, Jiawei Han, Heng Ji, and Ming Zhang.
In Proc. 2015 Int. Joint Conf. on Artificial Intelligence (IJCAI 2015).
[paper] [slides]
KnowSim: A document similarity measure on structured heterogeneous information networks
Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang, and Jiawei Han.
In Proc. of 2015 IEEE Int. Conf. on Data Mining (ICDM 2015).
[paper] [slides] [code] [data]
Text classification with heterogeneous information network kernels
Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang and Jiawei Han.
In Proc. 2016 AAAI Conf. on Artificial Intelligence (AAAI 2016).
[paper] [slides] [code] [data]
RelSim: Relation similarity search in schema-rich heterogeneous information networks
Chenguang Wang, Yizhou Sun, Yanglei Song, Jiawei Han, Yangqiu Song, Lidan Wang, and Ming Zhang.
In Proc. 2016 SIAM Int.Conf. on Data Mining (SDM 2016)".
[paper] [slides]
World knowledge as indirect supervision for document clustering
Chenguang Wang, Yangqiu Song, Ahmed El-Kishky, Dan Roth, Ming Zhang, and Jiawei Han.
In ACM Transactions on Knowledge Discovery fromData (TKDD 2016).
[paper] [data]
HINE: Heterogeneous information network embedding
Yuxin Chen, and Chenguang Wang.
In Proc. 2017 Int. Conf. on Database Systems for Advanced Applications (Dasfaa 2017).
[paper]
Towards re-defining relation understanding in financial domain
Chenguang Wang, Doug Burdick, Laura Chiticariu, Rajasekar krishnamurthy, Yunyao Li, Huaiyu Zhu.
In Proc. of 2017 ACM SIGMOD Int. Conf. on Management of Data Workshop (SIGMOD 2017 Workshop).
[paper] [slides] [video]
Semi-supervised learning over heterogeneous information networks by ensemble of meta-graph guided random walks
He Jiang, Yangqiu Song, Chenguang Wang, Ming Zhang, and Yizhou Sun.
In Proc. 2017 Int. JointConf. on Artificial Intelligence (IJCAI 2017).
[paper] [code]
Active learning for black-box semantic role labeling with neural factors
Chenguang Wang, Laura Chiticariu, and Yunyao Li.
In Proc. 2017 Int. Joint Conf. on Artificial Intelligence (IJCAI 2017).
[paper] [data] [slides]
Crowd-in-the-loop: A hybrid approach for annotating semantic roles
Chenguang Wang, Alan Akbik, Laura Chiticariu, Yunyao Li, Fei Xia, and Anbang Xu.
In Proc. 2017 Conf. on Empirical Methods on Natural Language Processing (EMNLP 2017).
[paper] [data] [slides]
Distant meta-path similarities for text-based heterogeneous information networks
Chenguang Wang, Yangqiu Song, Haoran Li, Yizhou Sun, Ming Zhang, and Jiawei Han.
In Proc. 2017 ACM Int. Conf. on Information and Knowledge Management (CIKM 2017).
[paper] [data] [slides]
Unsupervised meta-path selection for similarity measure on heterogeneous information networks
Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang, and Jiawei Han.
In Proc. 2018 Data Mining and Knowledge Discovery (DMKD 2018).
[paper] [code] [data]
Co-occurrent features in semantic segmentation
Hang Zhang, Han Zhang, Chenguang Wang, and Junyuan Xie.
In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2019).
[paper]
From shallow to deep language representations: pre-training, fine-tuning, and beyond
Sheng Zha, Aston Zhang, Haibin Lin, Chenguang Wang, Mu Li, and Alexander Smola.
In Proc. 2019 ACM SIGKDD Int. Conf.on Knowledge Discovery and Data Mining (KDD 2019).
[paper] [code]
Language models with Transformers
Chenguang Wang, et al.
In arXiv preprint arXiv:1904.09408 (arXiv 2019).
[paper] [code] [slides]
GluonCV and GluonNLP: deep learning in computer vision and natural language processing
Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, and Shuai Zheng.
In Journal of Machine Learning Research (JMLR 2020).
[paper] [code]
Transformer on a Diet
Chenguang Wang, Zihao Ye, Aston Zhang, Zheng Zhang, and Alexander Smola.
In arXiv preprint arXiv:2002.06170 (arXiv 2020).
[paper] [code]
PoD: Positional Dependency-Based Word Embedding for Aspect Term Extraction
Yichun Yin, Chenguang Wang, and Ming Zhang.
In Proc. 2020 Int. Conf. on Computational Linguistics (COLING 2020).
[paper]
Language Models are Open Knowledge Graphs
Chenguang Wang, Xiao Liu, and Dawn Song.
In arXiv preprint arXiv:2010.11967 (arXiv 2020).
[paper] [slides]