Gets more than 4.4k blog views and more than 320 Likes and Retweets on Twitter. Experimental results on the PTB, WikiText-2, and WikiText-103 show that proposed method achieves perplexities between 20.42 and 34.11 on all problems, i.e. on average an improvement of 12.0 perplexity units compared to state-of-the-art LSTMs.
I was a Research Staff Member in IBM Research-Almaden. I received my Ph.D. degree from Peking University supervised by Dr. Ming Zhang. I was also a visiting Ph.D. student at University of Illinois at Urbana-Champaign under the supervision of Dr. Jiawei Han.
My research interests span the areas of machine learning, natural language understanding and graph mining. The goal of my research is to help real world applications in human daily life with better intelligence. To achieve this goal, I am now working on machine learning with indirect supervision and its applications to deep language understanding, as well as to graph construction and representation learning, enabling machine to better understand human.
I am enthusiastic to contribute to open source projects including:
- GluonNLP (co-creator): An easy-to-use deep learning for NLP toolkit.
- AutoGluon (co-creator): An AutoML toolkit for deep learning.
- TextHIN (creator): A text-to-network representation and semantic parsing toolkit.
- SystemT (contributor): A declarative information extraction system.
- UniversalPropositions (contributor): A multilingual shallow semantic parsing toolkit.
- D2L.ai (contributor): An interactive deep learning book with code, math, and discussions.
- Apache MXNet (contributor): One of the major deep learning frameworks.
Selected Publications [Full List]
Crowd-in-the-loop: A hybrid approach for annotating semantic roles
Chenguang Wang, Alan Akbik, Laura Chiticariu, Yunyao Li, Fei Xia, and Anbang Xu.
In Proc. 2017 Conf. on Empirical Methods on Natural Language Processing (EMNLP 2017).
[paper] [data] [slides]
Our experimental evaluation shows that the proposed approach reduces the workload for experts by over two-thirds, and thus significantly reduces the cost of producing SRL annotation at little loss in quality.
Text classification with heterogeneous information network kernels
Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang and Jiawei Han.
In Proc. 2016 AAAI Conf. on Artificial Intelligence (AAAI 2016).
[paper] [slides] [code] [data]
This paper presents a novel text as network classification framework, which introduces a structured and typed heterogeneous information networks (HINs) representation of texts, and a meta-path based approach to link texts.
Incorporating world knowledge to document clustering via heterogeneous information networks
Chenguang Wang, Yangqiu Song, Ahmed El-Kishky, Dan Roth, Ming Zhang, and Jiawei Han.
In Proc. 2015 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD 2015).
[paper] [slides] [video] [code] [data]
We provide three ways to specify the world knowledge to domains by resolving the ambiguity of the entities and their types, and represent the data with world knowledge as a heterogeneous information network.
- Dec, 2019: AutoGluon has been released [code] [website] with 1.9k Github stars.
- Jul, 2019: Our paper “GluonCV and GluonNLP: deep learning in computer vision and natural language processing” is on arXiv [paper] [GluonNLP code] [GluonCV code].
- Jul, 2019: GluonNLP v0.7.1 has been released [code] [news].
- Apr, 2019: Our paper “Language models with Transformers” is on arXiv [paper] [code] [slides], gets more than 4.4k blog views [in Chinese] and more than 320 Likes and Retweets on Twitter [link].
- Apr, 2019: Our tutorial “From shallow to deep language representations: pre-training, fine-tuning, and beyond” is accepted by KDD [website].
- Apr, 2019: I will serve as PC for EMNLP 2019.
- Mar, 2019: Our paper “Co-occurrent features in semantic segmentation” is accepted by CVPR [paper].
- Feb, 2019: I will serve as PC for ACL 2019.
- Jan, 2019: I will serve as PC for KDD 2019.
- Oct, 2018: I will serve as PC for ICML 2019.
- Aug, 2018: I will give a talk about GluonNLP at MXNet Seattle meetup [video].