Data Mining
Group of the Database Systems Laboratory at Group
members § Principal
investigator § Jianyong
Wang (Professor) § Current
members § Zhuo
Wang (Ph.D. student) § Jiacheng Xu (Ph.D. student) § Zhichao Duan (Ph.D. student) § Cong Han (Ph.D. student) § Zhenyu Li (Ph.D. student) § Yunfei Yang (Ph.D. student) § Bowen Dong (Ph.D. student) § Yutao Sun (Bachelor Student) § Tengyu Pan (Bachelor Student) § Former student members § Xiuxing
Li (Ph.D., 2022) § Rui Zhang(M.E., 2022) § Bowen Dong (B.E., 2022) § Zhongkai He (B.E., 2022) § Ning
Liu (Ph.D., Assistant Professor, 山东大学) § Zhenyu Li (B.E., 2021) § Jianyuan Lu (Posdoc Researcher) § Chenwei Ran (Ph.D., 2020) § Zujiang Pan (M.E., 2020) § Jiacheng Xu (B.E., 2020) § Yuanquan Lu (M.E., 2019) § Long Guo (B.E., 2019) § Yifan Li (B.E., 2019) § Gang Chen (M.E., 2018) § Pan Lu (M.E., 2018,清华大学优秀硕士论文奖) § Xingzhi
Niu (B.E., 2018) § Junyi
Fu (B.E., 2018) § Jianhua
Yin (Ph.D. , 2017,Tenured Associate Professor, 山东大学) § Xinding
Wei (B.E., 2017) § Wei Zhang (Ph.D. , 2016,Professor,
北京市优秀博士毕业生、清华大学优秀博士论文奖,华东师大紫江学者) § Yuda
Zang (M.E., 2016) § Chao
Wang (M.E., 2016) § Wei
Feng (Ph.D., 2015,北京市优秀博士毕业生、清华大学优秀博士论文奖) § Zhaoxu
Tu (M.E., 2015) § Wei
Shen (Ph.D., 2014,Associate Professor,中国人工智能学会优秀博士论文奖、清华大学优秀博士论文奖,南开大学振兴计划) § Zhenhua
Song (M.E., 2014) § Xianjun
Zhang (M.E., 2014) § Chenwei
Ran (B.E., 2014) § Hongda
Ren (M.E., 2013) § Haijun
Xia (B.E., 2013) § Junlin
Lin (Visiting Master student from NTHU, 2013) § Lili
Jiang (Ph.D., visiting from Lanzhou Univ., 2012, Associate Professor, Umea
University, Sweden) § Xu Pu
(M.E., 2012,清华大学优秀硕士论文奖) § Shuyong
Chen (M.E., 2012) § Qingyan
Yang (M.E., 2011) § ZhiJie
He (B.E., 2011) § Jun
Zhang (M.E., 2010) § Chuancong
Gao (M.E., 2010,清华大学优秀硕士论文奖) § Yuzhou
Zhang (Ph.D., 2010) § Yiting
Bian (B.E., 2009) § Yan
Li (M.E., 2009) § Xiaoming
Fan (M.E., 2009) § Chun
Li (M.E., 2009,清华大学优秀硕士论文奖) § Zhiping
Zeng (Ph.D., 2009) § Jing Wang
(B.E., 2008, Associate
Professor,清华大学优秀本科毕业生,香港科技大学商学院) § Bing
Lv (M.E., 2008) § Wei
Fu (B.E., 2007) Current
research topics § Knowledge
graph and Medical data mining: we mainly focus on the problems
in this area such as entity disambiguation, relation extraction, entity
linking, personalized recommender systems, medical data mining, interpretable
learning models, and so on. Representative publications include: § Lili
Jiang, Jianyong Wang, Ning An, Shengyuan Wang, Jian Zhan, Lian Li. GRAPE: A
Graph-Based Framework for Disambiguating People Appearances in Web Search. IEEE ICDM'09 (PP:199-208) § Qingyan
Yang, Ju Fan, Jianyong Wang, Lizhu Zhou. Personalizing Web Page
Recommendation via Collaborative Filtering and Topic-Aware Markov Model. IEEE ICDM'10 (PP: 1145-1150) § Xiaoming
Fan, Jianyong Wang, Xu Pu, Lizhu Zhou, Bing Lv. On Graph-based Name
Disambiguation. ACM Journal of Data and Information Quality, February 2011. (ACM JDIQ, Vol.
2, No. 2, Article 10.) § Wei
Shen, Jianyong Wang, Ping Luo, Min Wang, Conglei Yao. REACTOR: A Framework
for Semantic Relation Extraction and Tagging over Enterprise Data. WWW'11 (PP: 121-122) § Xu
Pu, Jianyong Wang, Ping Luo, Min Wang. AWETO: Efficient Incremental Update
and Querying for RDF Storage System. ACM CIKM'11
(PP: 2445–2448) § Wei Shen,
Jianyong Wang, Ping Luo, and Min Wang. LINDEN: Linking Named Entities with
Knowledge Base via Semantic Knowledge. Proc. WWW'12. (PP: 449-458) § Lili
Jiang, Jianyong Wang, Ping Luo, Ning An, Min Wang. Towards Alias Detection
Without String Similarity: an Active Learning based Approach. ACM SIGIR'12. (poster paper,
PP: 1155-1156) § Wei
Feng, Jianyong Wang. Incorporating Heterogeneous Information for Personalized
Tag Recommendation in Social Tagging Systems. ACM
SIGKDD'12. (PP: 1276-1284) § Wei Shen, Jianyong Wang, Ping Luo, Min Wang. LIEGE:
Link Entities in Web Lists with Knowledge Base.
ACM SIGKDD'12. (PP: 1424-1432) § Jun
Zhang, Xiaoming Fan, Jianyong Wang, Lizhu Zhou. Keyword-Propagation-Based
Information Enriching and Noise Removal for Web News Videos. ACM SIGKDD'12. (Industry track, PP:
561-569) § Wei Shen, Jianyong Wang, Ping Luo, Min Wang. A
Graph-Based Approach for Ontology Population with Named Entities. Proc. ACM CIKM'12. (PP: 345-354.) § Wei
Feng, Jianyong Wang. Retweet or not? Personalized Tweet Re-ranking. ACM WSDM'13. (PP: 577-586) § Wei Zhang, Wei Feng, Jianyong Wang. Integrating
Semantic Relatedness and Words’ Intrinsic Features for Keyword Extraction.
Proc. IJCAI'13. (PP:2225-2231) § Wei Shen, Jianyong Wang, Ping Luo, Min Wang. Linking
Named Entities in Tweets with Knowledge Base via User Interest Modeling.
Proc. ACM SIGKDD'13. PP:(68-76) § Wei Zhang, Jianyong Wang, Wei Feng. Combining
Latent factor Model with Location Features for Event-based Group
Recommendation. Proc. ACM SIGKDD'13.
(PP:910-918) § Lili
Jiang, Ping Luo, Jianyong Wang, Yuhong Xiong, Bingduan Lin, Min Wang, Ning
An. GRIAS: an Entity-Relation Graph based Framework for Discovering Entity
Aliases. Proc. IEEE ICDM'13. (PP:310-319) § Wei Feng, Jianyong Wang, Wei Zhang. We Can
Learn Your #Hashtags: Connecting Tweets to Explicit Topics. Proc. IEEE ICDE'14. (PP:856-867) § Wei Shen, Jiawei Han, Jianyong Wang. A
Probabilistic Model for Linking Named Entities in Web Text with Heterogeneous
Information Networks. Proc. ACM SIGMOD'14.
(PP:1199-1210) § Jianhua
Yin, Jianyong Wang. A Dirichlet Multinomial Mixture Model-based Approach for
Short Text Clustering. Proc. ACM SIGKDD'14.
(PP: 233-242) § Wei Shen, Jianyong Wang, Jiawei Han. Entity
Linking with a Knowledge Base: Issues, Techniques, and Solutions. IEEE TDKE (Vol. 27, No. 2, Feb. 2015, PP:
443-460). § Wei Feng, Chao Zhang, Wei Zhang, Jiawei Han, Jianyong
Wang, Charu Aggarwal, Jianbin Huang. StreamCube:
Hierarchical Spatio-temporal Hashtag Clustering for Event Exploration over
the Twitter Stream. IEEE ICDE'15 (PP: 1561-1572).
§ Wei
Zhang, Jianyong Wang. A Collective Bayesian Poisson Factorization Model for
Cold-start Local Event Recommendation. ACM
SIGKDD'15 (PP: 1455-1456). § Wei
Zhang, Jianyong Wang. A Location and Time Aware Social Collaborative
Retrieval Approach for New Successive Point-of-Interest Recommendation. ACM CIKM'15 (PP:
1221-1230). § Chenwei Ran, Wei Shen, Jianyong Wang, Xuan Zhu. Domain-specific
knowledge base enrichment using Wikipedia tables. IEEE ICDM'15. § Jianhua
Yin, Jianyong Wang. A Model-based Approach for Text Clustering with Outlier
Detection. IEEE ICDE'16. § Wei Zhang, Quan Yuan, Jiawei Han, Jianyong Wang. Collaborative
Multi-Level Embedding Learning from Reviews for Rating Prediction.
IJCAI'16. § Jianhua Yin, Jianyong Wang. A Text
Clustering Algorithm Using an Online Clustering Scheme for Initialization. ACM
SIGKDD'16. § Wei
Zhang, Jianyong Wang. Integrating Topic and Latent Factors for Scalable
Personalized Review-based Rating Prediction. IEEE
TDKE,November 2016. § Wei
Shen, Jiawei Han, Jianyong Wang, Xiaojie Yuan, Zhenglu Yang. SHINE+: A
General Framework for Domain-Specific Entity Linking with Heterogeneous
Information Networks. IEEE TDKE,February
2018. § Pan Lu, Hongsheng Li, Wei Zhang, Jianyong Wang,
Xiaogang Wang. Co-attending Free-form Regions and Detections with
Multi-modal Multiplicative Feature Embedding for Visual Question Answering. AAAI'18. § Chenwei Ran, Wei Shen, Jianyong Wang. An
Attention Factor Graph Model for Tweet Entity Linking. WWW'18. § Wei Shen, Yinan Liu, Jianyong Wang. Predicting
Named Entity Location Using Twitter. IEEE ICDE’18. § Jianhua
Yin, Daren Chao, Zhongkun Liu, Wei Zhang, Xiaohui Yu, Jianyong Wang.
Model-based Clustering of Short Text Streams. ACM SIGKDD'18. § Pan Lu, Lei Ji, Wei Zhang, Nan Duan, Ming Zhou,
Jianyong Wang. R-VQA: Learning Visual Relation Facts with Semantic
Attention for Visual Question Answering. ACM
SIGKDD'18. § Ning
Liu, Pan Lu, Wei Zhang, Jianyong Wang. Knowledge-Aware
Deep Dual Networks for Text-Based Mortality Prediction. IEEE
ICDE’19. § Zhuo Wang, Wei Zhang, Ning Liu, Jianyong Wang. Transparent
Classification with Multilayer Logical Perceptrons and Random
Binarization. Accepted to appear
in Proc. the Thirty-Fourth AAAI Conference on Artificial Intelligence, Feb.
7-12, New York, USA. AAAI'20. § Xiuxing Li, Zhenyu Li, Zhengyan Zhang, Ning Liu, Haitao Yuan,
Wei Zhang, Zhiyuan Liu, Jianyong Wang. Effective Few-Shot Named Entity
Linking by Meta-Learning. Accepted to appear in Proceedings of the
38th IEEE International Conference on Data Engineering, Kuala Lumpur, Malaysia, May
9-12, 2022. (IEEE ICDE’22). § Zhuo Wang, Wei Zhang, Ning Liu, Jianyong
Wang. Scalable Rule-Based
Representation Learning for Interpretable Classification. Accepted to appear in Proceedings of the
Thirty-Fifth Conference on Neural Information Processing Systems (NeurIPS’21) Past research topics § Graph data mining: we investigate the problems in this area
such as coherent subgraph mining, community detection in large networks,
graph generator mining for classification, structural anonymization of graph
data (joint work with IBM), and so on. Representative publications include: § Jianyong Wang, Zhiping Zeng, Lizhu Zhou. CLAN:
An Algorithm for Mining Closed Cliques from Large Dense Graph Databases.
IEEE ICDE'06 (Full research paper,
Article No. 73). § Zhiping Zeng, Jianyong Wang, Lizhu Zhou, George
Karypis. Out-of-Core
Coherent Closed Quasi-Clique Mining from Large Dense Graph Databases. ACM TODS, June 2007 (Volume 32, Issue 2, Article No. 13). § Yuzhou
Zhang, Jianyong Wang, Zhiping Zeng, Lizhu Zhou. Parallel Mining of Closed
Quasi-Cliques. IEEE IPDPS'08 (Full research paper,
Article No. 2). § Yuzhou Zhang, Jianyong Wang, Yi Wang, Lizhu
Zhou. Parallel Community Detection on Large Networks with Propinquity
Dynamics. ACM SIGKDD'09 (PP:997-1005). § Zhiping Zeng, Jianyong Wang, Jun Zhang, Lizhu Zhou. FOGGER:
An Algorithm for Graph Generator Discovery. EDBT'09
(PP: 517-528). §
Chun Li, Charu Aggarwal, Jianyong Wang. On Anonymization of Multi-graphs. SIAM
SDM'11(PP: 711-722). § Sequence data mining: we mainly study the problems in this topic
such as closed sequential pattern mining, gap-constrained sequential pattern
mining, sequence generator pattern mining, summarization subsequence mining
for clustering, sequential pattern based XML document clustering (joint work
with IBM), and so on. Representative publications include: § Jianyong Wang, Jiawei Han. BIDE: Efficient
Mining of Frequent Closed Sequences. (Most
cited paper in IEEE ICDE 2004) § Jianyong
Wang, Jiawei Han, Chun Li. Frequent Closed Sequence Mining without Candidate
Maintenance. IEEE TKDE, August 2007
(PP: 1042-1056). § Charu C. Aggarwal, Na Ta, Jianyong Wang,
Jianhua Feng, and Mohammed J. Zaki. XProj: A Framework for Projected Structural
Clustering of XML Documents. ACM SIGKDD'07 (PP: 46-55). § Chuancong Gao, Jianyong Wang, Yukai He,
Lizhu Zhou. Efficient Mining of Frequent Sequence Generators. WWW'08
(Posters track, PP: 1051-1052). § Jianyong Wang, Yuzhou Zhang, Lizhu Zhou,
George Karypis, Charu C. Aggarwal. CONTOUR: An Efficient Algorithm for
Discovering Discriminating Subsequences. Int.
J. Data Mining and Knowledge Discovery, Feb. 2009 (Vol. 18, No. 1, PP: 1-29). § Chun Li, Qingyan Yang, Jianyong Wang, Ming
Li. Efficient Mining of Gap-Constrained Subsequences and its Various
Applications. ACM Transactions on Knowledge Discovery from Data, Vol. 6,
No.1, Article No. 2, March 2012. (ACM
TKDD) § Uncertain data mining: we mainly work on problems in this topic
such as frequent pattern discovery from uncertain data (joint work with IBM),
and mining patterns for classifying uncertain data. Representative
publications include: § Charu C. Aggarwal, Yan
Li, Jianyong Wang, Jing Wang. Frequent
Pattern Mining with Uncertain Data. ACM
SIGKDD'09 (PP: 29-37 ). § Chuancong Gao, Jianyong Wang. Direct Mining
of Discriminative Patterns for Classifying Uncertain Data. ACM SIGKDD'10 (PP:
861-870 ). § Other data mining topics: we also work on problems such as stream
data mining. Representative publications include: § Charu C. Aggarwal, Jiawei Han, Jianyong Wang, Philip S. Yu. A
Framework for Clustering Evolving Data Streams. (Most cited paper in VLDB 2003) § Chuancong Gao, Jianyong Wang.
Efficient Itemset Generator Discovery Over a Stream Sliding Window. ACM CIKM'09 (PP:
355-364). § Chuancong Gao, Jianyong Wang, Qingyan Yang.
Efficient Mining of Closed Sequential Patterns on Stream Sliding Window. IEEE ICDM'11 (PP:
1044-1049). |