Data Mining and Information Systems Lab
PI: Prof. Jaewoo Kang, Dept. of Computer Science and Engineering, Korea Univ.
(고려대학교 컴퓨터학과 강재우교수 연구실)
Data science has advanced to the point where it is changing our world. It is now the center of exploring and uncovering knowledge in different domains and acts as a bridge to connect them. With ever growing amount of data and opportunity to explore, DMIS lab aims to drive the data science revolution.
DMIS (Data Mining and Information Systems) Lab seeks to develop explainable AI in the following areas: Drug Discovery, Bioinformatics Analysis, Biomedical Image Processing, Recommender Systems, Question and Answering, Search, Financial Data Analysis, and much more. We focus on finding models, algorithms, and systems for any kinds of data analysis with applications on prediction, knowledge discovery, representation learning and anomaly detection. DMIS Lab is also participating various data science competitions such as DREAM Challenges to solve difficult real-world problems and facilitate knowledge sharing with other research teams around the world.
Sep. 2020: Jungsoo Park's paper, Adversarial Subword Regularization for Robust Neural Machine Translation, was accepted to Findings of ACL: EMNLP 2020, an anthology journal of ACL which is one of the top-tier conferences for computational linguistics.
Sep. 2020: Miyoung Ko's paper, Look at the First Sentence: Position Bias in Question Answering, was accepted to EMNLP 2020, one of the best renowned conferences for NLP-related publications!
Sep. 2020: Recently, BioBERT: a pre-trained biomedical language representation for biomedical text mining co-first authored by Dr. Jinhyuk Lee and Wonjin Yoon has been ranked as the most read papers in Bioinformatics which is one of the top-tier journals in the domain.
Also, BioBERT was included in the Best Papers for the Natural Language Processing Section of the 2020 IMIA (International Medical Informatics Association) Yearbook (link). Congratulations once again to the authors for this grand achievement!
Jul. 2020: Enhancing the interpretability of transcription factor binding site prediction using attention mechanism, co-first authored by Sungjoon Park, Yookyung Koh, and Hwisang Jeon, was accepted to Scientific Reports. Congratulations!
Apr. 2020: MAPS: Multi-Agent reinforcement learning-based Portfolio management System, co-first authored by Jinho Lee and Raehyun Kim, is accepted to IJCAI 2020, one of the top conferences for general AI.
MAPS is an hedge fund like portfolio management system trained with cooperative multi-agent reinforcement learning.
It is inspired by the fact that hedge fund's entire portfolio is manged by multiple investors, working together to maximize risk-adjusted return.
Apr. 2020: Congratulations to Sunkyu Kim for publication to Cell Systems!
Sunkyu Kim's team won 1st place in the NCI-CPTAC DREAM Proteogenomics Challenge in 2017 (outperforming UCLA(3rd), Stanford(13th)).
Assessment of the Limits of Predictability of Protein and Phosphorylation Levels in Cancer is a paper for the DREAM challenge and is worked with Heidelberg University, Icahn School of Medicine and New York University.
Apr. 2020: Sunkyu Kim's paper, Improved survival analysis by learning shared genomic information from pan-cancer data, was accepted to ISMB 2020, top conference in Bioinformatics.
Two papers got accepted to ACL 2020, one of the top conferences in NLP.
Dr. Jinhyuk Lee's paper, Contextualized Sparse Representations for Real-Time Open-Domain Question Answering, is accepted to ACL 2020.
Mujeen Sung's paper, Biomedical Entity Representations with Synonym Marginalization, is accepted to ACL 2020.
Jan. 2020: Wonjin Yoon received the NAVER Ph.D Fellowship Award as he showed outstanding performance in his research area.
Nov. 2019: Congratulations! Our DMIS team (Sungjoon Park, Minji Jeon, Sunkyu Kim, Junhyun Lee, Seongjun Yun, Bumsoo Kim, Buru Chang) has been selected as the top performers in the IDG-DREAM Drug-Kinase Binding Prediction Challenge. As one of the best performers, we presented our model at the RSG with DREAM Conference, NY in November. (Link)
Sep. 2019: Congratulations! DMIS team outperformed Google team and won 1st place at BioASQ challenge, a challenge on large scale biomedical semantic indexing and question answering.
By using BioBERT, our team(Wonjin Yoon, Jinhyuk Lee, Donghyeon Kim, Minbyul Jeong) produced outstanding results for all 5 test batches on BioASQ Task 7B-Phase B (challenge results - http://bioasq.org/participate/seventh-challenge-winners ).
Sep. 2019: Congratulations to Dr. Jinhyuk Lee and Wonjin Yoon for BioBERT publication in Bioinformatics!
BioBERT: a pre-trained biomedical language representation model for biomedical text mining is the first biomedical language representation model pre-trained on large-scale biomedical corpus and achieves state-of-the-art performances on various biomedical NLP tasks. (paper , code)
With BioBERT, DMIS team won 1st place at BioASQ challenge.
Sep. 2019: Seongjun Yun's paper, Graph Transformer Networks, got accepted to In Advances in Neural Information Processing Systems (NeurIPS 2019), one of the top-tier conferences in Machine Learning alongside with ICML.
Aug. 2019: Donghyeon Park's paper, KitcheNette: Prediction and Ranking Food Ingredient Pairings based on Siamese Neural Network, got accepted to IJCAI 2019, one of the top-tier conferences for general AI.
May. 2019: Real-Time Open-Domain Question Answering on Wikipedia with Dense-Sparse Phrase Index, co-first authored by Jinhyuk Lee, is accepted to ACL 2019, the top conference in computational linguistics and natural language processing.
May. 2019: ReSimNet: Drug Response Similarity Prediction using Siamese Neural Networks, co-first authored by Minji Jeon and Donghyeon Park, has been accepted to Bioinformatics, the best journal for computational biology.
ReSimNet measures the transcriptional response similarity of the two chemical compounds, and the team achieved first place in the Multi-targeting Drug DREAM Challenge with this model (outperforming Janssen Pharmaceutica).
Apr. 2019: Self-Attention Graph Pooling, co-first authored by Junhyun Lee and Inyeop Lee, has been accepted to ICML 2019, the top conference in machine learning.
Apr. 2019: SAIN: Self-Attentive Integration Network for Recommendation, co-first authored by Seoungjun Yun and Raehyun Kim, got accepted by SIGIR 2019, the best conference in Information Retrieval.
Apr. 2019: Congratulations to Dr. Minji Jeon for her first Nature series publication! (accepted to Nature Communications)
Previously, Dr. Minji Jeon's team won 2nd place in the AstraZeneca Sanger Drug Synergy Prediction DREAM challenge (outperforming Stanford(6th), MIT(11th)).
Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen is an overview paper for the DREAM challenge and is coauthored by top performing teams and organizers from AstraZeneca-Sanger. (bioarxiv)
Dec. 2018: Our DMIS team (Minji Jeon, Donghyeon Park, Jinhyuk Lee, Hwisang Jeon, Miyoung Ko, Sunkyu Kim, Yonghwa Choi) won 1st place in the Multi-targeting Drug DREAM Challenge. The team outperformed multinational pharmaceutical firms such as Janssen Pharmaceutica. (Link1, Link2)
Dec. 2018: Predicting Multiple Demographic Attributes with Task Specific Embedding Transformation and Attention Network, co-first authored by Raehyun Kim and Hyunjae Kim, has been accepted as full paper by SDM19, one of the top-tier conferences in data-mining.
Nov. 2018: Buru Chang received the NAVER Ph.D Fellowship Award as he showed stellar performance with his papers.
Aug. 2018: Jinhyuk Lee's paper, Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering, was accepted to EMNLP2018, one of the most renowned conferences in NLP field.
Aug. 2018: Learning User Preferences and Understanding Calendar Contexts for Event Scheduling (co-first authored by Donghyeon Kim and Jinhyuk Lee) got accepted by CIKM2018, which is one of the top-tier international conferences in Database/Data Mining/Information Retrieval field with 17% acceptance rate.
Jul. 2018: Buru Chang's paper, Content-Aware Point-of-Interest Embedding Model for Successive POI Recommendation, was accepted to IJCAI 2018, one of the top-tier conferences for general AI.
Nov. 2017: Our DMIS team (Sunkyu Kim, Heewon Lee, Keonwoo Kim, Hwisang Jeon, Minji Jeon, Yonghwa Choi, Daehan Kim) was awarded as the BEST performers of the NCI-CPTAC DREAM Proteogenomics Challenge, sponsored by the National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium (CPTAC). This was the very first time that Korea team won the Challenge. (Link)
UCLA: 3rd place
Stanford University: 13th place
Aug. 2017: Jinhyuk Lee's paper, Name Nationality Classification with Recurrent Neural Network, got accepted for IJCAI 2017, one of the top-tier conferences for general AI.
Apr. 2017: Constructing and Evaluating a Novel Crowdsourcing-based Paraphrased Opinion Spam Dataset, co-first authored by Seongsoon Kim and Seongwoon Lee, has been accepted to WWW 2017, one of the top conferences for web.
Oct. 2016: Among 42 teams from different parts of the world, our DMIS team ranked 2nd place at the Disease Module Identification DREAM Challenge: Discover disease pathways in genomic networks. The goal is to systematically assess module identification methods on a panel of state-of-the-art genomic networks and to discover novel network pathways.
Oct. 2016: 생물학적 네트워크에서 질병에 연관된 모듈을 발굴하는 Disease Module Identification DREAM Challenge: Discover disease pathways in genomic networks에 참여하여 전체 42팀 중 종합성적 공동2위 달성!
Mar. 2016: Our DMIS team won 2nd place at the AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge, which is designed to predict synergistic drug combinations and to identify associated biomarkers. As the challenge was hosted by AstraZeneca, one of the top 10 pharmaceutical companies in the world, the DMIS team showed stellar performance in this grand competition, ranking 2nd place. (Link)
Stanford University: 6th place
MIT: 11th place