DMIS LAB

Data Mining and Information Systems Lab

PI: Prof. Jaewoo Kang, Dept. of Computer Science and Engineering, Korea Univ.

(고려대학교 컴퓨터학과 강재우교수 연구실)

About

Data science has advanced to the point where it is changing our world. It is now the center of exploring and uncovering knowledge in different domains and acts as a bridge to connect them. With ever growing amount of data and opportunity to explore, the Data Mining and Information Systems (DMIS) lab aims to drive the data science revolution.

The main focus of DMIS Lab is to utilize and leverage AI and machine learning (ML) to solve problems in bioinformatics, drug discovery, and biomedical text mining. In order to diversify and strengthen its arsenal, DMIS Lab also conducts research in other areas such as natural language processing (NLP) and graph ML in order to uncover more-refined techniques to carry out its larger mission. Aside from conducting research, DMIS Lab also participates in various international challenges and competitions in order to contribute to the communal effort to tackle unmet needs in the field of biomedine, such as the DREAM Challenges and the BioASQ Challenges.

News

Jan. 2025: Yein's paper, ASGuard: Activation-Scaling Guard to Mitigate Targeted Jailbreaking Attack was accepted to ICLR 2026, one of the top conferences in artificial intelligence. Congratulations!

Jan. 2026: Two papers got accepted to EACL 2026, one of the top conferences in NLP. Congratulations!
- Dain and Jiwoo's paper , Benchmarking Direct Preference Optimization for Medical Large Vision–Language Models (Findings)
- Heedou's paper, SCRIPTMIND: Crime Script Inference and Cognitive Evaluation for LLM-based Social Engineering Scam Detection System (Industry Track)

Dec. 2025: Our DMIS Team (Suhyeon Lim, Sungwook Jung, Hyeon Hwang, Jueon Park, and Prof. Jaewoo Kang) has been honored with multiple Excellence Awards in CURE-Bench @ NeurIPS 2025, an international competition focused on agentic AI reasoning for therapeutic decision-making at scale.
Out of the four evaluation categories, the team earned Excellence Awards in three: prediction consistency, tool use, and human-aligned evaluation.

Nov. 2025: Taewhoo Lee's paper, The Curious Case of Analogies: Investigating Analogical Reasoning in Large Language Models was accepted to AAAI 2026, one of the top conferences in artificial intelligence. Congratulations!

Sep. 2025: Our DMIS team (Hajung Kim, Hoonick Lee, Yewon Cho, Jungwoo Park, Jueon Park, Soyon Park, Yan Ting Chok, Seungheun Baek, Donghyeon Lee, and Prof. Jaewoo Kang) achieved 1st place in the BioASQ 13B Challenge, an international competition in biomedical semantic indexing and question answering.
We outperformed teams from NCU, New Delhi, UR, UA, and BSRC, winning 1st place in three out of four batches and securing the overall victory.

Sep. 2025: Congratulations to Dr. Donghee Choi on his appointment as a tenure-track Assistant Professor in the School of Computer Science and Engineering at Pusan National University!

Aug. 2025: Our DMIS team (Jongmyung Jung, Hyeongsoon Hwang, Yein Park, Minju Song, Jaehoon Yoon, Hyeon Hwang, Sanghoon Lee, Jiwoong Sohn, and Jaewoo Kang) achieved 1st place in MedHopQA 2025, the Biomedical Multi-hop Question Answering Challenge at BioCreative IX / IJCAI-25, Montréal.

Aug. 2025: Two papers got accepted to EMNLP 2025, top conferences in NLP. Congratulations!
- Jaehoon Yun, Jiwoong Sohn, and Jungwoo Park's paper, Med-PRM: Medical Reasoning Models with Step-wise Guideline-verified Process Rewards
- Hyeon Hwang's paper, Assessing LLM Reasoning Steps via Principal Knowledge Grounding (Findings)

Aug. 2025: Junseok Choe and Hajung Kim's paper, Retrosynthetic Crosstalk between Single-step Reaction and Multi-step Planning, is published at Journal of Cheminformatics. Congratulations!

Jul. 2025: 📢 Our Meerkat paper is now published in npj Digital Medicine! (https://www.nature.com/articles/s41746-025-01653-8)

Titled "Small language models learn enhanced reasoning skills from medical textbooks", the paper highlights how Meerkat-7B became the first 7B model to pass the USMLE by leveraging textbook-based instruction tuning.

Kudos to Dr. Hyunjae Kim and the Meerkat team for pushing the boundaries of open-source medical LLMs!

May. 2025: Understanding and Tackling Over-Dilution in Graph Neural Networks, first authored by Junhyun Lee, was accepted to KDD 2025, one of the top conferences in Data Mining. Congratulations!

Mar. 2025: Three papers got accepted to ACL 2025, one of the top conferences in NLP. Congratulations!
- Yein Park's paper, Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information
- Jungwoo Park's paper, Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
- Chanhwi Kim's paper, Learning from Negative Samples in Biomedical Generative Entity Linking (Findings)

Apr. 2025: GPO-VAE: Modeling Explainable Gene Perturbation Responses utilizing GRN-Aligned Parameter Optimization, co-first authored by Seungheun Baek and Soyon Park, was accepted to ISMB/ECCB 2025, one of the top conferences in Bioinformatics. Congratulations!

Jan. 2025: Two papers got accepted to ICLR 2025, one of the top conferences in AI. Congratulations!
- Yein Park's paper, ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
- Jungwoo Park's paper, Monet: Mixture of Monosemantic Experts for Transformers

Jan. 2025: Three papers got accepted to NAACL 2025, one of the top conferences in NLP. Congratulations!
- Jiwoong Sohn's paper, Rationale-Guided Retrieval Augmented Generation for Medical Question Answering
- Taewhoo Lee's paper, ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
- Chanwoong Yoon and Gangwoo Kim's paper, Ask Optimal Questions: Aligning Large Language Models with Retriever’s Preference in Conversation (Findings)

Dec. 2024: Cradle-VAE: Enhancing Single-Cell Gene Perturbation Modeling with Counterfactual Reasoning-based Artifact Disentanglement, co-first authored by Seungheun Baek and Soyon Park, was accepted to AAAI 2025, one of the top conferences in artificial intelligence. Congratulations!

Oct. 2024: Dr. Hyunjae Kim's paper, Augmenting Biomedical Named Entity Recognition with General-domain Resources, is published in Journal of Biomedical Informatics (JBI), one of the top journals in Bioinformatics. This paper is a result of our fruitful collaboration with University of Liverpool, Yale University, and NIH. Congratulations!

Oct. 2024: Kiwoong Yoo's paper, TurboHopp: Accelerated Molecule Scaffold Hopping with Consistency Models, was accepted to NeurIPS 2024, one of the top conferences in Machine Learning. Congratulations!

Sep. 2024: Chanwoong Yoon's paper, CompAct: Compressing Retrieved Documents Actively for Question Answering, was accepted to EMNLP 2024 (Miami), one of the top conferences in NLP. Congratulations!

Sep. 2024: Congratulations to Dr. Mogan Gim on his appointment as a tenure-track Assistant Professor in the Department of Biomedical Engineering at Hankuk University of Foreign Studies!

Sep. 2024: Congratulations to Dr. Bumsoo Kim on his appointment as a tenure-track Assistant Professor in the School of Computer Science and Engineering at Chung-Ang University!

Jul. 2024: Donghee's and Jinkyu's paper, DeepClair: Utilizing Market Forecasts for Effective Portfolio Selection & Heedou's paper, LAPIS: Language Model-Augmented Police Investigation System were both accepted to the Conference on Information and Knowledge Management (CIKM), which is scheduled as an in-person conference taking place in Boise, Idaho, USA on October 21-25, 2024. The former is based on collaboration with Imperial College London and Shinhan Bank, while the latter is with also Imperial College London and Korean National Police Agency. Congratulations for the authors of these two accepted papers!

Jun. 2024: Mogan's and Jueon's paper, MolPLA: a molecular pretraining framework for learning cores, R-groups and their linker joints & Minbyul's paper, Improving medical reasoning through retrieval and self-reflection with retrieval-augmented large language models were published in Bioinformatics (OUP) and will be both orally presented in the upcoming ISMB conference in Montreal, Canada. The former is based on collaboration with Aigen Sciences while the latter is with Kyunghee University. Congratulations for the authors of these two published papers!

Apr. 2024: 🎉 Introducing Meerkat-7B: The First 7B Model to Pass USMLE! 🥳

There's a noticeable difference in performance between commercial large LMs and open-source small LMs in the medical domain. While GPT-4 scored an impressive 90% accuracy on USMLE-style questions, the previous best 7B model managed only 52%, falling significantly below the USMLE passing threshold of 60%.

Our new medical LM, Meerkat-7B, achieved a groundbreaking milestone by surpassing the 60% passing threshold for the United States Medical Licensing Examination (USMLE) for the first time among 7B-parameter models (with scores of 74.3% on the MedQA dataset and 71.4% on the USMLE sample test). Additionally, our system outperformed GPT-3.5 (175B) by 13.1% across seven medical benchmarks, indicating significant progress in open-source model development within the medical field. (동아일보, 국민일보, AI타임스)

Congratulations to Dr. Hyunjae Kim and the Meerkat team on their remarkable achievement!

- Paper: https://arxiv.org/abs/2404.00376

- Model: https://huggingface.co/dmis-lab/meerkat-7b-v1.0

Mar. 2024: Congratulations to Dr. Mujeen Sung on his appointment as a tenure-track Assistant Professor in the School of Computing at Kyung Hee University!

Feb. 2024: Donghee Choi's paper, CookingSense: A Culinary Knowledgebase with Multidisciplinary Assertions, was accepted to LREC-COLING 2024, a notable international conference on computational linguistics resources. This feat is a result of our fruitful collaboration with Sony Research. Congratulations Donghee-san!

Jan. 2024: Ngoc-Quang Nguyen's paper, MulinforCPI: enhancing precision of compound–protein interaction prediction through novel perspectives on multi-level information integration, is published at Briefings in Bioinformatics, one of the top journal in Bioinformatics. Congratulations!

Jan. 2024: Hyunjae Kim's paper, Fine-tuning CLIP Text Encoders with Two-step Paraphrasing, is accepted to EACL 2024 (Findings of the ACL), one of the top conferences in NLP. Congratulations!

Oct. 2023: Two papers got accepted to EMNLP 2023 (Singapore), one of the top conferences in NLP. Congratulations!
- Mujeen Sung's paper, Pre-training Intent-Aware Encoders for Zero- and Few-Shot Intent Classification, is accepted to EMNLP 2023.
- Gangwoo Kim's paper, Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Models, is accepted to EMNLP 2023.

Aug. 2023: We are celebrating over 4,000 citations to our BioBERT paper, which is both the first and most cited biomedical domain-specific transformer-based large language model. Congratulations to our team: Jinhyuk Lee (currently at Google DeepMind), Wonjin Yoon (Harvard Medical School), Donghyeon Kim (Hyundai Motors AI), Sunkyu Kim (AIGEN Sciences), and Chan Ho So (TmaxSoft)!

Jul. 2023: Our DMIS team (Gangwoo Kim, Hajung Kim, Chanhwi Kim, Mujeen Sung, Hyunjae Kim) achieved 1st place in RadSum23, the Multi-modal and Multi-anatomical Radiology Report Summarization Challenge. Congratulations! (인공지능신문, 이데일리)
- We outperformed leading AI research groups, including Stanford University, Siemens, University College London, and The University of Texas at San Antonio.
- DMIS led the multinational team effort, with researchers from Microsoft Research Asia, AIGEN Sciences, KAIST, and Beihang University participating.
- The paper describing the winning model is available here.

Jul. 2023: Mogan's paper, ArkDTA: attention regularization guided by non-covalent interactions for explainable drug–target binding affinity prediction was published in Bioinformatics (OUP) and will be orally presented in the upcoming ISMB conference in Lyon, France . This work wouldn't have been completed without the supportive efforts of Junseok, Seungheun, Jueon, Chaeeun, Minjae and Sumin. Congratulations!

Jun. 2023: Congratulations to Dr. Buru Chang on the appointment as a tenure-track Assistant Professor in the Department of Artificial Intelligence at Sogang University!

May. 2023: Minbyul's paper, Consistency Enhancement of Model Prediction on Document-level Named Entity Recognition will be published in Bioinformatics (OUP). Congratulations!

May. 2023: Two papers got accepted to ACL 2023 (Toronto, Canada), one of the top conferences in NLP. Congratulations!
- Hyunjae Kim's paper, Automatic Creation of Named Entity Recognition Datasets by Querying Phrase Representations, is accepted to ACL 2023.
- Mujeen Sung's paper, Optimizing Test-Time Query Representations for Dense Retrieval, is accepted to Findings of ACL 2023.

Apr. 2023: Donghee Choi's paper, KitchenScale: Learning to Predict Ingredient Quantities from Recipe Contexts will be published in Expert Systems with Applications, one the most recognized journals with an impact factor of 8.665. This paper is based on collaborative work with Sony Research and Sejong University. Congratulations!

Jan. 2023: Congratulations to Dr. Seongjun Yun, who joined Amazon (Vancouver, BC) as an applied scientist. Dr. Yun joined the M5 team that builds large pretrained models to support machine learning applications at Amazon. Congratulations!

Jan. 2023: Mujeen Sung received the NAVER Ph.D Fellowship Award as he showed outstanding performance in his research area.

Nov. 2022: Ngoc-Quang Nguyen's paper, Perceiver CPI: A nested cross-attention network for compound-protein interaction prediction, will be published in Bioinformatics (OUP), one of the top journals in the field of bioinformatics. Congratulations!

Nov. 2022: LIQUID: A Framework for List Question Answering Dataset Generation, co-first authored by Seongyun Lee and Hyunjae Kim, was accepted to AAAI 2023, one of the top conferences in artificial intelligence. Congratulations!

Oct. 2022: Four papers got accepted to EMNLP 2022, one of the top conferences in NLP. Congratulations!
- Hyunjae Kim's paper, Simple Questions Generate Named Entity Recognition Datasets, is accepted to EMNLP 2022.
- Gangwoo Kim's paper, Generating Information-Seeking Conversations from Unlabeled Documents, is accepted to EMNLP 2022.
- Gangwoo Kim's paper (co-authored), Saving Dense Retriever from Shortcut Dependency in Conversational Search, is accepted to EMNLP 2022.
- Wonjin Yoon's paper, Biomedical NER for the Enterprise with Distillated BERN2 and the Kazu Framework, is accepted to EMNLP 2022 (Industry Track).

Sep. 2022: WonJin Yoon received an Academic Award, "Standigm Paper Award 2022" (스탠다임 우수논문상) from the Korean Society for Bioinformatics (한국생명정보학회) with the paper entitled Sequence Tagging for Biomedical Extractive Question Answering (Bioinformatics 2022). Congratulations!

Sep. 2022: BERN2: an advanced neural biomedical named entity recognition and normalization tool, co-first authored by Mujeen Sung and Minbyul Jeong, will be published in Bioinformatics (OUP). Congratulations! [Demo]

Aug. 2022: RecipeMind: Guiding Ingredient Choices from Food Pairing to Recipe Completion using Cascaded Set Transformer co-first authored by Mogan Gim and Donghee Choi, got accepted at CIKM 2022, one of the top-tier conforences in Information and Knowledge Management domain. This paper is a fruitful result of collaborative work between our DMIS lab, Professor Park (FNAI Lab, Sejong University) and Sony AI (Tokyo, Japan) which aims to promote creative cooking in food industry.

Jul. 2022: Congratulations to Dr. Jinhyuk Lee, who joined Google (Mountain View, CA) as a research scientist. Dr. Lee joined the NLP team at Google Research that created BERT and Transformer, working alongside Jeff Dean. Congratulations!

Jun. 2022: Sequence Tagging for Biomedical Extractive Question Answering, co-authored by WonJin Yoon and researchers at AstraZeneca UK and Sweden, as one of the results of research collaboration, will be published in Bioinformatics (OUP). Congratulations!

Apr. 2022: DyGRAIN: An Incremental Learning Framework for Dynamic Graphs, co-authored by Seoyoon Kim and Seongjun Yun, will be presented at IJCAI 2022. Congratulations!

Apr. 2022: Congratulations, Jungsoo Park, whose papers got accepted at ACL and NAACL!
- Consistency Training with Virtual Adversarial Discrete Perturbation. NAACL 2022
- FAVIQ: FAct Verification from Information-seeking Questions. ACL 2022

Apr. 2022: MSTR: Multi-Scale Transformer for End-to-End Human-Object Interaction Detection, co-authored by Bumsoo Kim and Junhyun Lee, got accepted at CVPR 2022. Congratulations!

Feb. 2022: Congratulations to Dr. Minji Jeon on the appointment of as a tenure-track Assistant Professor in the Department of Medicine at Korea University Medical School.

Feb. 2022: Congratulations to Dr. Donghyeon Park on the appointment as a tenure-track Assistant Professor in the Department of Data Science at Sejong University.

Dec. 2021: Congratulations to Dr. Kyubum Lee on joining Amgen Inc. (CA, USA) as Principal Data Scientist. Dr. Lee will work on data-driven clinical trial design and execution using ML and NLP.

Dec. 2021: WonJin Yoon et al.'s paper, KU-DMIS at BioASQ 9: Data-centric and model-centric approaches for biomedical question answering, is selected as the best paper in the BioASQ Lab at CLEF2021, one of the most highly valued venues in Biomedical NLP. Congratulations!

Nov. 2021: Seongjun Yun's paper, Neo-GNNs: Neighborhood Overlap-aware Graph Neural Networks for Link Prediction, was accepted to NeurIPS 2021, one of the top conferences in Machine Learning. Congratulations!

Nov. 2021: DMIS team scored top performance at 2 challenge tracks held by the BioCreative VII workshop. (인공지능신문, 뉴시스)
- Won third place at the relation extraction task: DrugProt: Text mining drug/chemical-protein interactions (Track 1). 🥉
  Paper: Using Knowledge Base to Refine Data Augmentation for Biomedical Relation Extraction (Wonjin Yoon, Sean Yi, Richard Jackson (External affiliation), Hyunjae Kim, Sunkyu Kim, Jaewoo Kang)
- Won first place at the named entity recognition task: NLM-Chem Track: Full text Chemical Identification and Indexing in PubMed articles (Track 2). 🥇
  Paper: Improving Tagging Consistency and Entity Coverage for Chemical Identification in Full-text Articles (Hyunjae Kim, Mujeen Sung, Wonjin Yoon, Sungjoon Park, Jaewoo Kang)
- Workshop information: https://biocreative.bioinformatics.udel.edu/news/

Aug. 2021: Two papers got accepted to EMNLP 2021, one of the top conferences in NLP. Congratulations!
- Dr. Jinhyuk Lee's paper, Phrase Retrieval Learns Passage Retrieval, Too, is accepted to EMNLP 2021.
- Mujeen Sung's paper, Can Language Models be Biomedical Knowledge Bases?, is accepted to EMNLP 2021.

May. 2021: Two papers got accepted to ACL-IJCNLP 2021, one of the top conferences in NLP. Congratulations!
- Dr. Jinhyuk Lee's paper, Learning Dense Representations of Phrases at Scale, is accepted to ACL-IJCNLP 2021. (한국경제, IT조선)
- Gangwoo Kim's paper, Learn to Resolve Conversational Dependency: A Consistency Training Framework for Conversational Question Answering, is accepted to ACL-IJCNLP 2021.

Apr. 2021: Gwanghoon Jang's paper, Predicting mechanism of action of novel compounds using compound structure and transcriptomic signature co-embedding, co-advised by Dr. Sungjoon Park and Prof. Jaewoo Kang, was accepted to ISMB/ECCB 2021. Congratulations!

Mar. 2021: Bumsoo Kim's paper, HOTR: End-to-End Human-Object Interaction Detection with Transformers, was accepted to CVPR 2021 (Virtual, June 19-25), one of the most top-tier conferences for computer vision, for oral presentation!

Feb. 2021: Mujeen Sung received the 2020 KU Graduate School Achievement Award as he showed outstanding performance in his research area.

Oct. 2020: Donghyeon Park's paper, FlavorGraph: A large-scale food-chemical graph for generating food representations and recommending food pairings, was accepted to Scientific Reports, an online peer-reviewed open access scientific mega journal published by Nature Research.

Oct. 2020: Minbyul Jeong, Mujeen Sung, Gangwoo Kim, Donghyeon Kim, Jaehyo Yoo, Wonjin Yoon and Jaewoo Kang won first place in both question answering and summarization of the BioASQ 8B (Phase B) Task B Challenge!
- 고려대 컴퓨터학과 연구팀이 의학, 생물학 질문에 답하는 인공지능 시스템 경진 국제대회인 BioASQ 대회에서 미국 캘리포니아대학 샌디에고(UCSD), 매사추세츠대학 (UMass), 중국 푸단대학 (Fudan Univ), 일본 도쿄대학(University of Tokyo)를 제치고 2년 연속 우승했다. (이데일리)
- 사람이 읽기에 자연스러운 문장으로 질문에 대한 답을 할 수 있는 인공지능 시스템이라는 점에서 앞으로 임상적으로 유의한 의사결정 지원 도구를 개발하는데 활용될 수 있을 것으로 기대된다.

Sep. 2020: Jungsoo Park's paper, Adversarial Subword Regularization for Robust Neural Machine Translation, was accepted to Findings of ACL: EMNLP 2020, an anthology journal of ACL which is one of the top-tier conferences for computational linguistics.

Sep. 2020: Miyoung Ko's paper, Look at the First Sentence: Position Bias in Question Answering, was accepted to EMNLP 2020, one of the best renowned conferences for NLP-related publications!

Sep. 2020: Recently, BioBERT: a pre-trained biomedical language representation for biomedical text mining co-first authored by Dr. Jinhyuk Lee and Wonjin Yoon has been ranked as the most read papers in Bioinformatics which is one of the top-tier journals in the domain.

Also, BioBERT was included in the Best Papers for the Natural Language Processing Section of the 2020 IMIA (International Medical Informatics Association) Yearbook (link). Congratulations once again to the authors for this grand achievement!

Jul. 2020: Enhancing the interpretability of transcription factor binding site prediction using attention mechanism, co-first authored by Sungjoon Park, Yookyung Koh, and Hwisang Jeon, was accepted to Scientific Reports. Congratulations!

Apr. 2020: MAPS: Multi-Agent reinforcement learning-based Portfolio management System, co-first authored by Jinho Lee and Raehyun Kim, is accepted to IJCAI 2020, one of the top conferences for general AI.
- MAPS is an hedge fund like portfolio management system trained with cooperative multi-agent reinforcement learning.
- It is inspired by the fact that hedge fund's entire portfolio is manged by multiple investors, working together to maximize risk-adjusted return.

Apr. 2020: Congratulations to Sunkyu Kim for publication to Cell Systems!
- Sunkyu Kim's team won 1st place in the NCI-CPTAC DREAM Proteogenomics Challenge in 2017 (outperforming UCLA(3rd), Stanford(13th)).
- Assessment of the Limits of Predictability of Protein and Phosphorylation Levels in Cancer is a paper for the DREAM challenge and is worked with Heidelberg University, Icahn School of Medicine and New York University.

Apr. 2020: Sunkyu Kim's paper, Improved survival analysis by learning shared genomic information from pan-cancer data, was accepted to ISMB 2020, top conference in Bioinformatics.

Two papers got accepted to ACL 2020, one of the top conferences in NLP.
- Dr. Jinhyuk Lee's paper, Contextualized Sparse Representations for Real-Time Open-Domain Question Answering, is accepted to ACL 2020.
- Mujeen Sung's paper, Biomedical Entity Representations with Synonym Marginalization, is accepted to ACL 2020.

Jan. 2020: Wonjin Yoon received the NAVER Ph.D Fellowship Award as he showed outstanding performance in his research area.

Nov. 2019: Congratulations! Our DMIS team (Sungjoon Park, Minji Jeon, Sunkyu Kim, Junhyun Lee, Seongjun Yun, Bumsoo Kim, Buru Chang) has been selected as the top performers in the IDG-DREAM Drug-Kinase Binding Prediction Challenge. As one of the best performers, we presented our model at the RSG with DREAM Conference, NY in November. (Link)
- 연구팀은 11월 뉴욕에서 개최된 RSG with Dream Conference에서 우승자 자격으로 초청되어 AI기반 버추얼약물스크리닝모델을 발표했다. (매일경제, 한국대학신문, 연합뉴스)
- 드림 챌린지는 미국 IBM과 Sage Bionetworks가 주최하는 의생명분야 데이터과학 국제 경진대회로 세계적으로 권위를 인정받고 있는 대회이며 연구팀은 일리노이대-칭화대 컨소시움, 노스캐롤라이나대 팀과 함께 약물활성예측 드림챌린지 공동 최우수팀으로 선정되었다.

Oct. 2019: [특별기고] 데이터주도과학과 인공지능. (DATA@KU 4호. 2019.10)

[특별기고]데이터주도과학과 인공지능 – 과학의 패러다임변화를 이끌다컴퓨터학 분야 노벨상이라 불리는 튜링어워드를 받은 짐그레이는 고대로부터 현대까지의 과학의 패러다임변화를 다음과 같이 설명한다. 인류가 과학을 하는 방식이 수천년 전에는 자연현상을 설명하는 경험적 과학이었고, 수백년전부터는 이를 모델링하고 일반화하는 이론적 과학, 수십년 전부터는 이론적 분석이 어려운 복잡한 현상을 컴퓨터 시뮬레이션을 통해 해석하고, 현재는

Sep. 2019: Congratulations! DMIS team outperformed Google team and won 1st place at BioASQ challenge, a challenge on large scale biomedical semantic indexing and question answering.
- By using BioBERT, our team(Wonjin Yoon, Jinhyuk Lee, Donghyeon Kim, Minbyul Jeong) produced outstanding results for all 5 test batches on BioASQ Task 7B-Phase B (challenge results - http://bioasq.org/participate/seventh-challenge-winners ).
- 의생명 분야의 질의 응답 시스템 경진대회인 BioASQ 대회에서 Google 제치고 1위 [Task 7B-Phase B] (고려대 보도자료, 전자신문, 연합뉴스)

Sep. 2019: Congratulations to Dr. Jinhyuk Lee and Wonjin Yoon for BioBERT publication in Bioinformatics!
- BioBERT: a pre-trained biomedical language representation model for biomedical text mining is the first biomedical language representation model pre-trained on large-scale biomedical corpus and achieves state-of-the-art performances on various biomedical NLP tasks. (paper , code)
- With BioBERT, DMIS team won 1st place at BioASQ challenge.

Sep. 2019: Seongjun Yun's paper, Graph Transformer Networks, got accepted to In Advances in Neural Information Processing Systems (NeurIPS 2019), one of the top-tier conferences in Machine Learning alongside with ICML.

Aug. 2019: Donghyeon Park's paper, KitcheNette: Prediction and Ranking Food Ingredient Pairings based on Siamese Neural Network, got accepted to IJCAI 2019, one of the top-tier conferences for general AI.
- DMIS 연구팀은 100만개의 레시피를 분석하고 식재료의 조합을 추천하는 Siamese Neural Network기반의 AI모델을 개발했다. 해당 모델은 전통적인 기계학습 모델들의 예측 및 추천 성능을 월등히 뛰어넘었으며 연구결과는 인공지능 최고 권위 학술대회 중 하나인 IJCAI-19, Macao에서 발표될 예정이다. (고려대 보도자료, YTN 사이언스, 매일경제, 연합뉴스, 서울신문, IT조선)
- 연구팀은 사용자가 직접 식재료 조합을 찾아보고 연구결과를 활용할 수 있도록 웹페이지를 제공하고 있다. (KitcheNette)

May. 2019: Real-Time Open-Domain Question Answering on Wikipedia with Dense-Sparse Phrase Index, co-first authored by Jinhyuk Lee, is accepted to ACL 2019, the top conference in computational linguistics and natural language processing.

May. 2019: ReSimNet: Drug Response Similarity Prediction using Siamese Neural Networks, co-first authored by Minji Jeon and Donghyeon Park, has been accepted to Bioinformatics, the best journal for computational biology.
- ReSimNet measures the transcriptional response similarity of the two chemical compounds, and the team achieved first place in the Multi-targeting Drug DREAM Challenge with this model (outperforming Janssen Pharmaceutica).

Apr. 2019: Self-Attention Graph Pooling, co-first authored by Junhyun Lee and Inyeop Lee, has been accepted to ICML 2019, the top conference in machine learning.

Apr. 2019: SAIN: Self-Attentive Integration Network for Recommendation, co-first authored by Seoungjun Yun and Raehyun Kim, got accepted by SIGIR 2019, the best conference in Information Retrieval.

Apr. 2019: Congratulations to Dr. Minji Jeon for her first Nature series publication! (accepted to Nature Communications)
- Previously, Dr. Minji Jeon's team won 2nd place in the AstraZeneca Sanger Drug Synergy Prediction DREAM challenge (outperforming Stanford(6th), MIT(11th)).
- Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen is an overview paper for the DREAM challenge and is coauthored by top performing teams and organizers from AstraZeneca-Sanger. (bioarxiv)

Dec. 2018: Our DMIS team (Minji Jeon, Donghyeon Park, Jinhyuk Lee, Hwisang Jeon, Miyoung Ko, Sunkyu Kim, Yonghwa Choi) won 1st place in the Multi-targeting Drug DREAM Challenge. The team outperformed multinational pharmaceutical firms such as Janssen Pharmaceutica. (Link1, Link2)
- DMIS 연구팀, 다국적 제약사를 (얀센, 바이엘 등) 제치고 대회에서 우승! 연구팀은 신약 후보 물질을 발굴하는 모델을 개발했고 AI로 선택한 물질의 가능성이 입증되어 대회 우승팀으로 선정되었다. (매일경제, 연합뉴스)

Dec. 2018: Predicting Multiple Demographic Attributes with Task Specific Embedding Transformation and Attention Network, co-first authored by Raehyun Kim and Hyunjae Kim, has been accepted as full paper by SDM19, one of the top-tier conferences in data-mining.

Nov. 2018: Buru Chang received the NAVER Ph.D Fellowship Award as he showed stellar performance with his papers.

Aug. 2018: Jinhyuk Lee's paper, Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering, was accepted to EMNLP2018, one of the most renowned conferences in NLP field.

Aug. 2018: Learning User Preferences and Understanding Calendar Contexts for Event Scheduling (co-first authored by Donghyeon Kim and Jinhyuk Lee) got accepted by CIKM2018, which is one of the top-tier international conferences in Database/Data Mining/Information Retrieval field with 17% acceptance rate.

Jul. 2018: Buru Chang's paper, Content-Aware Point-of-Interest Embedding Model for Successive POI Recommendation, was accepted to IJCAI 2018, one of the top-tier conferences for general AI.

Nov. 2017: Our DMIS team (Sunkyu Kim, Heewon Lee, Keonwoo Kim, Hwisang Jeon, Minji Jeon, Yonghwa Choi, Daehan Kim) was awarded as the BEST performers of the NCI-CPTAC DREAM Proteogenomics Challenge, sponsored by the National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium (CPTAC). This was the very first time that Korea team won the Challenge. (Link)
- - UCLA: 3rd place
  - Stanford University: 13th place

- Nov. 2017: 고려대학교 강재우 교수 연구팀 - 암 환자의 단백질 활성도를 예측하는 NCI-CPTAC DREAM Proteogenomics Challenge에 참가하여 대회 역사상 한국팀 최초 우승! 해당 Challenge는 미국 국립 암 연구원의 유전단백체 연구센터(NCI-CPTAC)가 주최하였다. (연합뉴스, 매일경제, 서울경제)

Aug. 2017: Jinhyuk Lee's paper, Name Nationality Classification with Recurrent Neural Network, got accepted for IJCAI 2017, one of the top-tier conferences for general AI.

Apr. 2017: Constructing and Evaluating a Novel Crowdsourcing-based Paraphrased Opinion Spam Dataset, co-first authored by Seongsoon Kim and Seongwoon Lee, has been accepted to WWW 2017, one of the top conferences for web.

Oct. 2016: Among 42 teams from different parts of the world, our DMIS team ranked 2nd place at the Disease Module Identification DREAM Challenge: Discover disease pathways in genomic networks. The goal is to systematically assess module identification methods on a panel of state-of-the-art genomic networks and to discover novel network pathways.
- Oct. 2016: 생물학적 네트워크에서 질병에 연관된 모듈을 발굴하는 Disease Module Identification DREAM Challenge: Discover disease pathways in genomic networks에 참여하여 전체 42팀 중 종합성적 공동2위 달성!

Mar. 2016: Our DMIS team won 2nd place at the AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge, which is designed to predict synergistic drug combinations and to identify associated biomarkers. As the challenge was hosted by AstraZeneca, one of the top 10 pharmaceutical companies in the world, the DMIS team showed stellar performance in this grand competition, ranking 2nd place. (Link)
- - Stanford University: 6th place
  - MIT: 11th place

- Mar. 2016: 항암제 병합 치료 효능을 예측하는 The AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge에 참여하여 전 세계 62팀 중 2위 입상! 해당 Challenge는 세계 10대 제약회사 "AstraZeneca"가 주최하였으며 강재우 교수 연구팀은 Stanford University(6위), MIT(11위)를 압도적으로 제치고 2위를 기록했다. (경향신문, 서울경제)

Address : Office 502, Jung Woonoh IT & General Education Center, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul, Republic of Korea 02841 Tel : +82-2-3290-3566Copyright © 2025, By Data Mining & Information Systems Laboratory, Department of Computer and Radio Communications, Korea University, All Rights Reserved.

Google Sites

Report abuse