Biomedical AI


For Prospective Students and Research Fellows

Data Mining and Information Systems (DMIS) lab is looking for MS/Ph.D. Students and Postdoctoral Fellows with enthusiam towards multi-faceted projects related to Drug Discovery and Precision Medicine.

DMIS Lab has a prestigious history of outperforming  


DMIS Lab aims to utilize AI and Machine Learning to solve a variety of research topics such as large-scale Language Modeling, Question Answering, and Named Entity Recognition. Our research achievements have been presented over the past years at top-tier conferences/journals such as ACL, NAACL, EMNLP, and Bioinformatics. Especially, our pre-trained language model BioBERT has been cited by over 2,400 and was selected as one of the Best Papers for the Natural Language Processing Section of the 2020 IMIA (International Medical Informatics Association) Yearbook. Biobert is by far the most cited pre-trained language model in the biomedical domain. 


Aside from research publications, DMIS Lab has participated in many international BioNLP challenges and achieved top performance: BioASQ Phrase B (2019, 2020), BioCreative VII NER (NLM-Chem) (2021), etc. Also, we are currently collaborating with global big techs and pharma companies including Amazon, Adobe (USA), AstraZeneca (UK), and other valuable partners. Our alumni continue to produce outstanding achievements in both academia and industry: tenure-track assistant professors at Korea University and Sejong University (Korea), research scientists at Google, Amgen (USA), and NAVER (Korea), etc.

Recent News

2024

Dec. 1 paper accepted to AAAI 2025.
Oct. 1 paper accepted to NeruIPS 2024.
Jul. 1 paper accepted to ISMB 2024.
Jan. 1 paper accepted to Briefings in Bioinformatics.

2023

Dec. 1 paper accepted to BIBM 2023.
Aug. 1 paper accepted to JAMIA 2023.
Jul. 1 paper accepted to ISMB/ECCB 2023. 

2022

Nov. 1 paper accepted to Bioinformatics.
Aug. 1 paper accepted to CIKM 2022.

2021

Nov. 1 paper accepted to Frontiers in Oncology.
Oct. 1 paper accepted to IEEE Access.
Sep. 1 paper accepted to PLOS Computational Biology.
Apr. 1 paper accepted to ISMB/ECCB 2021.
Apr. 1 paper accepted to Nature Communications.

~2020

Oct.2020 1 paper accepted to Scientific Reports.
Jul.2020 1 paper accepted to Scientific Reports.
Jun.2020 1 paper accepted to Scientific Data.
Apr.2020 1 paper accepted to Cell Systems.
Apr.2020 1 paper accepted to ISMB 2020.
Nov.2019 Our DMIS team has been selected as the top performers in the IDG-DREAM Drug-Kinase Binding Prediction Challenge. (Results -DMIS team)
Nov.2019 1 paper accepted to Nucleic Acids Research.
Nov.2019 1 paper accepted to Genes.
Jul.2019 1 paper accepted to BMC Medical Genomics.
Aug.2019 1 paper accepted to IJCAI 2019.
May.2019 1 paper accepted to Bioinformatics.
Apr.2019 1 paper accepted to Nature Communications.
Dec.2018 1 paper accepted to BMC Medical Imaging.
Dec.2018 Our DMIS team won 1st place in the Multi-targeting Drug DREAM Challenge. (Link1, Link2)
Sep.2018 1 paper accepted to PLOS ONE.
Aug.2018 1 paper accepted to Methods.
Aug.2018 1 paper accepted to Information Sciences.
Jul.2018 1 paper accepted to PLOS ONE.
Apr.2018 1 paper accepted to BMC Medical Genomics.
Mar.2018 1 paper accepted to International Journal of Genomics.
Mar.2018 2 papers accepted to BMC Systems Biology.
Jan.2018 1 paper accepted to BMC Bioinformatics.
Jan.2018 1 paper accepted to PLOS ONE.
Jan.2018 1 paper accepted to Database.
Nov.2017 Our DMIS team was awarded as the BEST performers of the NCI-CPTAC DREAM Proteogenomics Challenge. (Link)
Mar.2017 1 paper accepted to PLOS ONE.
Jan.2017 1 paper accepted to Nucleic Acids Research.
Dec.2016 1 paper accepted to BMC Bioinformatics.
Oct.2016 1 paper accepted to BIOLOGY DIRECT.
Oct.2016 1 paper accepted to PLOS ONE.
Oct.2016 Our DMIS team ranked 2nd place at the Disease Module Identification DREAM Challenge: Discover disease pathways in genomic networks.
Mar.2016 Our DMIS team won 2nd place at the AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge. (Link)
Jul.2015 1 paper accepted to BMC Bioinformatics.
Jul.2015 1 paper accepted to PLOS ONE.
Jun.2015 1 paper accepted to BMC Genomics.
May.2015 1 paper accepted to Bioinformatics.
Apr.2015 1 paper accepted to Journal of Clinical Neurology.
Sep.2014 1 paper accepted to Bioinformatics.
Jan.2014 1 paper accepted to Bioinformatics.
Sep.2013 1 paper accepted to International Journal of Data Mining and Bioinformatics.
Sep.2013 1 paper accepted to Human Genomics.
Apr.2013 1 paper accepted to BMC Medical Informatics and Decision Making.
Sep.2012 1 paper accepted to International Journal of Data Mining and Bioinformatics.

Publications

2024

CRADLE-VAE: Enhancing Single-Cell Gene Perturation Modeling with Counterfactual Reasoning-based Artifact Disentanglement
Seungheun Baek, Soyon Park, Yan Ting Chok, Junhyun Lee, Jueon Park, Mogan Gim*, Jaewoo Kang*
AAAI 2025
[Paper]

TurboHopp: Accelerated Molecule Scaffold Hopping with Consistency Models
Kiwoong Yoo, Owen Oertell, Junhyun Lee, Sanghoon Lee, Jaewoo Kang*
NeurIPS 2024
[Paper]

MolPLA: A Molecular Pre-training Framework for Learning Cores, R-Groups and their Linker Joints
Mogan Gim, Jueon Park, Soyon Park, Sanghoon Lee, Seungheun Baek, Junhyun Lee, Ngoc-Quang Nguyen, Jaewoo Kang*
ISMB 2024, Bioinformatics
[Paper] [Code]

MulinforCPI: enhancing precision of compound–protein interaction prediction through novel perspectives on multi-level information integration
Ngoc-Quang Nguyen, Sejeong Park, Mogan Gim, Jaewoo Kang*
Briefings In Bioinformatics  
[Paper] [Code]

2023

Enhancing Clinical Outcome Predictions through Auxiliary Loss and Sentence-Level Self-Attention  
Sanghoon Lee, Gwanghoon Jang, Chanhwi Kim, Sejeong Park, Kiwoong Yoo, Jihye Kim, Sunkyu Kim*, Jaewoo Kang*
BIBM 2023
[Paper] [Code]

Evaluation of crowdsourced mortality prediction models as a framework for assessing artificial intelligence in medicine  
Timothy Bergquist, Thomas Schaffter, Yao Yan, Thomas Yu, Justin Prosser, Jifan Gao, Guanhua Chen, Łukasz Charzewski, Zofia Nawalany, Ivan Brugere, Renata Retkute, Alidivinas Prusokas, Augustinas Prusokas, Yonghwa Choi, Sanghoon Lee, Junseok Choe, Inggeol Lee, Sunkyu Kim, Jaewoo Kang, Sean D. Mooney*, Justin Guinney* and the Patient Mortality Prediction DREAM Challenge Consortium 
JAMIA
[Paper]

ArkDTA: Attention Regularization guided by non-Covalent Interactions for Explainable Drug-Target Binding Affinity Prediction  
Mogan Gim, Junseok Choe, Seungheun Baek, Jueon Park, Chaeeun Lee, Minjae Ju, Sumin Lee, Jaewoo Kang*
ISMB 2023, Bioinformatics
[Paper] [Code]

2022

Perceiver CPI: A nested cross-attention network for compound-protein interaction prediction  
Ngoc-Quang Nguyen, Gwanghoon Jang, Hajung Kim, Jaewoo Kang*
Bioinformatics 2022
[Paper] [Code]

RecipeMind: Guiding Ingredient Choices from Food Pairing to Recipe Completion using Cascaded Set Transformer  
Mogan Gim†, Donghee Choi, Kana Maruyama, Jihun Choi, Hajung Kim, Donghyeon Park* and Jaewoo Kang*   
CIKM 2022
[Paper] [Code]

2021

Deep-Learning-Based Natural Language Processing of Serial Free-Text Radiological Reports for Predicting Rectal Cancer Patient Survival  
Sunkyu Kim†, Choong-kun Lee†, Yonghwa Choi, Eun Sil Baek, Jeong Eun Choi, Joonseok Lim, Jaewoo Kang* and Sang Joon Shin*  
Frontiers in Oncology
[Paper] [Code]

Crowdsourced identification of multi-target kinase inhibitors for RET- and TAU- based disease: the Multi-Targeting Drug DREAM Challenge  
Zhaoping Xiong†, Minji Jeon†, Robert J Allaway†, Jaewoo Kang, Donghyeon Park, Jinhyuk Lee, Hwisang Jeon, Miyoung Ko, Hualiang Jiang, Mingyue Zheng, Aik Choon Tan, Xindi Guo, The Multi-Targeting Drug DREAM Challenge Community, Kristen K Dang, Alex Tropsha, Chana Hecht, Tirtha K. Das, Heather A. Carlson, Ruben Abagyan, Justin Guinney, Avner Schlessinger*, Ross Cagan*  
PLOS Computational Biology
[Paper]

RecipeBowl: A Cooking Recommender for Ingredients and Recipes Using Set Transformer
Mogan Gim†, Donghyeon Park†, Michael Spranger, Kana Maruyama, Jaewoo Kang*  
IEEE Access
[Paper]

Crowdsourced mapping of unexplored target space of kinase inhibitors  
Anna Cichonska, Balaguru Ravikumar, Robert J Allaway, Fangping Wan, Sungjoon Park, Olexandr Isayev, Shuya Li, Michael Mason, Andrew Lamb, Zia-ur-Rehman Tanoli, Minji Jeon, Sunkyu Kim, Mariya Popova, Stephen Capuzzi, Jianyang Zeng, Kristen Dang, Gregory Koytiger, Jaewoo Kang, et al.   
Nature Communications  

Predicting mechanism of action of novel compounds using compound structure and transcriptomic signature co-embedding  
Gwanghoon Jang, Sungjoon Park*, Sanghoon Lee, Sunkyu Kim, Sejeong Park, Jaewoo Kang*
ISMB/ECCB 2021, Bioinformatics  
[Paper] [Code]