Principle Investigator: Thanh Thieu, Ph.D.

Dr. Thieu received his Ph.D. in Computer Science from the University of Missouri-Columbia. Prior to Moffitt, he was an Assistant Professor in Computer Science Department at the Oklahoma State University, and further was a postdoc at the U.S. National Institutes of Health. He has been pursuing research in natural language processing, machine learning, and artificial intelligence with application in healthcare and education. Dr. Thieu has taught and mentored students and juniors at University of Missouri, NIH Clinical Center, ACT Inc., Oklahoma State University, and Moffitt Cancer Center. Having worked in academia, government, and industry, Dr. Thieu has developed capability to mentor and collaborate across educational backgrounds, ethnicities, genders, and origins.


Ph.D. Student: Thanh Duong

Thanh is a Ph.D. student in Computer Science at Oklahoma State University and will transfer to the University of South Florida under advisor of Dr. Licato and Dr. Thieu. His research focuses on NLP algorithm, lexical complexity, and language generation. He is interested in applying NLP algorithm to process free text clinical notes in electronic health records and free text scientific reports in medical literature. Thanh aims to develop a data augmentation generation method for expanding dataset size of clinical notes that improve training process of language model.


Ph.D. Student: Tuan-Dung Le

Tuan Dung is a PhD student in Computer Science at the University of South Florida. He received his master’s degree in Information Systems from the Hanoi University of Science and Technology in Vietnam and worked as AI engineer at FPT.AI for 2 years. Before joining Moffitt Cancer Center as research trainee, he worked under supervision of Dr. Thieu for a year. He has built a host-pathogen interactions database from scientific literature to help the biomedical research community in the field of infectious diseases. He also developed a Natural Language Processing system that can effectively extract patient's history information from clinical unstructured text to benefit the medical billing process.