Fall Research Expo 2020

A Phenotype-driven COVID-19 Knowledge Graph from Biomedical Literature Drives Hypothesis Generation

View Poster

Since December 2019, the scientific community has experienced a literature explosion regarding the novel coronavirus originating in Wuhan, China. As such, it has become increasingly difficult for researchers in the field to stay informed about novel developments in the published corpus. To address this problem and aid researchers in collecting, analyzing, and organizing the vast amount of information, we have created a knowledge graph (KG) cataloguing the relationships found between entities as evidenced by papers in the COVID-19 Open Research Dataset (CORD-19). We trained an embedding model to apply the KG to subsequent tasks such as predicting new treatments, symptoms, and risk factors for COVID-19. The embedding model obtained a classification accuracy over 70% classification accuracy with hits@10 at 0.61 and 0.18 depending on the expansiveness of the KG. Furthermore, an interactive web application was created and allows researchers to explore the KG and form novel questions. In conclusion, our KG compiles and extracts COVID-19 information useful to developing diagnostics and treatments. The web application is available at http://covid19nlp.wglab.org:3001/.

PRESENTED BY

Ryan H Lee

PURM - Penn Undergraduate Research Mentoring Program

College of Arts & Sciences 2023

View Profile

Advised By

Kai Wang

Dr.

kai@pennmedicine.upenn.edu

https://wglab.org/

September 17 | 1:00 PM

Join Ryan for a virtual discussion

Virtual Meeting

View Poster(PDF)

PRESENTED BY

Ryan H Lee

PURM - Penn Undergraduate Research Mentoring Program

College of Arts & Sciences 2023

View Profile

Advised By

Kai Wang

Dr.

kai@pennmedicine.upenn.edu

https://wglab.org/

Comments

Triples

Are triples the convention for this kind of NLP and analysis? Would adding a fourth term as a metathesaurus concept or semantic network relation (eg A or B <verb> C, A <verb 1 or 2> affects B) be possible or useful? This work is extremely timely and presented very well!

Great Job!

Hi Ryan, congratulations on this very interesting project. I really liked how your research is interactive and very visual. I was curious though, how was the learning curve for developing the natural language processing? Were there any libraries or computational techniques you had to learn specifically for this research?

Skip to Main Content

The University of Pennsylvania - Curf Presents

A Phenotype-driven COVID-19 Knowledge Graph from Biomedical Literature Drives Hypothesis Generation

Comments

Triples

Great Job!