A Biomedical Knowledge Base of Named Entities

Alex Fang has designed and constructed a biomedical knowledge base to facilitate research in bioinformatics. The knowledge base contains entries representing named entities such as drugs, diseases, genes and their combinations central to intelligent processing of biomedical literature.

Entity Quantity Information
Drugs 210,486 Drug name, generic name, brand name, active ingredient, Mesh ID, info source
Diseases 80,014 Disease name, ICD code, Mesh ID, info source
Genes 405,599 Gene name, gene symbol, chromosome, chromosomal start, chromosomal end, NCBI gene ID, info sources
Drug-disease pairs 1,907,474 Drug name, disease name, PubMed ID, info sources
Drug-gene pairs 747,996 Gene symbol, gene name, info source, interaction types, PubMed ID, drug name
Disease-gene pairs 18,754,794 Gene symbol, gene name, info source, interaction type, PubMed ID, disease name

So far, the knowledge base contains a total of 22,106,363 entries, which, coupled with Alex’s past research in automatic terminology extraction, support robust approaches to biomedical information engineering such as drug repurposing.


Fang, C Y, Liu, Y, Lu, Y, Cao, J, and Xia, J. (2018). A Corpus-Oriented Perspective on Terminologies of Side Effect and Adverse Reaction in Support of Text Retrieval for Drug Repurposing. International Journal of Data Mining and Bioinformatics, Vol. 21, No. 3. pp. 269–286.