|
PubTator.key |
|
|
|
A BioC format for PubTator and other NER tools (i.e., tmChem, DNorm, tmVar, SR4GN or GenNorm) developed at the Biomedical Text Mining group at NCBI |
|
The goal of this collection is to provide easy access to the text and bio-concept annotations for PMC articles. |
|
|
|
collection: a group of PubMed documents, each document is organized into title, abstract and other passages |
|
|
|
source: PubMed, PubMed Central, etc. |
|
|
|
date: Document download date |
|
|
|
document: abstract, full-text article, free-text document, etc. |
|
|
|
id: PubMed ID (or other ID in a given collection) of the document |
|
|
|
passage: Title, abstract and other passages |
|
|
|
infon["type"]: "title", "abstract" and other passages |
|
|
|
offset: Title has an offset of zero, while the other passages (e.g., abstract) are assumed to begin after the previous passages and one space |
|
|
|
text: Text of the passage |
|
|
|
annotation: One bio-concept of the passage as determined by the tmChem, DNorm, tmVar, SR4GN or GenNorm |
|
|
|
infon["type"]: The type of bioconcept, e.g. "Gene", "Species", "Disease", "Chemical" or "Mutation" |
|
|
|
infon["MeSH"]: The bio-concept identifier in MeSH as detected by DNorm or tmChem |
|
|
|
infon["OMIM"]: The bio-concept identifier in OMIM as detected by DNorm |
|
|
|
infon["NCBI_Gene"]: The bio-concept identifier in NCBI Gene as detected by GenNorm |
|
|
|
infon["NCBI_Taxonomy"]: The bio-concept identifier in NCBI Taxonomy as detected by SR4GN |
|
|
|
infon["ChEBI"]: The bio-concept identifier in ChEBI as detected by tmChem |
|
|
|
infon["tmVar"]: The intelligent key generated artificially for the mention detected by tmVar (<Sequence type>|<Mutation type>|<Wild type>|<Mutation position>|<Mutant>) |
|
|
|
location: location of the mention including the global document "offset" where a bio-concept is located and the "length" of the mention |
|
|
|
text: Mention of the bio-concept |
|
|