ScanBious: Text-mining tool for highlighting concepts from PubMed
Background:
Concept-centered semantic maps were created based on a text-mining analysis of PubMed. The objects ("concepts") of a semantic map can be MeSH-terms or other terms (names of proteins, diseases, chemical compounds, etc.) structured in the form of controlled vocabularies. The edges between the two objects were automatically calculated based on the index of semantic similarity, which is proportional to the number of publications related to both objects simultaneously. On the one hand, an individual semantic map created based on the already published papers allows us to trace scientific inquiry. On the other hand, a prospective analysis based on the study of PubMed search history enables us to determine the possible directions for future research.
In the process of navigating PubMed, researchers unknowingly generate user-specific reading profiles that can be shared within a social networking environment. This paper examines the structure of the social networking environment generated by PubMed users.
We used the semantic score to assess the connectivity between two proteins based on the number of shared relevant or related articles. Using such a score we created the semantic network for 150 human proteins belonging to different metabolic pathways. Analysis of the network has shown that proteins involved in the same molecular processes were separated into distinct subgraphs.
A comparison of personalized bibliographic profiles can be represented in the form of a semantic network, where the nodes are the names of scientists, and the relationships are proportional to the calculated measures of similarity of bibliographic profiles. It allowed us to show the relationship between the scientific trajectories of one scientific school and to correlate the results with world trends.
Way2Drug portal
Gene-Centric Knowledgebase: Focus on Human Chromosome 18 encoded proteins
Background:
Web-based tool Gene-centric Content Management System (GenoCMS) for comparing public resources to proprietary results by using the representation of proteins as color-coded catalog. Within our CMS, the features of protein-coding genes are uploaded from the public domain and then appended by additional features derived from original experimental workflows.
Knowledgebase consolidates public web-resources with proprietary experimental and predicted data, generated in course of the Russian part of the Human Proteome Project.
The GenoCMS is an example of a project-oriented informational system, which is important for public data sharing.