Ontology for Investigations on Pathogenicity

From OIP
Jump to: navigation, search
OIP Ontology
Formalising the data from host-pathogen interaction profiling experiments between plants and fungi
Plant pathogenic fungi infection as seen at different levels - from the field to the cell.
Language OWL
Namespace http://oip.rothamsted.ac.uk

Project goals

Our project aims to develop an ontology that will support the following efforts of the pathogenicity research community:

  • Data curation. Data curators will be able to benefit from formally-defined controlled vocabularies, developed according to the best practices of OWL ontology development. OIP will help them to ensure that all curated data is bound to unambiguous and semantically-distinct terms. Additional rules encoded in OWL also provide guidance on how to create a complete and consistent data model. OIP incorporates links to other relevant ontologies and therefore also facilitates their re-use.

  • Sharing and integration of experimental results. The OIP ontology binds terms to globally unique identifiers (URIs) and therefore will enable data sharing and analysis using Semantic Web technologies. These, in turn, will facilitate further integration with other data resources. As a proof-of-concept, we have already covered part of the PHI-base data into RDF format and explored the possibility of linking it to the RDF representation of UniProt database. The results of this work can be accessed via our SPARQL endpoint.

  • Development of novel computational analyses. The use of Semantic Web technologies allows easy creation of highly sophisticated integrated datasets though the use of globally-unique identifiers. Construction of such integrated datasets can bring in essential biological knowledge to complement the data available in databases like PHI-base. These datasets can facilitate further analysis using traditional bioinformatics techniques (e.g. guilt-by-association, biological network structure-based methods). The use of OWL ontologies also opens up the possibility of using semantic reasoning analysis.

The OIP ontology only aims to formalise how experiments that profile host-pathogen interactions are described. The current project does not aim to go beyond the experimental data and does not attempt to organise the information with respect to the biological problem that those experiments ultimately investigate. Specifically, the OIP does not offer terms that could be used to model known biological processes of pathogenic interactions and how they are related to higher-level phenotypes – but rather observations arising from specific experiments.


One of the challenges faced by the plant pathogenic fungi research community is the unavoidable dispersion of research effort across large number of diverse species, both in terms of plant hosts and fungal pathogens. The importance of this area of research is defined both by the need to balance food security and nature conservation concerns and by increasing threat of fungicide resistance leading to loss of previously successful pest control chemicals. To fulfil the needs of the ever growing human population, agricultural crop productivity needs to increase. For this to happen there needs to be a significant reduction in the potential losses to agricultural crops from the myriads of pathogenic microbes and invertebrate pest that cause diseases and damage. These plant infecting organisms include phytopathogenic bacteria, fungi, protists viruses, insects and nematodes. The resulting biotic stresses cause major losses to crop yield and product quality and when severe disease epidemics occur this can result in the complete loss of the harvest [1]. Globally, an estimated 20-40% reduction to agricultural production globally occurs each year due to pathogens and pests.[2][3] In addition, over the past few decades a steady increase in the virulence of plant infecting organisms has been observed which is making the task of pathogen and pest control even more difficult.[4]

The problem of consistent data sharing is particularly poignant for this area of research. Different combinations of hosts and pathogen pairs can be studied in parallel by very distinct communities. But at the same time, findings may often hold global relevance – as, for example, in cases where fungal species can be pathogens of both humans and plants. Some progress to develop relevant standards has now been made, in particular by the development of PAMGO ontology. Another key resource for collecting and sharing such data is PHI-base database. Since 2005, PHI-base has catalogued experimentally verified pathogenicity, virulence and effector genes from fungal, protist and bacterial pathogens which infect animal, plant, fish, insect and / or fungal hosts [5]. PHI-base is an online resource devoted to the identification and presentation of phenotypic and genotypic information on pathogenicity and effector genes and their host interactions. As such it is a valuable resource for comparative genomic analyses, comparative gene function analyses and for the discovery of candidate targets in medically and agronomically important species for chemical intervention. All the data in PHI-base is manually curated.

However, the currently available standards and resources still require a formally defined structure to organise all these pieces of data. As the scope of the PAMGO ontology is to provide annotations at gene level, the closest initiative so far that could provide such standard is the one defined by the PHI-base schema and a small selection of controlled vocabularies that support it. Ontology for Investigations on Pathogenicity builds upon the experiences of the PHI-base project to formally define such standard in OWL language.


The ontology has originally been developed based on the experiences from the PHI-base database development. PHI-base resource was established to catalog the growing number of experimentally-verified genes needed for an organism to cause disease and/or trigger host response. As well as collecting information about genes, PHI-base also captures details of experimental set-up, phenotypic observations and known gene-level annotation. As PHI-base was originally established to hold information about fungal and Oomycete pathogens, its data model is primarily designed to accommodate types of experiments relevant for those organisms. Although PHI-base does not restrict the range of host species for which data is collected (as some fungi and Oomycete can infect a wide range different species), its primary focus is still on the pathogens of plants of agricultural significance.

The Ontology for Investigations on Pathogenicity (OIP) was conceived as a way of formally modelling data in the PHI-base database. As such, at present OIP may also be reproducing some of its original biases, however the intention is to resolve them in the final version and to develop a data model that is both widely applicable and generic. The intention is to not only provide a well-defined, formal vocabulary for this domain of knowledge, but to also enable the applications of ontology-based reasoning and use of Semantic Web technologies to further disseminate and improve analysis of these data.

How to contribute

We welcome all comments and suggestions from the community and have created this wiki to collect such feedback. As there have been persistent problems with malicious spam attacks, we have instituted a manual approval process for any new accounts. Therefore, to get an account, please send an e-mail to Dr Artem Lysenko at artem.lysenko ‘at’ rothamsted.ac.uk.

The wiki pages hold a loss-less view of an entire OWL ontology structure (classes, properties and restriction axioms are all included). We are using semi-automated methods to synchronise the OWL ontology master file and the content of the wiki that support both export and import of an entire ontology. The OWL semantics are realised using Semantic Mediawiki extension. Therefore, if you would like to contribute to the ontology development directly, it will be necessary to familiarise yourself both with the Semantic Mediawiki syntactical constructs and our adopted conventions for using them. Alternatively, everyone is also welcome to provide feedback using the discussion pages for individual OWL classes and properties.

Access options


This work was funded by BBSRC and realised as part of collaboration between Rothamsted Research and Brunel University



  1. Agrios, G. N. (2005). Plant Pathology. Academic Press, London
  2. Strange, R. N. & Scott, P. R. (2005). Plant disease: a threat to global food security. Annual Review of Phytopathology 43, 83-116
  3. Oerke, E.-C. (2006). Crop losses to pests. Journal of Agricultural Science 144, 31–43
  4. Fisher, M. C., Henk, D.. A., Briggs, C. J., Brownstein, J. S., Madoff, L. C., McCraw, Sarah L. & Gurr, S. J.(2012). Emerging fungal threats to animal, plant and ecosystem health. Nature 484, 186-194
  5. Winnenburg R., Baldwin T.K., Urban M., Rawlings C., Köhler J., Hammond-Kosack K.E. (2006). PHI-base: a new database for pathogen host interactions. Nucleic Acids Research 1;34 (Database issue): D459-64.