SysMO-DB

SysMO is a European trans-national funding and research initiative on “Systems Biology of Microorganisms”. The goal pursued by SysMO is to record and describe the dynamic molecular processes occurring in microorganisms in a comprehensive way and to present these processes in the form of computerized mathematical models. The aim is to pool research capacities and know-how from eleven projects. To facilitate this process, the Data Management Group (DMG) has been created to support data access and integration.

Each of the individual projects in SysMO are working towards different research outcomes and represent a cross-section of microorganisms, including bacteria, archaea and yeast. The environmental conditions for each organism also vary widely with organisms growing in culture, soil, water and animal hosts.

As a consequence of this diversity, there is no one model for experimentation or for the types of data collected and the types of models produced. In order to pool the research outcomes for SysMO, our job is to support and manage this diversity and promote a shared understanding across the community by using the same technologies.

The main objectives of SysMO-DB are to: facilitate the web-based exchange of data between research groups within- and inter- consortia, and to provide an integrated platform for the dissemination of the results of the SysMO projects to the scientific community. We aim to devise a progressive and scalable solution to the data management needs of the SysMO initiative, that:

  • facilitates and maximises the potential for data exchange between SysMO research groups;
  • maximises the ‘shelf life’ and utility of data generated by SysMO;
  • provides an integrated platform for the dissemination of the results of the SysMO projects to the scientific community; and
  • facilitates standardization of practices in Systems Biology for the interfacing of modelling and experimentation.

We follow several key principles:

  • exploit what is already available, both within the consortium and outside it, and do not reinvent;
  • identify the least we can do to make a benefit and do this incrementally.

We propose a strategy built around the following:

  • SysMO-HUB: a unified access to SysMO resources, and integrated queries across data, workflow and model catalogues, and repositories through a customised web portal.
  • SysMO-SEEK: a self-curated access-controlled catalogue of assets accessed through the HUB or from the portals, groupware and applciations already used by SYsMO projects.
  • SysMO-JERM: the “Just Enough Results Model” (JERM) which is the minimum content metadata model required to usefully exchange and interlink data.
  • JWS-Online: a shared, but accessed controlled, annotated model repository and simulator.
  • Taverna workflows: for automating processes that combine SysMO resources and external resources (datasets, tools, programs); build and populate SBML models, and validate the results of model simulations.
  • myExperiment: a shared, community repository and social curation environment for a pool of Taverna workflows.
  • Data management practice: to promote and support the setting up of appropriate databases such as SABIO-RK and BRENDA, and the adoption of data standards; develop agreements on common data exchange formats and controlled vocabularies for annotations and the adoption of community accepted standards; facilitate the annotation of data and models using local and external controlled vocabularies.
  • Training: on databases, models, workflow systems and web services, and best practice for the annotation of resources by metadata.

The architecture is a layered one.

A key approach is the JERM exchange. We define and build extractors to extract and access results from the SysMO projects data resources, in whatever form they be, through the JERM Interface implemented as a web access interface. This means that we are able to cater for the heterogeneity of the underlying data resources by hiding them behind an interface; moreover, we can evolve the interface and evolve the extractors as we establish data solutions at the sites and as we improve the annotation of data. This will allow us to incrementally assess the cost of building interfaces onto existing solutions as opposed to developing complete migrations to suggested SysMO community solutions, and facilitate an impact analysis on current applications and practices by each partner. In other words, identifying what is necessary and sufficient for each partner and for exchange, and no more.

 
start.txt · Last modified: 2009/04/01 10:28 (external edit)