SysMO-DB is a project that is creating a web-based platform, and tooling, for finding, sharing and exchanging Data, Models and Processes in Systems Biology. It was designed to support the SysMO Consortium(Systems Biology for Micro-Organisms), but the principles and methods employed are equally applicable to other multi-site Systems Biology projects.
The main objectives of SysMO-DB are to: facilitate the web-based exchange of data between research groups within the consortium and with other consortia, and to provide an integrated platform for the dissemination of the results of the SysMO projects to the scientific community. Our system is a progressive and scalable solution to the data management needs of the SysMO initiative, that:
- facilitates and maximises the potential for data exchange between SysMO research groups;
- maximises the 'shelf life' and utility of data generated by SysMO;
- provides an integrated platform for the dissemination of the results of the SysMO projects to the scientific community; and
- facilitates standardisation of practices in Systems Biology for the interfacing of modelling and experimentation.
We follow several key principles:
- exploit what is already available, from the consortium and from the wider scientific community, and avoid reinvention;
- identify the least we can do to make a benefit and do this incrementally.
The focus of SysMO-DB involves the exchange of data, models and expertise across the consortium. Therefore, we must make it easy for scientists to adopt our methods. Rather than oblige all partners to define or adopt comprehensive formats or controlled vocabularies, or force them to change their data management solutions (if they have them), we define a "Just Enough Results Model (JERM)" which is the minimum content metadata model required to usefully interlink data.
The JERM is based on the ISA format (Investigations, Studies and Assays). ISA is a community standardisation activity to allow the association of multiple 'Omics datasets in the context of the experiments that created them. By building on this existing work, we open the possibility of exporting data in ISA-TAB format and therefore exchanging data with a much wider community.
JERM extractors process and extract results, which we can then access through the JERM Interface. This means that we are able to cater for the heterogeneity of the underlying data resources by hiding them behind an interface; moreover, we can evolve the interface and evolve the extractors as we establish data solutions at the sites and as we improve the annotation of data. This will allow us to incrementally assess the cost of building interfaces onto existing solutions as opposed to developing complete migrations to suggested SysMO community solutions, and facilitate an impact analysis on current applications and practices by each partner. In other words, identifying what is necessary and sufficient for each partner and for exchange, and no more.
As a consequence of this approach, we can also allow researchers to store data at their own institutions, extracting it only when required and only when those requests are from other researchers with access rights to that data. SEEK is the SysMO Assets Catalogue. It contains information about who holds what data, models, protocols and expertise, and where those assets are held. The SEEK is the main web-based access point to the system and provides an access control layer to enable researchers to restrict access to collaborators, colleagues or other individuals until they are ready to share with the whole consortium or the wider community.
Some of the content of SysMO-SEEK is available to the public and some is only available to members of the SysMO consortium. To view public SEEK assets, please see the SysMO-SEEK