biochem4jCitation formats

Standard

biochem4j : integrated and extensible biochemical knowledge through graph databases. / Swainston, Neil; Batista-Navarro, Riza Theresa; Carbonell, Pablo; Dobson, Paul; Dunstan, Mark; Jervis, Adrian; Vinaixa, Maria; Williams, Alan; Ananiadou, Sophia; Faulon, Jean-Loup; Pedrosa Mendes, Pedro; Kell, Douglas; Scrutton, Nigel; Breitling, Rainer.

In: PLoS ONE, Vol. 12, No. 7, e0179130, 2017.

Research output: Contribution to journalArticlepeer-review

Harvard

APA

Vancouver

Author

Bibtex

@article{efde089c2cab47319c1fce8ea96962cf,
title = "biochem4j: integrated and extensible biochemical knowledge through graph databases",
abstract = "Biologists and biochemists have at their disposal a number of excellent, publicly available data resources such as UniProt, KEGG, and NCBI Taxonomy, which catalogue biological entities. Despite the usefulness of these resources, they remain fundamentally unconnected. While links may appear between entries across these databases, users are typically only able to follow such links by manual browsing or through specialised workflows. Although many of the resources provide web-service interfaces for computational access, performing federated queries across databases remains a non-trivial but essential activity in interdisciplinary systems and synthetic biology programmes. What is needed are integrated repositories to catalogue both biological entities and – crucially – the relationships between them. Such a resource should be extensible, such that newly discovered relationships – for example, those between novel, synthetic enzymes and non-natural products – can be added over time. With the introduction of graph databases, the barrier to the rapid generation, extension and querying of such a resource has been lowered considerably. With a particular focus on metabolic engineering as an illustrative application domain, biochem4j, freely available at http://biochem4j.org, is introduced to provide an integrated, queryable database that warehouses chemical, reaction, enzyme and taxonomic data from a range of reliable resources. The biochem4j framework establishes a starting point for the flexible integration and exploitation of an ever-wider range of biological data sources, from public databases to laboratory-specific experimental datasets, for the benefit of systems biologists, biosystems engineers and the wider community of molecular biologists and biological chemists.",
keywords = "graph database , neo4j, Metabolism, enzymes , metabolomics , systems biology , modelling, metabolic engineering , pathway design ",
author = "Neil Swainston and Batista-Navarro, {Riza Theresa} and Pablo Carbonell and Paul Dobson and Mark Dunstan and Adrian Jervis and Maria Vinaixa and Alan Williams and Sophia Ananiadou and Jean-Loup Faulon and {Pedrosa Mendes}, Pedro and Douglas Kell and Nigel Scrutton and Rainer Breitling",
year = "2017",
doi = "10.1371/journal.pone.0179130",
language = "English",
volume = "12",
journal = "PL o S One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "7",

}

RIS

TY - JOUR

T1 - biochem4j

T2 - integrated and extensible biochemical knowledge through graph databases

AU - Swainston, Neil

AU - Batista-Navarro, Riza Theresa

AU - Carbonell, Pablo

AU - Dobson, Paul

AU - Dunstan, Mark

AU - Jervis, Adrian

AU - Vinaixa, Maria

AU - Williams, Alan

AU - Ananiadou, Sophia

AU - Faulon, Jean-Loup

AU - Pedrosa Mendes, Pedro

AU - Kell, Douglas

AU - Scrutton, Nigel

AU - Breitling, Rainer

PY - 2017

Y1 - 2017

N2 - Biologists and biochemists have at their disposal a number of excellent, publicly available data resources such as UniProt, KEGG, and NCBI Taxonomy, which catalogue biological entities. Despite the usefulness of these resources, they remain fundamentally unconnected. While links may appear between entries across these databases, users are typically only able to follow such links by manual browsing or through specialised workflows. Although many of the resources provide web-service interfaces for computational access, performing federated queries across databases remains a non-trivial but essential activity in interdisciplinary systems and synthetic biology programmes. What is needed are integrated repositories to catalogue both biological entities and – crucially – the relationships between them. Such a resource should be extensible, such that newly discovered relationships – for example, those between novel, synthetic enzymes and non-natural products – can be added over time. With the introduction of graph databases, the barrier to the rapid generation, extension and querying of such a resource has been lowered considerably. With a particular focus on metabolic engineering as an illustrative application domain, biochem4j, freely available at http://biochem4j.org, is introduced to provide an integrated, queryable database that warehouses chemical, reaction, enzyme and taxonomic data from a range of reliable resources. The biochem4j framework establishes a starting point for the flexible integration and exploitation of an ever-wider range of biological data sources, from public databases to laboratory-specific experimental datasets, for the benefit of systems biologists, biosystems engineers and the wider community of molecular biologists and biological chemists.

AB - Biologists and biochemists have at their disposal a number of excellent, publicly available data resources such as UniProt, KEGG, and NCBI Taxonomy, which catalogue biological entities. Despite the usefulness of these resources, they remain fundamentally unconnected. While links may appear between entries across these databases, users are typically only able to follow such links by manual browsing or through specialised workflows. Although many of the resources provide web-service interfaces for computational access, performing federated queries across databases remains a non-trivial but essential activity in interdisciplinary systems and synthetic biology programmes. What is needed are integrated repositories to catalogue both biological entities and – crucially – the relationships between them. Such a resource should be extensible, such that newly discovered relationships – for example, those between novel, synthetic enzymes and non-natural products – can be added over time. With the introduction of graph databases, the barrier to the rapid generation, extension and querying of such a resource has been lowered considerably. With a particular focus on metabolic engineering as an illustrative application domain, biochem4j, freely available at http://biochem4j.org, is introduced to provide an integrated, queryable database that warehouses chemical, reaction, enzyme and taxonomic data from a range of reliable resources. The biochem4j framework establishes a starting point for the flexible integration and exploitation of an ever-wider range of biological data sources, from public databases to laboratory-specific experimental datasets, for the benefit of systems biologists, biosystems engineers and the wider community of molecular biologists and biological chemists.

KW - graph database

KW - neo4j

KW - Metabolism

KW - enzymes

KW - metabolomics

KW - systems biology

KW - modelling

KW - metabolic engineering

KW - pathway design

U2 - 10.1371/journal.pone.0179130

DO - 10.1371/journal.pone.0179130

M3 - Article

VL - 12

JO - PL o S One

JF - PL o S One

SN - 1932-6203

IS - 7

M1 - e0179130

ER -