Flexible informatics for linking experimental data to mathematical models via DataRail
Julio Saez-Rodriguez 1,2,
,
Arthur Goldsipe 1,3,
,
Jeremy Muhlich 1,2,
Leonidas G. Alexopoulos 1,2,
Bjorn Millard 1,2,
Douglas A. Lauffenburger 1,3 and
Peter K. Sorger 1,2,3,
1Center for Cell Decision Processes, 2Department of Systems Biology, Harvard Medical School, Boston, MA 02115 and 3Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
Bioinformatics 2008 24(6):840-847. Open Access Article.
Abstract
Motivation: Linking experimental data to mathematical modelsin biology is impeded by the lack of suitable software to manageand transform data. Model calibration would be facilitated andmodels would increase in value were it possible to preservelinks to training data along with a record of all normalization,scaling, and fusion routines used to assemble the training datafrom primary results.
Results: We describe the implementation of DataRail, an opensource MATLAB-based toolbox that stores experimental data inflexible multi-dimensional arrays, transforms arrays so as tomaximize information content, and then constructs models usinginternal or external tools. Data integrity is maintained viaa containment hierarchy for arrays, imposition of a metadatastandard based on a newly proposed MIDAS format, assignmentof semantically typed universal identifiers, and implementationof a procedure for storing the history of all transformationswith the array. We illustrate the utility of DataRail by processinga newly collected set of
22 000 measurements of protein activitiesobtained from cytokine-stimulated primary and transformed humanliver cells.
Availability: DataRail is distributed under the GNU General
Public License and available at http://code.google.com/p/sbpipeline/