Secondly, the algorithm can usually only be applied to data from a single network, which complicates algorithm design and validation.ĭue to these limitations of real experimental data, the use of simulated data for benchmarking structure learning algorithms is gaining interest. Indeed, assessing the relevance of predicted interactions that have not been experimentally confirmed, is infeasible. However, using such an approach, false positive interactions are for example not penalized. Moreover, knowledge about the underlying biological TRN is often incomplete or unavailable.Īs a consequence, validation strategies applied to experimentally obtained data are often limited to confirming previously known interactions in the reconstructed network. Unfortunately, experimental data sets of the appropriate size and design are usually not available. Gaining statistical knowledge about the performance of these algorithms requires repeatedly testing them on large, high-quality data sets obtained from many experimental conditions and derived from different well-characterized networks. Because data on transcriptional regulation are most accessible, much effort goes to the develoment of algorithms that infer the structure of transcriptional regulatory networks (TRNs) from this data. The ability to simultaneously measure the expression level of a large number of genes, makes it possible to take a system-wide view of the cell.ĭeveloping reliable data analysis methods that infer the complex network of interactions between the various constituents of a living system based on high throughput data, is a major issue in current bioinformatics research. Recent technological advances have made the application of high throughput assays, such as microarrays, common practice. The generator models different types of biological interactions and produces biologically plausible synthetic gene expression data. Simulation of the network scales well to large networks. The topological characteristics of the generated networks more closely resemble the characteristics of real transcriptional networks. This network generation technique offers a valid alternative to existing methods. Several user-definable parameters adjust the complexity of the resulting data set with respect to the structure learning algorithms. Our results show that the statistical properties of these topologies more closely approximate those of genuine biological networks than do those of different types of random graph models. Interaction kinetics are modeled by equations based on Michaelis-Menten and Hill kinetics. Network topologies are generated by selecting subnetworks from previously described regulatory networks. In this paper we describe a network generator that creates synthetic transcriptional regulatory networks and produces simulated gene expression data that approximates experimental data. Since experimental data sets of the appropriate size and design are usually not available, there is a clear need to generate well-characterized synthetic data sets that allow thorough testing of learning algorithms in a fast and reproducible manner. Validation of these algorithms requires benchmark data sets for which the underlying network is known. The development of algorithms to infer the structure of gene regulatory networks based on expression data is an important subject in bioinformatics research.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |