Back to Search Start Over

BiDaS: a web-based Monte Carlo BioData Simulator based on sequence/feature characteristics

Authors :
Emmanouil Athanasiadis
Maria D. Paraskevopoulou
George Spyrou
Ioannis S. Vlachos
Source :
Nucleic Acids Research
Publication Year :
2013
Publisher :
Oxford University Press (OUP), 2013.

Abstract

BiDaS is a web-application that can generate massive Monte Carlo simulated sequence or numerical feature data sets (e.g. dinucleotide content, composition, transition, distribution properties) based on small user-provided data sets. BiDaS server enables users to analyze their data and generate large amounts of: (i) Simulated DNA/RNA and aminoacid (AA) sequences following practically identical sequence and/or extracted feature distributions with the original data. (ii) Simulated numerical features, presenting identical distributions, while preserving the exact 2D or 3D between-feature correlations observed in the original data sets. The server can project the provided sequences to multidimensional feature spaces based on: (i) 38 DNA/RNA features describing conformational and physicochemical nucleotide sequence features from the B-DNA-VIDEO database, (ii) 122 DNA/RNA features based on conformational and thermodynamic dinucleotide properties from the DiProDB database and (iii) Pseudo-aminoacid composition of the initial sequences. To the best of our knowledge, this is the first available web-server that allows users to generate vast numbers of biological data sets with realistic characteristics, while keeping between-feature associations. These data sets can be used for a wide variety of current biological problems, such as the in-depth study of gene, transcript, peptide and protein groups/families; the creation of large data sets from just a few available members and the strengthening of machine learning classifiers. All simulations use advanced Monte Carlo sampling techniques. The BiDaS web-application is available at http://bioserver-3.bioacademy.gr/Bioserver/BiDaS/.

Details

ISSN :
13624962 and 03051048
Volume :
41
Database :
OpenAIRE
Journal :
Nucleic Acids Research
Accession number :
edsair.doi.dedup.....474679d8098f0946542206ccd84314e7
Full Text :
https://doi.org/10.1093/nar/gkt420