14 results on '"Schwämmle, Veit"'
Search Results
2. Automated workflow composition in mass spectrometry based proteomics
- Author
-
Software Technology, Sub Software Technology, Palmblad, Magnus, Lamprecht, A.L., Ison, Jon, Schwämmle, Veit, Software Technology, Sub Software Technology, Palmblad, Magnus, Lamprecht, A.L., Ison, Jon, and Schwämmle, Veit
- Published
- 2019
3. Tools and data services registry: a community effort to document bioinformatics resources
- Author
-
Ison, Jon, Rapacki, Kristoffer, Ménager, Hervé, Kalaš, Matúš, Rydza, Emil, Chmura, Piotr, Anthon, Christian, Beard, Niall, Berka, Karel, Bolser, Dan, Booth, Tim, Bretaudeau, Anthony, Brezovsky, Jan, Casadio, Rita, Cesareni, Gianni, Coppens, Frederik, Cornell, Michael, Cuccuru, Gianmauro, Davidsen, Kristian, Vedova, Gianluca Della, Dogan, Tunca, Doppelt-Azeroual, Olivia, Emery, Laura, Gasteiger, Elisabeth, Gatter, Thomas, Goldberg, Tatyana, Grosjean, Marie, Grüning, Björn, Helmer-Citterich, Manuela, Ienasescu, Hans, Ioannidis, Vassilios, Jespersen, Martin Closter, Jimenez, Rafael, Juty, Nick, Juvan, Peter, Koch, Maximilian, Laibe, Camille, Li, Jing-Woei, Licata, Luana, Mareuil, Fabien, Mičetić, Ivan, Friborg, Rune Møllegaard, Moretti, Sebastien, Morris, Chris, Möller, Steffen, Nenadic, Aleksandra, Peterson, Hedi, Profiti, Giuseppe, Rice, Peter, Romano, Paolo, Roncaglia, Paola, Saidi, Rabie, Schafferhans, Andrea, Schwämmle, Veit, Smith, Callum, Sperotto, Maria Maddalena, Stockinger, Heinz, Vařeková, Radka Svobodová, Tosatto, Silvio C.E., de la Torre, Victor, Uva, Paolo, Via, Allegra, Yachdav, Guy, Zambelli, Federico, Vriend, Gert, Rost, Burkhard, Parkinson, Helen, Løngreen, Peter, Brunak, Søren, Ison, Jon, Rapacki, Kristoffer, Ménager, Hervé, Kalaš, Matúš, Rydza, Emil, Chmura, Piotr, Anthon, Christian, Beard, Niall, Berka, Karel, Bolser, Dan, Booth, Tim, Bretaudeau, Anthony, Brezovsky, Jan, Casadio, Rita, Cesareni, Gianni, Coppens, Frederik, Cornell, Michael, Cuccuru, Gianmauro, Davidsen, Kristian, Vedova, Gianluca Della, Dogan, Tunca, Doppelt-Azeroual, Olivia, Emery, Laura, Gasteiger, Elisabeth, Gatter, Thomas, Goldberg, Tatyana, Grosjean, Marie, Grüning, Björn, Helmer-Citterich, Manuela, Ienasescu, Hans, Ioannidis, Vassilios, Jespersen, Martin Closter, Jimenez, Rafael, Juty, Nick, Juvan, Peter, Koch, Maximilian, Laibe, Camille, Li, Jing-Woei, Licata, Luana, Mareuil, Fabien, Mičetić, Ivan, Friborg, Rune Møllegaard, Moretti, Sebastien, Morris, Chris, Möller, Steffen, Nenadic, Aleksandra, Peterson, Hedi, Profiti, Giuseppe, Rice, Peter, Romano, Paolo, Roncaglia, Paola, Saidi, Rabie, Schafferhans, Andrea, Schwämmle, Veit, Smith, Callum, Sperotto, Maria Maddalena, Stockinger, Heinz, Vařeková, Radka Svobodová, Tosatto, Silvio C.E., de la Torre, Victor, Uva, Paolo, Via, Allegra, Yachdav, Guy, Zambelli, Federico, Vriend, Gert, Rost, Burkhard, Parkinson, Helen, Løngreen, Peter, and Brunak, Søren
- Abstract
Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand. Here we present a community-driven curation effort, supported by ELIXIR—the European infrastructure for biological information—that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners. As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools.
- Published
- 2016
4. MetaboLink: A web application for Streamlined Processing and Analysis of Large-Scale Untargeted Metabolomics Data.
- Author
-
Mendes A, Havelund JF, Lemvig J, Schwämmle V, and Færgeman NJ
- Abstract
Motivation: The post-processing and analysis of large-scale untargeted metabolomics data face significant challenges due to the intricate nature of correction, filtration, imputation, and normalization steps. Manual execution across various applications often leads to inefficiencies, human-induced errors, and inconsistencies within the workflow., Results: Addressing these issues, we introduce MetaboLink, a novel web application designed to process LC-MS metabolomics datasets combining established methodologies and offering flexibility and ease of implementation. It offers visualization options for data interpretation, an interface for statistical testing, and integration with PolySTest for further tests and with VSClust for clustering analysis., Availability: Fully functional tool is publicly available at https://computproteomics.bmb.sdu.dk/Metabolomics/. The source code is available at https://github.com/anitamnd/MetaboLink and a detailed description of the app can be found at https://github.com/anitamnd/MetaboLink/wiki. A tutorial video can be found at https://youtu.be/ZM6j10S6Z8Q., Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author(s) 2024. Published by Oxford University Press.)
- Published
- 2024
- Full Text
- View/download PDF
5. Variability analysis of LC-MS experimental factors and their impact on machine learning.
- Author
-
Rehfeldt TG, Krawczyk K, Echers SG, Marcatili P, Palczynski P, Röttger R, and Schwämmle V
- Subjects
- Chromatography, Liquid, Tandem Mass Spectrometry, Machine Learning
- Abstract
Background: Machine learning (ML) technologies, especially deep learning (DL), have gained increasing attention in predictive mass spectrometry (MS) for enhancing the data-processing pipeline from raw data analysis to end-user predictions and rescoring. ML models need large-scale datasets for training and repurposing, which can be obtained from a range of public data repositories. However, applying ML to public MS datasets on larger scales is challenging, as they vary widely in terms of data acquisition methods, biological systems, and experimental designs., Results: We aim to facilitate ML efforts in MS data by conducting a systematic analysis of the potential sources of variability in public MS repositories. We also examine how these factors affect ML performance and perform a comprehensive transfer learning to evaluate the benefits of current best practice methods in the field for transfer learning., Conclusions: Our findings show significantly higher levels of homogeneity within a project than between projects, which indicates that it is important to construct datasets most closely resembling future test cases, as transferability is severely limited for unseen datasets. We also found that transfer learning, although it did increase model performance, did not increase model performance compared to a non-pretrained model., (© The Author(s) 2023. Published by Oxford University Press GigaScience.)
- Published
- 2022
- Full Text
- View/download PDF
6. VIQoR: a web service for visually supervised protein inference and protein quantification.
- Author
-
Tsiamis V and Schwämmle V
- Subjects
- Algorithms, Peptides chemistry, Software, Proteins chemistry, Proteomics methods
- Abstract
Motivation: In quantitative bottom-up mass spectrometry (MS)-based proteomics, the reliable estimation of protein concentration changes from peptide quantifications between different biological samples is essential. This estimation is not a single task but comprises the two processes of protein inference and protein abundance summarization. Furthermore, due to the high complexity of proteomics data and associated uncertainty about the performance of these processes, there is a demand for comprehensive visualization methods able to integrate protein with peptide quantitative data including their post-translational modifications. Hence, there is a lack of a suitable tool that provides post-identification quantitative analysis of proteins with simultaneous interactive visualization., Results: In this article, we present VIQoR, a user-friendly web service that accepts peptide quantitative data of both labeled and label-free experiments and accomplishes the crucial components protein inference and summarization and interactive visualization modules, including the novel VIQoR plot. We implemented two different parsimonious algorithms to solve the protein inference problem, while protein summarization is facilitated by a well-established factor analysis algorithm called fast-FARMS followed by a weighted average summarization function that minimizes the effect of missing values. In addition, summarization is optimized by the so-called Global Correlation Indicator (GCI). We test the tool on three publicly available ground truth datasets and demonstrate the ability of the protein inference algorithms to handle shared peptides. We furthermore show that GCI increases the accuracy of the quantitative analysis in datasets with replicated design., Availability and Implementation: VIQoR is accessible at: http://computproteomics.bmb.sdu.dk/Apps/VIQoR/. The source code is available at: https://bitbucket.org/veitveit/viqor/., Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author(s) 2022. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.)
- Published
- 2022
- Full Text
- View/download PDF
7. MS2AI: automated repurposing of public peptide LC-MS data for machine learning applications.
- Author
-
Rehfeldt TG, Krawczyk K, Bøgebjerg M, Schwämmle V, and Röttger R
- Subjects
- Chromatography, Liquid methods, Software, Proteome chemistry, Tandem Mass Spectrometry methods, Peptides analysis
- Abstract
Motivation: Liquid-chromatography mass-spectrometry (LC-MS) is the established standard for analyzing the proteome in biological samples by identification and quantification of thousands of proteins. Machine learning (ML) promises to considerably improve the analysis of the resulting data, however, there is yet to be any tool that mediates the path from raw data to modern ML applications. More specifically, ML applications are currently hampered by three major limitations: (i) absence of balanced training data with large sample size; (ii) unclear definition of sufficiently information-rich data representations for e.g. peptide identification; (iii) lack of benchmarking of ML methods on specific LC-MS problems., Results: We created the MS2AI pipeline that automates the process of gathering vast quantities of MS data for large-scale ML applications. The software retrieves raw data from either in-house sources or from the proteomics identifications database, PRIDE. Subsequently, the raw data are stored in a standardized format amenable for ML, encompassing MS1/MS2 spectra and peptide identifications. This tool bridges the gap between MS and AI, and to this effect we also present an ML application in the form of a convolutional neural network for the identification of oxidized peptides., Availability and Implementation: An open-source implementation of the software can be found at https://gitlab.com/roettgerlab/ms2ai., Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.)
- Published
- 2022
- Full Text
- View/download PDF
8. biotoolsSchema: a formalized schema for bioinformatics software description.
- Author
-
Ison J, Ienasescu H, Rydza E, Chmura P, Rapacki K, Gaignard A, Schwämmle V, van Helden J, Kalaš M, and Ménager H
- Subjects
- Databases, Factual, Humans, Semantics, Software, Biological Science Disciplines, Computational Biology
- Abstract
Background: Life scientists routinely face massive and heterogeneous data analysis tasks and must find and access the most suitable databases or software in a jungle of web-accessible resources. The diversity of information used to describe life-scientific digital resources presents an obstacle to their utilization. Although several standardization efforts are emerging, no information schema has been sufficiently detailed to enable uniform semantic and syntactic description-and cataloguing-of bioinformatics resources., Findings: Here we describe biotoolsSchema, a formalized information model that balances the needs of conciseness for rapid adoption against the provision of rich technical information and scientific context. biotoolsSchema results from a series of community-driven workshops and is deployed in the bio.tools registry, providing the scientific community with >17,000 machine-readable and human-understandable descriptions of software and other digital life-science resources. We compare our approach to related initiatives and provide alignments to foster interoperability and reusability., Conclusions: biotoolsSchema supports the formalized, rigorous, and consistent specification of the syntax and semantics of bioinformatics resources, and enables cataloguing efforts such as bio.tools that help scientists to find, comprehend, and compare resources. The use of biotoolsSchema in bio.tools promotes the FAIRness of research software, a key element of open and reproducible developments for data-intensive sciences., (© The Author(s) 2021. Published by Oxford University Press on behalf of GigaScience.)
- Published
- 2021
- Full Text
- View/download PDF
9. Community curation of bioinformatics software and data resources.
- Author
-
Ison J, Ménager H, Brancotte B, Jaaniso E, Salumets A, Raček T, Lamprecht AL, Palmblad M, Kalaš M, Chmura P, Hancock JM, Schwämmle V, and Ienasescu HI
- Subjects
- Computational Biology standards, Database Management Systems, Europe, Humans, Community Participation, Computational Biology methods, Software
- Abstract
The corpus of bioinformatics resources is huge and expanding rapidly, presenting life scientists with a growing challenge in selecting tools that fit the desired purpose. To address this, the European Infrastructure for Biological Information is supporting a systematic approach towards a comprehensive registry of tools and databases for all domains of bioinformatics, provided under a single portal (https://bio.tools). We describe here the practical means by which scientific communities, including individual developers and projects, through major service providers and research infrastructures, can describe their own bioinformatics resources and share these via bio.tools., (© The Author(s) 2019. Published by Oxford University Press.)
- Published
- 2020
- Full Text
- View/download PDF
10. Automated workflow composition in mass spectrometry-based proteomics.
- Author
-
Palmblad M, Lamprecht AL, Ison J, and Schwämmle V
- Subjects
- Computational Biology, Software, Mass Spectrometry, Proteomics, Workflow
- Abstract
Motivation: Numerous software utilities operating on mass spectrometry (MS) data are described in the literature and provide specific operations as building blocks for the assembly of on-purpose workflows. Working out which tools and combinations are applicable or optimal in practice is often hard. Thus researchers face difficulties in selecting practical and effective data analysis pipelines for a specific experimental design., Results: We provide a toolkit to support researchers in identifying, comparing and benchmarking multiple workflows from individual bioinformatics tools. Automated workflow composition is enabled by the tools' semantic annotation in terms of the EDAM ontology. To demonstrate the practical use of our framework, we created and evaluated a number of logically and semantically equivalent workflows for four use cases representing frequent tasks in MS-based proteomics. Indeed we found that the results computed by the workflows could vary considerably, emphasizing the benefits of a framework that facilitates their systematic exploration., Availability and Implementation: The project files and workflows are available from https://github.com/bio-tools/biotoolsCompose/tree/master/Automatic-Workflow-Composition., Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author(s) 2018. Published by Oxford University Press.)
- Published
- 2019
- Full Text
- View/download PDF
11. VSClust: feature-based variance-sensitive clustering of omics data.
- Author
-
Schwämmle V and Jensen ON
- Subjects
- Algorithms, Software, Cluster Analysis, Genomics methods, Metabolomics
- Abstract
Motivation: Data clustering is indispensable for identifying biologically relevant molecular features in large-scale omics experiments with thousands of measurements at multiple conditions. Optimal clustering results yield groups of functionally related features that may include genes, proteins and metabolites in biological processes and molecular networks. Omics experiments typically include replicated measurements of each feature within a given condition to statistically assess feature-specific variation. Current clustering approaches ignore this variation by averaging, which often leads to incorrect cluster assignments., Results: We present VSClust that accounts for feature-specific variance. Based on an algorithm derived from fuzzy clustering, VSClust unifies statistical testing with pattern recognition to cluster the data into feature groups that more accurately reflect the underlying molecular and functional behavior. We apply VSClust to artificial and experimental datasets comprising hundreds to >80 000 features across 6-20 different conditions including genomics, transcriptomics, proteomics and metabolomics experiments. VSClust avoids arbitrary averaging methods, outperforms standard fuzzy c-means clustering and simplifies the data analysis workflow in large-scale omics studies., Availability and Implementation: Download VSClust at https://bitbucket.org/veitveit/vsclust or access it through computproteomics.bmb.sdu.dk/Apps/VSClust., Supplementary Information: Supplementary data are available at Bioinformatics online.
- Published
- 2018
- Full Text
- View/download PDF
12. Accumulation of histone variant H3.3 with age is associated with profound changes in the histone methylation landscape.
- Author
-
Tvardovskiy A, Schwämmle V, Kempf SJ, Rogowska-Wrzesinska A, and Jensen ON
- Subjects
- Animals, Carcinoma, Hepatocellular metabolism, Cells, Cultured, Hepatocytes metabolism, Histone Code, Humans, Liver Neoplasms metabolism, Lysine metabolism, Male, Methylation, Mice, Inbred C57BL, Protein Isoforms metabolism, Protein Processing, Post-Translational, Aging metabolism, Histones metabolism
- Abstract
Deposition of replication-independent histone variant H3.3 into chromatin is essential for many biological processes, including development and reproduction. Unlike replication-dependent H3.1/2 isoforms, H3.3 is expressed throughout the cell cycle and becomes enriched in postmitotic cells with age. However, lifelong dynamics of H3 variant replacement and the impact of this process on chromatin organization remain largely undefined. Using quantitative middle-down proteomics we demonstrate that H3.3 accumulates to near saturation levels in the chromatin of various mouse somatic tissues by late adulthood. Accumulation of H3.3 is associated with profound changes in global levels of both individual and combinatorial H3 methyl modifications. A subset of these modifications exhibit distinct relative abundances on H3 variants and remain stably enriched on H3.3 throughout the lifespan, suggesting a causal relationship between H3 variant replacement and age-dependent changes in H3 methylation. Furthermore, the H3.3 level is drastically reduced in human hepatocarcinoma cells as compared to nontumoral hepatocytes, suggesting the potential utility of the H3.3 relative abundance as a biomarker of abnormal cell proliferation activity. Overall, our study provides the first quantitative characterization of dynamic changes in H3 proteoforms throughout lifespan in mammals and suggests a role for H3 variant replacement in modulating H3 methylation landscape with age., (© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.)
- Published
- 2017
- Full Text
- View/download PDF
13. Tools and data services registry: a community effort to document bioinformatics resources.
- Author
-
Ison J, Rapacki K, Ménager H, Kalaš M, Rydza E, Chmura P, Anthon C, Beard N, Berka K, Bolser D, Booth T, Bretaudeau A, Brezovsky J, Casadio R, Cesareni G, Coppens F, Cornell M, Cuccuru G, Davidsen K, Vedova GD, Dogan T, Doppelt-Azeroual O, Emery L, Gasteiger E, Gatter T, Goldberg T, Grosjean M, Grüning B, Helmer-Citterich M, Ienasescu H, Ioannidis V, Jespersen MC, Jimenez R, Juty N, Juvan P, Koch M, Laibe C, Li JW, Licata L, Mareuil F, Mičetić I, Friborg RM, Moretti S, Morris C, Möller S, Nenadic A, Peterson H, Profiti G, Rice P, Romano P, Roncaglia P, Saidi R, Schafferhans A, Schwämmle V, Smith C, Sperotto MM, Stockinger H, Vařeková RS, Tosatto SC, de la Torre V, Uva P, Via A, Yachdav G, Zambelli F, Vriend G, Rost B, Parkinson H, Løngreen P, and Brunak S
- Subjects
- Data Curation, Software, Computational Biology, Registries
- Abstract
Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand.Here we present a community-driven curation effort, supported by ELIXIR-the European infrastructure for biological information-that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners.As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools., (© The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.)
- Published
- 2016
- Full Text
- View/download PDF
14. A simple and fast method to determine the parameters for fuzzy c-means cluster analysis.
- Author
-
Schwämmle V and Jensen ON
- Subjects
- Oligonucleotide Array Sequence Analysis methods, Pattern Recognition, Automated methods, Proteomics, Cluster Analysis, Fuzzy Logic
- Abstract
Motivation: Fuzzy c-means clustering is widely used to identify cluster structures in high-dimensional datasets, such as those obtained in DNA microarray and quantitative proteomics experiments. One of its main limitations is the lack of a computationally fast method to set optimal values of algorithm parameters. Wrong parameter values may either lead to the inclusion of purely random fluctuations in the results or ignore potentially important data. The optimal solution has parameter values for which the clustering does not yield any results for a purely random dataset but which detects cluster formation with maximum resolution on the edge of randomness., Results: Estimation of the optimal parameter values is achieved by evaluation of the results of the clustering procedure applied to randomized datasets. In this case, the optimal value of the fuzzifier follows common rules that depend only on the main properties of the dataset. Taking the dimension of the set and the number of objects as input values instead of evaluating the entire dataset allows us to propose a functional relationship determining the fuzzifier directly. This result speaks strongly against using a predefined fuzzifier as typically done in many previous studies. Validation indices are generally used for the estimation of the optimal number of clusters. A comparison shows that the minimum distance between the centroids provides results that are at least equivalent or better than those obtained by other computationally more expensive indices.
- Published
- 2010
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.