101 results on '"Stein SE"'
Search Results
2. Die Sicherheit der laparoskopisch assistierten Uterusbiopsie (LAUB) zur intraoperativen Diagnostik der Adenomyosis uteri
- Author
-
Dakkak, R, primary, Stein, SE, additional, Mechsner, S, additional, Herbst, H, additional, Halis, G, additional, and Ebert, AD, additional
- Published
- 2007
- Full Text
- View/download PDF
3. Molecular Oxygen (O 2 ) Artifacts in Tandem Mass Spectra.
- Author
-
Liang Y, Neta P, Yang X, Garraffo HM, Bukhari TH, Liu Y, and Stein SE
- Abstract
Peak annotation plays an important role in mass spectral evaluation of the NIST 2023 tandem mass spectral library. While most fragment ions are formed by neutral losses, there are peaks that represent adduct ions from these fragments. Previously, we have reported two main types of addition reactions in the collision cell, namely addition of H
2 O and N2 . Here we report a different reaction in the collision cell, with addition of O2 leading to a small peak that could only be assigned to a peroxyl radical ion. For example, some protonated iodoaromatics lose an iodine atom to form a radical cation [M+H-I]+• , which reacts with O2 to generate a peroxyl radical ion [M+H-I+O2 ]+• . Higher concentrations of O2 result in higher peroxyl radical peaks, which become dominant while the precursor ions are consumed, as examined by five compounds under different concentrations of O2 . The correlation of the peroxyl radical peak intensities to the concentration of O2 provides a tool to estimate trace amounts of O2 within the instrument. In the NIST 2023 tandem mass spectral library, the peaks for [M+H-X+O2 ]+· are most abundant in numbers and in intensity for X = NO2 or I, are much less abundant for X = Br, and are rare for X = Cl. Other leaving groups in this library are SO3 H, SO2 NH2 , CSNH2 , CO2 C6 F5 , SO2 CH3 , and COCH3 . The O2 addition reaction is also observed with negative ions in this library. While adducts of H2 O and N2 often constitute major peaks, the peaks of the peroxyl radicals under standard conditions are mostly very small and may be mistaken for noise, but their correct annotation improves the quality of the spectra and is important when comparing spectra from different instruments or conditions.- Published
- 2024
- Full Text
- View/download PDF
4. Variation of Site-Specific Glycosylation Profiles of Recombinant Influenza Glycoproteins.
- Author
-
Goecker ZC, Burke MC, Remoroza CA, Liu Y, Mirokhin YA, Sheetlin SL, Tchekhovskoi DV, Yang X, and Stein SE
- Subjects
- Glycosylation, Humans, HEK293 Cells, Hemagglutinin Glycoproteins, Influenza Virus metabolism, Hemagglutinin Glycoproteins, Influenza Virus chemistry, Glycoproteins metabolism, Glycopeptides metabolism, Recombinant Proteins metabolism, Polysaccharides metabolism, Neuraminidase metabolism
- Abstract
This work presents a detailed determination of site-specific N-glycan distributions of the recombinant influenza glycoproteins hemagglutinin (HA) and neuraminidase. Variation in glycosylation among recombinant glycoproteins is not predictable and can depend on details of the biomanufacturing process as well as details of protein structure. In this study, recombinant influenza proteins were analyzed from eight strains of four different suppliers. These include five HA and three neuraminidase proteins, each produced from a HEK293 cell line. Digestion was conducted using a series of complex multienzymatic methods designed to isolate glycopeptides containing single N-glycosylated sites. Site-specific glycosylation profiles of intact glycopeptides were produced using a recently developed method and comparisons were made using spectral similarity scores. Variation in glycan abundances and distribution was most pronounced between different strains of virus (similarity score = 383 out of 999), whereas digestion replicates and injection replicates showed relatively little variation (similarity score = 957). Notably, glycan distributions for homologous regions of influenza glycoprotein variants showed low variability. Due to the multiple possible sources of variation and inherent analytical difficulties in site-specific glycan determinations, variations were individually examined for multiple factors, including differences in supplier, production batch, protease digestion, and replicate measurement. After comparing all glycosylation distributions, four distinguishable classes could be identified for the majority of sites. Finally, attempts to identify glycosylation distributions on adjacent potential N-glycosylated sites of one HA variant were made. Only the second site (NnST) was found to be occupied using two rarely used proteases in proteomics, subtilisin and esperase, both of which did selectively cleave these adjacent sites., (Published by Elsevier Inc.)
- Published
- 2024
- Full Text
- View/download PDF
5. An XIC-Centric Strategy for Improved Identification and Quantification in Proteomic Data Analyses.
- Author
-
Wang G, Zhang Z, Liu Y, Burke MC, Sheetlin SL, and Stein SE
- Subjects
- Humans, Chromatography, Liquid methods, Reproducibility of Results, Glycoproteins analysis, Glycoproteins chemistry, Glycopeptides analysis, Glycopeptides chemistry, Data Analysis, Mass Spectrometry methods, Tandem Mass Spectrometry methods, Proteomics methods, Hair chemistry
- Abstract
Reproducibility is a "proteomic dream" yet to be fully realized. A typical data analysis workflow utilizing extracted ion chromatograms (XICs) often treats the information path from identification to quantification as a one-way street. Here, we propose an XIC-centric approach in which the data flow is bidirectional: identifications are used to derive XICs whose information is in turn applied to validate the identifications. In this study, we employed liquid chromatography-mass spectrometry data from glycoprotein and human hair samples to illustrate the XIC-centric concept. At the core of this approach was XIC-based monoisotope repicking. Taking advantage of the intensity information for all detected isotopes across the whole range of an XIC peak significantly improved the accuracy and uncovered misidentifications originating from monoisotope assignment mistakes. It could also rescue non-top-ranked glycopeptide hits. Identification of glycopeptides is particularly susceptible to precursor mass errors for their low abundances, large masses, and glycans differing by 1 or 2 Da easily confused as isotopes. In addition, the XIC-centric strategy significantly reduced the problem of one XIC peak associated with multiple unique identifications, a source of quantitative irreproducibility. Taken together, the proposed approach can lead to improved identification and quantification accuracy and, ultimately, enhanced reproducibility in proteomic data analyses.
- Published
- 2024
- Full Text
- View/download PDF
6. Comparison of N-Glycopeptide to Released N-Glycan Abundances and the Influence of Glycopeptide Mass and Charge States on N-Linked Glycosylation of IgG Antibodies.
- Author
-
Remoroza CA, Burke MC, Mak TD, Sheetlin SL, Mirokhin YA, Cooper BT, Goecker ZC, Lowenthal MS, Yang X, Wang G, Tchekhovskoi DV, and Stein SE
- Subjects
- Humans, Glycosylation, Glycopeptides analysis, Polysaccharides chemistry, Ions, Immunoglobulin G, Tandem Mass Spectrometry
- Abstract
We report the comparison of mass-spectral-based abundances of tryptic glycopeptides to fluorescence abundances of released labeled glycans and the effects of mass and charge state and in-source fragmentation on glycopeptide abundances. The primary glycoforms derived from Rituximab, NISTmAb, Evolocumab, and Infliximab were high-mannose and biantennary complex galactosylated and fucosylated N-glycans. Except for Evolocumab, in-source ions derived from the loss of HexNAc or HexNAc-Hex sugars are prominent for other therapeutic IgGs. After excluding in-source fragmentation of glycopeptide ions from the results, a linear correlation was observed between fluorescently labeled N-glycan and glycopeptide abundances over a dynamic range of 500. Different charge states of human IgG-derived glycopeptides containing a wider variety of abundant attached glycans were also investigated to examine the effects of the charge state on ion abundances. These revealed a linear dependence of glycopeptide abundance on the mass of the glycan with higher charge states favoring higher-mass glycans. Findings indicate that the mass spectrometry-based bottom-up approach can provide results as accurate as those of glycan release studies while revealing the origin of each attached glycan. These site-specific relative abundances are conveniently displayed and compared using previously described glycopeptide abundance distribution spectra "GADS" representations. Mass spectrometry data are available from the MAssIVE repository (MSV000093562).
- Published
- 2024
- Full Text
- View/download PDF
7. AIRI: Predicting Retention Indices and Their Uncertainties Using Artificial Intelligence.
- Author
-
Geer LY, Stein SE, Mallard WG, and Slotta DJ
- Subjects
- Uncertainty, Artificial Intelligence, Neural Networks, Computer
- Abstract
The Kováts retention index (RI) is a quantity measured using gas chromatography and is commonly used in the identification of chemical structures. Creating libraries of observed RI values is a laborious task, so we explore the use of a deep neural network for predicting RI values from structure for standard semipolar columns. This network generated predictions with a mean absolute error of 15.1 and, in a quantification of the tail of the error distribution, a 95th percentile absolute error of 46.5. Because of the Artificial Intelligence Retention Indices (AIRI) network's accuracy, it was used to predict RI values for the NIST EI-MS spectral libraries. These RI values are used to improve chemical identification methods and the quality of the library. Estimating uncertainty is an important practical need when using prediction models. To quantify the uncertainty of our network for each individual prediction, we used the outputs of an ensemble of 8 networks to calculate a predicted standard deviation for each RI value prediction. This predicted standard deviation was corrected to follow the error between the observed and predicted RI values. The Z scores using these predicted standard deviations had a standard deviation of 1.52 and a 95th percentile absolute Z score corresponding to a mean RI value of 42.6.
- Published
- 2024
- Full Text
- View/download PDF
8. Improved Sample Preparation Method for Protein and Peptide Identification from Human Hair.
- Author
-
Zhang Z, Wallace WE, Wang G, Burke MC, Liu Y, Sheetlin SL, and Stein SE
- Subjects
- Humans, Electrophoresis, Hair chemistry, Hair metabolism, Proteins analysis, Peptides analysis
- Abstract
A fast and sensitive direct extraction (DE) method developed in our group can efficiently extract proteins in 30 min from a 5 cm-long hair strand. Previously, we coupled DE to downstream analysis using gel electrophoresis followed by in-gel digestion, which can be time-consuming. In searching for a better alternative, we found that a combination of DE with a bead-based method (SP3) can lead to significant improvements in protein discovery in human hair. Since SP3 is designed for general applications, we optimized it to process hair proteins following DE and compared it to several other in-solution digestion methods. Of particular concern are genetically variant peptides (GVPs), which can be used for human identification in forensic analysis. Here, we demonstrated improved GVP discovery with the DE and SP3 workflow, which was 3 times faster than the previous in-gel digestion method and required significantly less instrument time depending on the number of gel slices processed. Additionally, it led to an increased number of identified proteins and GVPs. Among the tested in-solution digestion methods, DE combined with SP3 showed the highest sequence coverage, with higher abundances of the identified peptides. This provides a significantly enhanced means for identifying proteins and GVPs in human hair.
- Published
- 2024
- Full Text
- View/download PDF
9. Determining Site-Specific Glycan Profiles of Recombinant SARS-CoV-2 Spike Proteins from Multiple Sources.
- Author
-
Burke MC, Liu Y, Remoroza C, Mirokhin YA, Sheetlin SL, Tchekhovskoi DV, Wang G, Yang X, and Stein SE
- Abstract
Glycopeptide Abundance Distribution Spectra (GADS) were recently introduced as a means of representing, storing, and comparing glycan profiles of intact glycopeptides. Here, using that representation, an extensive analysis is made of multiple commercial sources of the recombinant SARS-CoV-2 spike protein, each containing 22 N-linked glycan sites (sequons). Multiple proteases are used along with variable energy fragmentation followed by ion trap confirmation. This enables a detailed examination of the reproducibility of the method across multiple types of variability. These results show that GADS are consistent between replicates and laboratories for sufficiently abundant glycopeptides. Derived GADS enable the examination and comparison of the glycan profiles between commercial sources of the spike protein. Multiple distinct glycopeptide distributions, generated by multiple proteases, confirm these profiles. Comparisons of GADS derived from 11 sources of recombinant spike protein reveal that sources for which protein expression methods were the same produced near-identical glycan profiles, thereby demonstrating the ability of this method to measure GADS of sufficient reliability to distinguish different glycoform distributions between commercial vendors and potentially to reliably determine and compare differences in glycosylation for any glycoprotein under different conditions of production. All mass spectrometry data files have been deposited in the MassIVE repository under the identifier MSV000091776.
- Published
- 2023
- Full Text
- View/download PDF
10. Inferring the Nominal Molecular Mass of an Analyte from Its Electron Ionization Mass Spectrum.
- Author
-
Moorthy AS, Kearsley AJ, Mallard WG, Wallace WE, and Stein SE
- Abstract
The performance of three algorithms for predicting nominal molecular mass from an analyte's electron ionization mass spectrum is presented. The Peak Interpretation Method (PIM) attempts to quantify the likelihood that a molecular ion peak is contained in the mass spectrum, whereas the Simple Search Hitlist Method (SS-HM) and iterative Hybrid Search Hitlist Method (iHS-HM) leverage results from mass spectral library searching. These predictions can be employed in combination (recommended) or independently. The methods were tested on two sets of query mass spectra searched against libraries that did not contain the reference mass spectra of the same compounds: 19,074 spectra of various organic molecules searched against the NIST17 mass spectral library and 162 spectra of small molecule drugs searched against SWGDRUG version 3.3. Individually, each molecular mass prediction method had computed precisions (the fraction of positive predictions that were correct) of 91, 89, and 74%, respectively. The methods become more valuable when predictions are taken together. When all three predictions were identical, which occurred in 33% of the test cases, the predicted molecular mass was almost always correct (>99%).
- Published
- 2023
- Full Text
- View/download PDF
11. AIomics: Exploring More of the Proteome Using Mass Spectral Libraries Extended by Artificial Intelligence.
- Author
-
Geer LY, Lapin J, Slotta DJ, Mak TD, and Stein SE
- Subjects
- Artificial Intelligence, Tandem Mass Spectrometry, Algorithms, Phosphopeptides, Databases, Protein, Software, Proteome metabolism, Peptide Library
- Abstract
The unbounded permutations of biological molecules, including proteins and their constituent peptides, present a dilemma in identifying the components of complex biosamples. Sequence search algorithms used to identify peptide spectra can be expanded to cover larger classes of molecules, including more modifications, isoforms, and atypical cleavage, but at the cost of false positives or false negatives due to the simplified spectra they compute from sequence records. Spectral library searching can help solve this issue by precisely matching experimental spectra to library spectra with excellent sensitivity and specificity. However, compiling spectral libraries that span entire proteomes is pragmatically difficult. Neural networks that predict complete spectra containing a full range of annotated and unannotated ions can be used to replace these simplified spectra with libraries of fully predicted spectra, including modified peptides. Using such a network, we created predicted spectral libraries that were used to rescore matches from a sequence search done over a large search space, including a large number of modifications. Rescoring improved the separation of true and false hits by 82%, yielding an 8% increase in peptide identifications, including a 21% increase in nonspecifically cleaved peptides and a 17% increase in phosphopeptides.
- Published
- 2023
- Full Text
- View/download PDF
12. Unexpected Gas-Phase Nitrogen-Oxygen Smiles Rearrangement: Collision-Induced Dissociation of Deprotonated 2-( N -Methylanilino)ethanol and Morpholinylbenzoic Acid Derivatives.
- Author
-
Liang Y, Simón-Manso Y, Neta P, and Stein SE
- Abstract
A nitrogen-oxygen Smiles rearrangement was reported to occur after collisional activation of the PhN(R)CH
2 CH2 O- (R = alkyl) anion, which undergoes a five-membered ring rearrangement to form a phenoxide ion C6 H5 O- . When R = H, such a Smiles rearrangement is unlikely since the negative charge is more favorably located on the nitrogen atom than the oxygen atom; hence, alternative neutral losses dominate the fragmentation. For example, collisional activation of deprotonated 2-anilinoethanol (PhN- CH2 CH2 OH) leads to the formation of an anilide anion (C6 H5 NH- , m / z 92) rather than a phenoxide ion (C6 H5 O- , m / z 93.0343). However, when the amino hydrogen of 2-anilinoethanol is substituted by a methyl group, i.e., 2-( N -methylanilino)ethanol, a Smiles rearrangement does occur, leading to the phenoxide ion, as the negative charge can only reside on the oxygen atom. To confirm the Smiles rearrangement mechanism, 2-( N -methylanilino)ethanol-18 O was synthesized and subjected to collisional activation, leading to an intense peak at m / z 95.0385, which corresponds to the18 O phenoxide ion ([C6 H5 18 O]- ). The abundance of the phenoxide ion is sensitive to substituents on the N atom, as demonstrated by the observation that an ethyl substituent results in the rearrangement ion with a much lower abundance. The nitrogen-oxygen Smiles rearrangement also occurs for various morpholinylbenzoic acid derivatives with a multistep mechanism, where the phenoxide ion is found to be predominantly formed after loss of CO2 , proton transfers, breaking of the morpholine ring, and Smiles rearrangement. The Smiles mechanism is also supported by density functional theory calculations and other observations.- Published
- 2022
- Full Text
- View/download PDF
13. Mass Spectral Library Methods for Analysis of Site-Specific N-Glycosylation: Application to Human Milk Proteins.
- Author
-
Remoroza CA, Burke MC, Yang X, Sheetlin S, Mirokhin Y, Markey SP, Tchekhovskoi DV, and Stein SE
- Subjects
- Glycopeptides analysis, Glycoproteins metabolism, Glycosylation, Humans, Lactoferrin metabolism, Tenascin metabolism, Milk Proteins metabolism, Milk, Human chemistry
- Abstract
We present a mass spectral library-based method for analyzing site-specific N-linked protein glycosylation. Its operation and utility are illustrated by applying it to both newly measured and available proteomics data of human milk glycoproteins. It generates two varieties of mass spectral libraries. One contains glycopeptide abundance distribution spectra (GADS). The other contains tandem mass spectra of the underlying glycopeptides. Both originate from identified glycopeptides in proteolytic digests of human milk and purified glycoproteins, which include tenascin, lactoferrin, and several antibodies. Analysis was also applied to digests of a NIST human milk standard reference material (SRM), leading to a GADS library of N-glycopeptides, enabling the direct comparison of glycopeptide distributions for individual proteins. Tandem spectra underlying each glycopeptide GADS peak are combined to create a second type of library that contains spectra of the underlying glycopeptide spectra. These were acquired by higher-energy (stepped) collision dissociation fragmentation followed by ion-trap fragmentation. Spectra are annotated using MS_Piano, recently reported annotation software. This data, with extensions of a widely used spectral library search and display software, provides accessible mass spectral libraries.
- Published
- 2022
- Full Text
- View/download PDF
14. Representing and Comparing Site-Specific Glycan Abundance Distributions of Glycoproteins.
- Author
-
Remoroza CA, Burke MC, Liu Y, Mirokhin YA, Tchekhovskoi DV, Yang X, and Stein SE
- Subjects
- Glycopeptides metabolism, Glycoproteins, Glycosylation, Humans, Polysaccharides, Reproducibility of Results, SARS-CoV-2, Spike Glycoprotein, Coronavirus, COVID-19
- Abstract
A method for representing and comparing distributions of N-linked glycans located at specific sites on proteins is presented. The representation takes the form of a simple mass spectrum for a given peptide sequence, with each peak corresponding to a different glycopeptide. The mass (in place of m / z ) of each peak is that of the glycan mass, and its abundance corresponds to its relative abundance in the electrospray MS
1 spectrum. This provides a facile means of representing all identifiable glycopeptides arising from a single protein "sequon" on a specific sequence, thereby enabling the comparison and searching of these distributions as routinely done for mass spectra. Likewise, these reference glycopeptide abundance distribution spectra (GADS) can be stored in searchable libraries. A set of such libraries created from available data is provided along with an adapted version of the widely used NIST-MS library-search software. Since GADS contain only MS1 abundances and identifications, they are equally suitable for expressing collision-induced fragmentation and electron-transfer dissociation determinations of glycopeptide identity. Comparisons of GADS for N-glycosylated sites on several proteins, especially the SARS-CoV-2 spike protein, demonstrate the potential reproducibility of GADS and their utility for comparing site-specific distributions.- Published
- 2021
- Full Text
- View/download PDF
15. MS_Piano: A Software Tool for Annotating Peaks in CID Tandem Mass Spectra of Peptides and N-Glycopeptides.
- Author
-
Yang X, Neta P, Mirokhin YA, Tchekhovskoi DV, Remoroza CA, Burke MC, Liang Y, Markey SP, and Stein SE
- Subjects
- Reproducibility of Results, Software, Tandem Mass Spectrometry, Glycopeptides, Proteomics
- Abstract
Annotating product ion peaks in tandem mass spectra is essential for evaluating spectral quality and validating peptide identification. This task is more complex for glycopeptides and is crucial for the confident determination of glycosylation sites in glycoproteins. MS_Piano ( M ass S pectrum P ept i de A n no tation) software was developed for reliable annotation of peaks in collision induced dissociation (CID) tandem mass spectra of peptides or N-glycopeptides for given peptide sequences, charge states, and optional modifications. The program annotates each peak in high or low resolution spectra with possible product ion(s) and the mass difference between the measured and theoretical m / z values. Spectral quality is measured by two major parameters: the ratio between the sum of unannotated vs all peak intensities in the top 20 peaks, and the intensity of the highest unannotated peak. The product ions of peptides, glycans, and glycopeptides in spectra are labeled in different class-type colors to facilitate interpretation. MS_Piano assists validating peptide and N-glycopeptide identification from database and library searches and provides quality control and optimizes search reliability in custom developed peptide mass spectral libraries. The software is freely available in .exe and .dll formats for the Windows operating system.
- Published
- 2021
- Full Text
- View/download PDF
16. Creation and filtering of a recurrent spectral library of CHO cell metabolites and media components.
- Author
-
Telu KH, Marupaka R, Andriamaharavo NR, Simón-Manso Y, Liang Y, Mirokhin YA, Bukhari TH, Preston RJ, Kashi L, Kelman Z, and Stein SE
- Subjects
- Animals, CHO Cells, Cricetulus, Culture Media chemistry, Metabolome, Metabolomics, Small Molecule Libraries
- Abstract
This paper reports the first implementation of a new type of mass spectral library for the analysis of Chinese hamster ovary (CHO) cell metabolites that allows users to quickly identify most compounds in any complex metabolite sample. We also describe an annotation methodology developed to filter out artifacts and low-quality spectra from recurrent unidentified spectra of metabolites. CHO cells are commonly used to produce biological therapeutics. Metabolic profiles of CHO cells and media can be used to monitor process variability and look for markers that discriminate between batches of product. We have created a comprehensive library of both identified and unidentified metabolites derived from CHO cells that can be used in conjunction with tandem mass spectrometry to identify metabolites. In addition, we present a workflow that can be used for assigning confidence to a NIST MS/MS Library search match based on prior probability of general utility. The goal of our work is to annotate and identify (when possible), all liquid chromatography-mass spectrometry generated metabolite ions as well as create automatable library building and identification pipelines for use by others in the field., (© 2021 The Authors. Biotechnology and Bioengineering Published by Wiley Periodicals LLC. This article has been contributed to by US Government employees and their work is in the public domain in the USA.)
- Published
- 2021
- Full Text
- View/download PDF
17. Comprehensive Analysis of Tryptic Peptides Arising from Disulfide Linkages in NISTmAb and Their Use for Developing a Mass Spectral Library.
- Author
-
Dong Q, Yan X, Liang Y, Markey SP, Sheetlin SL, Remoroza CA, Wallace WE, and Stein SE
- Subjects
- Amino Acid Sequence, Chromatography, Liquid, Disulfides, Humans, Peptides, Tandem Mass Spectrometry
- Abstract
This work presents methods for identifying and then creating a mass spectral library for disulfide-linked peptides originating from the NISTmAb, a reference material of the humanized IgG1k monoclonal antibody (RM 8671). Analyses involved both partially reduced and non-reduced samples under neutral and weakly basic conditions followed by nanoflow liquid chromatography tandem mass spectrometry (LC-MS/MS). Spectra of peptides containing disulfide bonds are identified by both MS1 ion and MS2 fragment ion data in order to completely map all the disulfide linkages in the NISTmAb. This led to the detection of 383 distinct disulfide-linked peptide ions, arising from fully tryptic cleavage, missed cleavage, irregular cleavage, complex Met/Trp oxidation mixtures, and metal adducts. Fragmentation features of disulfide bonds under low-energy collision dissociation were examined. These include (1) peptide bond cleavage leaving disulfide bonds intact; (2) disulfide bond cleavage, often leading to extensive fragmentation; and (3) double cleavage products resulting from breakages of two peptide bonds or both peptide and disulfide bonds. Automated annotation of various complex MS/MS fragments enabled the identification of disulfide-linked peptides with high confidence. Peptides containing each of the nine native disulfide bonds were identified along with 86 additional disulfide linkages arising from disulfide bond shuffling. The presence of shuffled disulfides was nearly completely abrogated by refining digest conditions. A curated spectral library of 702 disulfide-linked peptide spectra was created from this analysis and is publicly available for free download. Since all IgG1 antibodies have the same constant regions, the resulting library can be used as a tool for facile identification of "hard-to-find" disulfide-bonded peptides. Moreover, we show that one may identify such peptides originating from IgG1 proteins in human serum, thereby serving as a means of monitoring the completeness of protein reduction in proteomics studies. Data are available via ProteomeXchange with identifier PXD023358.
- Published
- 2021
- Full Text
- View/download PDF
18. CID Fragmentation of Deprotonated N -Acyl Aromatic Sulfonamides. Smiles-Type and Nitrogen-Oxygen Rearrangements.
- Author
-
Liang Y, Simón-Manso Y, Neta P, Yang X, and Stein SE
- Abstract
The NIST tandem mass spectral library (2020 version) includes over 800 aromatic sulfonamides. In negative mode, upon collisional activation most benzenesulfonamides lose a neutral SO
2 molecule leading to an anilide anion (C6 H5 NH- , m / z 92). However, for deprotonated N -benzoyl aromatic sulfonamides, the phenoxide ion (C6 H5 O- , m / z 93.0343) is the principal product ion. A variety of N -acylbenzenesulfonamide derivatives were also found to overwhelmingly produce the phenoxide ion as the most intense product ion. A mechanism is proposed in which, at low energy, a carbonyl oxygen atom (C═O) is transferred to a benzene ring, known as a Smiles-type rearrangement (the amide oxygen atom attacks the arylsulfonyl group at the ipso position), in parallel and determining the reaction at high energy a nitrogen-oxygen rearrangement mechanism leads to the formation of the phenoxide ion. Tandem mass spectra of deprotonated N -benzoyl-18 O -benzenesulfonamide and N -thiobenzoyl- p -toluenesulfonamide confirmed the rearrangement since base peaks at m / z 95.0384 and 123.0270 which correspond to an18 O phenoxide ion ([C6 H5 18 O]- ) and a 4-methylbenzenethiolate anion ([CH3 C6 H4 S]- ) were observed, respectively. The parallel mechanism is supported by the strong correlation between the observed product ion intensities and the corresponding activation energies obtained by Density Functional Theory calculations. This is an example of a relatively simple ion with a complex path to fragmentation, being a cautionary tale for indiscriminate use of in silico spectra in place of actual measurement.- Published
- 2021
- Full Text
- View/download PDF
19. Increasing the Coverage of a Mass Spectral Library of Milk Oligosaccharides Using a Hybrid-Search-Based Bootstrapping Method and Milks from a Wide Variety of Mammals.
- Author
-
Remoroza CA, Liang Y, Mak TD, Mirokhin Y, Sheetlin SL, Yang X, San Andres JV, Power ML, and Stein SE
- Subjects
- Animals, Humans, Species Specificity, Tandem Mass Spectrometry, Mammals, Milk chemistry, Milk, Human chemistry, Oligosaccharides chemistry, Small Molecule Libraries
- Abstract
This study significantly expands both the scope and method of identification for construction of a previously reported tandem mass spectral library of 74 human milk oligosaccharides (HMOs) derived from results of combined LC-MS/MS experiments and comprehensive structural analysis of HMOs. In the present work, a hybrid search "bootstrap" identification method was employed that substantially broadens the coverage of milk oligosaccharides and thereby increases utility use of a spectrum library-based method for the rapid tentative identification of all distinguishable glycans in milk. This involved hybrid searching of the previous library, which was itself constructed using the hybrid search of oligosaccharide spectra in the NIST 17 Tandem MS Library. The general approach appears applicable to library construction of other classes of compounds. The coverage of oligosaccharides was significantly extended using milks from a variety of mammals, including bovine, Asian buffalo, African lion, and goat. This new method led to the identification of another 145 oligosaccharides, including an additional 80 HMOs from reanalysis of human milk. The newly identified compounds were added to a freely available mass spectral reference database of 219 milk oligosaccharides. We also provide suggestions to overcome several limitations and pitfalls in the interpretation of spectra of unknown oligosaccharides.
- Published
- 2020
- Full Text
- View/download PDF
20. Mass Spectral Library of Acylcarnitines Derived from Human Urine.
- Author
-
Yan X, Markey SP, Marupaka R, Dong Q, Cooper BT, Mirokhin YA, Wallace WE, and Stein SE
- Subjects
- Carnitine chemistry, Carnitine metabolism, Carnitine urine, Chromatography, Liquid, Humans, Molecular Structure, Tandem Mass Spectrometry, Carnitine analogs & derivatives
- Abstract
We describe the creation of a mass spectral library of acylcarnitines and conjugated acylcarnitines from the LC-MS/MS analysis of six NIST urine reference materials. To recognize acylcarnitines, we conducted in-depth analyses of fragmentation patterns of acylcarnitines and developed a set of rules, derived from spectra in the NIST17 Tandem MS Library and those identified in urine, using the newly developed hybrid search method. Acylcarnitine tandem spectra were annotated with fragments from carnitine and acyl moieties as well as neutral loss peaks from precursors. Consensus spectra were derived from spectra having similar retention time, fragmentation pattern, and the same precursor m / z and collision energy. The library contains 157 different precursor masses, 586 unique acylcarnitines, and 4 332 acylcarnitine consensus spectra. Furthermore, from spectra that partially satisfied the fragmentation rules of acylcarnitines, we identified 125 conjugated acylcarnitines represented by 987 consensus spectra, which appear to originate from Phase II biotransformation reactions. To our knowledge, this is the first report of conjugated acylcarnitines. The mass spectra provided by this work may be useful for clinical screening of acylcarnitines as well as for studying relationships among fragmentation patterns, collision energies, structures, and retention times of acylcarnitines. Further, these methods are extensible to other classes of metabolites.
- Published
- 2020
- Full Text
- View/download PDF
21. Disparate Metabolomics Data Reassembler: A Novel Algorithm for Agglomerating Incongruent LC-MS Metabolomics Datasets.
- Author
-
Mak TD, Goudarzi M, Laiakis EC, and Stein SE
- Subjects
- Animals, Chromatography, Liquid, Databases, Factual, Humans, Mass Spectrometry, Mice, Algorithms, Metabolomics
- Abstract
In the past decade, the field of LC-MS-based metabolomics has transformed from an obscure specialty into a major "-omics" platform for studying metabolic processes and biomolecular characterization. However, as a whole the field is still very fractured, as the nature of the instrumentation and the information produced by the platform essentially creates incompatible "islands" of datasets. This lack of data coherency results in the inability to accumulate a critical mass of metabolomics data that has enabled other -omics platforms to make impactful discoveries and meaningful advances. As such, we have developed a novel algorithm, called Disparate Metabolomics Data Reassembler (DIMEDR), which attempts to bridge the inconsistencies between incongruent LC-MS metabolomics datasets of the same biological sample type. A single "primary" dataset is postprocessed via traditional means of peak identification, alignment, and grouping. DIMEDR utilizes this primary dataset as a progenitor template by which data from subsequent disparate datasets are reassembled and integrated into a unified framework that maximizes spectral feature similarity across all samples. This is accomplished by a novel procedure for universal retention time correction and comparison via identification of ubiquitous features in the initial primary dataset, which are subsequently utilized as endogenous internal standards during integration. For demonstration purposes, two human and two mouse urine metabolomics datasets from four unrelated studies acquired over 4 years were unified via DIMEDR, which enabled meaningful analysis across otherwise incomparable and unrelated datasets.
- Published
- 2020
- Full Text
- View/download PDF
22. Sensitive Method for the Confident Identification of Genetically Variant Peptides in Human Hair Keratin.
- Author
-
Zhang Z, Burke MC, Wallace WE, Liang Y, Sheetlin SL, Mirokhin YA, Tchekhovskoi DV, and Stein SE
- Subjects
- Artifacts, Chromatography, Liquid, Databases, Protein, Forensic Medicine, Humans, Male, Mass Spectrometry, Proteomics, Reproducibility of Results, Hair metabolism, Keratins, Hair-Specific metabolism, Peptides metabolism, Proteome metabolism
- Abstract
Recent reports have demonstrated that genetically variant peptides derived from human hair shaft proteins can be used to differentiate individuals of different biogeographic origins. We report a method involving direct extraction of hair shaft proteins more sensitive than previously published methods regarding GVP detection. It involves one step for protein extraction and was found to provide reproducible results. A detailed proteomic analysis of this data is presented that led to the following four results: (i) A peptide spectral library was created and made available for download. It contains all identified peptides from this work, including GVPs that, when appropriately expanded with diverse hair-derived peptides, can provide a routine, reliable, and sensitive means of analyzing hair digests; (ii) an analysis of artifact peptides arising from side reactions is also made using a new method for finding unexpected modifications; (iii) detailed analysis of the gel-based method employed clearly shows the high degree of cross-linking or protein association involved in hair digestion, with major GVPs eluting over a wide range of high molecular weights while others apparently arise from distinct non-cross-linked proteins; and (v) finally, we show that some of the specific GVP identifications depend on the sample preparation method., (Published 2019. This article is a U.S. Government work and is in the public domain in the USA. Journal of Forensic Sciences.)
- Published
- 2020
- Full Text
- View/download PDF
23. Correction to Mass Spectrometry Fingerprints of Small-Molecule Metabolites in Biofluids: Building a Spectral Library of Recurrent Spectra for Urine Analysis.
- Author
-
Simón-Manso Y, Marupaka R, Yan X, Liang Y, Telu KH, Mirokhin Y, and Stein SE
- Published
- 2020
- Full Text
- View/download PDF
24. NIST Interlaboratory Study on Glycosylation Analysis of Monoclonal Antibodies: Comparison of Results from Diverse Analytical Methods.
- Author
-
De Leoz MLA, Duewer DL, Fung A, Liu L, Yau HK, Potter O, Staples GO, Furuki K, Frenkel R, Hu Y, Sosic Z, Zhang P, Altmann F, Grunwald-Grube C, Shao C, Zaia J, Evers W, Pengelley S, Suckau D, Wiechmann A, Resemann A, Jabs W, Beck A, Froehlich JW, Huang C, Li Y, Liu Y, Sun S, Wang Y, Seo Y, An HJ, Reichardt NC, Ruiz JE, Archer-Hartmann S, Azadi P, Bell L, Lakos Z, An Y, Cipollo JF, Pucic-Bakovic M, Štambuk J, Lauc G, Li X, Wang PG, Bock A, Hennig R, Rapp E, Creskey M, Cyr TD, Nakano M, Sugiyama T, Leung PA, Link-Lenczowski P, Jaworek J, Yang S, Zhang H, Kelly T, Klapoetke S, Cao R, Kim JY, Lee HK, Lee JY, Yoo JS, Kim SR, Suh SK, de Haan N, Falck D, Lageveen-Kammeijer GSM, Wuhrer M, Emery RJ, Kozak RP, Liew LP, Royle L, Urbanowicz PA, Packer NH, Song X, Everest-Dass A, Lattová E, Cajic S, Alagesan K, Kolarich D, Kasali T, Lindo V, Chen Y, Goswami K, Gau B, Amunugama R, Jones R, Stroop CJM, Kato K, Yagi H, Kondo S, Yuen CT, Harazono A, Shi X, Magnelli PE, Kasper BT, Mahal L, Harvey DJ, O'Flaherty R, Rudd PM, Saldova R, Hecht ES, Muddiman DC, Kang J, Bhoskar P, Menard D, Saati A, Merle C, Mast S, Tep S, Truong J, Nishikaze T, Sekiya S, Shafer A, Funaoka S, Toyoda M, de Vreugd P, Caron C, Pradhan P, Tan NC, Mechref Y, Patil S, Rohrer JS, Chakrabarti R, Dadke D, Lahori M, Zou C, Cairo C, Reiz B, Whittal RM, Lebrilla CB, Wu L, Guttman A, Szigeti M, Kremkow BG, Lee KH, Sihlbom C, Adamczyk B, Jin C, Karlsson NG, Örnros J, Larson G, Nilsson J, Meyer B, Wiegandt A, Komatsu E, Perreault H, Bodnar ED, Said N, Francois YN, Leize-Wagner E, Maier S, Zeck A, Heck AJR, Yang Y, Haselberg R, Yu YQ, Alley W, Leone JW, Yuan H, and Stein SE
- Subjects
- Antibodies, Monoclonal metabolism, Glycomics methods, Glycopeptides metabolism, Glycosylation, Humans, Laboratories, Polysaccharides metabolism, Protein Processing, Post-Translational, Proteomics methods, Antibodies, Monoclonal chemistry, Biological Products, Biopharmaceutics methods
- Abstract
Glycosylation is a topic of intense current interest in the development of biopharmaceuticals because it is related to drug safety and efficacy. This work describes results of an interlaboratory study on the glycosylation of the Primary Sample (PS) of NISTmAb, a monoclonal antibody reference material. Seventy-six laboratories from industry, university, research, government, and hospital sectors in Europe, North America, Asia, and Australia submitted a total of 103 reports on glycan distributions. The principal objective of this study was to report and compare results for the full range of analytical methods presently used in the glycosylation analysis of mAbs. Therefore, participation was unrestricted, with laboratories choosing their own measurement techniques. Protein glycosylation was determined in various ways, including at the level of intact mAb, protein fragments, glycopeptides, or released glycans, using a wide variety of methods for derivatization, separation, identification, and quantification. Consequently, the diversity of results was enormous, with the number of glycan compositions identified by each laboratory ranging from 4 to 48. In total, one hundred sixteen glycan compositions were reported, of which 57 compositions could be assigned consensus abundance values. These consensus medians provide community-derived values for NISTmAb PS. Agreement with the consensus medians did not depend on the specific method or laboratory type. The study provides a view of the current state-of-the-art for biologic glycosylation measurement and suggests a clear need for harmonization of glycosylation analysis methods., (© 2020 De Leoz et al.)
- Published
- 2020
- Full Text
- View/download PDF
25. Hybrid Search: A Method for Identifying Metabolites Absent from Tandem Mass Spectrometry Libraries.
- Author
-
Cooper BT, Yan X, Simón-Manso Y, Tchekhovskoi DV, Mirokhin YA, and Stein SE
- Subjects
- Peptide Fragments chemistry, Proteomics methods, Databases, Factual, Metabolomics methods, Tandem Mass Spectrometry
- Abstract
Metabolomics has a critical need for better tools for mass spectral identification. Common metabolites may be identified by searching libraries of tandem mass spectra, which offers important advantages over other approaches to identification. But tandem libraries are not nearly complete enough to represent the full molecular diversity present in complex biological samples. We present a novel hybrid search method that can help identify metabolites not in the library by similarity to compounds that are. We call it "hybrid" searching because it combines conventional, direct peak matching with the logical equivalent of neutral-loss matching. A successful hybrid search requires the library to contain "cognates" of the unknown: similar compounds with a structural difference confined to a single region of the molecule, that does not substantially alter its fragmentation behavior. We demonstrate that the hybrid search is highly likely to find similar compounds under such circumstances.
- Published
- 2019
- Full Text
- View/download PDF
26. Mass Spectrometry Fingerprints of Small-Molecule Metabolites in Biofluids: Building a Spectral Library of Recurrent Spectra for Urine Analysis.
- Author
-
Simón-Manso Y, Marupaka R, Yan X, Liang Y, Telu KH, Mirokhin Y, and Stein SE
- Subjects
- Carnitine urine, Chromatography, Liquid, Humans, Mass Spectrometry, Software, Body Fluids chemistry, Carnitine analogs & derivatives, Neoplasms urine, Small Molecule Libraries analysis
- Abstract
A large fraction of ions observed in electrospray liquid chromatography-mass spectrometry (LC-ESI-MS) experiments of biological samples remain unidentified. One of the main reasons for this is that spectral libraries of pure compounds fail to account for the complexity of the metabolite profiling of complex materials. Recently, the NIST Mass Spectrometry Data Center has been developing a novel type of searchable mass spectral library that includes all recurrent unidentified spectra found in the sample profile. These libraries, in conjunction with the NIST tandem mass spectral library, allow analysts to explore most of the chemical space accessible to LC-MS analysis. In this work, we demonstrate how these libraries can provide a reliable fingerprint of the material by applying them to a variety of urine samples, including an extremely altered urine from cancer patients undergoing total body irradiation. The same workflow is applicable to any other biological fluid. The selected class of acylcarnitines is examined in detail, and derived libraries and related software are freely available. They are intended to serve as online resources for continuing community review and improvement.
- Published
- 2019
- Full Text
- View/download PDF
27. False Discovery Rate Estimation for Hybrid Mass Spectral Library Search Identifications in Bottom-up Proteomics.
- Author
-
Burke MC, Zhang Z, Mirokhin YA, Tchekovskoi DV, Liang Y, and Stein SE
- Subjects
- Algorithms, Peptide Library, Peptides classification, Software, Tandem Mass Spectrometry, Databases, Protein, Peptides genetics, Proteomics methods
- Abstract
We present a method for FDR estimation of mass spectral library search identifications made by a recently developed method for peptide identification, the hybrid search, based on an extension of the target-decoy approach. In addition to estimating confidence for a given identification, this allows users to compare and integrate identifications from the hybrid mass spectral library search method with other peptide identification methods, such as a sequence database-based method. In addition to a score, each hybrid score is associated with a "DeltaMass" value, which is the difference in mass of the search and library peptide, which can correspond to the mass of a modification. We explored the relation between FDR and DeltaMass using 100 concatenated random decoy libraries and discovered that a small number of DeltaMass values were especially likely to result from decoy searches. Using these values, FDR values could be adjusted for these specific values and a reliable FDR generated for any DeltaMass value. Finally, using this method, we find and examine common, reliable identifications made by the hybrid search for a range of proteomic studies.
- Published
- 2019
- Full Text
- View/download PDF
28. Cross-Ring Fragmentation Patterns in the Tandem Mass Spectra of Underivatized Sialylated Oligosaccharides and Their Special Suitability for Spectrum Library Searching.
- Author
-
De Leoz MLA, Simón-Manso Y, Woods RJ, and Stein SE
- Subjects
- CA-19-9 Antigen, N-Acetylneuraminic Acid chemistry, Polysaccharides analysis, Polysaccharides chemistry, Small Molecule Libraries, Oligosaccharides analysis, Oligosaccharides chemistry, Tandem Mass Spectrometry methods
- Abstract
Reference spectral library searching, while widely used to identify compounds in other areas of mass spectrometry, is not commonly used in glycomics. Building on a study by Cotter and coworkers on analysis of sialylated oligosaccharides using atmospheric pressure-matrix-assisted laser-induced tandem mass spectrometry (MS/MS), we show that library search methods enable the automated differentiation of such sialylated oligosaccharide isomers using MS/MS derived from electrospray collision-induced dissociation in ion trap and beam-type fragmentation mass spectrometers. We compare MS/MS spectra of five sets of native sialylated oligosaccharide isomers and show a spectral library search method that can distinguish between these isomers using the precursor ion [M+2X-H]
+ , where X=Li, Na, or K. Sialic acid linkage (α2,3 vs. α2,6) is known to have a dramatic effect on the fragmentation of the sialylated compounds. We found that2,4 A3 cross-ring fragment at the terminal monosaccharide in sialyllactoses, sialyllactosamines, and sialyl pentasaccharides is highly abundant in the MS/MS spectra of [M+2X-H]+ species of α2,6-NeuAc glycans, while (2,4 A3-H2O) fragment is highly abundant in α2,3-NeuAc moiety. The2,4 A3-H2O peak is specific to NeuAc-α2,3-Gal-β1,4-Y (Y=GlcNAc or Glc). To our knowledge, this observation was not reported previously. Theoretical calculations reveal major conformational differences between α2,6-NeuAc and α2,3-NeuAc structures that provide reasonable explanations for the observed fragmentation patterns. Other singly-charged ions ([M+X]+ ) do not show similar cross-ring cleavages. Implemented in a searchable library, these spectral differences provide a facile method to distinguish sialyl isomers without derivatization. We also found good spectral matching across instruments. MS/MS spectra and tools are available at http://chemdata.nist.gov/glycan/spectra . Graphical Abstract.- Published
- 2019
- Full Text
- View/download PDF
29. Creating a Mass Spectral Reference Library for Oligosaccharides in Human Milk.
- Author
-
Remoroza CA, Mak TD, De Leoz MLA, Mirokhin YA, and Stein SE
- Subjects
- Chromatography, High Pressure Liquid methods, Chromatography, High Pressure Liquid standards, Humans, Reference Standards, Spectrometry, Mass, Electrospray Ionization standards, Tandem Mass Spectrometry standards, Milk, Human chemistry, Oligosaccharides analysis, Spectrometry, Mass, Electrospray Ionization methods, Tandem Mass Spectrometry methods
- Abstract
We report the development and availability of a mass spectral reference library for oligosaccharides in human milk. This represents a new variety of spectral library that includes consensus spectra of compounds annotated through various data analysis methods, a concept that can be extended to other varieties of biological fluids. Oligosaccharides from the NIST Standard Reference Material (SRM) 1953, composed of human milk pooled from 100 breastfeeding mothers, were identified and characterized using hydrophilic interaction liquid chromatography electrospray ionization tandem mass spectrometry (HILIC-ESI-MS/MS) and the NIST 17 Tandem MS Library. Consensus reference spectra were generated, incorporated into a searchable library, and matched using the newly developed hybrid search algorithm to elucidate unknown oligosaccharides. The NIST hybrid search program facilitates the structural assignment of complex oligosaccharides especially when reference standards are not commercially available. High accuracy mass measurement for precursor and product ions, as well as the relatively high MS/MS signal intensities of various oligosaccharide precursors with Fourier transform ion trap (FT-IT) and higher energy dissociation (HCD) fragmentation techniques, enabled the assignment of multiple free and underivatized fucosyllacto- and sialyllacto-oligosaccharide spectra. Neutral and sialylated isomeric oligosaccharides have distinct retention times, allowing the identification of 74 oligosaccharides in the reference material. This collection of newly characterized spectra based on a searchable, reference MS library of annotated oligosaccharides can be applied to analyze similar compounds in other types of milk or any biological fluid containing milk oligosaccharides.
- Published
- 2018
- Full Text
- View/download PDF
30. The NISTmAb tryptic peptide spectral library for monoclonal antibody characterization.
- Author
-
Dong Q, Liang Y, Yan X, Markey SP, Mirokhin YA, Tchekhovskoi DV, Bukhari TH, and Stein SE
- Subjects
- Animals, Chromatography, Liquid instrumentation, Humans, Adalimumab analysis, Adalimumab chemistry, Mass Spectrometry methods, Peptide Library
- Abstract
We describe the creation of a mass spectral library composed of all identifiable spectra derived from the tryptic digest of the NISTmAb IgG1κ. The library is a unique reference spectral collection developed from over six million peptide-spectrum matches acquired by liquid chromatography-mass spectrometry (LC-MS) over a wide range of collision energy. Conventional one-dimensional (1D) LC-MS was used for various digestion conditions and 20- and 24-fraction two-dimensional (2D) LC-MS studies permitted in-depth analyses of single digests. Computer methods were developed for automated analysis of LC-MS isotopic clusters to determine the attributes for all ions detected in the 1D and 2D studies. The library contains a selection of over 12,600 high-quality tandem spectra of more than 3,300 peptide ions identified and validated by accurate mass, differential elution pattern, and expected peptide classes in peptide map experiments. These include a variety of biologically modified peptide spectra involving glycosylated, oxidized, deamidated, glycated, and N/C-terminal modified peptides, as well as artifacts. A complete glycation profile was obtained for the NISTmAb with spectra for 58% and 100% of all possible glycation sites in the heavy and light chains, respectively. The site-specific quantification of methionine oxidation in the protein is described. The utility of this reference library is demonstrated by the analysis of a commercial monoclonal antibody (adalimumab, Humira®), where 691 peptide ion spectra are identifiable in the constant regions, accounting for 60% coverage for both heavy and light chains. The NIST reference library platform may be used as a tool for facile identification of the primary sequence and post-translational modifications, as well as the recognition of LC-MS method-induced artifacts for human and recombinant IgG antibodies. Its development also provides a general method for creating comprehensive peptide libraries of individual proteins.
- Published
- 2018
- Full Text
- View/download PDF
31. Collision-Induced Dissociation of Deprotonated Peptides. Relative Abundance of Side-Chain Neutral Losses, Residue-Specific Product Ions, and Comparison with Protonated Peptides.
- Author
-
Liang Y, Neta P, Yang X, and Stein SE
- Subjects
- Ions chemistry, Tandem Mass Spectrometry, Amino Acids chemistry, Peptides chemistry, Protons
- Abstract
High-accuracy MS/MS spectra of deprotonated ions of 390 dipeptides and 137 peptides with three to six residues are studied. Many amino acid residues undergo neutral losses from their side chains. The most abundant is the loss of acetaldehyde from threonine. The abundance of losses from the side chains of other amino acids is estimated relative to that of threonine. While some amino acids lose the whole side chain, others lose only part of it, and some exhibit two or more different losses. Side-chain neutral losses are less abundant in the spectra of protonated peptides, being significant mainly for methionine and arginine. In addition to the neutral losses, many amino acid residues in deprotonated peptides produce specific negative ions after peptide bond cleavage. An expanded list of fragment ions from protonated peptides is also presented and compared with those of deprotonated peptides. Fragment ions are mostly different for these two cases. These lists of fragments are used to annotate peptide mass spectral libraries and to aid in the confirmation of specific amino acids in peptides. Graphical Abstract ᅟ.
- Published
- 2018
- Full Text
- View/download PDF
32. Reverse and Random Decoy Methods for False Discovery Rate Estimation in High Mass Accuracy Peptide Spectral Library Searches.
- Author
-
Zhang Z, Burke M, Mirokhin YA, Tchekhovskoi DV, Markey SP, Yu W, Chaerkady R, Hess S, and Stein SE
- Subjects
- Amino Acid Sequence, Humans, Peptides chemistry, Software, Tandem Mass Spectrometry, Algorithms, Amino Acids chemistry, Image Processing, Computer-Assisted statistics & numerical data, Libraries, Special methods, Peptide Library, Peptides analysis
- Abstract
Spectral library searching (SLS) is an attractive alternative to sequence database searching (SDS) for peptide identification due to its speed, sensitivity, and ability to include any selected mass spectra. While decoy methods for SLS have been developed for low mass accuracy peptide spectral libraries, it is not clear that they are optimal or directly applicable to high mass accuracy spectra. Therefore, we report the development and validation of methods for high mass accuracy decoy libraries. Two types of decoy libraries were found to be suitable for this purpose. The first, referred to as Reverse, constructs spectra by reversing a library's peptide sequences except for the C-terminal residue. The second, termed Random, randomly replaces all non-C-terminal residues and either retains the original C-terminal residue or replaces it based on the amino-acid frequency of the library's C-terminus. In both cases the m/z values of fragment ions are shifted accordingly. Determination of FDR is performed in a manner equivalent to SDS, concatenating a library with its decoy prior to a search. The utility of Reverse and Random libraries for target-decoy SLS in estimating false-positives and FDRs was demonstrated using spectra derived from a recently published synthetic human proteome project (Zolg, D. P.; et al. Nat. Methods 2017, 14, 259-262). For data sets from two large-scale label-free and iTRAQ experiments, these decoy building methods yielded highly similar score thresholds and spectral identifications at 1% FDR. The results were also found to be equivalent to those of using the decoy-free PeptideProphet algorithm. Using these new methods for FDR estimation, MSPepSearch, which is freely available search software, led to 18% more identifications at 1% FDR and 23% more at 0.1% FDR when compared with other widely used SDS engines coupled to postprocessing approaches such as Percolator. An application of these methods for FDR estimation for the recently reported "hybrid" library search (Burke, M. C.; et al. J. Proteome Res. 2017, 16, 1924-1935) method is also made. The application of decoy methods for high mass accuracy SLS permits the merging of these results with those of SDS, thereby increasing the assignment of more peptides, leading to deeper proteome coverage.
- Published
- 2018
- Full Text
- View/download PDF
33. Combining Fragment-Ion and Neutral-Loss Matching during Mass Spectral Library Searching: A New General Purpose Algorithm Applicable to Illicit Drug Identification.
- Author
-
Moorthy AS, Wallace WE, Kearsley AJ, Tchekhovskoi DV, and Stein SE
- Subjects
- Ions chemistry, Molecular Structure, Molecular Weight, Tandem Mass Spectrometry, Algorithms, Illicit Drugs analysis, Search Engine
- Abstract
A mass spectral library search algorithm that identifies compounds that differ from library compounds by a single "inert" structural component is described. This algorithm, the Hybrid Similarity Search, generates a similarity score based on matching both fragment ions and neutral losses. It employs the parameter DeltaMass, defined as the mass difference between query and library compounds, to shift neutral loss peaks in the library spectrum to match corresponding neutral loss peaks in the query spectrum. When the spectra being compared differ by a single structural feature, these matching neutral loss peaks should contain that structural feature. This method extends the scope of the library to include spectra of "nearest-neighbor" compounds that differ from library compounds by a single chemical moiety. Additionally, determination of the structural origin of the shifted peaks can aid in the determination of the chemical structure and fragmentation mechanism of the query compound. A variety of examples are presented, including the identification of designer drugs and chemical derivatives not present in the library.
- Published
- 2017
- Full Text
- View/download PDF
34. Extending a Tandem Mass Spectral Library to Include MS 2 Spectra of Fragment Ions Produced In-Source and MS n Spectra.
- Author
-
Yang X, Neta P, and Stein SE
- Abstract
Tandem mass spectral library searching is finding increased use as an effective means of determining chemical identity in mass spectrometry-based omics studies. We previously reported on constructing a tandem mass spectral library that includes spectra for multiple precursor ions for each analyte. Here we report our method for expanding this library to include MS
2 spectra of fragment ions generated during the ionization process (in-source fragment ions) as well as MS3 and MS4 spectra. These can assist the chemical identification process. A simple density-based clustering algorithm was used to cluster all significant precursor ions from MS1 scans for an analyte acquired during an infusion experiment. The MS2 spectra associated with these precursor ions were grouped into the same precursor clusters. Subsequently, a new top-down hierarchical divisive clustering algorithm was developed for clustering the spectra from fragmentation of ions in each precursor cluster, including the MS2 spectra of the original precursors and of the in-source fragments as well as the MSn spectra. This algorithm starts with all the spectra of one precursor in one cluster and then separates them into sub-clusters of similar spectra based on the fragment patterns. Herein, we describe the algorithms and spectral evaluation methods for extending the library. The new library features were demonstrated by searching the high resolution spectra of E. coli extracts against the extended library, allowing identification of compounds and their in-source fragment ions in a manner that was not possible before. Graphical Abstract ᅟ.- Published
- 2017
- Full Text
- View/download PDF
35. The Hybrid Search: A Mass Spectral Library Search Method for Discovery of Modifications in Proteomics.
- Author
-
Burke MC, Mirokhin YA, Tchekhovskoi DV, Markey SP, Heidbrink Thompson J, Larkin C, and Stein SE
- Subjects
- Animals, CHO Cells, Cell Line, Tumor, Cricetulus, Databases, Factual, Humans, Ions, Molecular Weight, Proteomics standards, Algorithms, Databases, Protein, Proteomics methods, Tandem Mass Spectrometry methods
- Abstract
We present a mass spectral library-based method to identify tandem mass spectra of peptides that contain unanticipated modifications and amino acid variants. We describe this as a "hybrid" method because it combines matching both ion m/z and mass losses. The mass loss is the difference between the mass of an ion peak and the mass of its precursor. This difference, termed DeltaMass, is used to shift the product ions in the library spectrum that contain the modification, thereby allowing library product ions that contain the unexpected modification to match the query spectrum. Clustered unidentified spectra from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) and Chinese hamster ovary cells were used to evaluate this method. The results demonstrate the ability of the hybrid method to identify unanticipated modifications, insertions, and deletions, which may include those due to an incomplete protein sequence database or to search settings that exclude the correct identification, in high-resolution tandem mass spectra without regard to their precursor mass. This has been made possible by indexing of the m/z value of each fragment ion and its difference in mass from its precursor ion.
- Published
- 2017
- Full Text
- View/download PDF
36. Mass Spectral Library Quality Assurance by Inter-Library Comparison.
- Author
-
Wallace WE, Ji W, Tchekhovskoi DV, Phinney KW, and Stein SE
- Abstract
A method to discover and correct errors in mass spectral libraries is described. Comparing across a set of highly curated reference libraries compounds that have the same chemical structure quickly identifies entries that are outliers. In cases where three or more entries for the same compound are compared, the outlier as determined by visual inspection was almost always found to contain the error. These errors were either in the spectrum itself or in the chemical descriptors that accompanied it. The method is demonstrated on finding errors in compounds of forensic interest in the NIST/EPA/NIH Mass Spectral Library. The target list of compounds checked was the Scientific Working Group for the Analysis of Seized Drugs (SWGDRUG) mass spectral library. Some examples of errors found are described. A checklist of errors that curators should look for when performing inter-library comparisons is provided. Graphical Abstract ᅟ.
- Published
- 2017
- Full Text
- View/download PDF
37. Analysis of human plasma metabolites across different liquid chromatography/mass spectrometry platforms: Cross-platform transferable chemical signatures.
- Author
-
Telu KH, Yan X, Wallace WE, Stein SE, and Simón-Manso Y
- Published
- 2017
- Full Text
- View/download PDF
38. Human Plasma Metabolites Measured with Different Liquid Chromatography/Mass Spectrometry (LC/MS) Platforms.
- Author
-
Telu KH, Yan X, Wallace WE, Stein SE, and Simón-Manso Y
- Published
- 2016
- Full Text
- View/download PDF
39. Interconversion of Peptide Mass Spectral Libraries Derivatized with iTRAQ or TMT Labels.
- Author
-
Zhang Z, Yang X, Mirokhin YA, Tchekhovskoi DV, Ji W, Markey SP, Roth J, Neta P, Hizal DB, Bowen MA, and Stein SE
- Subjects
- Databases, Protein, Mass Spectrometry, Molecular Weight, Staining and Labeling, Peptide Library, Proteomics methods
- Abstract
Derivitization of peptides with isobaric tags such as iTRAQ and TMT is widely employed in proteomics due to their compatibility with multiplex quantitative measurements. We recently made publicly available a large peptide library derived from iTRAQ 4-plex labeled spectra. This resource has not been used for identifying peptides labeled with related tags with different masses, because values for virtually all masses of precursor and most product ions would differ for ions containing the different tags as well as containing different tag-specific peaks. We describe a method for interconverting spectra from iTRAQ 4-plex to TMT (6- and 10-plex) and to iTRAQ 8-plex. We interconvert spectra by appropriately mass shifting sequence ions and discarding derivative-specific peaks. After this "cleaning" of search spectra, we demonstrate that the converted libraries perform well in terms of peptide spectral matches. This is demonstrated by comparing results using sequence database searches as well as by comparing search effectiveness using original and converted libraries. At 1% FDR TMT labeled query spectra match 97% as many spectra against a converted iTRAQ library as compared to an original TMT library. Overall this interconversion strategy provides a practical way to extend results from one derivatization method to others that share related chemistry and do not significantly alter fragmentation profiles.
- Published
- 2016
- Full Text
- View/download PDF
40. Data quality issues in proteomics - there are many paths to enlightenment.
- Author
-
Haynes PA, Stein SE, and Washburn MP
- Subjects
- Humans, Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization, Proteomics methods, Proteomics standards
- Published
- 2016
- Full Text
- View/download PDF
41. In-Depth Characterization and Spectral Library Building of Glycopeptides in the Tryptic Digest of a Monoclonal Antibody Using 1D and 2D LC-MS/MS.
- Author
-
Dong Q, Yan X, Liang Y, and Stein SE
- Subjects
- Chromatography, Liquid methods, Databases, Protein, Humans, Polysaccharides analysis, Software, Tandem Mass Spectrometry, Trypsin metabolism, Antibodies, Monoclonal metabolism, Glycopeptides analysis, Immunoglobulin G metabolism, Proteomics methods
- Abstract
This work presents a detailed analysis of glycopeptides produced in the tryptic digestion of an IgG1 reference material. Analysis was done by nanospray ESI LC-MS/MS over a wide range of HCD collision energies with both conventional 1D separation for various digestion conditions and a 20 fraction 2D-LC study of a single digest. An extended version of NIST-developed software for analysis of "shotgun" proteomics served to identify the glycopeptides from their precursor masses and product ions for peptides with up to three missed cleavages. A peptide with a single missed cleavage, TKPREEQYNSTYR, was dominant and led to the determination of almost all glycans reported in this study. The 2D studies found a total of 247 glycopeptide ions and 60 glycans of different masses, including 30 glycans found in the 1D studies. This significantly larger number of glycans than found in any other glycoanalysis of therapeutic glycoproteins is due to both the improved separation of sialylated versus asialylated species in the first (high-pH) dimension and the ability to inject large amounts of glycosylated peptides in the 2D studies. Systematic variations in retention with glycan size were also noted. Energy-dependent changes in HCD fragmentation confirmed the proposed glycan structures and led to a peak-annotated mass spectral library to aid the analysis of glycopeptides derived from IgG1 drugs.
- Published
- 2016
- Full Text
- View/download PDF
42. Analysis of human plasma metabolites across different liquid chromatography/mass spectrometry platforms: Cross-platform transferable chemical signatures.
- Author
-
Telu KH, Yan X, Wallace WE, Stein SE, and Simón-Manso Y
- Subjects
- Chromatography, High Pressure Liquid methods, Chromatography, Reverse-Phase methods, Humans, Metabolome, Plasma chemistry, Metabolomics methods, Plasma metabolism, Tandem Mass Spectrometry methods
- Abstract
Rationale: The metabolite profiling of a NIST plasma Standard Reference Material (SRM 1950) on different liquid chromatography/mass spectrometry (LC/MS) platforms showed significant differences. Although these findings suggest caution when interpreting metabolomics results, the degree of overlap of both profiles allowed us to use tandem mass spectral libraries of recurrent spectra to evaluate to what extent these results are transferable across platforms and to develop cross-platform chemical signatures., Methods: Non-targeted global metabolite profiles of SRM 1950 were obtained on different LC/MS platforms using reversed-phase chromatography and different chromatographic scales (conventional HPLC, UHPLC and nanoLC). The data processing and the metabolite differential analysis were carried out using publically available (XCMS), proprietary (Mass Profiler Professional) and in-house software (NIST pipeline)., Results: Repeatability and intermediate precision showed that the non-targeted SRM 1950 profiling was highly reproducible when working on the same platform (relative standard deviation (RSD) <2%); however, substantial differences were found in the LC/MS patterns originating on different platforms or even using different chromatographic scales (conventional HPLC, UHPLC and nanoLC) on the same platform. A substantial degree of overlap (common molecular features) was also found. A procedure to generate consistent chemical signatures using tandem mass spectral libraries of recurrent spectra is proposed., Conlusions: Different platforms rendered significantly different metabolite profiles, but the results were highly reproducible when working within one platform. Tandem mass spectral libraries of recurrent spectra are proposed to evaluate the degree of transferability of chemical signatures generated on different platforms. Chemical signatures based on our procedure are most likely cross-platform transferable., (Published in 2016. This article is a U.S. Government work and is in the public domain in the USA.)
- Published
- 2016
- Full Text
- View/download PDF
43. A Description of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) Common Data Analysis Pipeline.
- Author
-
Rudnick PA, Markey SP, Roth J, Mirokhin Y, Yan X, Tchekhovskoi DV, Edwards NJ, Thangudu RR, Ketchum KA, Kinsinger CR, Mesri M, Rodriguez H, and Stein SE
- Subjects
- Biomarkers, Tumor metabolism, Humans, Proteome metabolism, Neoplasms diagnosis, Neoplasms metabolism, Proteomics
- Abstract
The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics data sets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) and nonreference markers of cancer. The CPTAC laboratories have focused on colon, breast, and ovarian tissues in the first round of analyses; spectra from these data sets were produced from 2D liquid chromatography-tandem mass spectrometry analyses and represent deep coverage. To reduce the variability introduced by disparate data analysis platforms (e.g., software packages, versions, parameters, sequence databases, etc.), the CPTAC Common Data Analysis Platform (CDAP) was created. The CDAP produces both peptide-spectrum-match (PSM) reports and gene-level reports. The pipeline processes raw mass spectrometry data according to the following: (1) peak-picking and quantitative data extraction, (2) database searching, (3) gene-based protein parsimony, and (4) false-discovery rate-based filtering. The pipeline also produces localization scores for the phosphopeptide enrichment studies using the PhosphoRS program. Quantitative information for each of the data sets is specific to the sample processing, with PSM and protein reports containing the spectrum-level or gene-level ("rolled-up") precursor peak areas and spectral counts for label-free or reporter ion log-ratios for 4plex iTRAQ. The reports are available in simple tab-delimited formats and, for the PSM-reports, in mzIdentML. The goal of the CDAP is to provide standard, uniform reports for all of the CPTAC data to enable comparisons between different samples and cancer types as well as across the major omics fields.
- Published
- 2016
- Full Text
- View/download PDF
44. Reaction of arylium ions with the collision gas N2 in electrospray ionization mass spectrometry.
- Author
-
Liang Y, Neta P, Simón-Manso Y, and Stein SE
- Subjects
- Ions chemistry, Nitrogen chemistry, Spectrometry, Mass, Electrospray Ionization methods
- Abstract
Rationale: The tandem mass spectra of many compounds contained peaks which could not have arisen from the precursor ion. Such peaks were found to be due to reaction of arylium ions with N2 in the collision cell. Therefore, this reaction was studied in detail with representative compounds., Methods: Various classes of compounds were dissolved in acetonitrile/water/formic acid and studied by electrospray ionization mass spectrometry to record their MS(2) and pseudo-MS(3) spectra in a QqQ mass spectrometer and their accurate m/z values in an Orbitrap Elite instrument. Arylium ions were found to react with N2 in the collision cell. The reaction was confirmed by pseudo-MS(3) studies, by comparison with authentic diazonium ions, and by the pressure dependence of the product ion survival yield., Results: Reactions of arylium ions with N2 were observed with p-toluenesulfonic acid, o-toluenesulfonamide, phenylphosphonic acid, phenol, aniline, aminonaphthalenes, benzoic acid, benzophenone, and other compounds. By using a QqQ mass spectrometer, we observed that the protonated compounds produce arylium ions, which then react with N2 to form diazonium ions. The diazonium ion was produced with N2 but not with Ar in the collision cell, and its abundance increased with increasing N2 pressure., Conclusions: Arylium ions generated from a wide variety of compounds in electrospray ionization tandem mass spectrometry may react with N2 to form diazonium ions. The abundance of the diazonium ions is affected by collision energy and N2 pressure. This reaction should be considered when annotating peaks in MS/MS libraries. Published in 2015. This article is a U.S. Government work and is in the public domain in the USA., (Published in 2015. This article is a U.S. Government work and is in the public domain in the USA.)
- Published
- 2015
- Full Text
- View/download PDF
45. Unexpected peaks in tandem mass spectra due to reaction of product ions with residual water in mass spectrometer collision cells.
- Author
-
Neta P, Farahani M, Simón-Manso Y, Liang Y, Yang X, and Stein SE
- Subjects
- Cations chemistry, Folic Acid analogs & derivatives, Reproducibility of Results, Spectrometry, Mass, Electrospray Ionization methods, Folic Acid chemistry, Tandem Mass Spectrometry methods, Water chemistry
- Abstract
Rationale: Certain product ions in electrospray ionization tandem mass spectrometry are found to react with residual water in the collision cell. This reaction often leads to the formation of ions that cannot be formed directly from the precursor ions, and this complicates the mass spectra and may distort MRM (multiple reaction monitoring) results., Methods: Various drugs, pesticides, metabolites, and other compounds were dissolved in acetonitrile/water/formic acid and studied by electrospray ionization mass spectrometry to record their MS(2) and MS(n) spectra in several mass spectrometers (QqQ, QTOF, IT, and Orbitrap HCD). Certain product ions were found to react with residual water in collision cells. The reaction was confirmed by MS(n) studies and the rate of reaction was determined in the IT instrument using zero collision energy and variable activation times., Results: Examples of product ions reacting with water include phenyl and certain substituted phenyl cations, benzoyl-type cations formed from protonated folic acid and similar compounds by loss of the glutamate moiety, product ions formed from protonated cyclic siloxanes by loss of methane, product ions formed from organic phosphates, and certain negative ions. The reactions of product ions with residual water varied greatly in their rate constant and in the extent of reaction (due to isomerization)., Conclusions: Various types of product ions react with residual water in mass spectrometer collision cells. As a result, tandem mass spectra may contain unexplained peaks and MRM results may be distorted by the occurrence of such reactions. These often unavoidable reactions must be taken into account when annotating peaks in tandem mass spectra and when interpreting MRM results. Published in 2014. This article is a U.S. Government work and is in the public domain in the USA., (Published in 2014. This article is a U.S. Government work and is in the public domain in the USA.)
- Published
- 2014
- Full Text
- View/download PDF
46. Creation of libraries of recurring mass spectra from large data sets assisted by a dual-column workflow.
- Author
-
Mallard WG, Andriamaharavo NR, Mirokhin YA, Halket JM, and Stein SE
- Subjects
- Citric Acid chemistry, Humans, Urine chemistry, Gas Chromatography-Mass Spectrometry, Small Molecule Libraries chemistry
- Abstract
An analytical methodology has been developed for extracting recurrent unidentified spectra (RUS) from large GC/MS data sets. Spectra were first extracted from original data files by the Automated Mass Spectral Deconvolution and Identification System (AMDIS; Stein, S. E. J. Am. Soc. Mass Spectrom. 1999 , 10 , 770 - 781 ) using settings designed to minimize spurious spectra, followed by searching the NIST library with all unidentified spectra. The spectra that could not be identified were then filtered to remove poorly deconvoluted data and clustered. The results were assumed to be unidentified components. This was tested by requiring each unidentified spectrum to be found in two chromatographic columns with slightly different stationary phases. This methodology has been applied to a large set of pediatric urine samples. A library of spectra and retention indices for derivatized urine components, both identified and recurrent unidentified, has been created and is available for download.
- Published
- 2014
- Full Text
- View/download PDF
47. Loss of H2 and CO from protonated aldehydes in electrospray ionization mass spectrometry.
- Author
-
Neta P, Simón-Manso Y, Liang Y, and Stein SE
- Subjects
- Models, Molecular, Protons, Aldehydes chemistry, Carbon Monoxide chemistry, Hydrogen chemistry, Spectrometry, Mass, Electrospray Ionization methods
- Abstract
Rationale: Electrospray ionization mass spectrometry (ESI-MS) of many protonated aldehydes shows loss of CO as a major fragmentation pathway. However, we find that certain aldehydes undergo loss of H2 followed by reaction with water in the collision cell. This complicates interpretation of tandem mass (MS/MS) spectra and affects multiple reaction monitoring (MRM) results., Methods: 3-Formylchromone and other aldehydes were dissolved in acetonitrile/water/formic acid and studied by ESI-MS to record their MS(2) and MS(n) spectra in several mass spectrometers (QqQ, QTOF, ion trap (IT), and Orbitrap HCD). Certain product ions were found to react with water and the rate of reaction was determined in the IT instrument using zero collision energy and variable activation times. Theoretical calculations were performed to help with the interpretation of the fragmentation mechanism., Results: Protonated 3-formylchromones and 3-formylcoumarins undergo loss of H2 as a major fragmentation route to yield a ketene cation, which reacts with water to form a protonated carboxylic acid. In general, protonated aldehydes which contain a vicinal group that forms a hydrogen bridge with the formyl group undergo significant loss of H2. Subsequent losses of CO and C3O are also observed. Theoretical calculations suggest mechanistic details for these losses., Conclusions: Loss of H2 is a major fragmentation channel for protonated 3-formychromones and certain other aldehydes and it is followed by reaction with water to produce a protonated carboxylic acid, which undergoes subsequent fragmentation. This presents a problem for reference libraries and raises concerns about MRM results., (Published in 2014. This article is a U.S. Government work and is in the public domain in the USA.)
- Published
- 2014
- Full Text
- View/download PDF
48. Tandem mass spectral libraries of peptides in digests of individual proteins: Human Serum Albumin (HSA).
- Author
-
Dong Q, Yan X, Kilpatrick LE, Liang Y, Mirokhin YA, Roth JS, Rudnick PA, and Stein SE
- Subjects
- Chromatography, Liquid, Humans, Proteolysis, Spectrometry, Mass, Electrospray Ionization, Tandem Mass Spectrometry, Trypsin chemistry, Peptide Library, Serum Albumin chemistry
- Abstract
This work presents a method for creating a mass spectral library containing tandem spectra of identifiable peptide ions in the tryptic digestion of a single protein. Human serum albumin (HSA(1)) was selected for this purpose owing to its ubiquity, high level of characterization and availability of digest data. The underlying experimental data consisted of ∼3000 one-dimensional LC-ESI-MS/MS runs with ion-trap fragmentation. In order to generate a wide range of peptides, studies covered a broad set of instrument and digestion conditions using multiple sources of HSA and trypsin. Computer methods were developed to enable the reliable identification and reference spectrum extraction of all peptide ions identifiable by current sequence search methods. This process made use of both MS2 (tandem) spectra and MS1 (electrospray) data. Identified spectra were generated for 2918 different peptide ions, using a variety of manually-validated filters to ensure spectrum quality and identification reliability. The resulting library was composed of 10% conventional tryptic and 29% semitryptic peptide ions, along with 42% tryptic peptide ions with known or unknown modifications, which included both analytical artifacts and post-translational modifications (PTMs) present in the original HSA. The remaining 19% contained unexpected missed-cleavages or were under/over alkylated. The methods described can be extended to create equivalent spectral libraries for any target protein. Such libraries have a number of applications in addition to their known advantages of speed and sensitivity, including the ready re-identification of known PTMs, rejection of artifact spectra and a means of assessing sample and digestion quality., (© 2014 by The American Society for Biochemistry and Molecular Biology, Inc.)
- Published
- 2014
- Full Text
- View/download PDF
49. Quality control for building libraries from electrospray ionization tandem mass spectra.
- Author
-
Yang X, Neta P, and Stein SE
- Abstract
Electrospray ionization (ESI) tandem mass spectrometry coupled with liquid chromatography is a routine technique for identifying and quantifying compounds in complex mixtures. The identification step can be aided by matching acquired tandem mass spectra (MS(2)) against reference library spectra as is routine for electron ionization (EI) spectra from gas chromatography/mass spectrometry (GC/MS). However, unlike the latter spectra, ESI MS(2) spectra are likely to originate from various precursor ions for a given target molecule and may be acquired at varying energies and resolutions and have characteristic noise signatures, requiring processing methods very different from EI to obtain complete and high quality reference spectra for individual analytes. This paper presents procedures developed for creating a tandem mass spectral library that addresses these factors. Library building begins by acquiring MS(2) spectra for all major MS(1) peaks in an infusion run, followed by assigning MS(2) spectra to clusters and creating a consensus spectrum for each. Intensity-based constraints for cluster membership were developed, as well as peak testing to recognize and eliminate suspect peaks and reduce noise. Consensus spectra were then examined by a human evaluator using a number of criteria, including a fraction of annotated peaks and consistency of spectra for a given ion at different energies. These methods have been developed and used to build a library from >9000 compounds, yielding 230,000 spectra.
- Published
- 2014
- Full Text
- View/download PDF
50. Improved normalization of systematic biases affecting ion current measurements in label-free proteomics data.
- Author
-
Rudnick PA, Wang X, Yan X, Sedransk N, and Stein SE
- Subjects
- Chromatography, Liquid, Data Interpretation, Statistical, Mass Spectrometry, Models, Statistical, Proteins chemistry, Biometry methods, Peptides analysis, Proteins analysis, Proteomics methods
- Abstract
Normalization is an important step in the analysis of quantitative proteomics data. If this step is ignored, systematic biases can lead to incorrect assumptions about regulation. Most statistical procedures for normalizing proteomics data have been borrowed from genomics where their development has focused on the removal of so-called 'batch effects.' In general, a typical normalization step in proteomics works under the assumption that most peptides/proteins do not change; scaling is then used to give a median log-ratio of 0. The focus of this work was to identify other factors, derived from knowledge of the variables in proteomics, which might be used to improve normalization. Here we have examined the multi-laboratory data sets from Phase I of the NCI's CPTAC program. Surprisingly, the most important bias variables affecting peptide intensities within labs were retention time and charge state. The magnitude of these observations was exaggerated in samples of unequal concentrations or "spike-in" levels, presumably because the average precursor charge for peptides with higher charge state potentials is lower at higher relative sample concentrations. These effects are consistent with reduced protonation during electrospray and demonstrate that the physical properties of the peptides themselves can serve as good reporters of systematic biases. Between labs, retention time, precursor m/z, and peptide length were most commonly the top-ranked bias variables, over the standardly used average intensity (A). A larger set of variables was then used to develop a stepwise normalization procedure. This statistical model was found to perform as well or better on the CPTAC mock biomarker data than other commonly used methods. Furthermore, the method described here does not require a priori knowledge of the systematic biases in a given data set. These improvements can be attributed to the inclusion of variables other than average intensity during normalization.
- Published
- 2014
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.