Matches in SemOpenAlex for { <https://semopenalex.org/work/W2600587704> ?p ?o ?g. }
- W2600587704 endingPage "1161" @default.
- W2600587704 startingPage "1151" @default.
- W2600587704 abstract "Traditional “bottom-up” proteomic approaches use proteolytic digestion, LC-MS/MS, and database searching to elucidate peptide identities and their parent proteins. Protein sequences absent from the database cannot be identified, and even if present in the database, complete sequence coverage is rarely achieved even for the most abundant proteins in the sample. Thus, sequencing of unknown proteins such as antibodies or constituents of metaproteomes remains a challenging problem. To date, there is no available method for full-length protein sequencing, independent of a reference database, in high throughput. Here, we present Database-independent Protein Sequencing, a method for unambiguous, rapid, database-independent, full-length protein sequencing. The method is a novel combination of non-enzymatic, semi-random cleavage of the protein, LC-MS/MS analysis, peptide de novo sequencing, extraction of peptide tags, and their assembly into a consensus sequence using an algorithm named “Peptide Tag Assembler.” As proof-of-concept, the method was applied to samples of three known proteins representing three size classes and to a previously un-sequenced, clinically relevant monoclonal antibody. Excluding leucine/isoleucine and glutamic acid/deamidated glutamine ambiguities, end-to-end full-length de novo sequencing was achieved with 99–100% accuracy for all benchmarking proteins and the antibody light chain. Accuracy of the sequenced antibody heavy chain, including the entire variable region, was also 100%, but there was a 23-residue gap in the constant region sequence. Traditional “bottom-up” proteomic approaches use proteolytic digestion, LC-MS/MS, and database searching to elucidate peptide identities and their parent proteins. Protein sequences absent from the database cannot be identified, and even if present in the database, complete sequence coverage is rarely achieved even for the most abundant proteins in the sample. Thus, sequencing of unknown proteins such as antibodies or constituents of metaproteomes remains a challenging problem. To date, there is no available method for full-length protein sequencing, independent of a reference database, in high throughput. Here, we present Database-independent Protein Sequencing, a method for unambiguous, rapid, database-independent, full-length protein sequencing. The method is a novel combination of non-enzymatic, semi-random cleavage of the protein, LC-MS/MS analysis, peptide de novo sequencing, extraction of peptide tags, and their assembly into a consensus sequence using an algorithm named “Peptide Tag Assembler.” As proof-of-concept, the method was applied to samples of three known proteins representing three size classes and to a previously un-sequenced, clinically relevant monoclonal antibody. Excluding leucine/isoleucine and glutamic acid/deamidated glutamine ambiguities, end-to-end full-length de novo sequencing was achieved with 99–100% accuracy for all benchmarking proteins and the antibody light chain. Accuracy of the sequenced antibody heavy chain, including the entire variable region, was also 100%, but there was a 23-residue gap in the constant region sequence. The main goal of mass spectrometry-based proteomic experiments is typically protein identification. To achieve this goal, current approaches utilize proteolytic digestion of protein samples followed by LC-MS/MS and database searching to identify the peptides, and thereby their parent proteins (1.Zhang Y. Fonslow B.R. Shan B. Baek M.C. Yates 3rd, J.R. Protein analysis by shotgun/bottom-up proteomics.Chem. Rev. 2013; 113: 2343-2394Crossref PubMed Scopus (942) Google Scholar, 2.Yates J.R. Ruse C.I. Nakorchevsky A. Proteomics by mass spectrometry: approaches, advances, and applications.Annu. Rev. Biomed. Eng. 2009; 11: 49-79Crossref PubMed Scopus (793) Google Scholar). Although very powerful when analyzing well characterized organisms, the method has several significant drawbacks when analyzing samples that are not well characterized. First, it strictly depends on a protein database that contains the correct sequence of the measured peptides. Unknown protein sequences cannot be identified. Second, it relies on identification of proteolytic peptides, typically tryptic. Trypsin is used for several reasons, including its high efficiency and specificity. However, because trypsin cleaves the protein only after lysine and arginine residues, tryptic digestion of typical proteins results in some peptides that are too short, too long, too hydrophobic, or contain a sequence of residues that is poorly ionized or fragmented. As a result, even for the most abundant proteins in the sample, sequence coverage of a protein (i.e. the percentage of the entire amino acid sequence covered by measured peptides) is almost never 100%, and there are likely to be regions with no overlap between identified peptides. Enzymatic digestion by other proteases is sometimes performed for specific applications, but they too might result in peptides that are not amenable for identification by LC-MS/MS. Another strategy for proteomic analysis is peptide de novo sequencing, where the peptide sequence is inferred directly from the MS/MS spectrum, without referring to a database (3.Allmer J. Algorithms for the de novo sequencing of peptides from tandem mass spectra.Expert Rev. Proteomics. 2011; 8: 645-657Crossref PubMed Scopus (94) Google Scholar, 4.Seidler J. Zinn N. Boehm M.E. Lehmann W.D. De novo sequencing of peptides by MS/MS.Proteomics. 2010; 10: 634-649Crossref PubMed Scopus (159) Google Scholar). This is done by identifying mass differences between peaks in the MS/MS spectrum that correspond exactly to specific amino acids. The advantage of this approach is that no database is required for identification of a peptide. However, the inherent chemical properties of the peptide and inefficiencies of the instrument might lead to gaps in the de novo sequencing, resulting in only partial or imperfect peptide sequences. Thus, obtaining confident and accurate peptide sequences de novo in high throughput is very challenging. Furthermore, even if the peptide was correctly sequenced de novo, inference to its parent protein is, again, strictly dependent on matching the peptide to a known protein in a database, using BLAST 1The abbreviations used are: BLAST, basic local alignment tool; pTA, peptide tag assembler; DiPS, database-independent protein sequencing; PTM, post-translational modification; IAA, iodoacetamide; IAc, iodoacetate; MAAH, microwave-assisted acid hydrolysis; mSPS, meta-SPS. 1The abbreviations used are: BLAST, basic local alignment tool; pTA, peptide tag assembler; DiPS, database-independent protein sequencing; PTM, post-translational modification; IAA, iodoacetamide; IAc, iodoacetate; MAAH, microwave-assisted acid hydrolysis; mSPS, meta-SPS. search, for example. For an unknown protein, reconstruction of its full amino acid sequence is very challenging using current bottom-up approaches. Determination of a protein sequence without prior knowledge is an important and rate-limiting step in analysis of poorly characterized protein samples, such as ones derived from unsequenced organisms, environmental samples, and microbiome. Other important cases are antibodies and T-cell receptors for which the variable region sequences are unknown. To infer the amino acid sequence of an unknown monoclonal antibody of interest, typically cDNA from the source hybridoma is produced, sequenced, and translated. However, hybridoma cells are not always available, or the primers used to amplify the cDNA might not match the target antibody DNA sequence. In such cases, the amino acid sequence of the protein has to be determined directly by proteomic techniques. To date, there are only a few reported methods attempting to perform end-to-end de novo protein sequencing by LC-MS/MS, of which ALPS and meta-SPS (mSPS) are among the most recent ones (5.Guthals A. Clauser K.R. Bandeira N. Shotgun protein sequencing with meta-contig assembly.Mol. Cell. Proteomics. 2012; 11: 1084-1096Abstract Full Text Full Text PDF PubMed Scopus (18) Google Scholar, 6.Bandeira N. Clauser K.R. Pevzner P.A. Shotgun protein sequencing: assembly of peptide tandem mass spectra from mixtures of modified proteins.Mol. Cell. Proteomics. 2007; 6: 1123-1134Abstract Full Text Full Text PDF PubMed Scopus (69) Google Scholar, 7.Guthals A. Clauser K.R. Frank A.M. Bandeira N. Sequencing-grade de novo analysis of MS/MS triplets (CID/HCD/ETD) from overlapping peptides.J. Proteome Res. 2013; 12: 2846-2857Crossref PubMed Scopus (51) Google Scholar, 8.Tran N.H. Rahman M.Z. He L. Xin L. Shan B. Li M. Complete de novo assembly of monoclonal antibody sequences.Sci. Rep. 2016; 6: 31730Crossref PubMed Scopus (60) Google Scholar, 9.Vyatkina K. Wu S. Dekker L.J. VanDuijn M.M. Liu X. Tolić N. Dvorkin M. Alexandrova S. Luider T.M. Paša-Tolić L. Pevzner P.A. De novo sequencing of peptides from top-down tandem mass spectra.J. Proteome Res. 2015; 14: 4450-4462Crossref PubMed Scopus (28) Google Scholar, 10.Bandeira N. Pham V. Pevzner P. Arnott D. Lill J.R. Automated de novo protein sequencing of monoclonal antibodies.Nat. Biotechnol. 2008; 26: 1336-1338Crossref PubMed Scopus (94) Google Scholar). All such bottom-up methods rely on enzymatic digestion by multiple proteases to generate overlapping peptides, followed by de novo peptide sequencing and assembly. Some of these methods use results from searches against a reference protein database for improving the assembly process. If the analyzed proteins or their close homologs are not represented in that database, thus requiring the use of only de novo sequenced peptides for assembly, these methods are expected to have inferior performance. For example, without using results from a database search against an in-house-generated antibody database, ALPS (8.Tran N.H. Rahman M.Z. He L. Xin L. Shan B. Li M. Complete de novo assembly of monoclonal antibody sequences.Sci. Rep. 2016; 6: 31730Crossref PubMed Scopus (60) Google Scholar) resulted in a fragmented assembly of all light and heavy chains of their analyzed antibodies. Even with the use of the database search results, one of the two heavy chains analyzed resulted in a fragmented assembly by ALPS, specifically at the variable region of the heavy chain (8.Tran N.H. Rahman M.Z. He L. Xin L. Shan B. Li M. Complete de novo assembly of monoclonal antibody sequences.Sci. Rep. 2016; 6: 31730Crossref PubMed Scopus (60) Google Scholar). A fragmented assembly is detrimental for determination of the full-length sequence of an unknown protein, because without prior knowledge of the protein sequence, it is not possible to determine which among all contigs (the assembled amino acid sequence stretches that cover part of the polypeptide chain) should be used for the assembly. Using de novo data only, the longest contig assembled by mSPS was 194 amino acids long (using analysis of multiple proteolytic digests with three different fragmentation methods) (7.Guthals A. Clauser K.R. Frank A.M. Bandeira N. Sequencing-grade de novo analysis of MS/MS triplets (CID/HCD/ETD) from overlapping peptides.J. Proteome Res. 2013; 12: 2846-2857Crossref PubMed Scopus (51) Google Scholar), and no protein was reported to be fully assembled (from N to C terminus) by mSPS in a single contig (5.Guthals A. Clauser K.R. Bandeira N. Shotgun protein sequencing with meta-contig assembly.Mol. Cell. Proteomics. 2012; 11: 1084-1096Abstract Full Text Full Text PDF PubMed Scopus (18) Google Scholar). Thus, de novo sequencing of typical full-length proteins is still challenging using current techniques. Here we present a proof-of-concept for a full-length de novo protein sequencing method that we named Database-independent Protein Sequencing (DiPS). The method is based on cleavage of the protein at semi-random sites by non-enzymatic, microwave-assisted acid hydrolysis (MAAH), enrichment of LC-MS/MS-amenable peptides from the hydrolysate by solid-phase extraction, LC-MS/MS analysis, de novo peptide sequencing of resulting peptides, extraction of peptide tags from the de novo peptide sequences, and their assembly into consensus contigs (Fig. 1). Within minutes of sample processing followed by standard proteomic analysis, full-length de novo protein sequences can be obtained. Three proteins were subjected to DiPS analysis in three independent replicates. These included bovine serum albumin (BSA), fetuin-A, and myoglobin as benchmarks. Additionally, AR37, a previously un-sequenced monoclonal antibody, was also subjected to DiPS as a test case. All chemicals and proteins were purchased from Sigma-Aldrich, unless stated otherwise. For MAAH, 10 μg of dry protein powder of each BSA (Uniprot accession no. P02769, Sigma-Aldrich catalog no. A2153), fetuin A (Uniprot accession no. P12763, Sigma-Aldrich catalog no. F2379), or equine myoglobin (Uniprot accession no. P68082, Sigma-Aldrich catalog no. M1882) were dissolved in 200 μl of 8 m urea, 0.1 m Tris-HCl, pH 7.9. Dithiothreitol (DTT) was added to final concentration of 5 mm and incubated at 37 °C for 50 min. Iodoacetamide (IAA) was added at a final concentration of 10 mm and incubated 30 min in the dark. Buffer was exchanged to water using an Amicon 3-kDa MWCO filter (Millipore UFC500396) by adding 300 μl of H2O and centrifuging at 14,000 × g until the remaining volume was about 40 μl, and the process was repeated. The remaining volume was collected and transferred to a glass vial with a pre-slit cap (Waters, catalog no. 186000307C). HCl was added to a final concentration of 3 m; the vial was placed on ice in a beaker and microwaved for 4 min (stopping every 1 min to replenish ice) in a standard home microwave (LG Intellowave 1,200 watts) at highest settings. Hydrolysates were then subjected to solid-phase extraction (Oasis HLB, Waters, catalog no. 186001828BA), and peptides were eluted with 80% acetonitrile. Peptide samples were dried using a vacuum centrifuge (Eppendorf Concentrator Plus) and resuspended in 3% acetonitrile, 0.1% formic acid for nanoLC-MS/MS analysis. 1.2 μg of resuspended hydrolyzed protein were loaded onto the chromatography column. AR37 was isolated as described previously (11.Carvalho S. Lindzen M. Lauriola M. Shirazi N. Sinha S. Abdul-Hai A. Levanon K. Korach J. Barshack I. Cohen Y. Onn A. Mills G. Yarden Y. An antibody to amphiregulin, an abundant growth factor in patients' fluids, inhibits ovarian tumors.Oncogene. 2016; 35: 438-447Crossref PubMed Scopus (29) Google Scholar). Sample processing of AR37 was performed as above with the exception of alkylation with iodoacetate (IAc) (Sigma-Aldrich catalog no. I4386) instead of IAA (identical concentration and conditions to IAA alkylation). For tryptic digestion, proteins were dissolved in 8 m urea, 0.1 m Tris-HCl, pH 7.9, reduced, and alkylated as described above. Samples were diluted to 2 m urea with 50 mm ammonium bicarbonate. Proteins were then subjected to digestion with trypsin (Promega; Madison, WI) overnight at 37 °C (50:1 protein amount/trypsin), followed by a second trypsin digestion for 4 h. The digestions were stopped by addition of trifluoroacetic acid (1%). Following digestion, peptides were desalted, dried, and resuspended as described above. ULC/MS grade solvents were used for all chromatographic steps. Each sample was loaded once (without technical replicates), using split-less nano-ultra performance liquid chromatography (nanoAcquity; Waters). The mobile phase was as follows: A, H2O + 0.1% formic acid; B, acetonitrile + 0.1% formic acid. Desalting of the samples was performed on line using a reversed-phase Symmetry C18 trapping column (180-μm internal diameter, 20-mm length, 5-μm particle size; Waters). The peptides were then separated using an HSS T3 nano-column (75-μm internal diameter, 250-mm length, 1.8-μm particle size; Waters) at 0.35 μl/min. For BSA, fetuin-A, and AR37, peptides were eluted from the column into the mass spectrometer in 3 h using the following gradient: 4–30% B in 140 min and 30–90% B in 25 min, maintained at 95% for 5 min, and then back to initial conditions. For myoglobin, the smallest protein, peptides were eluted from the column into the mass spectrometer in 2 h using the following gradient: 4–30% B in 105 min and 30–90% B in 15 min, maintained at 95% for 5 min, and then back to initial conditions. The nano-UPLC was coupled on line through a nano-ESI emitter (10-μm tip; New Objective; Woburn, MA) to a quadrupole orbitrap mass spectrometer (Q Exactive Plus, Thermo Fisher Scientific) using a FlexIon nanospray apparatus (Proxeon). Data were acquired in data-dependent acquisition (DDA) mode, using a Top20 method. MS1 resolution was set to 70,000 (at 400 m/z), maximum injection time of 20 ms, scan range was 300–1650 m/z, and AGC target of 3e6. MS2 resolution was set to 70,000, maximum injection time of 120 ms, isolation window 1.7 m/z, and AGC target of 1e6. Normalized collision energy was set to 30. Raw data were analyzed using the PEAKS 7.0 software (Bioinformatics Solutions Inc, Waterloo, Ontario, Canada) using the de novo module for DiPS or using the database search module for assessment of cleavage efficiency. For BSA, fetuin-A, and myoglobin, analysis parameters included no enzyme specificity, no fixed modifications, and variable modifications as follows: methionine oxidation, cysteine carbamidomethylation, cysteine carboxymethylation, and arginine citrullination. For AR37 (alkylated with IAc), the de novo parameters included no enzyme specificity, fixed modification of cysteine carboxymethylation, and variable modifications as follows: methionine oxidation, arginine citrullination, and glutamine to pyroglutamate conversion. Parent Mass Error Tolerance was 10.0 ppm. Fragment Mass Error Tolerance was 0.02 Da. Maximum variable PTM per peptide was 5. Unfiltered de novo sequenced peptides (minimum “average local confidence score” = 0) were exported as a ‘.csv’ file and used as input for pTA using default parameters: k-mer size = 7, k-mer min overlap = 5, unite overlap size = 5, unite minimum extension = 7, merge minimum quality = 0.7. For AR37 validation, the tryptic digest was searched against a database containing the DiPS determined heavy and light sequences, as well as 123 common laboratory contaminants. The search was performed using the database search module of the PEAKS algorithm with parameters specifying nonspecific digestion, fixed modification of cysteine carboxymethylation, and variable modifications of methionine oxidation, asparagine/glutamine deamidation, and N-terminal asparagine/glutamine to pyroglutamate. Data was filtered at 1% FDR at the peptide level based on a reversed sequence decoy database search. The pTA executable and example data are provided as a supplemental file. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (12.Vizcaíno J.A. Csordas A. del-Toro N. Dianes J.A. Griss J. Lavidas I. Mayer G. Perez-Riverol Y. Reisinger F. Ternent T. Xu Q.W. Wang R. Hermjakob H. 2016 update of the PRIDE database and its related tools.Nucleic Acids Res. 2016; 44: D447-D456Crossref PubMed Scopus (2775) Google Scholar) partner repository with the dataset identifier PXD003804. The pTA tool is available as a Windows executable (supplemental material) and the code is available via https://bitbucket.org/incpm/dips. DiPS is based on assembly of overlapping de novo sequenced peptide tags into a final consensus sequence of the protein. To this aim, a modified MAAH protocol (first described in Ref. 13.Zhong H. Zhang Y. Wen Z. Li L. Protein sequencing by mass analysis of polypeptide ladders after controlled protein hydrolysis.Nat. Biotechnol. 2004; 22: 1291-1296Crossref PubMed Scopus (113) Google Scholar) was developed as a simple, cost-effective, and rapid method to cleave proteins at semi-random peptide bonds, thus producing peptides overlapping in sequence, which cover the full protein sequence. Because of the physical and chemical properties of different peptide bonds along the polypeptide chain, the process of generating peptide tags by DiPS is not completely random. To enhance randomization, each step of the process was optimized based on the highest number of unique peptides identified from a BSA hydrolysate subjected to nanoLC-MS/MS and a standard database search as a benchmark. Optimized parameters included hydrolysis time (supplemental Fig. S1), solid-phase extraction elution (supplemental Fig. S2), and normalized collision energy (NCE) for peptide fragmentation (supplemental Fig. S3). After MAAH treatment, the resulting hydrolysate was subjected to nanoLC-MS/MS and de novo peptide sequencing using the commercial PEAKS 7.0 software. An algorithm that we named Peptide Tag Assembler (pTA) was developed for extraction of confident peptide tags from the PEAKS de novo output and their assembly into consensus contigs, based on the de Bruijn graph approach. Here, we refer to “peptide tags” as high confidence sections of de novo sequenced peptides. Fig. 2 contains a detailed description of the pTA logic and method of action. Starting with a single seed sequence, the algorithm extends the ends of the growing contig with the most likely residues (in terms of occurrences and confidence scores) at the next positions, as evidenced in the PEAKS de novo output. After initial contig assembly using all unique peptide tags as seeds, pTA performs several refinement steps, including merging of similar contigs into consensus sequences, and uniting these merged contigs into longer contigs if sufficient overlap exists between them. pTA outputs several files summarizing the analysis results at all stages of analysis, including an html report (supplemental Fig. S4). Certain free amino acids have previously been reported to be modified during acid hydrolysis (14.Fountoulakis M. Lahm H.W. Hydrolysis and amino acid composition of proteins.J. Chromatogr. A. 1998; 826: 109-134Crossref PubMed Scopus (419) Google Scholar, 15.Inglis A.S. Nicholls P.W. Roxburgh C.M. Hydrolysis of the peptide bond and amino acid modification with hydriodic acid.Aust. J. Biol. Sci. 1971; 24: 1235-1240Crossref PubMed Scopus (17) Google Scholar). By examining MS/MS spectra of partially correct de novo assignments of known peptides, we discovered that some of these modifications also result from MAAH in the context of a peptide chain, in addition to unreported modifications (Table I), and they were considered in the data analysis.Table IObserved amino acid modifications resulting from microwave-assisted acid hydrolysis sample preparationMAAH-modified residueModificationMass shiftEquivalent mass residueDaAsnDeamidation+0.98402AspGlnDeamidation+0.98402GluArgCitrullination+0.98402CysCarboxymethylationaWhen IAA is used as the alkylating agent, the majority of cysteines are carboxymethylated and the rest are carbamidomethylated. When IAc is used, only carboxymethylation of cysteines occurs and can thus be considered as a fixed modification for de novo sequencing.+58.00548a When IAA is used as the alkylating agent, the majority of cysteines are carboxymethylated and the rest are carbamidomethylated. When IAc is used, only carboxymethylation of cysteines occurs and can thus be considered as a fixed modification for de novo sequencing. Open table in a new tab Glutamine and asparagine deamidation into glutamic acid and aspartic acid, respectively, is very common during MAAH. In some cases, little or no evidence for the original Gln or Asn residues remains in the hydrolysate for specific residue positions, especially for poorly covered regions along the protein sequence. In such cases, Glu or Asp residues are selected during the assembly at these specific positions. At positions where the decision is not conclusive regarding the identity of the residue (e.g. potential sequence variants, deamidated Gln/Glu, deamidated Asn/Asp), both options and their coverage are presented at the top panel of the pTA html report (“sequence with potential ambiguities/variants,” supplemental Fig. S4). The final sequence decided upon is presented at the middle panel of the report (“final consensus sequence,” supplemental Fig. S4), where selected residues are color-coded for confidence in assignment. The isobaric leucine and isoleucine cannot be differentiated in the MS/MS spectra, and thus pTA-reported Leu at all relevant positions is regarded as “Leu or Ile.” Finally, pTA reports the sequence coverage at every position along the consensus sequence (i.e. number of peptide tags covering this position) (“coverage graph”, bottom panel, supplemental Fig. S4). To benchmark DiPS, it was applied to samples containing BSA (583 amino acids), equine myoglobin (153 amino acids), or bovine fetuin-A (342 amino acids) in triplicate. These proteins were chosen for their diversity in size and structure. The single resulting contig for each experiment matched the respective known protein sequence with 99–100% accuracy, covering 100% of the sequence (after processing of the N-terminal methionine, signal peptide, and propeptide where relevant) (Fig. 3). The only sequencing mistakes were the result of a swap of two residues (e.g. “Asp-Pro” instead of “Pro-Asp”) or the result of Ile:Leu, deamidated Gln:Glu, and deamidated Asn:Asp ambiguities, all of which are identified as potential ambiguities by pTA. We show that incorporating peptide tags from an additional single analysis of a trypsin digest of the benchmarking protein into pTA correctly resolved most ambiguities (supplemental Fig. S5). An emerging therapeutic strategy in onco-immunology is the utilization of antibodies to control tumor growth or eradication of cancer altogether using immunotherapy. We sought to demonstrate the utility of DiPS for therapeutic antibody research. Amphiregulin is a member of the epidermal growth factor (EGF) family and has been targeted by inhibitory monoclonal antibodies (11.Carvalho S. Lindzen M. Lauriola M. Shirazi N. Sinha S. Abdul-Hai A. Levanon K. Korach J. Barshack I. Cohen Y. Onn A. Mills G. Yarden Y. An antibody to amphiregulin, an abundant growth factor in patients' fluids, inhibits ovarian tumors.Oncogene. 2016; 35: 438-447Crossref PubMed Scopus (29) Google Scholar, 16.Lindzen M. Carvalho S. Starr A. Ben-Chetrit N. Pradeep C.R. Köstler W.J. Rabinkov A. Lavi S. Bacus S.S. Yarden Y. A recombinant decoy comprising EGFR and ErbB-4 inhibits tumor growth and metastasis.Oncogene. 2012; 31: 3505-3515Crossref PubMed Scopus (26) Google Scholar, 17.Ferraro D.A. Gaborit N. Maron R. Cohen-Dvashi H. Porat Z. Pareja F. Lavi S. Lindzen M. Ben-Chetrit N. Sela M. Yarden Y. Inhibition of triple-negative breast cancer models by combinations of antibodies to EGFR.Proc. Natl. Acad. Sci. U.S.A. 2013; 110: 1815-1820Crossref PubMed Scopus (85) Google Scholar). One of these, AR37, was selected from amphiregulin knock-out mice and shown to retard growth of human tumor cells in vitro and in vivo (data not shown). It was chosen as a test case for DiPS because cDNA amplification and sequencing failed to produce a product when a standard primer mix targeted at the conserved regions flanking the variable regions was used. For deeper coverage and improved resolution of potential ambiguities, peptide tags from two experiments of a single MAAH preparation (one LC-MS/MS experiment followed by another with an exclusion list containing confidently de novo sequenced peptides from the first experiment) and one experiment of a tryptic digest were included in the analysis. In an attempt to improve de novo sequencing of peptides spanning disulfide bonds (18.Samgina T.Y. Vorontsov E.A. Gorshkov V.A. Artemenko K.A. Nifant'ev I.E. Kanawati B. Schmitt-Kopplin P. Zubarev R.A. Lebedev A.T. Novel cysteine tags for the sequencing of non-tryptic disulfide peptides of anurans: ESI-MS study of fragmentation efficiency.J. Am. Soc. Mass Spectrom. 2011; 22: 2246-2255Crossref PubMed Scopus (10) Google Scholar), which were expected to be crucial in antibody sequencing, IAc was used as the alkylating agent instead of IAA during sample preparation. Cysteine alkylation by IAc results in carboxymethylation (+58.00548 Da) and thus also has the added benefit over IAA alkalization of resolving ambiguities that are the result of the isobaric glycine and carbamidomethyl (+57.02146 Da). In our hands IAA performed better than IAc alkylation in terms of peptide identifications in typical bottom-up workflows, but for DiPS of disulfide-bound proteins, alkylation with IAc should be considered. Cyclization of N-terminal glutamine to pyroglutamate is a common modification of recombinant monoclonal antibodies (19.Liu H. Ponniah G. Zhang H.M. Nowak C. Neill A. Gonzalez-Lopez N. Patel R. Cheng G. Kita A.Z. Andrien B. In vitro in vivo modifications of recombinant and human IgG antibodies.MAbs. 2014; 6: 1145-1154Crossref PubMed Scopus (117) Google Scholar) and was therefore included as a variable modification for the PEAKS de novo peptide-sequencing analysis. When applicable, pTA also reports the number of occurrences of pyroglutamate, glutamine, and glutamic acid at the relevant position. DiPS analysis resulted in three contigs. Because no prior knowledge of the protein in the sample was assumed, an initial characterization of the three contigs was performed by subjecting them to a BLAST search. Contig1 was 215 residues long. A BLAST search revealed near-perfect identity of contig1 to G0YP42 mouse anti-human langerin 2G3 λ-type light chain (supplemental Fig. S6). The first residue of contig1 aligns to the first G0YP42 residue, after cleavage of the predicted signal peptide. The constant region of G0YP42 is matched perfectly by contig1 (with one deamidated-Q/E ambiguity), and the variable region differs in only five positions. Interestingly, the majority of glutamine residues at position 1 of contig" @default.
- W2600587704 created "2017-04-07" @default.
- W2600587704 creator A5001348816 @default.
- W2600587704 creator A5010496018 @default.
- W2600587704 creator A5011594133 @default.
- W2600587704 creator A5013496664 @default.
- W2600587704 creator A5041215099 @default.
- W2600587704 creator A5051842735 @default.
- W2600587704 creator A5054102757 @default.
- W2600587704 creator A5037284924 @default.
- W2600587704 date "2017-06-01" @default.
- W2600587704 modified "2023-09-24" @default.
- W2600587704 title "Database-independent Protein Sequencing (DiPS) Enables Full-length de Novo Protein and Antibody Sequence Determination" @default.
- W2600587704 cites W1963703398 @default.
- W2600587704 cites W1969175190 @default.
- W2600587704 cites W1982002760 @default.
- W2600587704 cites W1996372164 @default.
- W2600587704 cites W1998829729 @default.
- W2600587704 cites W2018221147 @default.
- W2600587704 cites W2035662171 @default.
- W2600587704 cites W2037765158 @default.
- W2600587704 cites W2046914949 @default.
- W2600587704 cites W2057282070 @default.
- W2600587704 cites W2073649272 @default.
- W2600587704 cites W2087481305 @default.
- W2600587704 cites W2090647436 @default.
- W2600587704 cites W2135581618 @default.
- W2600587704 cites W2137309994 @default.
- W2600587704 cites W2140985419 @default.
- W2600587704 cites W2158088874 @default.
- W2600587704 cites W2162443513 @default.
- W2600587704 cites W2238266328 @default.
- W2600587704 cites W2314833185 @default.
- W2600587704 cites W2422876731 @default.
- W2600587704 cites W2461718750 @default.
- W2600587704 cites W2513487313 @default.
- W2600587704 doi "https://doi.org/10.1074/mcp.o116.065417" @default.
- W2600587704 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/5461544" @default.
- W2600587704 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/28348172" @default.
- W2600587704 hasPublicationYear "2017" @default.
- W2600587704 type Work @default.
- W2600587704 sameAs 2600587704 @default.
- W2600587704 citedByCount "18" @default.
- W2600587704 countsByYear W26005877042018 @default.
- W2600587704 countsByYear W26005877042019 @default.
- W2600587704 countsByYear W26005877042020 @default.
- W2600587704 countsByYear W26005877042021 @default.
- W2600587704 countsByYear W26005877042022 @default.
- W2600587704 countsByYear W26005877042023 @default.
- W2600587704 crossrefType "journal-article" @default.
- W2600587704 hasAuthorship W2600587704A5001348816 @default.
- W2600587704 hasAuthorship W2600587704A5010496018 @default.
- W2600587704 hasAuthorship W2600587704A5011594133 @default.
- W2600587704 hasAuthorship W2600587704A5013496664 @default.
- W2600587704 hasAuthorship W2600587704A5037284924 @default.
- W2600587704 hasAuthorship W2600587704A5041215099 @default.
- W2600587704 hasAuthorship W2600587704A5051842735 @default.
- W2600587704 hasAuthorship W2600587704A5054102757 @default.
- W2600587704 hasBestOaLocation W26005877041 @default.
- W2600587704 hasConcept C10010492 @default.
- W2600587704 hasConcept C104317684 @default.
- W2600587704 hasConcept C167625842 @default.
- W2600587704 hasConcept C2778112365 @default.
- W2600587704 hasConcept C41008148 @default.
- W2600587704 hasConcept C41584329 @default.
- W2600587704 hasConcept C54355233 @default.
- W2600587704 hasConcept C70721500 @default.
- W2600587704 hasConcept C77088390 @default.
- W2600587704 hasConcept C86803240 @default.
- W2600587704 hasConceptScore W2600587704C10010492 @default.
- W2600587704 hasConceptScore W2600587704C104317684 @default.
- W2600587704 hasConceptScore W2600587704C167625842 @default.
- W2600587704 hasConceptScore W2600587704C2778112365 @default.
- W2600587704 hasConceptScore W2600587704C41008148 @default.
- W2600587704 hasConceptScore W2600587704C41584329 @default.
- W2600587704 hasConceptScore W2600587704C54355233 @default.
- W2600587704 hasConceptScore W2600587704C70721500 @default.
- W2600587704 hasConceptScore W2600587704C77088390 @default.
- W2600587704 hasConceptScore W2600587704C86803240 @default.
- W2600587704 hasIssue "6" @default.
- W2600587704 hasLocation W26005877041 @default.
- W2600587704 hasLocation W26005877042 @default.
- W2600587704 hasLocation W26005877043 @default.
- W2600587704 hasOpenAccess W2600587704 @default.
- W2600587704 hasPrimaryLocation W26005877041 @default.
- W2600587704 hasRelatedWork W1617298830 @default.
- W2600587704 hasRelatedWork W2012198878 @default.
- W2600587704 hasRelatedWork W2022667872 @default.
- W2600587704 hasRelatedWork W2094959732 @default.
- W2600587704 hasRelatedWork W2137789374 @default.
- W2600587704 hasRelatedWork W2510305690 @default.
- W2600587704 hasRelatedWork W2805252915 @default.
- W2600587704 hasRelatedWork W2948694765 @default.
- W2600587704 hasRelatedWork W3136663431 @default.
- W2600587704 hasRelatedWork W3099095423 @default.
- W2600587704 hasVolume "16" @default.
- W2600587704 isParatext "false" @default.
- W2600587704 isRetracted "false" @default.