Matches in SemOpenAlex for { <https://semopenalex.org/work/W1998359381> ?p ?o ?g. }
- W1998359381 endingPage "16883" @default.
- W1998359381 startingPage "16873" @default.
- W1998359381 abstract "MUC5B, mapped clustered withMUC6, MUC2, and MUC5AC to chromosome 11p15.5, is a human mucin gene of which the genomic organization is being elucidated. We have recently published the sequence and the peptide organization of its huge central exon, 10,713 base pairs (bp) in length. We present here the genomic organization of its 3′ region, which encompasses 10,690 bp. The genomic sequence has been completely determined. The 3′ region of MUC5B is composed of 18 exons ranging in size from 32 to 781 bp, contrasting thus with the very large central exon. The sizes of the 18 introns range from 114 to 1118 bp. Some repetitive sequences were identified in four introns. The peptide deduced from the sequence of the 18 exons consists of an 808-amino acid peptide. This carboxyl-terminal region exhibits extensive sequence similarity to MUC2, MUC5AC, and von Willebrand factor, particularly the number and the positions of the cysteine residues, suggesting that this domain may be derived from a common ancestral gene. The presence in these components of a cystine knot also found in growth factors such as transforming growth factor-β is of particular interest. Moreover, one part of this peptide is identical to the 196-amino acid sequence deduced from the cDNA clone pSM2-1, which codes for a part of the high molecular weight mucin MG1 isolated from human sublingual gland. Considering the expression pattern of MUC5B and the origin of MG1, we can thus conclude that MUC5B encodes MG1. MUC5B, mapped clustered withMUC6, MUC2, and MUC5AC to chromosome 11p15.5, is a human mucin gene of which the genomic organization is being elucidated. We have recently published the sequence and the peptide organization of its huge central exon, 10,713 base pairs (bp) in length. We present here the genomic organization of its 3′ region, which encompasses 10,690 bp. The genomic sequence has been completely determined. The 3′ region of MUC5B is composed of 18 exons ranging in size from 32 to 781 bp, contrasting thus with the very large central exon. The sizes of the 18 introns range from 114 to 1118 bp. Some repetitive sequences were identified in four introns. The peptide deduced from the sequence of the 18 exons consists of an 808-amino acid peptide. This carboxyl-terminal region exhibits extensive sequence similarity to MUC2, MUC5AC, and von Willebrand factor, particularly the number and the positions of the cysteine residues, suggesting that this domain may be derived from a common ancestral gene. The presence in these components of a cystine knot also found in growth factors such as transforming growth factor-β is of particular interest. Moreover, one part of this peptide is identical to the 196-amino acid sequence deduced from the cDNA clone pSM2-1, which codes for a part of the high molecular weight mucin MG1 isolated from human sublingual gland. Considering the expression pattern of MUC5B and the origin of MG1, we can thus conclude that MUC5B encodes MG1. Mucus is the layer that covers, protects, and lubricates the luminal surfaces of epithelial respiratory, gastrointestinal, and reproductive tracts. These basic properties are due to the viscous and viscoelastic properties of mucins, the major glycoprotein components of mucus. Mucins constitute a family of high molecular mass glycoproteins synthesized by the goblet cells of the epithelia and in some cases by submucosal glands (for more complete reviews, see Refs. 1Gendler S.J. Spicer A.P. Annu. Rev. Physiol. 1995; 57: 607-634Crossref PubMed Scopus (872) Google Scholar, 2Bansil R. Stanley E. LaMont J.T. Annu. Rev. Physiol. 1995; 57: 635-657Crossref PubMed Scopus (315) Google Scholar, 3Forstner G. Annu. Rev. Physiol. 1995; 57: 585-605Crossref PubMed Scopus (146) Google Scholar). Alterations of the biosynthesis of mucins affecting the protein core and/or the carbohydrate content linked to the peptide have been observed in numerous pathological situations such as various adenomas and carcinomas, inflammatory diseases such as cystic fibrosis, asthma, chronic bronchitis, or inflammatory bowel diseases (4Verma M. Cancer Biochem. Biophys. 1994; 14: 151-162PubMed Google Scholar, 5Ho S.B. Roberton A.M. Shekels L.L. Lyftogt C.T. Niehans G.A. Toribara N.W. Gastroenterology. 1995; 109: 735-747Abstract Full Text PDF PubMed Scopus (250) Google Scholar, 6Buisine M.P. Janin A. Maunoury V. Audié J.P. Delescaut M.P. Copin M.C. Colombel J.F. Degand P. Aubert J.P. Porchet N. Gastroenterology. 1996; 110: 84-91Abstract Full Text PDF PubMed Scopus (96) Google Scholar, 7Kaliner M. Shelhamer J.H. Borson B. Nadel J. Patow C. Marow Z. Am. Rev. Respir. Dis. 1986; 134: 612-621PubMed Google Scholar). Moreover, the hypersecretion of mucins and the presence of alternating hydrophobic and hydrophilic domains in mucins have been shown to play a central role in the pathogenesis of cholesterol gallstones (8Lee S.P. LaMont J.T. Carey M.C. J. Clin. Invest. 1981; 67: 1712-1723Crossref PubMed Scopus (312) Google Scholar, 9Klinkskpoor J.H. Tytgat G.N.J. Groen A.K. Eur. J. Gastroenterol. Hepatol. 1993; 5: 226-234Crossref Scopus (3) Google Scholar). All apomucins contain tandemly repeated sequences rich in threonine and/or serine. Due to the high carbohydrate content, the peptide moiety of mucins has been difficult to characterize. cDNA cloning has enabled researchers to approach the study of the mucins over the past decade. Today, the membrane-associated mucin MUC1 and the secreted MUC7 are the only mucins for which the full-length cDNA and the genomic organization have been reported (10Lan M.S. Batra S.K. Qi W.N. Metzgar R.S. Hollingsworth M.A. J. Biol. Chem. 1990; 265: 15294-15299Abstract Full Text PDF PubMed Google Scholar, 11Lancaster C.A. Peat N. Duhig T. Wilson D. Taylor-Papadimitriou J. Gendler S.J. Biochem. Biophys. Res. Commun. 1990; 173: 1019-1029Crossref PubMed Scopus (123) Google Scholar, 12Bobek L.A. Tsai H. Biesbrock A.R. Levine M.J. J. Biol. Chem. 1993; 268: 20563-20569Abstract Full Text PDF PubMed Google Scholar, 13Bobek L.A. Liu J. Sait S.N.J. Shows T.B. Bobek Y.A. Levine M.J. Genomics. 1996; 31: 277-282Crossref PubMed Scopus (56) Google Scholar). Both were revealed to be, in fact, small mucins. A complete cDNA of the large secreted mucin MUC2 (14Gum J.R. Byrd J.C. Hicks J.W. Toribara N.W. Lamport D.T.A. Kim Y.S. J. Biol. Chem. 1989; 264: 6480-6487Abstract Full Text PDF PubMed Google Scholar, 15Toribara N.W. Gum J.R. Culhane P.J. Lagace R.E. Hicks J.W. Petersen G.M. Kim Y.S. J. Clin. Invest. 1991; 88: 1005-1013Crossref PubMed Scopus (144) Google Scholar, 16Gum Jr., J.R. Hicks J.W. Toribara N.W. Rothe E.M. Lagace R.E. Kim Y.S. J. Biol. Chem. 1992; 267: 21375-21383Abstract Full Text PDF PubMed Google Scholar, 17Gum Jr., J.R. Hicks J.W. Toribara N.W. Siddiki B. Kim Y.S. J. Biol. Chem. 1994; 269: 2440-2446Abstract Full Text PDF PubMed Google Scholar) has been described. Partial cDNAs have been identified for the other human mucin genes that code for secreted mucins: MUC3 (18Gum J.R. Hicks J.W. Swallow D.M. Lagace R.E. Byrd J.C. Lamport D.T.A. Siddiki B. Kim Y.S. Biochem. Biophys. Res. Commun. 1990; 171: 407-415Crossref PubMed Scopus (315) Google Scholar), MUC4 (19Porchet N. Nguyen V.C. Dufossé J. Audié J.P. Guyonnet Dupérat Gross M.S. Denis C. Degand P. Bernheim A. Aubert J.P. Biochem. Biophys. Res. Commun. 1991; 175: 414-422Crossref PubMed Scopus (328) Google Scholar), MUC5AC (20Aubert J.P. Porchet N. Crépin M. Duterque-Coquillaud M. Vergnes G. Mazzuca M. Debuire B. Petitprez D. Degand P. Am. J. Respir. Cell. Mol. Biol. 1991; 5: 178-185Crossref PubMed Scopus (108) Google Scholar, 21Guyonnet Dupérat V. Audié J.P. Debailleul V. Laine A. Buisine M.P. Galiegue-Zouitina S. Pigny P. Degand P. Aubert J.P. Porchet N. Biochem. J. 1995; 305: 211-219Crossref PubMed Scopus (190) Google Scholar, 22Meerzaman D. Charles P. Daskal E. Polymeropoulos M.H. Martin B.M. Rose M.C. J. Biol. Chem. 1994; 269: 12932-12939Abstract Full Text PDF PubMed Google Scholar, 23Lesuffleur T. Roches F. Hill A.S. Lacasa M. Fox M. Swallow D.M. Zweibaum A. Real F.X. J. Biol. Chem. 1995; 270: 13665-13673Abstract Full Text Full Text PDF PubMed Scopus (88) Google Scholar, 24Klomp L.W.J. Van Rens L. Strous G.J. Biochem. J. 1995; 308: 831-838Crossref PubMed Scopus (45) Google Scholar), MUC5B (25Dufossé J. Porchet N. Audié J.P. Guyonnet Dupérat V. Laine A. Van Seuningen I. Marrakchi S. Degand P. Aubert J.P. Biochem. J. 1993; 293: 329-337Crossref PubMed Scopus (124) Google Scholar), and MUC6 (26Toribara N.W. Roberton A.M. Ho S.B. Kuo W.L. Gum E. Hicks J.W. Gum Jr., J.R. Byrd J.C. Siddiki B. Kim Y.S. J. Biol. Chem. 1993; 268: 5879-5885Abstract Full Text PDF PubMed Google Scholar). Four mucin genes are mapped to 11p15.5: MUC5AC,MUC5B, MUC2, and MUC6. (26Toribara N.W. Roberton A.M. Ho S.B. Kuo W.L. Gum E. Hicks J.W. Gum Jr., J.R. Byrd J.C. Siddiki B. Kim Y.S. J. Biol. Chem. 1993; 268: 5879-5885Abstract Full Text PDF PubMed Google Scholar, 27Nguyen V.C. Aubert J.P. Gross M.S. Porchet N. Degand P. Frézal J. Hum. Genet. 1990; 86: 167-172Crossref PubMed Scopus (46) Google Scholar, 28Griffiths B. Matthews D.J. West L. Attwood J. Povey S.M. Swallow D.M. Gum J.R. Kim Y.S. Ann. Hum. Genet. 1990; 54: 277-285Crossref PubMed Scopus (45) Google Scholar). Recently, we have determined that the order of the four clustered 11p15.5 human mucin genes is tel-MUC6/MUC2/MUC5AC/MUC5B-cen (29Pigny P. Guyonnet Dupérat V. Hill A.S. Pratt W.S. Galiegue-Zouitina S. Collyn D'Hooge M. Laine A. Van Seuningen I. Gum J.R. Kim Y.S. Swallow D.M. Aubert J.P. Porchet N. Genomics. 1996; 38: 340-352Crossref PubMed Scopus (192) Google Scholar). We have also established that MUC2, MUC5AC, and MUC5B have a consensus cysteine-rich domain found twice in MUC2 (16Gum Jr., J.R. Hicks J.W. Toribara N.W. Rothe E.M. Lagace R.E. Kim Y.S. J. Biol. Chem. 1992; 267: 21375-21383Abstract Full Text PDF PubMed Google Scholar), at least four times in MUC5AC (21Guyonnet Dupérat V. Audié J.P. Debailleul V. Laine A. Buisine M.P. Galiegue-Zouitina S. Pigny P. Degand P. Aubert J.P. Porchet N. Biochem. J. 1995; 305: 211-219Crossref PubMed Scopus (190) Google Scholar, 22Meerzaman D. Charles P. Daskal E. Polymeropoulos M.H. Martin B.M. Rose M.C. J. Biol. Chem. 1994; 269: 12932-12939Abstract Full Text PDF PubMed Google Scholar, 24Klomp L.W.J. Van Rens L. Strous G.J. Biochem. J. 1995; 308: 831-838Crossref PubMed Scopus (45) Google Scholar), and seven times in MUC5B (30Desseyn J.L. Guyonnet Dupérat V. Porchet N. Aubert J.P. Laine A. J. Biol. Chem. 1997; 272: 3168-3178Abstract Full Text Full Text PDF PubMed Scopus (149) Google Scholar). MUC5B is expressed mainly in bronchus glands and also in submaxillary glands, endocervix, gall bladder, and pancreas (31Audié J.P. Janin A. Porchet N. Copin M.C. Gosselin B. Aubert J.P. J. Histochem. Cytochem. 1993; 41: 1479-1485Crossref PubMed Scopus (412) Google Scholar, 32Audié J.P. Tetaert D. Pigny P. Buisine M.P. Janin A. Aubert J.P. Porchet N. Boersma A. Hum. Reprod. 1995; 10: 98-102Crossref PubMed Scopus (78) Google Scholar, 33Balagué C Gambus G. Carrato C. Porchet N. Aubert J.P. Kim Y.S. Real F.X. Gastroenterology. 1994; 106: 1056-1061Crossref Scopus (153) Google Scholar, 34Campion J.P. Porchet N. Aubert J.P. L'Helgoualc'h A. Clément B. Hepatol. 1995; 21: 223-231PubMed Google Scholar, 35Balagué C. Audié J.P. Porchet N. Real F.X. Gastroenterology. 1995; 109: 953-964Abstract Full Text PDF PubMed Scopus (127) Google Scholar). The structural organization of the peptide deduced from the nucleotide sequence of the central region of MUC5B has been published recently (30Desseyn J.L. Guyonnet Dupérat V. Porchet N. Aubert J.P. Laine A. J. Biol. Chem. 1997; 272: 3168-3178Abstract Full Text Full Text PDF PubMed Scopus (149) Google Scholar). The single large exon of 10,713 bp, 1The abbreviations used are: bp, base pair(s); aa, amino acid(s); BSM, bovine submaxillary gland mucin-like; CK, cystine knot; FIM, frog integumentary mucin; PCR, polymerase chain reaction; PSM, porcine submaxillary mucin; TGF, transforming growth factor; RACE, rapid amplification of cDNA ends; RT, reverse transcription; vWF, von Willebrand factor; ORF, open reading frame. containing all the tandem repeat domain, is, to our knowledge, the biggest described for a vertebrate gene. It codes for a 3570-amino acid peptide. Nineteen subdomains have been individualized. Most of the MUC5B subdomains show similarity to each other, creating four larger composite super-repeat units of 528 amino acids. Each super-repeat is made up of repeats consisting of an irregular repeat of 29 amino acids, one cysteine-rich subdomain (10 cysteine residues, 108 aa), and one unique sequence of 111 amino acid residues also rich in serine and threonine. The complete organization of the region downstream of the central region of the humanMUC5B gene, i.e. its complete 3′ region, is reported in this paper; we present here the complete genomic nucleotide sequence, the exon-intron organization, and the full cDNA sequence coding for the carboxyl-terminal domain of the human MUC5B apomucin. This domain stretches 808 amino acid residues and can be divided into six subdomains. The last five cysteine-rich subdomains exhibit extensive sequence similarity to MUC2, MUC5AC, and vWF (17Gum Jr., J.R. Hicks J.W. Toribara N.W. Siddiki B. Kim Y.S. J. Biol. Chem. 1994; 269: 2440-2446Abstract Full Text PDF PubMed Google Scholar, 22Meerzaman D. Charles P. Daskal E. Polymeropoulos M.H. Martin B.M. Rose M.C. J. Biol. Chem. 1994; 269: 12932-12939Abstract Full Text PDF PubMed Google Scholar, 23Lesuffleur T. Roches F. Hill A.S. Lacasa M. Fox M. Swallow D.M. Zweibaum A. Real F.X. J. Biol. Chem. 1995; 270: 13665-13673Abstract Full Text Full Text PDF PubMed Scopus (88) Google Scholar,36Mancuso D.J. Tuley E.A. Westfield L.A. Worrall N.K. Shelton-Inloes B.B. Sorace J.M. Alevy Y.G. Sadler J.E. J. Biol. Chem. 1989; 264: 19514-19527Abstract Full Text PDF PubMed Google Scholar), particularly the number and the positions of the cysteine residues, suggesting that this domain may be derived from a common ancestral gene. Moreover, with the exception of one substitution, which does not change the coded amino acid, one part of the cDNA sequence we determined is identical to the nucleotide sequence of pSM2-1. This cDNA codes for 196 amino acids in the carboxyl-terminal region of the high molecular weight mucin MG1 isolated from human sublingual gland (37Troxler R.F. Offner G.D. Zhang F. Iontcheva I. Oppenheim G.O. Biochem. Biophys. Res. Commun. 1995; 217: 1112-1119Crossref PubMed Scopus (45) Google Scholar). Considering the expression pattern of MUC5B (31Audié J.P. Janin A. Porchet N. Copin M.C. Gosselin B. Aubert J.P. J. Histochem. Cytochem. 1993; 41: 1479-1485Crossref PubMed Scopus (412) Google Scholar) and the origin of MG1, we can thus conclude that MUC5B encodes MG1. A λgt11 cDNA library constructed from human tracheal mucosa was screened with rabbit antibodies raised to deglycosylated Pronase glycopeptides from bronchial mucins (38Crépin M. Porchet N. Aubert J.P. Degand P. Biorheology. 1991; 27: 471-484Crossref Google Scholar). Among the various positive clones obtained, the one designated TH71 and containing a poly(A) tail was of particular interest in the present study. A human genomic λEMBL4 phage library was screened using hybridization with the JER57 probe (25Dufossé J. Porchet N. Audié J.P. Guyonnet Dupérat V. Laine A. Van Seuningen I. Marrakchi S. Degand P. Aubert J.P. Biochem. J. 1993; 293: 329-337Crossref PubMed Scopus (124) Google Scholar). One positive clone, CEL5, was isolated and studied. A human placental genomic DNA library in pWE15 cosmid provided by Stratagene was screened using the JER57 probe. Two positive clones, BEN1 and BEN2, were obtained (30Desseyn J.L. Guyonnet Dupérat V. Porchet N. Aubert J.P. Laine A. J. Biol. Chem. 1997; 272: 3168-3178Abstract Full Text Full Text PDF PubMed Scopus (149) Google Scholar). BEN2 was the useful clone in the present study. Oligonucleotide primers used in PCR, RACE-PCR, RT-PCR, and sequencing experiments were synthesized by Eurogentec (Liège, Belgium). Their sequences and locations are indicated in Table I.Table IPrimers used for cDNA synthesis and DNA and cDNA sequencingPrimer designationPrimer sequence (5′ to 3′)PositionOrientation1-aStrand orientation: sense (S), antisense (AS).NAU61ACTCAATGCTCAGGGTTTATTTGC10582–10605ASNAU67GGGTTTATTTGCAAAACTG10575–10593ASNAU71AGTGCTGATTGCACACTGCGT838–859ASNAU102CCTGTCGCAGCTTCCTGGCAG10446–10466ASNAU106CAGTGAGCATAGGGGAAGCCT3387–3407SNAU127AGGCTTCCCCTATGCTCACTG3387–3407ASNAU128CGTGTCCACTGTGTCCTCCTCAGTC1–25SNAU140GATGGCGGAGGGCTGCTTCTG5139–5159SNAU141CAGACCGTGTGCACGCAGCAC1001–1021ASNAU142CCAGGGTAGGACTCCTGAGTG10246–10266ASNAU151TGAGCAGCGGTTTCAGCAAGA3168–3188SNAU152CAAGGTTGTGGCACTCAGCAA3837–3857ASNAU196CGAGGGTTCAGTGTCGGTG6013–6031SNAU200CAGTGTCCTTACCGGGAGA2221–2239ASNAU203ATTTAGGAAACCCATCGGGT5689–5708ASNAU207CGCGGGGTGCCACACACAGGCC10142–10163ASNAU208GGGTGTAGGTGTGCAGGATGG9927–9947ASNAU219GCAGGGAAGGGCGCCTGGGAA7394–7414ASNAU226AGCGGAAGGTGGGACAGCAGT6620–6640ASNAU227ACTGCTGTCCCACCTTCCGCT6620–6640SNAU232CTTCCCAGGCGCCCTTCCCTGC7393–7414SNAU233CTGCGAGACCGAGGTCAACATC9113–9134SNAU234GATGTTGACCTCGGTCTCGCAG9113–9134ASNAU249CTCCTCACAGGAGTAGCAGC8814–8832ASNAU277CAGTGACTGGCGAGGTGCAACTG3973–3995SNAU278GTATGGGGCCGCATGCGTTGTACACT4624–4649ASNAU280TGGACAGATGCCCAGGGTTGA5901–5921SNAU281TGCCATTGTACGAACACAGCT6776–6796ASNAU282CTGCAGGCCCCATTGGGTCAT7297–7317SNAU293ATGAGCCGTGGATGGGGTCCC1195–1215SNAU297TCATGGTCCTGGGCGGCTCCT5277–5297AS1-a Strand orientation: sense (S), antisense (AS). Open table in a new tab The 5′-AmpliFINDER RACE kit (CLONTECH) was used to synthesize first-strand cDNA from human trachea poly(A)+ RNA (1 μg) obtained from CLONTECH using NAU61 as a primer (Table I), followed by ligation of the 5′-ANCHOR adapter. The PCR was then performed using the nested primer NAU67 (Table I and Fig. 1) and the 5′-ANCHOR primer. Nested PCRs involving a second or third round amplification were carried out with 1 μl of the reaction mixture obtained from each previous round of PCR as template. Total RNA of human gall bladder was extracted as described previously (39Glisin V. Orkvenjakov R. Byus C. Biochemistry. 1974; 13: 2633-2637Crossref PubMed Scopus (1553) Google Scholar). Single-stranded cDNA was performed using the 1st STRAND Synthesis kit (CLONTECH), random hexamers and human trachea poly(A)+ mRNA (0.5 μg) (CLONTECH) or total gall bladder RNA (1 μg). PCR amplification reaction mixtures (50 μl) contain 0.3 mm dNTPs, 2.5 units of TaqDNA polymerase (Boehringer Mannheim), 15 pmol of the appropriate primers, the buffer system purchased with Taq DNA polymerase, and an aliquot of cDNA. The PCR was performed using a Perkin-Elmer Thermal Cycler 480. PCR parameters were 94 °C for 2 min, followed by 30 cycles at 94 °C for 30 s, 60 °C for 1 min, and 72 °C for 2 min, followed by a final extension at 72 °C for 15 min. The amplified products were electrophoresed on a 1% Seaplaque gel (FMC, Rockland, ME) and stained with ethidium bromide. The band was cut out, purified using Preps DNA purification resin (Promega), and subcloned into the T/A cloning vector, pMOSBlue T-vector (Amersham). Thereafter, cDNA clones were subcloned into pBluescript KS(+) vector (Stratagene) using the restriction enzymes (Boehringer Mannheim) PstI,SacI, and/or SmaI. Subclones were sequenced as described below using either universal primers or a series of oligonucleotides specific for both strands of the inserts (TableI). Fragments of the genomic clones CEL5 and BEN2 corresponding to the region downstream of the central exon were subcloned into pBluescript KS(+) vector as described previously (30Desseyn J.L. Guyonnet Dupérat V. Porchet N. Aubert J.P. Laine A. J. Biol. Chem. 1997; 272: 3168-3178Abstract Full Text Full Text PDF PubMed Scopus (149) Google Scholar). The double-stranded plasmid inserts were sequenced manually using the dideoxynucleotide chain termination method (40Sanger F. Nicklen S. Coulson A.R. Proc. Natl. Acad. Sci. U. S. A. 1977; 74: 5463-5467Crossref PubMed Scopus (52769) Google Scholar) using [α-35S]dATP (Amersham) and Sequenase 2.0 (U. S. Biochemical Corp.) according to the protocol indicated by the manufacturer. Universal primers or a series of specific oligonucleotides were used. Sequencing reaction mixtures were electrophoresed on 6% polyacrylamide gel (Sequagel-6™, National Diagnostics). The clones were sequenced on both strands several times. Direct DNA sequencing on cosmid was performed as described previously (30Desseyn J.L. Guyonnet Dupérat V. Porchet N. Aubert J.P. Laine A. J. Biol. Chem. 1997; 272: 3168-3178Abstract Full Text Full Text PDF PubMed Scopus (149) Google Scholar). Computer analyses were performed using PC/GENE Software. The whole genomic sequence reported in this paper has been submitted to the EMBL Data Bank with accession number Y09788. The sequence of TH71 has been submitted to the EMBL Data Bank with accession number Y10080. To determine the exact number of repeats in the intron G, first we cut the genomic subclonedBglII-BglII fragment using SacI andRsaI that flank the region containing these direct 59-bp repeats. The complete digestion with SmaI was obtained using 10 units/μg DNA for 3 h. The partial digestions were performed using 1 unit/μg DNA and 0.25 unit/μg DNA for 1 h. After electrophoretic separation on 1.5% agarose gel, the blot analysis was conducted using the antisense oligonucleotide NAU199 (5′- AGAGCCGAGGGGTCTGGG-3′), which had been previously radiolabeled using T4 polynucleotide kinase (Boehringer Mannheim) and [γ-32P]ATP from Amersham. The partial restriction maps of the genomic clones CEL5 and BEN2 were determined. Their overlapping parts present the same restriction map. The partial restriction map of the 3′ region ofMUC5B is shown in Fig. 1 together with the overlapping fragments, which were separated and subcloned into pBluescript KS(+) vector. The fragmentsBamHI-SacII and PstI-SacII (in the left part of Fig. 1) contain the 3′ end of the central exon. All these clones were entirely sequenced after restriction digestion and subcloning. Primer walking using specific oligonucleotides (Table I) was also performed. Several cDNA-positive clones were obtained by screening the λgt11 cDNA library using antibodies as described previously (38Crépin M. Porchet N. Aubert J.P. Degand P. Biorheology. 1991; 27: 471-484Crossref Google Scholar). The clone designated TH71 is 380 bp in length. Its sequence (Fig. 2), submitted to the EMBL data bank with accession number Y10080, revealed a poly(A) tail with 73 A, 16 bp downstream from a polyadenylation signal (AAUAAA). By sequencing the PstI-PstI subclone (noted with anasterisk in Fig. 1) obtained from the fragmentNotI-BglII of the BEN2 clone, an identical 67-bp sequence was observed (Fig. 2), up to the A where the poly(A) addition occurs, indicating that the clone BEN2 contains the 3′ end of theMUC5B gene. Using the two synthesized oligonucleotides NAU61 and NAU67 chosen in this sequence, a 5′-RACE-PCR experiment was performed. After cloning of the fragment obtained, the insert of 88 bp designated RACE67 was sequenced. This sequence is identical to the 88-bp sequence determined in the PstI-PstI clone (Fig. 2). In contrast, the first 34 nucleotides differ from the sequence of TH71. The TH71 clone, which has been found using the antibodies directed against the repeat part of the MUC5B apomucin (38Crépin M. Porchet N. Aubert J.P. Degand P. Biorheology. 1991; 27: 471-484Crossref Google Scholar), begins with a 132-bp sequence we found in the central exon. Between this sequence and the 3′ end identical to the RACE67, TH71 seems to have been rearranged; moreover, the following results show that an important part of the cDNA has been lost. We will discuss these data below. The NotI-BglII fragment from BEN2 contains two other clustered canonical polyadenylation signals, AATAAA. The first was located about 2 kilobase pairs downstream from the first polyadenylation signal and the second 298 base pairs downstream from this latter AATAAA. The significance of these two additional polyadenylation signals is not known. It will be interesting to determine if several forms of MUC5B mRNA can be transcribed by selection of alternative polyadenylation signals. The dinucleotides TG and GT were found with oligo(T) stretches in the region downstream from the first AATAAA motif within thePstI-PstI subclone. This region, referred to as “GT cluster,” is important for 3′ processing of polyadenylated mRNAs (41Birnstiel M.L. Busslinger M. Strub K. Cell. 1985; 41: 349-359Abstract Full Text PDF PubMed Scopus (946) Google Scholar). Moreover, the pentanucleotide CATTG was found between the AATAAA sequence and the poly(A) site addition (Fig.3). This CAYTG recognition element has been described to be related to cleavage site selection by Berget (42Berget S.M. Nature. 1984; 309: 179-181Crossref PubMed Scopus (206) Google Scholar). The author suggested that pre-polyadenylated RNA hybridized with the AAUAAA recognition element as related to primary site selection, and with CAYUG recognition element within the U4 small nuclear ribonucleoproteins as related to cleavage site selection. Hence,MUC5B combines some common features of the 3′ mRNA processing. From this nucleotide sequence, the new oligonucleotide NAU102 was synthesized to perform RT-PCR.Figure 3Sequence of the 3′ region of the MUC5Bgene. The sequence shows the entire region beginning at the sequence of the oligonucleotide NAU128 used for the RT-PCR. The sequences of exons are indicated in uppercase letters, and the sequences of introns are indicated in lowercase letters. Encoded amino acids are shown in single-letter code andnumbered 1–845 in the right margin. Bold letters on the right indicate the names of the introns. GC boxes are shaded. The polyadenylation signal isbold boxed. GT stretches and the pentanucleotide CATTG that may be involved in processing or polyadenylation areunderlined. The position of the signal of poly(A) addition is marked by an arrow. The splice donor and acceptor sites are in bold.View Large Image Figure ViewerDownload Hi-res image Download (PPT)Figure 3Sequence of the 3′ region of the MUC5Bgene. The sequence shows the entire region beginning at the sequence of the oligonucleotide NAU128 used for the RT-PCR. The sequences of exons are indicated in uppercase letters, and the sequences of introns are indicated in lowercase letters. Encoded amino acids are shown in single-letter code andnumbered 1–845 in the right margin. Bold letters on the right indicate the names of the introns. GC boxes are shaded. The polyadenylation signal isbold boxed. GT stretches and the pentanucleotide CATTG that may be involved in processing or polyadenylation areunderlined. The position of the signal of poly(A) addition is marked by an arrow. The splice donor and acceptor sites are in bold.View Large Image Figure ViewerDownload Hi-res image Download (PPT) Two specific overlapping cDNAs were synthesized by RT-PCR experiments. The locations of the oligonucleotides used in these experiments are indicated in Fig. 1. The oligonucleotide primer NAU151 was designed on the basis of the sequence determined for theBglII-BglII cosmid fragment (Fig. 1). This fragment hybridized with human tracheal RNA on Northern blot and probably contains coding sequences. An amplification product was obtained when the RT-PCR was performed with the two primers NAU151 and NAU102 using human tracheal first-strand cDNA as template. It was designated RT151-102 and is 2209 nucleotides in length. An other RT-PCR was then performed with the following oligonucleotide primers: NAU152, designed with the sequence of RT151-102, and NAU128, chosen in the 3′ end sequence of the MUC5B central exon (30Desseyn J.L. Guyonnet Dupérat V. Porchet N. Aubert J.P. Laine A. J. Biol. Chem. 1997; 272: 3168-3178Abstract Full Text Full Text PDF PubMed Scopus (149) Google Scholar). The resultant 1166-bp amplification product, called RT128-152, and the RT151-102 were cloned into pMOS-Blue T-vector. They were subsequently subcloned into pBluescript KS(+) vector after cutting with the restriction enzymes PstI, SacI, and/orSmaI. The subclones were entirely sequenced on both strands several times using T3 and T7 primers and specific oligonucleotides (see in Table I). The two amplification products RT128-152 and RT151-102 have overlapping sequences of 416 nucleotides. The 3′ region of the human MUC5B gene shown in Fig. 3 encompasses 10,690 bp, of which the first 113 nucleotides correspond to the 3′ end of the central exon we recently published (30Desseyn J.L. Guyonnet Dupérat V. Porchet N. Aubert J.P. Laine A. J. Biol. Chem. 1997; 272: 3168-3178Abstract Full Text Full Text PDF PubMed Scopus (149) Google Scholar). The full-length sequence has been submitted to the EMBL data bank with accession number Y09788. The 3′ region of MUC5B gene is composed of 18 exons ranging in size from 32 to 781 bp (Table II) in good agreement with the mean length of exons (43Hawkins J.D. Nucleic Acids Res. 1988; 16: 9893-9905Crossref PubMed Scopus (443) Google Scholar), in contrast to the extraordinary large central exon of MUC5B (30Desseyn J.L. Guyonnet Dupérat V. Porchet N. Aubert J.P. Laine A. J. Biol. Chem. 1997; 272: 3168-3178Abstract Full Text Full Text PDF PubMed Scopus (149) Google Scholar). The last exon is the largest one. It codes for the 72-amino acid COOH terminus of the core protein and comprises the 3′-untranslated region, 564 bp in lengt" @default.
- W1998359381 created "2016-06-24" @default.
- W1998359381 creator A5010941913 @default.
- W1998359381 creator A5014416641 @default.
- W1998359381 creator A5021414780 @default.
- W1998359381 creator A5035845246 @default.
- W1998359381 creator A5066843507 @default.
- W1998359381 date "1997-07-01" @default.
- W1998359381 modified "2023-10-17" @default.
- W1998359381 title "Genomic Organization of the 3′ Region of the Human Mucin GeneMUC5B" @default.
- W1998359381 cites W144814460 @default.
- W1998359381 cites W1492473590 @default.
- W1998359381 cites W1492833857 @default.
- W1998359381 cites W1533289312 @default.
- W1998359381 cites W1538346208 @default.
- W1998359381 cites W1546825042 @default.
- W1998359381 cites W1548863075 @default.
- W1998359381 cites W1562448272 @default.
- W1998359381 cites W1569084270 @default.
- W1998359381 cites W1581087233 @default.
- W1998359381 cites W1586051439 @default.
- W1998359381 cites W17339586 @default.
- W1998359381 cites W1760310372 @default.
- W1998359381 cites W1893700502 @default.
- W1998359381 cites W1894342464 @default.
- W1998359381 cites W1963550288 @default.
- W1998359381 cites W1965051773 @default.
- W1998359381 cites W1965642419 @default.
- W1998359381 cites W1971909044 @default.
- W1998359381 cites W1973948465 @default.
- W1998359381 cites W1974419017 @default.
- W1998359381 cites W1976616302 @default.
- W1998359381 cites W1980081860 @default.
- W1998359381 cites W2001817785 @default.
- W1998359381 cites W2003284803 @default.
- W1998359381 cites W2007538395 @default.
- W1998359381 cites W2017469581 @default.
- W1998359381 cites W2018043410 @default.
- W1998359381 cites W2022628365 @default.
- W1998359381 cites W2024467920 @default.
- W1998359381 cites W2025890252 @default.
- W1998359381 cites W2026899037 @default.
- W1998359381 cites W2037170193 @default.
- W1998359381 cites W2047693198 @default.
- W1998359381 cites W2048979488 @default.
- W1998359381 cites W2052742602 @default.
- W1998359381 cites W2055441104 @default.
- W1998359381 cites W2059464970 @default.
- W1998359381 cites W2060799184 @default.
- W1998359381 cites W2062079699 @default.
- W1998359381 cites W2062851048 @default.
- W1998359381 cites W2068507644 @default.
- W1998359381 cites W2068870265 @default.
- W1998359381 cites W2076892293 @default.
- W1998359381 cites W2087844645 @default.
- W1998359381 cites W2094803824 @default.
- W1998359381 cites W2124620276 @default.
- W1998359381 cites W2131211068 @default.
- W1998359381 cites W2132932170 @default.
- W1998359381 cites W2138270253 @default.
- W1998359381 cites W2139997043 @default.
- W1998359381 cites W2145120738 @default.
- W1998359381 cites W2148304994 @default.
- W1998359381 cites W2154695589 @default.
- W1998359381 cites W2158401046 @default.
- W1998359381 cites W2160035072 @default.
- W1998359381 cites W2166433018 @default.
- W1998359381 cites W2176052107 @default.
- W1998359381 cites W2194098842 @default.
- W1998359381 cites W2303389737 @default.
- W1998359381 cites W2412888379 @default.
- W1998359381 cites W4247986856 @default.
- W1998359381 doi "https://doi.org/10.1074/jbc.272.27.16873" @default.
- W1998359381 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/9201995" @default.
- W1998359381 hasPublicationYear "1997" @default.
- W1998359381 type Work @default.
- W1998359381 sameAs 1998359381 @default.
- W1998359381 citedByCount "105" @default.
- W1998359381 countsByYear W19983593812012 @default.
- W1998359381 countsByYear W19983593812014 @default.
- W1998359381 countsByYear W19983593812015 @default.
- W1998359381 countsByYear W19983593812016 @default.
- W1998359381 countsByYear W19983593812018 @default.
- W1998359381 countsByYear W19983593812019 @default.
- W1998359381 countsByYear W19983593812021 @default.
- W1998359381 countsByYear W19983593812022 @default.
- W1998359381 crossrefType "journal-article" @default.
- W1998359381 hasAuthorship W1998359381A5010941913 @default.
- W1998359381 hasAuthorship W1998359381A5014416641 @default.
- W1998359381 hasAuthorship W1998359381A5021414780 @default.
- W1998359381 hasAuthorship W1998359381A5035845246 @default.
- W1998359381 hasAuthorship W1998359381A5066843507 @default.
- W1998359381 hasBestOaLocation W19983593811 @default.
- W1998359381 hasConcept C104317684 @default.
- W1998359381 hasConcept C141231307 @default.
- W1998359381 hasConcept C179264091 @default.
- W1998359381 hasConcept C185592680 @default.
- W1998359381 hasConcept C22322919 @default.