Genotype to Phenotype
eBook - ePub

Genotype to Phenotype

  1. 312 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub
Book details
Book preview
Table of contents
Citations

About This Book

This new edition builds on the success of the first by reviewing the increased understanding of the mechanisms of gene action in humans, focusing particularly on those derived from the study of genetic diseases. It deals mainly with the fundamental aspects of gene arrangement and expression rather than mutation. As well as updating and revising material from the first edition, it covers methods of exploring gene function and contains a range of chapters on specific systems which raise issues of special interest such as imprinting or homologous genes within clusters.

Frequently asked questions

Simply head over to the account section in settings and click on ā€œCancel Subscriptionā€ - itā€™s as simple as that. After you cancel, your membership will stay active for the remainder of the time youā€™ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlegoā€™s features. The only differences are the price and subscription period: With the annual plan youā€™ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, weā€™ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Genotype to Phenotype by J. J. Goodship, S. Malcolm, J. J. Goodship, S. Malcolm in PDF and/or ePUB format, as well as other popular books in Biological Sciences & Genetics & Genomics. We have over one million books available in our catalogue for you to explore.

Information

Year
2003
ISBN
9781135322922
Edition
1
1
Genotype to phenotype: interpretation of the Human Genome Project
Sue Malcolm
The genome is done: you have seen it in the papers and heard it on the news. The year 2001 marked a milestone in our understanding of the human genome with the publication of two papers (International Human Genome Sequencing Consortium, 2001; Venter et al., 2001) as the culmination of a multi-million pound project. It is a giant step to move from there to the understanding of phenotype, and in particular how the single sequence which has been published can account for the wide range of phenotypic variation in appearance and susceptibility to disease actually found in humans.
1. The genome project
1.1 The sequence
The first two complete chromosomes to be sequenced were chromosome 22 in 1999 (Dunham et al., 1999) and chromosome 21 in 2000 (Hattori et al., 2000). They each contain 33 million base pairs. The published sequence does not correspond to any one individual as bacterial artificial chromosome (BAC), P-1 derived artificial chromosome (PAC), cosmid and fosmid libraries were used from multiple donor sources. The opportunity to donate DNA to the project was broadly advertised near the participating laboratories and volunteers of diverse backgrounds were accepted on a first-come, first-taken basis. Elaborate steps were taken to remove any way of identifying clones or sequences with a particular donor. Analysis of overlapping sequences on chromosome 21 revealed multiple nucleotide variations and small deletions and insertions, leading to an estimate of an average of one sequence difference for each 787 base pairs. Only a very small proportion of each chromosome remained unsequenced. Interestingly, on chromosome 22 one of these areas, on the proximal region of the long arm, corresponded to long, low copy number repeats which contribute to the instability associated with the DiGeorge or velocardiofacial critical region (Edelmann et al., 1999) described in Chapter 9.
1.2 Finding the genes
Methods were developed for annotating the raw sequence and for searching for functional genes. A prediction of 225 genes was made for chromosome 21 (127 known genes and 98 predicted) and 545 genes for chromosome 22. Based on these figures a total gene count of between 30 000 and 40 000 was extrapolated for the whole genome. This is broadly in line with two other experimental determinations (Roest Crollius et al. 2000; Ewing and Green, 2000) but considerably less than databases held by several biotech companies suggested (Liang et al., 2000). The reason for the discrepancy is unknown, but may be connected with variable splice forms which have been registered more than once. Gene counts of around 30 000 were confirmed in the two papers analyzing the whole genome (International Human Genome Sequencing Consortium, 2001; Venter et al., 2001). This is very few more, about twice as many, than in the worm and fly.
Both chromosome communities developed broadly similar hierarchical approaches to identifying genes. Basically, they combined gene prediction programs such as GENSCAN and GRAIL with sequence similarity searches using the BLAST suite of programs. The presence of CpG islands provided useful additional evidence. At the top came known human genes from the literature or public databases. Secondly came novel genes which could be correlated with known cDNAs or open reading frames from any organism. This category identified new members of human gene families as well as human homologs or orthologs of genes from yeast, Caenorhabditis elegans, Drosophila, mouse etc. After that the position became murkier with the next category containing novel genes which, in part, corresponded to a known protein domain such as a zinc finger. Finally, there was a class of novel anonymous genes defined solely by gene predictions including some quality guidelines, for example strongly predicted exons or adjacent predicted exons also being found spliced in Expressed Sequence Tags (EST). Both studies also found a surprisingly large number of pseudo genes (59 on chromosome 21 and 134 on chromosome 22) corresponding to 20% of the total. This should provide a cautionary note for amateur readers of the sequence.
To what extent does the definitive sequence provide the definitive catalog of all genes? Clearly extensive experimental studies will have to be carried out to confirm the nature of the predicted sequences, but there are frequent examples where experimental exploration of biological function leads to the discovery of an ever increasingly complex array of products from a single locus and it is sometimes purely a matter of semantics as to what should be defined as a/the gene. When a gene has the same coding region but has tissue specific splicing of 5ā€™ non coding exons from alternative promoters (Suter et al., 1994), perhaps two genes could be defined named for their tissue specificity. A particularly complex example, the GNAS1 locus, is described in detail in Chapter 8. A clinically important example for our understanding of cancer is the set of tumor suppressor genes found in chromosomal region 9p21 which is the site of a major locus for predisposition to melanoma. Three genes have been identified in the region: CDKN2A which encodes the p16 protein, CDKN2B which encodes p15 protein, and also p14ARF which is encoded by an alternative exon (1Ī²) about 12kb upstream of CDKN2A. Exon 1Ī² is spliced onto exon 2 of CDKN2A leading to an alternate reading frame (ARF) and no sequence homology to CDKN2A at the amino acid level (Randerson-Moor et al., 2001; Stone et al., 1995). Only half the melanoma families linked to 9p21ā€“22 have detectable mutations in the CDKN2A gene but it appears to be the critical protein for tumor suppression of melanoma. A family has now been reported who have a phenotype of melanoma plus neural system tumors in which only the exon 1Ī² is deleted. This will help to unravel the relative roles of p14ARF and p76.
Many of the most complex examples, perhaps only because they are also the most intensively investigated examples, arise from regions where genomic imprinting occurs. The small nuclear ribonucleoprotein N (SNRPN) gene was originally defined because of its role in splicing where spliceosomes contain small RNAs and associated polypeptides. The gene maps within the imprinted region on chromosome 15q 11ā€“13 implicated in the two neurodevelopmental disorders Prader-Willi syndrome and Angelman syndrome and is itself only expressed from the paternal chromosome. There are multiple alternatively spliced exons both 5ā€™ and 3ā€™ to the coding region of the gene (Buiting et al., 1997; Dittrich et al., 1996; Farber et al., 1999) none of which alter the coding capability. Small deletions within the 5ā€™ region have been found in Angelman syndrome and Prader-Willi syndrome patients who are unable to switch their imprint between generations (as shown by the methylation pattern; Buiting et al., 1995; Dittrich et al., 1996) leading to the definition of an imprinting center. These fall into two close but non overlapping clusters: those found in Angelman syndrome which stop the paternal to maternal switch (Buiting et al., 1999) and those found in Prader-Willi syndrome which stop a maternal to paternal switch (Saitoh et al., 1996). As there is to date no apparent connection between the spliceosome and the setting of a methylation imprint it is unclear how many ā€˜genesā€™ are coded for at the SNRPN locus.
Further glimpses into the complexity arising when sets of genes are regulated in a coordinated fashion, as they are at most imprinted loci, come from the discovery of spliced RNAs with no coding potential such as Imprinted in Prader-Willi (IPW) and H19 on chromosome 11p 15 (Falls et al., 1999)
1.3 Will the real DNA sequence please stand up
All humans, except monozygotic twins, have different DNA sequences and even monozygotic twins will have differences in somatic tissues, mitochondrial DNA and at certain loci such as the immunoglobulin genes. The published human sequence will be an average of all these because of the methods used in its decipherment, and we are unlikely ever to know whether this hypothetical individual would have had straight blonde hair and blue eyes, hypertension or would have developed cancer. In order to answer the above questions large scale systematic efforts at defining the differences between individuals have been undertaken. This has concentrated particularly on single nucleotide changes or polymorphisms (SNPs) as these are most likely to have a functional role. A non-profit making SNP consortium was set up between 10 pharmaceutical companies, academic centers and the Wellcome Trust to identify and map a large number of SNPs and make these freely available (snp.cshl.org). These may occur in coding regions or intergenic DNA and may or may not lead to changes in amino acids. The significance of these changes and their frequency is discussed in Chapter 3.
2. Mutation vs. polymorphism
2.1 What is the difference?
As will already be clear, any substantial stretch of an individualā€™s DNA which is sequenced is likely to contain differences, mainly heterozygous, from the ā€˜standardā€™ sequence. This variation certainly establishes individuality, but methods are still only poorly developed to establish which changes are responsible for pathological changes found in the individual. Traditionally, evidence has been derived from genetic methods and functional methods. The best genetic evidence arises when a de novo mutation in a gene is found in a sporadic case of a genetic disorder. This can be very powerful evidence for dominant disorders but it will not be possible to observe in recessive disorders where both parents will be carriers. Several mutations have been shown to occur at a sufficiently high level in certain populations that 100 so called ā€˜normalā€™ controls are likely to contain several individuals carrying the change. The gap junction protein Connexin 26 (CX26) provides a good example. Deletion of a guanine residue at cDNA position 35 (35delG), causes a frameshift of the coding sequence leading to premature chain termination at the 12th amino acid and has been found in numerous cases with recessive nonsyndromic deafness. Estivill et al. (1998) found mutations in the CX26 gene in 49% of participants from Italy and Spain with a family history of recessive deafness and 37% of sporadic cases. The 35delG mutation accounted for 85% of CX26 mutations. The carrier frequency of the 35delG mutation in the general population was 1 in 31 (95% CI, 1 in 19ā€“1 in 87).
Another example comes from our knowledge of hemochromatosis which is probably the most common mutant gene in the Caucasian population. Iron overload in affected homozygous individuals can lead to liver cirrhosis and primary hepatocellular carcinoma. The faulty hemochromatosis gene (HFE) was isolated in 1996 (Feder et al., 1996), the major evidence for it being the causative gene was that the majority of clinically confirmed cases are homozygously associated with a point mutation in HFE (the substitution of a tyrosine for cysteine at position 282, or C282Y). Both the frequency and non conservative nature of the amino acid change suggest this is a causative mutation. However, even in such an apparently clear cut case there are complications. Several reports have shown that a minority of individuals homozygous for C282Y do not fit normal diagnostic criteria (Tavill, 1999). Because the disease is fairly easily treatable, HFE screening has been proposed as a suitable subject for population screening but the reduced penetrance complicates this. A second sequence variant (H63D) has been found. On its own it is associated with only a slight increase in risk of disease, if any, but as a compound heterozygote together with C282Y it was found in 8/178 of the original cohort and its overall population frequency is 16.6%.
2.2 Changes involving splicing
Nucleotide changes within and around splice donor and acceptor sites have long been recognized as a major source of mutation in humans. More recently it has been suggested that some mutations within coding regions may in fact be exercising their effect through splicing by disrupting Exonic Splice Enhancers (ESEs; Blencowe, 2000; Cooper and Mattox, 1997). Accurate splicing is a complicated business requiring an array of small nuclear ribonucleoproteins and other factors which are components of the spliceosome. ESEs are present in constitutive and alternatively spliced exons and are required for efficient splicing of those exons. The ESEs in pre-mRNAs are recognized by serine/arginine rich (SR) proteins, a family of essential splicing factors that also regulate alternative splicing. ESEs contain a wide spectrum of sequences of approximately 6ā€“8 nucleotides, but they are hard to detect because of their degeneracy. When missense mutations are identified in genomic DNA, particularly in a diagnostic laboratory, the usual interpretation is that the affected amino acid is crucial for the function of the protein if the change is non conservative or the residue changed is highly conserved in evolution. However, if thes...

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Table of Contents
  6. Contributors
  7. Abbreviations
  8. Preface
  9. 1 Genotype to phenotype: interpretations of the human genome project
  10. 2 From protein sequence to structure and function
  11. 3 Genes in population
  12. 4 Gene-environment interaction: lipoprotein lipase and smoking and risk of CAD and the ACE and exercise-induced left ventricular hypertrophy as examples
  13. 5 Pharmacogenomics
  14. 6 Mitochondrial genetics
  15. 7 Identification of disease susceptibility genes (modifier) in mouse models: cancer and infectious disease
  16. 8 The GNAS1 gene
  17. 9 Genomic disorders
  18. 10 Genotype to phenotype in the spinocerebellar ataxias
  19. 11 Disorders of cholesterol biosynthesis
  20. 12 Mutations in the human HOX genes
  21. 13 PITX2 gene in development
  22. 14 The hedgehog pathway and developmental disorders
  23. 15 X-linked immunodeficiences
  24. 16 The ubiquitin-proteasome system and genetic diseases: protein degradation gone awry
  25. Index