Part One
Basic Principles of Human GeneticsCHAPTER 1
DNA Structure and Function
Introduction
The 20th century will likely be remembered by historians of biological science for the discovery of the structure of DNA and the mechanisms by which information coded in DNA is translated into the amino acid sequence of proteins. Although the story of modern human genetics begins about 50 years before the structure of DNA was elucidated, we will start our exploration here. We do so because everything we know about inheritance must now be viewed in the light of the underlying molecular mechanisms. We will see here how the structure of DNA sets the stage both for its replication and for its ability to direct the synthesis of proteins. We will also see that the function of the system is tightly regulated, and how variations in the structure of DNA can alter function. The story of human genetics did not begin with molecular biology, and it will not end there, as knowledge is now being integrated to explain the behavior of complex biological systems. Molecular biology, however, remains a key engine of progress in biological understanding, so it is fitting that we begin our journey here.
Key Points
- DNA consists of a double-helical sugarâphosphate structure with the two strands held together by hydrogen bonding between adenine and thymine or cytosine and guanine bases.
- DNA replication involves local unwinding of the double helix and copying a new strand from the base sequence of each parental strand. Replication proceeds bidirectionally from multiple start sites in the genome.
- DNA is complexed with proteins to form a highly compacted chromatin fiber in the nucleus.
- Genetic information is copied from DNA into messenger RNA (mRNA) in a highly regulated process that involves activation or repression of individual genes.
- mRNA molecules are extensively processed in the nucleus, including removal of introns and splicing together of exons, prior to export to the cytoplasm for translation into protein.
- The base sequence of mRNA is read in triplet codons to direct the assembly of amino acids into protein on ribosomes.
- Some genes are permanently repressed by epigenetic marks such as methylation of cytosine bases. These include most genes on one of two X chromosomes in cells in females and one of the two copies of imprinted genes.
Deoxyribonucleic Acid
Mendel described dominant and recessive inheritance before the concept of the gene was introduced and long before the chemical basis of inheritance was known. Cell biologists during the late 19th and early 20th centuries had established that genetic material resides in the cell nucleus and DNA was known to be a major chemical constituent. As the chemistry of DNA came to be understood, for a long time it was considered to be too simple a molecule â consisting of just four chemical building blocks, the bases adenine, guanine, thymine, and cytosine, along with sugar and phosphate â to account for the complexity of genetic transmission. Credit for recognition of the role of DNA in inheritance goes to the landmark experiments by Oswald Avery and his colleagues, who demonstrated that a phenotype of smooth or rough colonies of the bacterium Pneumococcus could be transmitted from cell to cell through DNA alone. Elucidation of the structure of DNA by James Watson and Francis Crick in 1953 opened the door to understanding the mechanisms whereby this molecule functions as the agent of inheritance (Sources of Information 1.1).
Sources of Information 1.1 Mendelian Inheritance in Man
Dr. Victor McKusick and his colleagues at Johns Hopkins School of Medicine began to catalog genes and human genetic traits in the 1960s. The first edition of the catalog Mendelian Inheritance in Man was published in 1966. Multiple print editions subsequently appeared, and now the catalog is maintained on the Internet as âOnline Mendelian Inheritance in Manâ (OMIM), located at www.omim.org.
OMIM is recognized as the authoritative source of information about human genes and genetic traits. The catalog can be searched by gene, phenotype, gene locus, and many other features. The catalog provides a synopsis of the gene or trait, including a summary of clinical features associated with mutations. There are links to other databases, providing access to gene and amino acid sequences, mutations, and so on. Each entry has a unique six-digit number, the MIM number. Autosomal dominant traits have entries beginning with 1, recessive traits with 2, X linked with 3, and mitochondrial with 5. Specific genes have MIM numbers that start with 6.
Throughout this book, genes or genetic traits will be annotated with their corresponding MIM number to remind the reader that more information is available on OMIM and to facilitate access to the site.
DNA Structure
DNA consists of a pair of strands of a sugarâphosphate backbone attached to a set of pyrimidine and purine bases (Figure 1.1). The sugar is deoxyribose â ribose missing an oxygen atom at its 2Ⲡposition. Each DNA strand consists of alternating deoxyribose molecules connected by phosphodiester bonds from the 5Ⲡposition of one deoxyribose to the 3Ⲡposition of the next. The strands are bound together by hydrogen bonds between adenine and thymine bases and between guanine and cytosine bases. Together these strands form a right-handed double helix. The two strands run in opposite (antiparallel) directions, so that one extends from 5Ⲡto 3Ⲡwhile the other goes from 3Ⲡto 5â˛.
The key feature of DNA, wherein resides its ability to encode information, is in the sequence of the four bases (Methods 1.1). The number of adenine bases (A) always equals the number of thymines (T), and the number of cytosines (C) always equals the number of guanines (G). This is because A on one strand is always paired with T on the other, and C is always paired with G. The pairing is noncovalent, due to hydrogen bonding between complementary bases. GâC base pairs form three hydrogen bonds, whereas AâT pairs form two, making the GâC pairs slightly more thermodynamically stable. Because the pairs always include one purine base (A or G) and one pyrimidine base (C or T), the distance across the helix remains constant.
Methods 1.1 Isolation of DNA
DNA, or in some cases RNA, is the starting point for most experiments aimed at studying gene structure or function. DNA can be isolated from any cell that contains a nucleus. The most commonly used tissue for human DNA isolation is peripheral blood, where white blood cells provide a readily accessible source of nucleated cells. Other commonly used tissues include cultured skin fibroblasts, epithelial cells scraped from the inner lining of the cheek, and fetal cells obtained by amniocentesis or chorionic villus biopsy. Peripheral blood lymphocytes can be transformed with EpsteinâBarr virus into immortalized cell lines, providing permanent access to growing cells from an individual.
Nuclear DNA is complexed with proteins, which must be removed in order for the DNA to be analyzed. For some experiments it is necessary to obtain highly purified DNA, which involves digestion or removal of the proteins. In other cases, relatively crude preparations suffice. This is the case, for example, with DNA isolated from cheek scrapings. The small amount of DNA isolated from this source is usually released from cells with minimal effort to remove proteins. This preparation is adequate for limited analysis of specific gene sequences. DNA preparations can be obtained from very minute biological specimens, such as drops of dried blood, skin cells, or hair samples isolated from crime scenes for forensic analysis.
Isolation of RNA involves purification of nucleic acid from the nucleus and/or cytoplasm. This RNA can be used to study the patterns of gene expression in a particular tissue. RNA tends to be less stable than DNA, requiring special care during isolation to avoid degradation.
DNA Replication
The complementarity of A to T and G to C provides the basis for DNA replication, a point that was recognized by Watson and Crick in their paper describing the structure of DNA. DNA replication proceeds by a localized unwinding of the double helix, with each strand serving as a template for replication of a new sister strand (Figure 1.2). Wherever a G base is found on one strand, a C will be placed on the growing strand; wherever a T is found, an A will be placed; and so on. Bases are positioned in the newly synthesized strand by hydrogen bonding, and new phosphodiester bonds are formed in the growing strand by the enzyme DNA polymerase. This is referred to as semiconservative replication, because the newly synthesized DNA double helices ar...