Chapter 1: Introduction to Molecular Regulation and Signaling




Molecular biology has opened the doors to new ways to study embryology and to enhance our understanding of normal and abnormal development. Sequencing the human genome, together with creating techniques to investigate gene regulation at many levels of complexity, has taken embryology to the next level. Thus, from the anatomical to the biochemical to the molecular level, the story of embryology has progressed, and each chapter has enhanced our knowledge.

There are approximately 23,000 genes in the human genome, which represents only one fifth of the number predicted prior to completion of the Human Genome Project. Because of various levels of regulation, however, the number of proteins derived from these genes is closer to the original predicted number of genes. What has been disproved is the one-gene–one-protein hypothesis. Thus, through a variety of mechanisms, a single gene may give rise to many proteins.

Gene expression can be regulated at several levels: (1) different genes may be transcribed, (2) nuclear deoxyribonucleic acid (DNA) transcribed from a gene may be selectively processed to regulate which RNAs reach the cytoplasm to become messenger RNAs (mRNAs), (3) mRNAs may be selectively translated, and (4) proteins made from the mRNAs may be differentially modified.

Gene Transcription

Genes are contained in a complex of DNA and proteins (mostly histones) called chromatin, and its basic unit of structure is the nucleosome (Fig. 1.1). Each nucleosome is composed of an octamer of histone proteins and approximately 140 base pairs of DNA. Nucleosomes themselves are joined into clusters by binding of DNA existing between nucleosomes (linker DNA) with other histone proteins (H1 histones; Fig. 1.1). Nucleosomes keep the DNA tightly coiled, such that it cannot be transcribed. In this inactive state, chromatin appears as beads of nucleosomes on a string of DNA and is referred to as heterochromatin. For transcription to occur, this DNA must be uncoiled from the beads. In this uncoiled state, chromatin is referred to as euchromatin.

Figure 1.1. Drawing showing nucleosomes that form the basic unit of chromatin.

Drawing showing nucleosomes that form the basic unit of chromatin

Each nucleosome consists of an octamer of histone proteins and approximately 140 base pairs of DNA. Nucleosomes are joined into clusters by linker DNA and other histone proteins.

Genes reside within the DNA strand and contain regions called exons, which can be translated into proteins, and introns, which are interspersed between exons and which are not transcribed into proteins (Fig. 1.2). In addition to exons and introns, a typical gene includes the following: a promoter region that binds RNA polymerase for the initiation of transcription; a transcription initiation site; a translation initiation site to designate the first amino acid in the protein; a translation termination codon; and a 3? untranslated region that includes a sequence (the poly A addition site) that assists with stabilizing the mRNA, allows it to exit the nucleus, and permits it to be translated into protein (Fig. 1.2). By convention, the 5? and the 3? regions of a gene are specified in relation to the RNA transcribed from the gene. Thus, DNA is transcribed from the 5? to the 3? end, and the promoter region is upstream from the transcription initiation site (Fig. 1.2). The promoter region, where the RNA polymerase binds, usually contains the sequence TATA, and this site is called the TATA box (Fig. 1.2). In order to bind to this site, however, the polymerase requires additional proteins called transcription factors (Fig. 1.3). Transcription factors also have a specific DNA-binding domain plus a transactivating domain that activates or inhibits transcription of the gene whose promoter or enhancer it has bound. In combination with other proteins, transcription factors activate gene expression by causing the DNA nucleosome complex to unwind, by releasing the polymerase so that it can transcribe the DNA template, and by preventing new nucleosomes from forming.

Figure 1.2.

Drawing of a “typical” gene showing the promoter region containing the TATA box; exons that contain DNA sequences that are translated into proteins; introns; the transcription initiation site; the translation initiation site that designates the code for the first amino acid in a protein; and the 3? untranslated region that includes the poly A addition site that participates in stabilizing the mRNA, allows it to exit the nucleus, and permits its translation into a protein.

Drawing of a “typical” gene showing the promoter region containing the TATA box

Figure 1.3. Drawing showing binding of RNA polymerase II to the TATA box site of the promoter region of a gene.

Drawing showing binding of RNA polymerase II to the TATA box site of the promoter region of a gene

This binding requires a complex of proteins plus an additional protein called a transcription factor. Transcription factors have their own specific DNA-binding domain and function to regulate gene expression.

Enhancers are regulatory elements of DNA that activate utilization of promoters to control their efficiency and the rate of transcription from the promoter. Enhancers can reside anywhere along the DNA strand and do not have to reside close to a promoter. Like promoters, enhancers bind transcription factors (through the transcription factor’s transactivating domain) and are used to regulate the timing of a gene’s expression and its cell-specific location. For example, separate enhancers in a gene can be used to direct the same gene to be expressed in different tissues. The PAX6 transcription factor, which participates in pancreas, eye, and neural tube development, contains three separate enhancers, each of which regulates the gene’s expression in the appropriate tissue. Enhancers act by altering chromatin to expose the promoter or by facilitating binding of the RNA polymerase. Sometimes, enhancers can inhibit transcription and are called silencers. This phenomenon allows a transcription factor to activate one gene while silencing another by binding to different enhancers. Thus, transcription factors themselves have a DNA-binding domain specific to a region of DNA plus a transactivating domain that binds to a promoter or an enhancer and activates or inhibits the gene regulated by these elements.

DNA Methylation Represses Transcription

Methylation of cytosine bases in the promoter regions of genes represses transcription of those genes. Thus, some genes are silenced by this mechanism. For example, one of the X chromosomes in each cell of a female is inactivated (X chromosome inactivation) by this methylation mechanism. Similarly, genes in different types of cells are repressed by methylation, such that muscle cells make muscle proteins (their promoter DNA is mostly unmethylated), but not blood proteins (their DNA is highly methylated). In this manner, each cell can maintain its characteristic differentiated state. DNA methylation is also responsible for genomic imprinting in which only a gene inherited from the father or the mother is expressed, while the other gene is silenced. Approximately 40 to 60 human genes are imprinted and their methylation patterns are established during spermatogenesis and oogenesis. Methylation silences DNA by inhibiting binding of transcription factors or by altering histone binding resulting in stabilization of nucleosomes and tightly coiled DNA that cannot be transcribed.

Other Regulators of Gene Expression

The initial transcript of a gene is called nuclear RNA (nRNA) or sometimes premessenger RNA. nRNA is longer than mRNA because it contains introns that are removed (spliced out) as the nRNA moves from the nucleus to the cytoplasm. In fact, this splicing process provides a means for cells to produce different proteins from a single gene. For example, by removing different introns, exons are “spliced” in different patterns, a process called alternative splicing (Fig. 1.4). The process is carried out by spliceosomes, which are complexes of small nuclear RNAs (snRNAs) and proteins that recognize specific splice sites at the 5? or the 3? ends of the nRNA. Proteins derived from the same gene are called splicing isoforms (also called splice variants or alternative splice forms), and these afford the opportunity for different cells to use the same gene to make proteins specific for that cell type. For example, isoforms of the WT1 gene have different functions in gonadal versus kidney development.

Figure 1.4. Drawing of a hypothetical gene illustrating the process of alternative splicing to form different proteins from the same gene.

Drawing of a hypothetical gene illustrating the process of alternative splicing to form different proteins from the same gene

Spliceosomes recognize specific sites on the initial transcript of nRNA from a gene. Based on these sites, different introns are “spliced out” to create more than one protein from a single gene. Proteins derived from the same gene are called splicing isoforms.

Even after a protein is made (translated), there may be post-translational modifications that affect its function. For example, some proteins have to be cleaved to become active, or they might have to be phosphorylated. Others need to combine with other proteins or be released from sequestered sites or be targeted to specific cell regions. Thus, there are many regulatory levels for synthesizing and activating proteins, such that although only 23,000 genes exist, the potential number of proteins that can be synthesized is probably closer to five times the number of genes.

Induction and Organ Formation

Organs are formed by interactions between cells and tissues. Most often, one group of cells or tissues causes another set of cells or tissues to change their fate, a process called induction. In each such interaction, one cell type or tissue is the inducer that produces a signal, and one is the responder to that signal. The capacity to respond to such a signal is called competence, and competence requires activation of the responding tissue by a competence factor. Many inductive interactions occur between epithelial and mesenchymal cells and are called epithelial–mesenchymal interactions (Fig. 1.5). Epithelial cells are joined together in tubes or sheets, whereas mesenchymal cells are fibroblastic in appearance and dispersed in extracellular matrices (Fig. 1.5). Examples of epithelial–mesenchymal interactions include the following: gut endoderm and surrounding mesenchyme to produce gut-derived organs, including the liver and pancreas; limb mesenchyme with overlying ectoderm (epithelium) to produce limb outgrowth and differentiation; and endoderm of the ureteric bud and mesenchyme from the metanephric blastema to produce nephrons in the kidney. Inductive interactions can also occur between two epithelial tissues, such as induction of the lens by epithelium of the optic cup. Although an initial signal by the inducer to the responder initiates the inductive event, crosstalk between the two tissues or cell types is essential for differentiation to continue (Fig. 1.5, arrows).

Figure 1.5.Drawing illustrating an epithelial–mesenchymal interaction.

Drawing illustrating an epithelial–mesenchymal interaction

Following an initial signal from one tissue, a second tissue is induced to differentiate into a specific structure. The first tissue constitutes the inducer, and the second is the responder. Once the induction process is initiated, signals (arrows) are transmitted in both directions to complete the differentiation process.