Transcript
RNA, Transcription and Processing
Lecture 13
Outline
RNA
Structure & Assembly
Types of RNAs
Gene structure
RNA Polymerase
Transcription
Four Basic Steps
Prokaryotic v Eukaryotic Differences
Enhancers and Silencers
Post-Transcriptional Message Processing
RNA Structure
RNA ribonucleotides are composed of a sugar, nucleotide base, and one or more phosphate groups, with two critical differences compared to DNA nucleotides
The bases adenine, guanine, and cytosine are the same, but thymine is replaced by uracil
The sugar ribose is used rather than deoxyribose
RNA Assembly and Structure
The similar sugars in RNA and DNA lead to formation of nearly identical sugar-phosphate backbones in the molecules
RNA strands are assembled by formation of phosphodiester bonds between the 5? phosphate
of one nucleotide and the 3? hydroxyl of the adjacent nucleotide
RNA is synthesized from a DNA template using complementary base pairing
(A with U and C with G)
RNA polymerase catalyzes the addition of each ribonucleotide to the 3? end of the nascent strand, and form the phosphodiester bonds between nucleotides
Two phosphates are eliminated in the process, as in DNA synthesis
Transcription Simple Video
https://youtu.be/5MfSYnItYvg
RNA Types and Classification
Messenger RNA (mRNA) is a short-lived intermediary between DNA and protein
Transfer RNAs (tRNAs) are encoded in dozens of forms and are responsible for binding an amino acid and depositing it for inclusion into a growing protein chain
Ribosomal RNA (rRNA) combines with numerous proteins to form ribosomes
Small nuclear RNA (snRNA) of various types is found in the nucleus of eukaryotes and plays a role in mRNA processing
Micro RNA (miRNA) is active in plant and animal cells and is involved in postranscriptional regulation of mRNA
Small interfering RNA (siRNA) protects plant and animal cells from production of viruses and movement of transposons
Gene Structure
The gene contains several segments with distinct functions
The promoter is immediately upstream (5?) to the start of transcription, referred to as the ?1 nucleotide
The promoter controls the access of RNA polymerase to the gene
The coding region of the gene is the portion that contains the information needed to synthesize the protein product
The termination region of the gene regulates cessation of transcription
The termination region is immediately downstream (3?) to the coding segment of the gene
RNA Polymerase Composition
The bacterial RNA polymerase holoenzyme is composed of a pentameric core enzyme that binds a sixth subunit, called the sigma (s) subunit
The large core enzyme is composed of two a subunits, one b and one b? and an w subunit
The core enzyme can transcribe RNA from a DNA template but cannot bind the promoter or initiate RNA synthesis without the s subunit
Several different types of sigma subunits, called alternative sigma subunits
These alter core enzyme conformation in slightly different ways to facilitate association with different promoter regions
in bacteria. helps decide if genes will be on or off like transcription factors
Four-Stages of Bacterial Transcription
Transcription is the synthesis of a single-stranded RNA molecule by RNA polymerase
The polymerase uses the template strand of the DNA to assemble a complementary, antiparallel strand of ribonucleotides
The coding strand of DNA, also called the nontemplate strand, is complementary to the template strand
Promoter recognition
Transcription initiation
Chain elongation
Chain termination
Promoter recognition
Consensus sequences are written in single-stranded shorthand form, 5? to 3? on the coding strand
At the ?10 position is the Pribnow box, or ?10 consensus sequence, 5?-TATAAT-3?
At ?35 is a 6-bp region, the ?35 consensus sequence, 5?- TTGACA-3?
RNA polymerase binds to ?10 and ?35 sequences and occupies the space between and around them
Transcription Initiation
First, the holoenzyme makes a loose attachment to the promoter sequence to form the closed promoter complex
The holoenzyme next unwinds about 18 bp of DNA around the ?10 position to form the open promoter complex
Next, the holoenzyme progresses downstream to initiate RNA synthesis at the ?1 site
Considerable sequence variation exists among promoters; alternative sigma sites allow for holoenzyme binding to variant promoters
Transcription Elongation
The holoenzyme initiated RNA synthesis at the ?1 site
It remains intact until the first 8 to 10 RNA nucleotides have been joined, at which point the sigma subunit dissociates from the core enzyme
DNA is unwound ahead of the enzyme to maintaining about 18 nucleotides of unwound DNA; the double helix reforms behind the RNA polymerase
Termination of Transcription
When transcription of the gene is completed, the 5? end of the RNA trails off the core enzyme
The core enzyme dissociates from the DNA
Shortly after one round of transcription is initiated, a second round begins
Termination: Intrinsic
Most bacterial termination occurs via intrinsic termination
Termination sequences include an inverted repeat followed by a string of adenines
mRNA containing the inverted repeats form into a short stem-loop structure, called a hairpin
The hairpin followed by a series of Us in the mRNA causes the RNA polymerase to slow down and destabilize
The instability caused by the slowing polymerase and the U-A base pairs induces the polymerase to release the transcript and separate from the DNA
Termination: Rho-Dependent
Certain bacterial genes require the action of rho protein to bind to the nascent mRNA and catalyze the separation of the mRNA from the RNA polymerase
Rho-dependent termination sequences do not have a string of uracils; instead they have a rho utilization (or rut) site, a stretch of about 50 nucleotides rich in cytosines
Rho proteins contains six identical polypeptides with two functional domains each
They are activated by ATP binding to one of the functional domains, facilitating binding to the rut site
Rho then moves along the transcript to RNA polymerase and catalyzes the breakage of hydrogen bonds between the mRNA and the DNA template, and release of the polymerase
End Product
Eukaryotic Transcription Differences
Multiple polymerases
Eukaryotic promoters and consensus sequences are more diverse than those of bacteria
Eukaryotes have three different RNA polymerases that recognize different promoters and produce different types of RNAs
The complex that assembles to initiate and elongate transcription is more complex in eukaryotes than in bacteria
Eukaryotic genes carry introns and exons, and require processing to remove introns
Eukaryote DNA is associated with proteins to form chromatin; the chromatin composition of a gene affects its transcription
Chromatin thus plays an important role in gene regulation of eukaryotes
Eukaryotic Polymerases
RNA polymerase I (RNA pol I) transcribes three ribosomal RNA genes
RNA polymerase II (RNA pol II) transcribes protein coding genes and most small nuclear RNA genes
RNA polymerase III (RNA pol III) transcribes tRNA, one small nuclear RNA, and one ribosomal RNA
Each eukaryotic (and archaeal) RNA polymerase contains units that share homology with the 5 subunits of the bacterial polymerase
Arachaea and eukaryotes have 6 to 11 additional subunits
All RNA polymerases share a similar “hand” shape with “fingers” that grasp DNA and a “palm” where RNA synthesis takes place
Promoter Elements
The most common eukaryotic promoter consensus sequence is the TATA box, located at about position ?25
The consensus sequence is 5?-TATAAA-3?
A CAAT box is often found near the -80 position
A GC-rich box (consensus 5?-GGGCGG-3?) is located at ?90, or further upstream
Diversity of Promoter Elements
Promoter Recognition
RNA pol II recognizes and binds to promoter sequences with the aid of proteins called transcription factors (TFs)
TFs bind to regulatory sequences and interact directly, or indirectly, with RNA polymerase; those interacting with pol II are called TFII factors
The TATA box is the principle binding site during promoter recognition
At the TATA box, TFIID, a multisubunit protein containing the TATA-binding protein (TBP), binds the TATA box sequence and a protein called the TBP-associated factor (TAF)
The assembled TFIID bound to the TATA box forms the initial committed complex
Next, TFIIB, TFIIF, and RNA pol II join the complex to form the minimal initiation complex
The minimal initiation complex is joined by TFIIE and TFIIH to form the complete initiation complex
The complete initiation complex contains multiple proteins commonly referred to as “general transcription factors”
The complete complex directs RNA pol II to the ?1 position, where it begins to assemble mRNA
Detecting Promoter Consensus Elements
Research to verify that a segment of DNA is a functionally important component of a promoter has two components
Discovering the presence and location of DNA sequences that transcription factors will bind to
Mutational analysis to confirm the functionality of each sequence
Mutational Analysis of Promoters
Researchers produce many different point mutations and compare the level of transcription generated by the mutant sequence relative to wild type
Mutations inside the consensus region significantly reduce levels of transcription
Mutations outside the consensus region have nonsignificant effects on transcription
Enhancers and Silencers
Promoters alone may not be sufficient to initiate eukaryotic transcription
Two categories of DNA regulatory sequences lead to differential expression of genes
These are enhancer sequences and silencer sequences
Enhancer Sequences
Enhancer sequences increase the level of transcription of specific genes
They bind proteins that interact with the proteins that are bound to gene promoters, and together the promoters and enhancers drive gene expression
Enhancers may be variable distances from the genes they affect and may be upstream or downstream of the gene
Enhancer Sequences and DNA Bending
Enhancer sequences bind activator proteins and associated coactivators that form a “protein bridge” that links the proteins at the enhancer sequence to the initiation complex at the promoter
This bridge bends the DNA so that the proteins at both locations are brought close enough together for them to interact
Silencer Sequences
Silencer sequences are DNA elements that act at a distance to repress transcription of their target genes
Silencers bind transcription factors called repressor proteins that induce bends in DNA
These bends reduce transcription of the target gene
Silencers may be located variable distances from their target genes, either upstream or downstream
Signal Transduction
Signal transduction pathways are sequential events that release regulatory molecules inside a cell in response to events outside the cell
They utilize transmembrane proteins, which receive signals externally through an extracellular interaction domain
They transmit signals within the cell via a binding domain inside the cell; this activates a transcription factor needed for expression of a target gene
RNA Polymerase I Promoters
RNA polymerase I transcribes genes for rRNA using a mechanism similar to that of RNA pol II
RNA pol I is recruited to upsteam promoter elements following binding of transcription factors, and transcribes ribosomal genes found in the nucleolus
The nucleolus is a nuclear organelle containing rRNA and multiple copies of genes encoding rRNA
Promoters recognized by RNA pol I have two functional sequences near the start of transcription
The core element stretches from ?45 to ?20, and is needed for initiation of transcription; it is bound by sigma-like factor 1 (SL1) protein
The upstream control element spans from ?100 to ?150, and increases the level of transcription; it is bound by upstream binding factor 1 (UBF1)
Termination in RNA Pol I or Pol III Transcription
Each eukaryotic RNA polymerase has a different termination mechanism
RNA pol III transcription is terminated similarly to E. coli intrinsic transcription termination
It transcribes a terminator sequence that creates a string of uracils in the transcript, though no stem-loop structure forms near it
RNA pol I is terminated at a 17-bp consensus sequence that binds transcription-terminating factor 1 (TTF1)
A large rRNA precursor transcript is cleaved about 18 nucleotides ahead of the consensus sequence, which does not appear in the mature transcript
Post-Transcriptional Processing Modifies RNA Molecules
Eukaryotic transcripts are more stable than bacterial transcripts
In eukaryotes, transcription and translation are separated in time and location
Eukaryotic transcripts have introns, which are not found in bacterial transcripts
These features are all related to post-transcriptional modification of eukaryotic transcripts
Post-Transcriptional Processing
The initial eukaryotic gene mRNA is called the pre-mRNA whereas the fully processed mRNA is called the mature mRNA; modifications include
5? capping
3? polyadenylation
Intron splicing
Capping 5? mRNA
After the first 20 to 30 nucleotides of mRNA have been synthesized, a special enzyme, guanylyl transferase, adds a guanine to the 5? end of the pre-mRNA
Additional enzyme action methylates the newly added guanine and may also methylate nearby nucleotides of the transcript
The addition of the guanine to the mRNA is called 5? capping
STEPS of Capping
Guanylyl transferase removes the g phosphate of the 5? end of the mRNA, leaving two phosphates on the message
The guanine to be added loses two phosphates to become guanine monophosphate
Guanylyl transferase joins the guanine monophosphate to the 5? mRNA by a 5? to 5? triphosphate linkage
Functions of the 5? Cap
Protection of mRNA from rapid degradation
Facilitating transport of mRNA out of the nucleus
Facilitating subsequent intron splicing
Enhancing translation efficiency by orienting the ribosome on the mRNA
Polyadenylation of 3? Pre-mRNA
Termination of transcription by RNA pol II is not fully understood
The 3? end of the pre-mRNA is created by enzyme action that removes a section of the 3? message and replaces it with a string of adenines
This is thought to be associated with the subsequent termination of transcription
Steps of Polyadenylation
Cleavage and polyadenylation specificity factor (CPSF) binds near the polyadenylation signal sequence—5?-AAUAAA-3?—which is downstream of the stop codon
The pre-mRNA is cleaved 15 to 30 nucleotides downstream of the polyadenylation signal sequence
The cleavage releases a fragment of the mRNA which is bound by CFI, CFII and CstF; this fragment is later degraded
The 3? end of the cut pre-mRNA undergoes enzymatic addition of 20 to 200 adenines through the action of CPSF and PAP
After addition of the first 10 adenines, molecules of poly-A-binding protein (PABII) join the adenine tail and increase the rate of addition of adenines
Functions of Polyadenylation
Facilitating transport of mature mRNA across the nuclear membrane to the cytoplasm
Protecting the mRNA from degradation
Enhancing translation by enabling the ribosomal recognition of mRNA
Some eukaryotic transcripts (e.g., the histone genes) do not undergo polyadenylation
Pre-mRNA Intron Splicing
Intron splicing requires great precision to remove intron nucleotides accurately
Errors in intron removal would lead to incorrect protein sequences
Roberts and Sharp shared the 1993 Nobel Prize for their codiscovery of “split genes,” i.e., the presence of intron and exon sequences
Splicing Signal Sequences
Specific short sequences define the junctions between introns and exons
The 5? splice site is at the 5? intron end and contains a consensus sequence with an invariant GU dinucleotide at the 5?-most end of the intron
The 3? splice site at the opposite end of the intron has an 11 nucleotide consensus with a pyrimidine rich region and a nearly invariant AG at the
3?-most end
The Branch Site
A third consensus region, called the branch site, is 20 to 40 nucleotides upstream of the 3? end of the intron
It is pyrimidine-rich and contains an invariant adenine called the branch point adenine near the 3? end of the consensus
Mutation analysis shows that all three consensus sequences are required for accurate splicing
Splicing
Spliceosome Composition
The spliceosome is a large complex made of multiple snRNPs (snRNPs U1 through U6)
The composition is dynamic, changing through the steps of splicing
Spliceosome components are recruited to 5? and 3? splice sites by SR proteins; these bind to sequences in exons called exonic splicing enhancers (ESEs), and ensure accurate splicing
Coupling of Pre-mRNA Processing Steps
Introns appear to be removed one by one, but not necessarily in order
The three steps of pre-mRNA processing are tightly coupled
The carboxyl terminal domain (CTD) of RNA polymerase II functions as an assembly platform and regulator of pre-mRNA processing machinery
Gene Expression Machines
Current models suggest that RNA pol II and an array of pre-mRNA processing proteins function as “gene expression machines”
The proteins that carry out capping, intron splicing, and polyadenylation associate with the CTD of pol II
All three processes are carried out simultaneously
Transcription Advanced Video
https://youtu.be/SMtWvDbfHLo
Take Home Messages
RNA
Structure & Assembly (i.e. uracil, ribose)
Types of RNAs (i.e. mRNA, tRNA etc)
Gene structure (i.e. ORF, promoter)
RNA Polymerase (i.e. sigma subunits)
Transcription
Four Basic Steps (i.e. initiation, termination)
Prokaryotic v Eukaryotic Differences
Enhancers and Silencers (how they influence expression)
Post-Transcriptional Message Processing (capping)