Bacterial Operons

The Bio 6B lab explores bacterial plasmids and operons through a set of connected experiments over multiple lab days. The concepts behind these labs are presented in a set of related pages on this site:

In addition, there are multiple pages for the experimental methods, which you'll find in the menus.

The control of the lac and trp operons in E. coli is a classic topic in biology; it’s featured in every textbook, and for good reason. Unfortunately, I find that many students learn the mechanisms of these operons without having a chance to reflect on why those mechanisms are important, and why these particular operons are so important in biology books. On this page, I want to focus on the context first, and then the details. If you just want to quickly compare and contrast the trp and lac operons, you can skip to the trp vs. lac section near the bottom of this page.

Gene expression overview

Ideally, you'd already be familiar with the basic concepts of transcription and translation, which are introduced in Campbell, Chapter 17: Gene Expression: From Gene to Protein. In case we haven't covered that chapter in lecture yet, I'll give a brief overview here. The operon concepts described on this page are featured in detail in Campbell, Chapter 18: Regulation of Gene Expression.

"Gene expression" refers to the process of producing a gene product, in this case a protein. (For some genes, the final gene product is an RNA, not a protein.) For protein-coding genes, gene expression has two phases:

Transcription
Making an RNA copy of a specific region of DNA.  
Translation
Using the nucleotide sequence information in the RNA to make a polypeptide (a polymer of amino acids that can be folded to become a protein). 

This diagram shows the basic steps of transcription and translation:

Transcription and translation

  1. Transcription begins when the enzyme RNA polymerase binds to a promoter, which is a specific sequence of nucleotides in the DNA. Once RNA polymerase binds, it separates the two DNA stands and begins to make an RNA copy of one strand.
  2. The region of DNA that gets copied into RNA is called the transcription unit. It begins at a start point specified by the promoter.
  3. RNA polymerase dissociates from the DNA after it has copied the entire transcription unit, and the RNA is released.
  4. The entire RNA molecule is called a transcript. For the examples on this page, the RNA codes for a protein, so the transcript is called a messenger RNA or mRNA. In bacteria, the mRNA can be translated immediately, even before transcription is complete. Each mRNA must contain at least one start codon and one stop codon. The region of the RNA transcript that codes for a protein is called the coding region; the transcript also contains untranslated regions that don't code for protein.
  5. Translation begins when a ribosome binds to a start codon, a specific sequence of three nucleotides in the mRNA. The start codon determines the reading frame, which is the way that the nucleotides are grouped into 3-nucleotide codons. This grouping, beginning with a start codon and ending with a stop codon, is called an open reading frame.
  6. As the ribosome moves along the mRNA, it produces a polypeptide (a chain of amino acids), with each codon in the mRNA coding for one amino acid in the polypeptide.
  7. When the ribosome reaches the stop codon, translation is terminated and the polypeptide is released. The polypeptide begins to fold on itself even before translation is complete; after translation is terminated, the polypeptide may be further folded and modified to become a complete functional protein.

In eukaryotes, transcription happens inside the nucleus (where the DNA is), while translation occurs in the cytoplasm (where the ribosomes are). For the bacterial examples on this page, there is no physical separation between transcription and translation. In fact, ribosomes can attach to RNAs before transcription is completed so an mRNA can be translated while it's still being transcribed. After transcription, mRNAs can be translated repeatedly, until they are destroyed.

The process of gene expression is tightly regulated. There are multiple possible ways to regulate gene expression, but the operon examples on this page are regulated by controlling whether RNA polymerase is able to bind to the promoter.

Regulating an anabolic pathway: the trp operon

Cells control their activities by way of metabolic pathways. For example, if an E. coli cell needs to produce the amino acid tryptophan, the synthesis process will occur as a pathway, with several steps catalyzed by specific enzymes. (Humans can’t do this pathway; for us, tryptophan is an essential amino acid in our diet, since we can’t synthesize it.)

Tryptophan synthesis is an example of an anabolic pathway, since it’s synthesizing something. This pathway, like all biochemical pathways, needs to be regulated. Bacterial cells produce tryptophan and use it in protein synthesis. The cells produce tryptophan when the tryptophan level is low, and stop producing it when there is plenty of tryptophan. In other words, this pathway is regulated by feedback inhibition, in which the accumulation of the product inhibits the synthesis pathway. Thus, cells produce only the amount of tryptophan that they can use. There are two different mechanisms of feedback inhibition for this pathway in E. coli.

Feedback inhibition of tryptophan synthesis pathway

 

Enzyme activity
Tryptophan binds to an allosteric site on the enzyme and inhibits its activity. As you have seen before, many biochemical pathways are regulated by allosterically regulating an enzyme at the beginning of the pathway. We could call this post-translational control: the gene for the enzyme is transcribed and then translated, and the enzyme is allosterically inhibited after translation. This was covered in the enzymes and metabolism unit.
Gene expression
If the cell doesn’t need more tryptophan, then there is no need to produce more copies of the enzymes that synthesize tryptophan. The cell responds by down-regulating (turning off) the expression of the genes that code for those enzymes. Down-regulation occurs when tryptophan binds to a repressor protein. To understand how this transcriptional downregulation works, we need to look at the structure of the trp operon, which I'll describe below.

In both cases, the accumulation of tryptophan stops the production of more tryptophan.

The trp operon

The genes that encode the enzymes for the tryptophan synthesis pathway in E. coli are in the the trp operon. These enzymes work together; there is no need to produce any one of these enzymes without all the others. The genes that encode the enzymes are all together on the chromosome of E. coli, and they are all transcribed together as a single messenger RNA, as shown in this diagram:

trp operon structure.

Expression of the trp operon occurs in a series of steps:

  1. The complete trp operon includes the transcription unit, which codes for the polypeptides, and some regulatory regions of DNA (the promoter and operator). RNA polymerase binds tightly to the promoter to begin transcription. I'll describe the operator in the next section.
  2. The transcription unit is polycistronic, meaning that one mRNA codes for several different polypeptides at the same time.
  3. The completed mRNA transcript contains five open reading frames (ORFs, or protein coding regions), each beginning with a start codon and ending with a stop codon. Each open reading frame codes for one of the five proteins in the tryptophan synthesis pathway. Each start codon provides a place for a ribosome to attach and begin translation, so separate polypeptides are produced.
  4. When the the polypeptides are complete, they fold and associate into three enzyme complexes, as shown. There are three enzyme complexes in the trp pathway, but two of them involve more than one polypeptide, so there are five polypeptides. These enzymes catalyze the steps of the tryptophan synthesis pathway.

Due to the polycistronic operon structure, the genes for the trp enzymes are coordinately controlled, meaning that they are all regulated together.  In contrast, most eukaryotic genes are monocistronic, meaning that one mRNA  has only one open reading frame and codes for only one polypeptide at a time). The trp operon is regulated to ensure that the enzymes of the trp pathway are produced only when the level of tryptophan is low.

Transcription of the trp operon is controlled by a repressor protein

The expression of the trp operon is controlled by a repressor protein (called trpR) that blocks RNA polymerase from binding to its promoter when the concentration of tryptophan is high. Tryptophan allosterically activates the repressor protein, causing the repressor to bind to the operator region of the DNA, inside the promoter. 

trp operon structure and repression

Here's how this works:

  1. The repressor protein trpR is always present in the cell, but is inactive by itself. There is a separate gene for trpR, outside the trp operon.
  2. Tryptophan accumulates in the cells when it is being synthesized faster than it's being used to make new proteins. Tryptophan binds to the repressor protein.
  3. The repressor protein trpR is allosterically activated by tryptophan. In its active state, trpR binds tightly to the operator region of the DNA, which is inside the promoter region.
  4. When the active repressor is bound to the operator, RNA polymerase is blocked from the promoter, and transcription does not occur. Eventually, the supply of tryptophan will be depleted, trpR will  lose its bound tryptophan, and it will no longer be allosterically activated. Transcription will resume.

In this example, RNA polymerase can bind to the promoter and start transcription whenever it’s not blocked by the repressor protein. This isn’t true for all genes; in eukaryotes, as well as some bacterial genes, RNA polymerase needs help from other proteins to attach to the promoter.

Since transcription is controlled by a repressor protein, this is considered a form of negative gene regulation.

The basic explanation of the trp operon, which encodes an anabolic pathway, is simple: the genes are expressed until the end product, tryptophan, accumulates within the cell; then the genes are shut off. This type of regulation applies to many different catabolic pathways. Catabolic pathways work somewhat differently.

Regulating a catabolic pathway: the lac operon

The lac operon encodes a set of proteins involved in the pathway for catabolizing the sugar lactose. Lactose is a sugar that’s abundant in milk, but rare otherwise. E. coli bacteria live in the intestines of mammals; if lactose is present, it's because the mammal that is hosting the E. coli drank some milk. For E. coli, a particular set of proteins is required to make use of lactose. The lac operon provides a mechanism for allowing cells to produce these proteins only when needed. The lac operon is highly expressed only when two conditions are met:

  • There is lactose present for the cell to catabolize.
  • The cell needs the energy it could obtain by catabolizing the lactose.

There are two separate mechanisms of gene expression control for the lac operon.

Negative gene regulation

Transcription of the lac operon, like the trp operon, is controlled by a repressor protein. In this case, the repressor protein is active by itself.

  1. The process of lac operon expression begins when lactose is present in the cell. The lactose is a food source; it comes from outside the cell (unlike tryptophan, which is synthesized in the cell). Some lactose gets converted (isomerized) to a form called allolactose. For simplicity, I will just refer to it as lactose.
  2. Unlike the trp repressor, the lac repressor protein binds to the operator and blocks transcription all by itself. Lactose allosterically inactivates the lac repressor protein, so it does not bind to the operator region of the DNA.
  3. RNA polymerase is allowed to bind to the promoter and begin transcription when lactose is present.
  4. The lac operon, like the trp operon, is polycistronic; it contains three open reading frames (ORFs).
  5. The proteins encoded by the lac operon function in lactose metabolism and transport.

As long as no lactose is present, the repressor protein blocks transcription.

Positive gene regulation

For the lac operon, removing the repressor isn’t enough to cause the operon to be highly expressed. The reason is simple: lactose is a good source of energy, but glucose is better. Glucose can be catabolized without the extra proteins encoded by the lac operon, but lactose catabolism requires all the enzymes of glycolysis in addition to the specialized lac operon proteins. Therefore, it's not beneficial for the cell to express the lac operon when the glucose level is high. Thus, the regulation of the lac operon has one more level of regulation, ensuring that the lac operon is highly expressed only when lactose is present and glucose is not.

Positive regulation of the lac operon by CAP

This process is controlled by the messenger molecule cAMP (cyclic adenosine monophosphate), which is derived from ATP. When glucose level is low and the cell is starved for energy, cAMP accumulates in the cell. The cAMP acts as a signal to upregulate the expression of the lac operon.

  1. First, the lac repressor protein must be allosterically inactivated by lactose, as described above.
  2. If glucose level is low, the cell begins to produce cAMP, which signals the cell's low-energy state.
  3. cAMP allosterically activates CAP (catabolite activator protein).
  4. The active CAP/cAMP complex attaches to a CAP binding site on the DNA, adjacent to the promoter.
  5. When active CAP is bound to the CAP site, it stabilizes the binding of RNA polymerase to the promoter. This stabilization greatly increases the rate of transcription, upregulating gene expression.

This process works partly because RNA polymerase doesn't bind tightly to the lac operon promoter by itself. Even in the absence of an active repressor, RNA polymerase will only occasionally bind to the lac promoter long enough to start transcription. The level of gene expression will be low. (Keep in mind that the same RNA polymerase is used for both the trp and lac operons, but the nucleotide sequences of the promoters are different, which influences polymerase binding.)

If there’s plenty of glucose available, the cAMP level will be low and the lac operon will be expressed at a very low level. The lac operon will only be fully upregulated if glucose concentration is low and lactose is present.

CAP = CRP

Unfortunately, there are two different names for the same protein. I think the most commonly used name is CAP (catabolite activator protein). Because it's produced as a result of catabolism, cAMP is called a catabolite, and when it binds to the CAP protein, the CAP/cAMP complex activates numerous genes for catabolic processes. This same protein is called CRP (cAMP Receptor Protein) in Campbell. This name makes sense, but seems to be less widely used. I think you're better off learning the most commonly used terms, so I'll try to use CAP consistently.

What is a gene?

Is an operon one gene, or a set of genes?

Unfortunately, there’s no universally accepted definition of a gene, so there's not a clear answer to this question. One traditional definition of a gene is that it's a segment of DNA that codes for one protein, but this is wrong in two basic ways. First, there are many kinds of RNA that don't code for proteins: tRNA, ribosomal RNA, microRNA, etc; however, those RNAs are still transcribed from genes. And second, some genes code for more than one protein. Aside from the polycistronic operon examples, many eukaryotic mRNAs can be edited to code for more than one protein.

A more modern definition is that a gene is a transcription unit: a segment of DNA that gets transcribed to make an RNA. One problem with this definition is that it leaves out a lot of important genomic information. Every gene includes a transcription unit. However, the gene won't be transcribed unless it also contains some regulatory regions. At a minimum, there must be a promoter that provides a place for RNA polymerase to bind to the DNA. The transcription start point (the beginning of the transcription unit) is inside the promoter. Most genes also rely on other regulatory regions that are outside the transcription unit (such as operators in the operon examples on this page, or enhancers in eukaryotic genes). Should the regulatory regions be considered part of the gene? Unfortunately, the word "gene" is not precisely defined as to whether it includes the regulatory regions.

It's a little frustrating that the word at the heart of genetics doesn't have a precise definition. The word "gene" is still useful, but when you need to be more precise, you'll need to use more precise words.

Summary: trp vs. lac operon

The similarities between these two operons are more fundamental than the differences:

  • Both produce polycistronic mRNAs, which code for multiple proteins. Each polycistronic mRNA encodes a set of proteins that work together in a biochemical pathway.
  • Both use negative gene regulation by way of a repressor protein. When the repressor protein binds to the operator DNA, RNA polymerase is blocked from binding to the promoter, so transcription is repressed. Thus, both operons are regulated by controlling the initiation of transcription.

The differences between these two operons are related to the functions of the biochemical pathways involved:

  • The trp operon encodes proteins that make up an anabolic pathway. The cell keeps producing tryptophan until there is enough. At this point, the tryptophan acts as a co-repressor, allosterically activating the repressor protein. The repressor protein is inactive by itself, until it's allosterically activated.
  • The lac operon encodes proteins that make up a catabolic pathway. The cell produces these enzymes only if there is some lactose present to catabolize. The repressor protein is active by itself, until it's allosterically activated by lactose. In addition, strong up-regulation of this operon occurs only when the cell lacks glucose as an alternative.

It can be challenging to remember the details of these operons, but it's a lot easier if you start by asking why they would use different mechanisms.

Why are these two operons featured in every bio textbook?

There are two reasons: First, these two operons exemplify the regulation of enzymes in catabolic and anabolic pathways, which together account for all of biochemistry. There are many other operons in bacterial genomes, and they tend to resemble either the trp operon (anabolic pathways) or the lac operon (catabolic pathways). Second, these two operons represent classic discoveries in the history of biology. When the mechanisms of the lac operon of E. coli were first described by Francois Jacob, Jacques Monod, and Andre Lwoff, starting in the 1950s, it was the first time that anyone had figured out how gene expression was controlled in any organism.

Comparing bacterial operons to eukaryotic genes

Like bacterial operons, your own genes are often regulated by controlling the initiation of transcription. However, eukaryotic chromosomes are far more complex than those of E. coli. Eukaryotic DNA is extensively complexed with histones and other proteins, allowing for additional mechanisms of gene regulation, including epigenetic mechanisms. In addition, eukaryotes also make extensive use of gene regulation mechanisms that occur after transcription, such as RNA editing. Finally, most eukaryotic genes aren't polycistronic.

You'll explore the mechanisms of eukaryotic gene regulation later this quarter.

 The GFP operon in pGLO

In the Bio 6B lab, you'll use the pGLO plasmid for a series of experiments. pGLO contains an engineered operon that controls the expression of a gene encoding Green Fluorescent Protein (GFP). The GFP operon was derived from a naturally-occurring operon that encodes proteins involved in the catabolism of a sugar called arabinose. The arabinose operon is regulated much like the lac operon, and the GFP operon used in pGLO is regulated the same way. In the pGLO lab experiments, you'll be able to take advantage of these mechanisms to control the production of Green Fluorescent Protein in E. coli cells that you genetically transform with pGLO. See the pGLO page for an explanation of how that operon works.

Review

Terms & concepts

There are a lot of words here, but memorizing them isn’t the same thing as understanding how operons work. The terminology is important, but only when you can put it in the context of the ways that operon regulation solves problems for living organisms. I’m defining some of the terms, but others I think you’ll figure out.

  • Allosteric regulation. Covered in detail in Chapter 8: Metabolism.
  • Anabolic pathways & catabolic pathways. See Chapter 8: Metabolism.
  • cAMP
  • CAP (Catabolite Activator Protein); also called CRP (cAMP Receptor Protein). Why do both these names make sense?
  • Coding region
  • Coordinate control
  • Down-regulation (and up-regulation). Gene expression control is usually more subtle than simply turning genes on and off, so molecular biologists often say that a gene is "upregulated," rather than "turned on."
  • Encode. To say that a gene encodes (or codes for) a protein means that the nucleotide sequence of the gene contains the information that determines the amino acid sequence of the polypeptide.
  • Feedback inhibition
  • Gene. Why is this term difficult to define?
  • Gene expression
  • lac operon
  • mRNA
  • Negative and positive gene regulation
  • Open reading frame (ORF). You'll do an assignment based on this concept; see the Open Reading Frames page for more information.
  • Operator
  • Operon
  • Polycistronic vs monocistronic
  • Promoter
  • Repressor protein
  • Ribosome
  • RNA polymerase
  • Start codon and stop codon
  • Transcript
  • Transcription unit
  • Transcription
  • trp operon

Review questions

  1. How are the lac and trp operons similar? How are they different?
  2. Compare and contrast the repressor proteins for the trp and lac operons. Why are they different?
  3. CAP. This is found in many operons. What sorts of operons would you expect to contain the CAP binding site?
  4. How many start codons would there be in the trp operon? How many stop codons? How many open reading frames? What about in the lac operon? (Compare these to the GFP operon in pGLO.)
  5. How tightly does RNA polymerase bind to the trp promoter? How tightly does it bind to the lac promoter? Why does this matter?
  6. Why is the level of glucose relevant to the expression of the lac operon (and not the trp operon)?
  7. Why does the lac operon use both positive and negative gene regulation, while the trp operon uses only negative regulation?
  8. Finish the sentence: RNA polymerase binds to the _______ region to start the process of ________.
  9. Finish the sentence: A ribosome binds to the _______  to start the process of ________.
  10. Which operon(s) are regulated by feedback inhibition?

References and further reading

Reading in Campbell

Chapter 17: Gene Expression: From Gene to Protein. Introduces transcription and translation.

Chapter 18: Regulation of Gene Expression. The beginning of this chapter describes the trp and lac operons.

Other sources

Prokaryotic Gene Regulation from OpenStax Biology, a free online textbook. Includes videos.

PDB-101. This site is all about proteins, and has some excellent images and descriptions of some of the proteins involved in the trp and lac operons:

Beyond Bio 6B

Brave Genius: A Scientist, a Philosopher, and Their Daring Adventures from the French Resistance to the Nobel Prize by Sean B. Carroll. This fascinating book describes the intersecting lives of molecular biology pioneer Jacques Monod (co-discoverer of the lac operon) and philosopher Albert Camus. The two were in the French Resistance during World War II, and remained friends throughout their lives. Way beyond Bio 6B, but highly recommended. Biologists should know some history, and this is a great story. The author, Sean B. Carroll, is a professor of molecular biology and has written several other excellent books.

 

A- A A+