Overproduction of proteins in E. coli.
General ref: Current Protocols in Molecular Biology, Ausubel et al Eds.
Objectives
To understand the following issues with respect to production of foreign proteins in E. coli.
1. The need to provide an E. coli promoter and ribosomal binding site.
2. The need to keep expression turned off during growth and propagation of the clone.
3. Problems related to stability and purification.
4. Use of affinity purification systems.
5. Recombinant Phage display.
Reasons for over-expressing proteins.
1. To purify large amounts for study or for sale.
2. To purify from a more convenient heterologous organism.
3. To purify away from other components of the originating organism.
4. As a prelude to in vitro mutagenesis.
Overview
The amount of care necessary to successfully express a foreign protein in E. coli depends on how much yield you need. If you're just trying to get enough to detect activity, then most any fusion to a valid E. coli promoter will probably do. For many research purposes, expressing the protein as about a percent of the bacterial protein is probably more than enough. If the gene comes with its own promoter, this may be achievable by simply putting the gene on a multicopy vector. If the gene is without a promoter (a cDNA for example), one can get this level of expression from fusing to any number of strong E. coli promoters. At this level of expression, one is mainly concerned with avoiding problems caused by some noxious property of the gene product (i.e. instability, refusal to fold, toxic to the host, mRNA degradation signals in the untranslated regions).
Other purposes, for example supporting structural studies, require high yields. Yields in excess of 40% of total E. coli protein can be obtained. To reach these yields, one should expect to optimize every step in the expression pathway. Getting high level transcription is usually not too hard. One may have to supply optimal translational start signals, to supply a transcriptional terminator, to remove some nonoptimal codons, to remove or replace untranslated regions, and to be prepared to recover large amounts of insoluble protein and refold it.
This image is from New England Biolabs advertising information for one of their affinity expression systems. The amount of the recombinant protein produced on top of total cell protein can be seen in lane 3. (http://www.ebiotrade.com/buyf/neb/newprt/A33.asp)
Typically, after induction and an expression period one spins down about a ml worth of cells, cooks them in SDS loading buffer, and then analyze by SDS PAGE. This is total protein, including insoluble inclusion bodies, membrane proteins, and soluble cytoplasmic protein. To distinguish whether the protein is in the soluble or insoluble fraction, one would open the cells by sonication, separate soluble and insoluble fractions, cook the insoluble fraction in SDS, and load each on the SDS polyacrylamide gel. In order to carry on with the affinity purification as indicated above, the protein would have to be in the soluble fraction. If it's in inclusion bodies, there are a series of washes to purify the inclusion bodies, then one would have to denature and renature before carrying on. If the protein is in the membranes, there it may be solubilized by gentle detergent treatment, eg. in Triton X 100.
When high level expression is coupled to in vitro mutagenesis, one should expect additional problems with the mutants. Mutant proteins are generally less stable, and therefore more susceptible to degradation and insolubility. Multiple mutations cause progressively more trouble.
Affinity Systems.
Most high level expression experiments in E. coli are done by making a fusion with some protein that is easily purified by affinity chromatography. Usually the fusion partner comes as part of the expression vector and will be the N terminal domain of the construct. This is so that the novel sequence added is not near the translation expression signals and is not at risk of forming secondary structure with them. Typically the vector contains a cloning site just downstream of a proteolytic cleavage site. Your insert would most easily be added by use of PCR amplification with primers designed to add a 5' extension serving but to provide the restriction site, and to supply an appropriate translational fusion.
The above figure is from Promega's advertisements for expression vectors (http://www.promega.com/vectors/bacterial_express_vectors.htm).
Note that in this typical case, one has to make the 5' primer to add the restriction site of choice and keep the fusion protein in phase:
For example:
MetLeuLysLeuMetLeuProSerGluAspSer
ATGCTTAAGCTTATGTTGCCCTCTCAGGACAGC
might be used together with the HindIII cleavage site in the Xa-3 vector (where Met Leu Pro is the beginning of the natural protein). However, the protein after factor Xa cleavage will have the N-terminal sequence Glu Lys Leu Met Leu Pro ... There are a few vectors designed to get your protein back out without extra residues on the N-terminus.
Since there is PCR involved, expect to resequence the clone to rule out inadvertent PCR-induced mutations.
General methods of boosting expression.
1. Increase copy number of the gene.
2. Fuse to more powerful transcription and/or translation signals.
(e.g. lac, lambda PL, Trp, TAC, beta-lactamase.)
Problems and potential solutions:
1. Codon preferences
· Resynthesize gene or segments thereof with favored codons, particularly codon #2, or replace runs of adjacent unfavorable codons.
· Use host strain with extra tRNAs.
2. Degradation of protein:
· lon- host.
· Fusion to another protein may stabilize small proteins.
· Use protease inhibitors after opening cells.
3. Insolubility of the expressed protein:
a. Find in inclusion bodies and solubilize by denaturation and renaturation.
b. Solubilize under nondenaturing conditions.
c. Increase solubility by use of a fusion partner.
d. Look out for missing cofactors (like metal ions) in the growth medium.
e. Co express with a chaperonin.
f. Try growth at reduced temperature.
g. Express at a reduced rate to give the protein a chance to fold.
h. Be happy with the soluble portion. (But it it's a small portion, beware that it might represent mistranslated or unfolded material).
4. Expression of the protein is toxic to E. coli - Use a tightly controlled promoter to keep expression turned off until the clone has been grown up.
5. Instability of the plasmid. This problem is particularly bad when the plasmid is maintained with ampicillin (or other antibiotic resisted by a beta-lactamase).
a. Keep expression suppressed during growth.
b. Eliminate unnecessary passages.
c. Consider a vector with a better antibiotic selection.
d. Use a recA- host.
6. Problems related to fusion partners.
· Protease to cleave the fusion domain off may cleave inside your protein.
· Cleavage at the protease cleavage site may be inhibited by presence of your protein.
· Extra residues added to protein may change its properties.
· Your protein may interfere with binding of the partner to its affinity resin.
· Your segment of mRNA may form secondary structure with the translation signals.
Examples:
Somatostatin - Itakura et al., Sci 198,1056. (1977)
Somatostatin is a peptide hormone. From the known amino acid sequence, a somatostatin gene was synthesized with E. coli codon preferences. It was expressed from the lactose promoter with and without fusion to beta-galactosidase, with the latter found to stabilize the peptide. The fusion was made after a Met residue so that somatostatin was recovered from the fusion protein after cyanogen bromide cleavage. The unfused construct produced no detectable somatostatin, and the fusion construct produced a disappointingly low yield of insoluble protein.
This was the first published attempt to mass produce a eucaryotic protein in E. coli. It mainly served to anticipate some of the problems that must be overcome for successful mass expression. The solubility problem remains something that requires a customized solution for each protein, although stable globular proteins do better than short peptides. This experiment did establish the strategy of fusing foreign peptides to a carrier protein to stabilize them. More specific means of cleaving the fusion junction are now available.
The low yield was related to a failure to adequately down-regulate the expression of the insert while the clone was being grown and propagated. The lac regulation was overpowered by the copy number of the vector (pBR322). Even though pBR322 exists in only about 20 molecules per cell, this enough to titrate out the available lac repressor. This causes partially constitutive expression of the insert, which causes selection for deletions that take out the promoter or the insert.
It is a common error for people to get a poor yield and blame it on degradation, when what really happened is that the gene or promoter was already genetically damaged in the construct by the time they looked for expression. The classic method to investigate protein degradation is to pulse label with [35]S-methionine and observe that the protein really is produced and then degraded. (Note: Look to be sure the protein has an internal met codon first; the initiator met is often removed by posttranslational processing). An alternative would be to do a western blot. For several of the affinity tag systems, one can obtain commercial antibody to the tag, which could be used for this purpose. However, with modern expression systems, the transgenic protein should be obvious on a simple Coomassie stained SDS gel. One should both run a sample of the cell lysate, and a sample obtained by cooking the insoluble cell debris in SDS. It will often be true that the major portion of the expressed protein is in the insoluble fraction.
Another symptom of genetic instability caused by expression leakage is that the yield drops off precipitously as the clone is propagated. So the clone might produce a great yield in a small pilot experiment, and then make almost nothing when scaled up to several liters. One should consider keeping back a small sample of the culture to allow examination of the plasmid DNA itself after the fact. Genetic instability will often show up as a heterogeneous set of deletions. However, you need to keep in mind that point mutations in the promoter, or even mutations in the host background can also destroy the expression of the insert.
Instability and ampicillin resistance.
The instability problem when growing expression clones is worse when trying to maintain the clone with ampicillin resistance than with other antibiotics. This is because ampicillinase (beta-lactamase) leaks out of the cells while they are growing in liquid culture and destroy the ampicillin in the culture fluid. After that, bacteria that lose the plasmid tend to overgrow the culture. A typical experience goes as follows:
1. The clones behave as expected on an ampicillin plate.
2. Small scale cultures produce the protein as expected.
3. An overnight preculture is prepared to start a large scale growth.
4. When the large culture is inoculated the next day, the optical density increases only slightly, and then decreases. To the practiced eye, there is an accumulation of stringy debris indicative of lysis.
5. The effect is non reproducible. Sometimes the large scale culture grows and sometimes it lyses. When it does grow, there can be a long lag phase, and the protein yield is typically less than anticipated from the small scale culture.
The explanation is that the ampicillin is cleared from the preculture and then ampicillin sensitive bacteria that have lost the plasmid overgrow to various degrees by morning. When the preculture is used to inoculate media with fresh ampicillin, the bacteria begin to grow. But they cannot synthesize cell wall due to the ampicillin, so they lyse.
This problem shows similar symptoms to a T1 phage infestation. T1 is a bacteriophage of E. coli that survives dehydration, and spreads as an airborne contaminant. It causes aggressive lysis, producing plaques on plates the size of a quarter. T1 infestation is rare, but when a culture gets accidentally infected, lyses, and then opened, it can spread enough airborne contamination throughout the lab or even an entire building that no one can grow E. coli cultures for years afterwards. This forces everyone to derive T1 resistant versions of all of their strains. This is a tremendous setback when it happens, hence everyone is advised upon observing a culture of E. coli to lyse unexpectedly to autoclave it without opening it. Clearly, it is inadvisable to have a background of cultures lysing unexpectedly due to this ampicillin selection problem because it reduces vigilance against the T1 infestation problem.
When working with an expression plasmid based on ampicillin selection, special precautions are required to maintain the selection. The growth is generally done more continuously to avoid precultures going to saturation. However the growth may still be done in stages with inoculation into fresh
medium.
Some biotech companies are promoting expression vectors based on different antibiotics to counter this effect.
Human growth hormone - Goedel, et al. (1979) Nature 281, 544.
This is probably the first published successful mass expression of a eucaryotic protein in E. coli. Human growth hormone is a 191 residue peptide hormone. The first 24 codons were resynthesized with an Eco RI site upstream of the AUG convenient for joining to the lac promoter and ribosome binding site. The other end was made as a Hae III site. The synthetic segment was first cloned and sequenced in an independent vector to verify the correct sequence.
The cDNA was cloned as a Hae III fragment which omits the first 24 codons. The two parts of the gene were then ligated together an joined to an Eco RI site downstream of two lac promoters.
They used lac iQ (overproducer of lactose repressor) to get tighter control over expression and downstream transcriptional fusion to the tet resistance gene of pBR322 to guard against deletion. Upon induction, they got 20% of cellular protein as HGH.