Proteins can be purified from other cellular components using a variety of techniques such as ultracentrifugation, precipitation, electrophoresis, and chromatography. This process usually begins with cell conditional media (for secreted proteins), or cell lysates in which a cell's membrane is disrupted and its internal contents released into a crude extract (for intracellular proteins). The resulting starting materials can be purified by fractionation of the extracellular or intracellular components. Various types of chromatography are used to isolate the protein of interest based on its physical and chemical properties, such as size, net charge, hydrophobicity and affinity binding group. Genetic engineering is often used to add tags or fusions to help purify specific proteins from complex mixtures. For example, His-tag and Fc-fusion are the most commonly used techniques in protein preparation (see the list and discussion in the previous topic).
The objective of a protein purification process is to yield the maximal amount of functional protein with minimal contaminants. The purification procedure should be optimized to complete the process using the minimum number of steps in the least amount of time and ensure the highest yield of functional protein. Another important consideration is the downstream application of the purified product. Both the quantity and purity must be sufficient for experimental analysis. In addition the properly-folded and functional protein is normally required for downstream studies. During purification and subsequent storage, many events can occur that affect protein quality and stability, including protein unfolding, aggregation, degradation, and loss of activity. Careful planning to purify protein as quickly as possible under the optimal stabilizing conditions will maximize the chance of a successful purification process.
Chromatography in Protein Purification
The four major types of column chromatography, including affinity chromatography, ion exchange chromatography (IEX), hydrophobic interaction chromatography (HIC), and size exclusion chromatography (SEC) are routinely employed in protein purification. Most protein purification processes require the use of one or multiple of these four types of chromatographic procedures to yield the homogeneity required for downstream analysis and application. Choosing the appropriate chromatographic type(s) and the order of these procedure(s) is essential in optimizing a protein purification process. By analyzing the sequence of a recombinant protein, unique characteristics can be identified that might help with its purification. For example, the molecular size and net charge of a protein (at a specific pH) can be determined, along with the finding of large stretches of hydrophobic residues to help with the selection of chromatographic types (see the table below).
Chromatography Type |
Separation Mechanism |
Binding Condition |
Elution Condition |
Affinity |
Interaction with an immobilized ligand |
No competing ligand |
Competing ligand |
Ion Exchange |
Net surface charge |
Low ionic strength |
High ionic strength |
Hydrophobic Interaction |
Hydrophobicity |
High ionic strength |
Low ionic strength |
Size Exclusion |
Hydrodynamic radius or size |
n/a |
n/a |
Affinity Chromatography
Affinity chromatography relies on the specific and reversible binding of a protein to a matrix-immobilized ligand. The ligand can either bind directly to the protein of interest or to a tag that is genetically engineered to attach to the protein. Proteins can also be affinity purified in a non-selective manner when the immobilized ligand binds to a group of proteins with similar binding behaviors. Affinity chromatography is often the most robust purification procedure and typically used in early stages of a purification process. Depending on the downstream application, affinity purification might be the only chromatographic step required to achieve adequate purity. Two most commonly affinity chromatographic methods in recombinant protein purification are immobilized metal (Nickel) and protein A/G affinity chromatography.
Immobilized Metal Affinity Chromatography
Six or more tandem histidines form a metal-binding structure. Thereby proteins tagged with poly-histidines (His-tag) can be purified by affinity to nickel or cobalt using Immobilized Metal Affinity Chromatography (IMAC) columns. His-tag has become the most commonly used method for protein purification and preparation in almost all expression systems, including the mammalian systems. IMAC is the widely employed because it has the particular advantage of specific binding and eluting conditions that are less damaging to the protein than other methods. For example, a nickel-nitrilotriacetic acid (Ni2+-NTA) chelate can be covalently attached to a cross-linked agarose matrix for the selective purification of His-tagged recombinant proteins (see the diagram below).
His-tag can coordinate to the Ni2+ ion by replacing the bound water molecules (indicated by two red arrows). Imidazole can then be used to elute the protein, through its ability to coordinate with the Ni2+ ion and displace the bound protein.
Antibody Affinity Chromatography
Antibodies contain several regions of interest: Fc (fragment, constant) that is highly homologous for all antibodies of the same class, and Fab (fragment, antigen binding) that makes direct contact with the specific antigen. Specific classes of antibodies can be non-selectively purified through affinity chromatography in which the ligand, Protein A, G, or L, is conjugated to a cross-linked matrix. For example, protein A or G affinity matrix allows the rapid and efficient purification of IgGs and derived Fc-fusion proteins. Typically one-step of protein A or G affinity chromatography can yield products with more than 90% purity. Protein A, G, or L can also be used to purify specific proteins. In this case the antibody will serve as an intermediate ligand providing selectivity for its antigen (see the diagram and discussion below).
Antibodies are often purified based on the highly specific interaction that occurs between an antibody and its antigen or immunogen. For example, a peptide containing the antigen can be coupled to a matrix for specific binding of the antibody. Lowering the pH of the elution buffer disrupts the antibody and peptide interaction to release the bound antibody. This method is routinely used in the isolation of specific antibodies from polyclonal anti-sera. Using a similar mode of binding (e.g., using an immobilized antibody on protein A or G), the antigen-binding (Fab) region of the antibody is still available for binding to its specific antigen. Thus, a specific target protein or antigen can be purified based on the specific interaction with the coupled antibody and eluted with competing antigen, such as a peptide containing antibody-recognizing epitope.
In many cases, purity is often not enough using just one or a few steps of affinity purifications. It is important to add additional purification steps, such as ion exchange and size exclusion chromatography.
Ion Exchange Chromatography
Ion exchange chromatography (IEX) separates proteins based on their net surface charge, through electrostatic interactions that occur between proteins and a charged matrix. Two types of IEX exist:
Ion exchange chromatography is commonly used as an intermediate step in a protein purification process; however, it can yield high resolution for some proteins when used earlier or later during the purification. All proteins exhibit a net charge that depends on the amino acid composition of the protein and the pH of buffer used.
Hydrophobic Interaction Chromatography
Hydrophobic interaction chromatography (HIC) separates proteins based on their hydrophobicity, and is often used as an intermediate step in a purification process. Proteins are bound to a matrix in a high ionic strength buffer and HIC is typically performed immediately after ion exchange chromatography with no buffer exchange or dilution required. HIC is also commonly performed after an ammonium sulfate precipitation, a procedure that can be used to quickly concentrate or remove proteins by precipitating some, but not all, proteins with salt. HIC can also be applied in early steps of a purification process or as a final step in the removal of trace impurities from the protein of interest.
Size Exclusion Chromatography
Size exclusion chromatography separates proteins by their hydrodynamic radius, a characteristic determined by both the size and shape of the protein. Unlike the other chromatographic procedures described previously, proteins do not bind to a cross-linked matrix in SEC. Instead, proteins are separated by the speed at which they migrate through the matrix. SEC is most often used in the final steps of a purification process due to its ability to differentiate between different forms of a protein. For example, oligomeric, unfolded, and degraded forms can all be separated from the native protein, while simultaneously exchanging the buffer system. Therefore SEC has also been used as a faster and reliable method for buffer exchange.
G&P Biosciences Protein Production Capability
G&P Biosciences benefits greatly from a number of technical advantages, providing the best chance of expressing a protein in soluble, active form with high yield. We have experience and expertise in all stages of protein expression from codon optimization, vector/promoter choice, tag/fusion use, culture medium, additives, feedings, host cell engineering, stable cell line creation to protein purification and formulation (see a typical workflow below). We develop unique, high throughput, mammalian cell expression systems, allowing exploiting every technique to improve the productivity for recombinant protein expression. With innovative solutions for varying conditions & parameters, we can quickly deliver recombinant proteins with the highest qualities possible to our clients in a short turnaround time.
We offers custom recombinant protein expression and purification using mammalian cell expression systems (in particular HEK293 and CHO) for small or large quantities (top to grams) of protein production. Our unique mammalian cell expression systems that are based on our high-throughput gene expression technologies, exhibit significantly improved success rates and production yields. Our custom-made recombinant proteins are expressed in mammalian cells growing under animal component-free, antibiotic-free condition, purified to the highest homogeneity possible, and formulated into the carrier-free or customer-specified buffer suitable for downstream in vitro & in vivo assays (visit our "Protein Products" and "Protein Services" pages to learn more and order).
« Go back to the previous topic: Protein Classification and Expression
« Return to Technical Support mainpage
Protein Classification
The human proteome are encoded by ~20,000 genes, which make up only 1–2% of our genome. Proteins can be broadly divided into three main classes: globular proteins, fibrous proteins, and membrane proteins according to their structural features. Most globular proteins are soluble and many of them are enzymes. Fibrous proteins are often structural, such as collagen – the major component of our connective tissues. Membrane proteins often serve as the receptors or channels for specific extracellular molecules to help them transmit signals or pass through the cell membrane. Many membrane proteins play important roles in cell signaling, immune responses, cell adhesion and migration as well as many other pathways.
Proteins can also be classified by the class and function as well as the biological process & pathway they participate in (see the pie chart above for an example for the classification of human proteome by class). Among total 22340 proteins encoded by our genome, about 37% (8309) are enzymes, which represent the most abundant class of proteins, including hydrolase, transferase, oxidoreductase, protease, kinase, ligase, phosphatase, lyase and isomerase. About 22% (4885) are regarded as membrane proteins, including cell junction protein, receptor, transporter, cell adhesion molecule, extracellular matrix protein, membrane traffic protein and transmembrane receptor regulatory/adaptor protein. There are 2724 (~12%) of nucleic acid binding proteins and 2041 (~9%) of transcription factors. Interestingly only ~6% (1286) are cytoskeletal and structural proteins.
» Go to the next topic: Post Genomic Recombinant Protein Expression
» Go to the next topic: Post Genomic Recombinant Protein Purification
« Return to Technical Support mainpage
In the post-genomic era, the continuing growth in the need of recombinant proteins leads to the development of many strategies and techniques to improve recombinant protein production yield, purity, and quality. For example, with advanced molecular biology and chemical synthesis of DNA, it is now possible to synthesize almost any gene of interest for the expression of a recombinant protein. Moreover, in order to improve the yield, it is possible to optimize gene codons for the selected expression system and host species. However, the optimal expression conditions differ from one protein to another due to the intrinsic variability in quantity, solubility, stability and functionality. Thus it is crucial for any laboratory of recombinant protein production to use techniques for testing many different conditions of expression as possible in a timely manner.
The bacterial system is commonly used for recombinant protein expression. In cases where post-translational protein modifications are required for proper activity or folding, researchers prefer to use an eukaryotic protein production system such as mammalian cells. The choice of the system depends ultimately on the nature of the protein to be produced and the application in which the recombinant product will be used. Mammalian cell expression systems provide the most comprehensive post-translational modifications, which are critical to the protein function. For example, therapeutic proteins including antibodies (composed of 82-96% protein and 4-18% carbohydrate) often require glycosylation that can only be achieved in a mammalian expression system.
Protein Fusions and Epitope Tags
When expressing and purifying large quantities of soluble proteins, the major obstacles often include poor yield and solubility due to the formation of aggregates. There are a number of advances in recombinant protein expression, including the optimization of expression vector and host system, improving transcriptional level and stability, increasing translation with host-specific codon optimization, maximizing the use of secretory pathways, co-expression with chaperones, and decreasing proteolytic degradation. In many cases, protein fusions or peptide (epitope) tags can simplify the detection and purification, improve the solubility, and/or promote the proper folding of the protein of interest.
Protein fusions refer to those with more than a dozen of amino acids, such as GFP, while epitope tags refer to the short peptides, such as HA, Myc, and poly-histidine (His-tag). Among them many are used as affinity tags for protein purification due to their ability of binding to a specific chemical ligand or antibody. Nowadays almost all recombinant proteins are expressed through fusions or tags. The localization and expression of protein of interest, even without a suitable detecting antibody, can be monitored through the tag or fusion. For example, tags or fusions can be used for protein detection by Western blot, ELISA, IP, flow cytometry, immunohistochemistry, and fluorescence microscopy, while some of them can be further utilized for protein purification (see a list of commonly used protein fusions and tags below).
Tag / Fusion |
Amino Acid |
Detection Method |
Purification Method |
His-tag |
HHHHHH or more |
Antibody |
Ni2+ or Co2+ / Imidazole |
Fc |
~230 aa |
Protein A/G, Antibody |
Protein A or G |
HA |
YPYDVPDYA |
Antibody |
Antibody / Peptide |
Myc |
EQKLISEED |
Antibody |
Antibody / Peptide |
Flag |
DYKDDDDK |
Antibody |
Antibody / Peptide |
Streptag 2 |
WSHPQFEK |
Streptavidin |
Streptavidin / Biotin |
GFP |
~220 aa |
Fluorescence Antibody |
Not used |
GST |
~220 aa |
Antibody |
Glutathione |
SUMO |
~100 aa |
Antibody |
Not used |
MBP |
~400 aa |
Antibody |
Amylose / Maltose |
Thioredoxin |
~100 aa |
Antibody |
Not used |
HRP |
~320 aa |
Substrate (TMB, ECL) |
Not used |
Proper design and judicious use of the right fusion or tag can enhance the stability and solubility of the protein of interest. There are a few basic considerations during the design of the proper tag or fusion. For example, the sequence of the tag/fusion must be in-frame with that of the protein of interest. Codon usage should be considered when different host species is used. The tag or fusion can be placed on the either end of target protein. Some fusions or tags can be used for easy in situ detection, such as HRP fusion (a mammalian expression optimized version of the enzyme is offered by G&P Biosciences for both intracellular and extracellular expression). Some can be used in tandem to increase desired features or applications (e.g., GFP & His-tag together for detection & purification). Some can be explored to solve significant protein expression problems, e.g., extending half-lives (e.g., Fc fusion) and increasing the protein solubility and folding (e.g., SUMO, MBP, thioredoxin, or GST fusion).
Cleavage Proteases
It is often not necessary to remove the tag/fusion, especially for the epitope tags due to their small size. In addition many Fc-fusion proteins are successfully used in clinic as therapeutic agents. However, a linker can be easily added between the tag/fusion and protein of interest to enable, e.g., easy cleavage of the tag/fusion, without interfering with the structure and activity of target protein. The key criterion for selecting a proper cleavage site and protease is that the protease must be very specific to a tag or a linker, and also the target protein does not have the recognition site.
Thrombin:
Due to its high specificity, Thrombin is one of most commonly used tag/fusion cleavage proteases. It cleaves after the arginine residue in the consensus cleavage site, Leu-Val-Pro-Arg|Gly-Ser, (L-V-P-R|G-S),which is often included in the linker region, and upon cleavage, there are residual amino acids in the fusion partner protein.
Factor Xa:
Factor Xa is also a commonly used protease in removing the tag/fusion. It cleaves after the arginine residue in its preferred cleavage site Ile-(Glu/Asp)-Gly-Arg|, (I-(E/D)-G-R|), and thus can completely remove the tag from the N-terminus of the target protein. However it sometimes cleave at other basic residues, depending on the conformation of the target protein. It does not cleave a site followed by proline or arginine.
TEV:
Tobacco etch virus (TEV) protease has a specific recognition site, and cleaves at very high precision. It cleaves between Gln and (Gly/Ser) in the consensus site, Glu-Asn-Leu-Tyr-Phe-Gln|(Gly/Ser), (E-N-L-Y-F-Q|(G/S)). Its activity is not inhibited by low concentration of urea, which can prevent protein aggregation, and increase protein solubility.
Enterokinase:
Enterokinase recognizes the D-D-D-D-K| and cleaves at the carboxyl site of lysine. FLAG-tag DYKDDDDK contains such a cleavage site.
In addition to above protease, SUMO can be cleaved by SUMO protease, which recognizes the confirmation of SUMO rather than a specific amino acid sequence. G&P Biosciences offer the ability to cleave tags or fusions used for purification as an integral part of the choice of a fusion or tag. We prefer to utilize the TEV cleavage (the recognition site can easily be integrated during gene synthesis and cloning between the tag/fusion and the protein of interest ) followed by affinity purification and an on-column cleavage. We have a panel of expression vectors with TEV cleavage sites built-in, allowing high throughput cloning and expression of genes of interest.
Overall the choice of tag or fusion depends on the nature and downstream application of expressed protein. Every protein is unique and no single fusion/tag or cleavage method will answer every need. His-tag and Fc fusion are the most commonly used and it is easy to apply others according to the needs. Through genetic engineering, researchers can alter the protein sequence and hence its structure, binding behavior, physical property or even biological function. For example, the technique allows the incorporation of unnatural amino acids into a recombinant protein product using artificial tRNAs and may allow the rational design of new proteins or protein-like biologics with novel affinity moieties for easy detection, modification, and/or purification.
» Go to the next topic: Post Genomic Recombinant Protein Purification
« Return to Technical Support mainpage
Earn discounts, credits or rewards with your purchases.
Get the sequence-verified, expression-ready gene clones.
Oder your high-quality, recombinant proteins of interest.
Try our products & services for your antibody R&D.
Acquire high titer, ready-to-use viral particles.
Send your question or feedback on our products & services