Written by Yuning Wang, PhD
July 10, 2021
Introduction
Antibody sequences are critical for antibody engineering and protein characterization in therapeutic development. For antibody reagent users, knowing the sequences allows them to perform sequence analysis/alignment to identify binding and cross-reactivity so they can conduct rational experiment design.
There are a number of online antibody databases storing different types of information such as suppliers, usage data, and publication references. These types of antibody databases are often used for reproducibility purposes. However, other databases containing sequences from immunoglobulin genes or sequenced antibody proteins offer much more insight for antibody engineering and characterization.
Sometimes, some genomes have not been sequenced, and this requires the use of de novo protein sequencing to elucidate the sequence. However, when the sequence is available, there are numerous antibody sequence databases that can be used by researchers for the purpose of sequence search.
Antibody sequence databases
UniProt
UniProt is a comprehensive, high-quality and freely accessible resource of protein sequence and functional information. UniProt does not specifically focus on antibodies; as such, it only includes a representative set of germline antibody sequences, and a limited number of non-germline antibody sequences.
IMGT
The international ImMunoGeneTics information system (IMGT) is the global reference the high-quality integrated knowledge resource specializing in the immunoglobulins (Ig), T cell Receptors (TcR) and Major Histocompatibility Complex (MHC) molecules of human and other vertebrates. IMGT comprises seven databases encompassing antibody sequence, genome, and structure. For example, IMGT/mAb-DB, the monoclonal antibodies database of IMGT, provides a unique expertised resource on antibodies or monoclonal antibodies (mAb) with clinical indications, and on fusion proteins for immune applications (FPIA).
abYsis
abYsis is a web-based antibody research system that includes an integrated database of antibody sequence and structure data. abYsis can be used in three main ways:
- to search the database to find information about sequences (and structures) including detailed annotations, identification of unusual residues, post-translational modification sites, etc.,
- to analyze trends across sequences in the database, and
- to enter one’s own sequences for analysis.
The ABCD (for AntiBodies Chemically Defined) database
Launched in 2020, the ABCD Database is a repository of sequenced antibodies, integrating curated information about the antibody and its antigen with cross-links to standardized databases of chemical and protein entities. Each antibody is assigned a unique ID number that can be used in academic publications to increase reproducibility of experiments. Check out their tutorial to learn How to obtain an antibody sequence, Step-by-step.
SAbDab
Structural Antibody Database (SAbDab) is a database containing all the antibody structures available in the PDB, annotated and presented in a consistent fashion. The user can select structures, according to these attributes as well as structural properties such as complementarity determining region loop conformation and variable domain orientation.
Thera-SAbDab
Therapeutic Structural Antibody Database (Thera-SAbDab) tracks all antibody- and nanobody-related therapeutics recognized by the World Health Organisation (WHO), and identifies any corresponding structures in the SAbDab with near-exact or exact variable domain sequence matches.
AHo’s Amazing Atlas of Antibody Anatomy
AAAAA is a tool for antibody structural analysis, modelling and engineering. All sequences of variable domain germline and rearranged sequences in AAAAA are numbered in an identical way. AAAAA is quite useful for users to check antibody sequences for plausibility to find out about unusual residues, sequencing mistakes, frameshifts, insertions and deletions.
Observed Antibody Space database
OAS is a project to collect and annotate immune repertoires for use in large-scale analysis. It currently contains over one billion sequences, from more than 79 different studies. These repertoires cover diverse immune states, organisms (primarily human and mouse), and individuals. OAS now contains both unpaired and paired antibody sequences.
cAb-Rep
cAb-Rep is a database of curated antibody repertoires. It currently includes 306 B cell repertoires collected from 121 human individuals, including healthy, vaccinated, and autoimmune disease donors. The database has a total of 267.9 and 72.9 million curated full-length V(D)J heavy and light chain transcripts respectively, which were generated using Illumina sequencing technology in previous studies.
What if the sequence is not available from online antibody databases?
Lack of sequence availability is a common problem. Sequences of a great number of antibodies in industry are not available to the public due to proprietary reasons, and also because many genomes remain to be sequenced. Furthermore, with immune systems evolving constantly, global antibody discovery efforts produce novel antibodies not present in any database on an ongoing basis.
Researchers may use the aforementioned databases to find sequences of closely related antibodies and rely on bioinformatics tools to make predictions for the antibody of interest. However, this is not an ideal way to conduct confident and rational research. Besides, when working with newly discovered or uncharacterized antibodies, online databases may not offer much value in acquiring sequence information.
Therefore, antibody sequencing techniques can often be used to determine and confirm the amino acid sequence of an antibody. Traditionally, researchers go for hybridoma sequencing if they have the cell line for a monoclonal antibody. However, sequencing with hybridomas suffers from certain drawbacks, which can cause inaccurate results. To make things worse, researchers do not always have available, viable hybridomas. B-cell sequencing, including single-cell approaches, can give nucleotide sequence information (blueprints for thousands of possible antibodies) but may not be able to access the sequence of the full circulating antibody protein repertoire.
With our proprietary de novo protein sequencing platform, REmAb®, we are able to determine the primary amino acid sequences of monoclonal antibodies without the need for hybridoma cell lines or prior knowledge of DNA sequences. REmAb® directly analyzes the antibody protein (only 0.1 mg required) with 100% coverage and accuracy. Our team has successfully sequenced thousands of antibodies, including antibody reagents, therapeutics, and newly isolated antibodies, helping researchers decode their antibodies within two weeks. So, for researchers who cannot find sequences of their antibodies in online databases, REmAb® could be considered as a straightforward solution with the above mentioned benefits.
Can I get sequences from polyclonal antibodies?
Now you can. Rapid Novor recently became the world’s first to sequence antibodies directly from a polyclonal protein sample without the use of DNA or other nucleotide sequencing technology. This means from now on researchers can expect to obtain antibody sequences even from a polyclonal mixture using Rapid Novor’s REpAb® platform.
Built on years of experience in mass spec-based proteomics and bioinformatics, our team has made the breakthrough to accomplish what used to be impossible. Our polyclonal sequencing technology can derive the sequences of dominant mAbs directly from a pAb mixture. Since the majority of antibody reagents, and all naturally produced antibodies are polyclonal, the applications of REpAb® are limitless. If you’re interested in learning more, contact us here.
Newsletter for All Things Protein Sequencing
Breakthrough Bispecific Antibody R&D Techniques, Apr 20 11:00am EST.
Talk to Our Scientists.
We Have Sequenced 5000+ Antibodies and We Are Eager to Help You.
Through next generation protein sequencing, Rapid Novor enables reliable discovery and development of novel reagents, diagnostics, and therapeutics. Thanks to our Next Generation Protein Sequencing and antibody discovery services, researchers have furthered thousands of projects, patented antibody therapeutics, and developed the first recombinant polyclonal antibody diagnostics.
Talk to Our Scientists.
We Have Sequenced 5000+ Antibodies and We Are Eager to Help You.
Through next generation protein sequencing, Rapid Novor enables timely and reliable discovery and development of novel reagents, diagnostics, and therapeutics. Thanks to our Next Generation Protein Sequencing and antibody discovery services, researchers have furthered thousands of projects, patented antibody therapeutics, and ran the first recombinant polyclonal antibody diagnostics