Written by Yuning Wang, PhD, and María Gerpe, PhD
IgBLAST Definition
Developed by the National Center for Biotechnology Information (NCBI), IgBlast is a tool for the analysis of immunoglobulin and T cell receptor sequences in FASTA format. Using Basic Local Alignment Search Tool (BLAST) technology, IgBlast identifies similar sites within nucleotide or protein sequences and then takes this information and compares it with existing records in biological databases.
Why was IgBLAST created
Prior to IgBlast creation, general BLAST searches were used to identify specific target immunoglobulin sequences. This method, however, needed to be repeated multiple times and often relied on manual work to assemble and compare different genes. IgBlast was developed to minimize errors associated with this procedure as well as allowing for more targeted and precise sequence identification. As immunoglobulins encode sequences from many different genes, immunoglobulin sequencing research relies on accurate identification and comparison of each gene. IgBlast can perform this and produce high-quality data.
The Functions of IgBLAST
With the ability to efficiently process biological sequences, IgBlast streamlines the laboratory workflow. It can also perform similarity matching with multiple databases simultaneously to improve the chances of identifying a gene.
As previously mentioned, multiple genes encode sequences in immunoglobulins. These genes include the variable gene, diversity gene, and joining gene, known as V, D, and J genes. IgBlast can recognize these different genes and report matches to V, D and J to a target sequence. Furthermore, IgBlast can give more insight into the nucleotide structure of V, D and J segments.
IgBlast can also interpret each of the four immunoglobulin domains, which provide a higher specificity while analyzing sequences. IgBlast notes when a sequence is rearranged and the type of rearrangement (in-frame or out-of-frame). This high level of detail contributes clarity and precision in antibody studies.
How to Use IgBLAST
To use IgBlast, users must download the appropriate computer program that matches their interface. Then, users need to download BLAST databases available on the NCBI website, with specific attention to germline V, D and J gene sequences of the organisms of interest in their study.
IgBlast can automatically be applied to animals, including humans, rats, rabbits and rhesus monkeys, but there is an option to apply this technology to a custom organism. However, to do so, additional file preparation is necessary. The U.S. National Library of Medicine also supports an IgBlast webpage resource where users can input sequence data, database information, and search parameters to analyze immunoglobulin and T cell receptor sequences.
Limitations of IgBLAST that can only be circumvented via de novo sequencing
Homology bias and IgBLAST
Because IgBlast relies on homology-based searches, sequences of novel antibodies or antibodies from not routinely researched species, or species with poor annotated genomes, may be difficult to fully identify, or worse may be impacted by bias. This is why de novo protein sequencing is important as it allows to sequence an antibody that is novel or from an uncommon species
Distinguishing between isoleucine and leucine using IgBLAST
Furthermore, with de novo sequencing, users can distinguish between same-mass residues such as Isoleucine (Ile, I) and leucine (Leu, L) that may be confounded in database searching in proteomics databases. An I/L identity in a reference sequenced provided by IgBLAST is based on the germline gene sequence, which may be reliable if the sequence has been verified consistently in a statistically significant way (e.g., reference sequences in SWISS-Prot).
However, if an I vs L decision must be made through de novo protein sequencing of a novel or engineered sequence or from a sequence of a not-well annotated genome, scientists may not always be able to rely on a reference sequence from a database. Thus to confirm the identity of isobaric residues I and L, de novo sequencing is critical. You can read more about how I/L elucidation is effected here.
I and L determination is especially important for antibodies since they often present in the complementarity-determining region (CDR) of antibodies and are important to elucidate for binding. Databases like IgBLAST may also not be able to give a reference sequence of CDR-H3, which is the most unique portion of an antibody sequence, and thus highly heterogeneous even within the same species and animal.
CDR diversity can only really be deconvoluted via de novo sequencing
Though CDR1 and CDR2 are encoded in the V segment, CDR3 is somatically generated via recombination of the V and J segments (light chain CDR-L3) or recombination of V, D, and J segments (heavy chain CDR-H3). IgBLAST stores reference germline genes which do not take into account insertions, deletions and/or rearrangements from V(D)J recombination which are vital to antibody sequence diversity.
If you think you would benefit from the de novo capabilities of next generation protein sequencing, reach out to discuss more with our scientists.
Newsletter for All Things Protein Sequencing
Breakthrough Bispecific Antibody R&D Techniques, Apr 20 11:00am EST.
Talk to Our Scientists.
We Have Sequenced 5000+ Antibodies and We Are Eager to Help You.
Through next generation protein sequencing, Rapid Novor enables reliable discovery and development of novel reagents, diagnostics, and therapeutics. Thanks to our Next Generation Protein Sequencing and antibody discovery services, researchers have furthered thousands of projects, patented antibody therapeutics, and developed the first recombinant polyclonal antibody diagnostics.
Talk to Our Scientists.
We Have Sequenced 5000+ Antibodies and We Are Eager to Help You.
Through next generation protein sequencing, Rapid Novor enables timely and reliable discovery and development of novel reagents, diagnostics, and therapeutics. Thanks to our Next Generation Protein Sequencing and antibody discovery services, researchers have furthered thousands of projects, patented antibody therapeutics, and ran the first recombinant polyclonal antibody diagnostics