Functional clustering of B cell receptors using sequence and structural features†
Abstract
The repertoires of B cell receptors (BCRs), which can be captured by single cell-resolution sequencing technologies, contain a personal history of a donor's antigen exposure. One of the current challenges in analyzing such BCR sequence data is to assign sequences to groups with similar antigen and epitope binding specificity. This is a non-trivial task given the paucity of experimentally-determined antibody–antigen structures and the fact that different gene combinations in B cells can lead to receptors that target the same antigen and epitope. Here, we describe a method for clustering BCRs based on sequence and predicted structural features in order to predict groups with similar antigen and epitope binding specificity. We show that all known experimentally-determined structures of antibody–antigen complexes can be clustered accurately (AUC 0.981) and that use of predicted structural features improved the accuracy of the epitope classification. We next show that an independent and non-redundant set of 104 anti-HIV antibody sequences could be clustered corresponding to manually-assigned epitopes with a specificity of 99.7% and a sensitivity of 61.93%, with the imbalance in sensitivity due almost entirely to one group of antibodies—those that target the gp120 V3 loop, which do not form a single, well-defined cluster. We next examined a diverse set of anti-hemagglutinin BCR sequences from humans and mice. We observed clusters that included human or mouse sequences with anti-hemagglutinin antibodies of known structure. We also observed clusters that included both human and mouse sequences. Importantly, to the extent that the epitopes have been experimentally characterized, none of the observed clusters erroneously grouped different hemagglutinin binding regions. Taken together, these results demonstrate that the proposed clustering method provides high-throughput prediction of BCRs with common binding specificity across clonal lineages, donors and even species.
- This article is part of the themed collection: Engineering immunity with quantitative tools