We describe a protocol for the label-free identification of lymphocyte subtypes using quantitative phase imaging and a machine learning algorithm. Measurements of 3D refractive index tomograms of lymphocytes present 3D morphological and biochemical information for individual cells, which is then analyzed with a machine-learning algorithm for identification of cell types.
We describe here a protocol for the label-free identification of lymphocyte subtypes using quantitative phase imaging and machine learning. Identification of lymphocyte subtypes is important for the study of immunology as well as diagnosis and treatment of various diseases. Currently, standard methods for classifying lymphocyte types rely on labeling specific membrane proteins via antigen-antibody reactions. However, these labeling techniques carry the potential risks of altering cellular functions. The protocol described here overcomes these challenges by exploiting intrinsic optical contrasts measured by 3D quantitative phase imaging and a machine learning algorithm. Measurement of 3D refractive index (RI) tomograms of lymphocytes provides quantitative information about 3D morphology and phenotypes of individual cells. The biophysical parameters extracted from the measured 3D RI tomograms are then quantitatively analyzed with a machine learning algorithm, enabling label-free identification of lymphocyte types at a single-cell level. We measure the 3D RI tomograms of B, CD4+ T, and CD8+ T lymphocytes and identified their cell types with over 80% accuracy. In this protocol, we describe the detailed steps for lymphocyte isolation, 3D quantitative phase imaging, and machine learning for identifying lymphocyte types.
Lymphocytes can be classified into various subtypes including B, helper (CD4+) T, cytotoxic (CD8+) T, and regulatory T cells. Each lymphocyte type has a different role in the adaptive immune system; for example, B lymphocytes produce antibodies, whereas T lymphocytes detect specific antigens, eliminate abnormal cells, and regulate B lymphocytes. Lymphocyte function and regulation is tightly controlled by and related to various diseases including cancers1, autoimmune diseases2, and viral infections3. Thus, the identification of lymphocyte types is important to understand their pathophysiological roles in such diseases and for immunotherapy in clinics.
Currently, methods for classifying lymphocyte types rely on antigen-antibody reactions by targeting specific surface membrane proteins or surface markers4. Targeting surface markers is a precise and accurate method to determine lymphocyte types. However, it requires expensive reagents and time-consuming procedures. Furthermore, it carries risks of the modification of membrane protein structures and the alteration of cellular functions.
To overcome these challenges, the protocol described here introduces the label-free identification of lymphocyte types using 3D quantitative phase imaging (QPI) and machine learning5. This method enables the classification of lymphocyte types at a single-cell level based on morphological information extracted from label-free 3D imaging of individual lymphocytes. Unlike conventional fluorescence microscopy techniques, QPI utilizes refractive index (RI) distributions (intrinsic optical properties of live cells and tissues) as optical contrast6,7. The RI tomograms of individual lymphocytes represent phenotypic information specific to subtypes of lymphocytes. In this case, to systemically utilize 3D RI tomograms of individual lymphocytes, a supervised machine learning algorithm was utilized.
Using various QPI techniques, the 3D RI tomograms of cells have been actively used for the study of cell pathophysiology because they provide a label-free, quantitative imaging capability8,9,10,11,12,13. Also, the 3D RI distributions of individual cells can provide morphological, biochemical, and biomechanical information about cells. 3D RI tomograms have been previously utilized in the fields of hematology14,15,16,17, infectious diseases18,19,20, immunology21, cell biology22,23, inflammation24, cancer25, neuroscience26,27, developmental biology28, toxicology29, and microbiology12,30,31,32.
Although 3D RI tomograms provide detailed morphological and biochemical information of cells, the classification of lymphocyte subtypes is difficult to achieve by simply imaging 3D RI tomograms5. To systematically and quantitatively exploit the measured 3D RI tomograms for the cell type classification, we utilized a machine learning algorithm. Recently, several works have been reported in which quantitative phase images of cells were analyzed with various machine learning algorithms33, including the detection of microorganisms34, classification of bacterial genus35,36, rapid and label-free detection of anthrax spores37, automated analysis of sperm cells38, analysis of cancer cells39,40, and detection of macrophage activation41.
This protocol provides detailed steps to perform label-free identification of lymphocyte types at the individual cell level using 3D QPI and machine learning. This includes: 1) lymphocyte isolation from mouse blood, 2) lymphocyte sorting via flow cytometry, 3) 3D QPI, 4) quantitative feature extraction from 3D RI tomograms, and 5) supervised learning for identifying lymphocyte types.
Animal care and experimental procedures were performed under the approval of the Institutional Animal Care and Use Committee of KAIST (KA2010-21, KA2014-01, and KA2015-03). All the experiments in this study were carried out in accordance with the approved guidelines.
1. Lymphocyte Isolation from Mouse Blood
2. Flow Cytometry and Sorting of Lymphocyte Subtypes
NOTE: Sorting lymphocytes depending on cell type is essential for establishing the ground-truth (i.e., correct) cell type labels to train and test a cell type classifier in supervised learning. Flow cytometry, a gold standard method, is used to identify and separate lymphocytes42.
3. 3D Quantitative Phase Imaging
4. Quantitative Morphological and Biochemical Feature Extraction from 3D RI Tomograms
5. Supervised Learning and Identification
Figure 1 shows the schematic process of the entire protocol. Using the procedure presented here, we isolated B (n = 149), CD4+ T (n = 95), and CD8+ T (n = 112) lymphocytes. To obtain phase and amplitude information at various angles of illumination, multiple 2D holograms of each lymphocyte were measured by changing the angle of illumination (from -60° to 60°). Typically, 50 holograms can be used to reconstruct a 3D RI tomogram, but the number of 2D holograms can be adjusted considering the imaging speed and quality. Amplitude and phase information of the measured holograms are retrieved using a field retrieval algorithm based on Fourier transform43,44. The 3D RI tomogram of each lymphocyte was reconstructed from multiple 2D retrieved phase and amplitude information at various angles of illumination using optical diffraction tomography algorithm. Details of image process and 3D RI tomogram reconstruction method can be found elsewhere21,45.
Figure 2A-2C shows representative 3D rendered RI tomograms of B, CD4+ T, and CD8+ T lymphocytes by allocating different color schemes according to RI values via the imaging software. From the RI values, quantitative morphological (SA, CV, and SI) and biochemical (PD and DM) features were calculated (Figure 2A-2C). This result clearly demonstrates that 3D RI distribution enables quantitative analysis of morphological as well as biochemical information of lymphocytes.
Supervised machine learning was exploited to identify lymphocyte types at a single-cell level. The measured 3D RI tomograms were randomly split into 70% and 30% of training (B: 104, CD4+ T: 66, and CD8+ T: 77) and test (B: 45, CD4+ T: 29, and CD8+ T: 35) datasets, respectively. We optimized the classifiers to maximally utilize the cell-type-specific fingerprints encoded in the feature space. The total accuracy, sensitivity (true positive), and specificity (true negative) were calculated by comparing the classifier-predicted results and ground-truth cell types.
In order to demonstrate proof-of-concept of the proposed protocol, we performed supervised machine learning on three different cases: binary classification of (i) B and T lymphocytes and (ii) two T lymphocyte subtypes (CD4+ and CD8+), and (iii) multiclass classification of all lymphocyte types.
Figure 3 shows identification performance of optimized classifiers for training and test stages. The accuracy of the T and B lymphocyte classification was 93.15% and 89.81% for the training and test cases, respectively. The CD4+ and CD8+ T lymphocytes were statistically classified, and the accuracy was 87.41% and 84.38% for the training and test sets, respectively. Lastly, the accuracy of the multiclass cell type classifier was 80.65% and 75.93% for the training and test stages, respectively.
Figure 1: Schematic diagrams of the label-free identification of lymphocyte types exploiting 3D quantitative phase imaging and machine learning. Please click here to view a larger version of this figure.
Figure 2: Representative 3D rendered RI tomograms of each lymphocyte cell type with quantitative morphological and biochemical features. (A) B cell, (B) CD4+ T cell, and (C) CD8+ T cell. Scale bar = 2 µm. SA, surface area; CV, cellular volume; SI, sphericity; PD, protein density; DM, dry mass. This figure is modified with permission5. Please click here to view a larger version of this figure.
Figure 3: Identification of individual lymphocyte types via supervised machine learning (A) binary classification of B and T cells, (B) binary classification of CD4+ and CD8+ T cells, and (C) multiclass classification of all three lymphocyte cell types; for both training and test sets. Note the small difference between the training and test cases, suggesting nice generalization of the established classifiers. The numbers below the names of each cell type indicate the number of cells used. This figure is modified with permission5. Please click here to view a larger version of this figure.
Supplementary File 1: Feature extraction code. Extracting features (SA, CV, SI, PD, and DM) after RI threshold-based segmentation of each tomogram. Implemented in an image processing software. Please click here to download this file.
Supplementary File 2: Training code. Training a k-NN classifier training based on selected features. Implemented in an image processing software. Please click here to download this file.
Supplementary File 3: Testing code. Testing a trained k-NN classifier for a new dataset (i.e., test set). Implemented in an image processing software. Please click here to download this file.
We present a protocol that enables the label-free identification of lymphocyte types exploiting 3D quantitative phase imaging and machine learning. Critical steps of this protocol are quantitative phase imaging and feature selection. For the optimal holographic imaging, the density of cells should be controlled as described above. Mechanical stability of the cells is also important to obtain a precise 3D RI distribution because floating or vibrational cellular motions will disturb hologram measurements upon illumination angle changes. We, therefore waited several minutes until the sample became stable and static in the imaging chamber before measuring holograms. Lastly, bubbles inside the imaging chamber are problematic when measuring holograms due to RI differences between air and the sample; thus, the sample should be carefully loaded to the imaging chamber.
Feature extraction and selection help determine the identification performance of the classifier. We calculated 5 quantitative morphological (CV, SA, SI) and biochemical (PD, DM) features from 3D RI distribution at 20 different RI threshold values; thus, we extracted a total of 100 features. We exhaustively searched optimal feature and classifier combinations, which show that the best cross-validation accuracy was selected. We tested 6 different machine learning algorithms, including k-NN (k = 4 and k = 6), linear discrimination analysis, quadratic discrimination analysis, naïve Bayes, and decision tree, and we found that k-NN (k = 4) showed the best identification performance. However, there is a chance to improve identification accuracy using other machine learning methods, including support vector machine and neuronal networks.
This protocol measures intrinsic optical properties via 3D quantitative phase imaging in order to identify lymphocyte types; thus, it does not require a labeling process based on antigen-antibody reactions used in fluorescence or magnetic bead-based cell-sorting techniques, which have risks of altering cellular function by modifying membrane protein structures. Moreover, the present method measures 3D RI distribution and provides 3D morphological and biochemical information about the cell, which cannot be obtained by a single-shot holography method46; therefore, the identification performance of the protocol is more accurate due to high-dimensional information.
A minor limitation of this protocol is the manual adjustment of the sample stage and required labeling process for supervised machine learning. We searched a lymphocyte by adjusting the manual translational stage and measured holograms, which are the most time-consuming steps. This limitation would be improved by employing an automated motorized stage or microfluidic channel devices. Regarding supervised learning, the known lymphocyte types are required to establish the optimal classifier; thus, we had to first isolate and identify lymphocyte cell types based on the antigen-antibody-based sorting technique. Nonetheless, this protocol still uses the intrinsic optical contrast of lymphocytes, and the labeling agents used to specify antibodies have negligible effects on the measured 3D RI signal. Therefore, the established classifier may be used for identifying lymphocytes in a label-free manner.
Although this protocol mainly utilizes phenotypes of lymphocytes by measuring 3D RI tomograms of individual cells, these 3D RI data can also be used in combination with other modalities addressing genotypes or proteomic information for better classification of subtypes. Recently, correlative microscopy techniques combining fluorescence imaging and QPI have been introduced47,48,49. The approach presented in this protocol can also be extended to these correlative imaging methods.
Label-free identification of lymphocyte types can be applied to studying pathophysiology or diagnosing disease by detecting abnormal lymphocytes or ratios among lymphocyte types. Furthermore, this protocol can be applied to whole blood analysis by identifying various cells including red blood cells, platelets, and white blood cells.
The authors have nothing to disclose.
This work was supported by the KAIST BK21+ Program, Tomocube, Inc., and the National Research Foundation of Korea (2015R1A3A2066550, 2017M3C1A3013923, 2018K000396). Y. Jo acknowledges support from the KAIST Presidential Fellowship and Asan Foundation Biomedical Science Scholarship.
Mouse | Daehan Biolink | C57BL/6J mice | gender and age-matched, 6 – 8 weeks |
Falcon conical centrifuge tube | ThermoFisher Scientific | 14-959-53A | 15 mL |
Phosphate-buffered saline | Sigma-Aldrich | 806544-500ML | |
Ammonium-chloride-potassium lysing buffer | ThermoFisher Scientific | A1049201 | |
RPMI-1640 medium | Sigma-Aldrich | R8758 | |
Fetal bovine serum | ThermoFisher Scientific | 10438018 | |
Antibody | BD Biosciences | 553140 (RRID:AB_394655) | CD16/32 (clone 2.4G2) |
Antibody | BD Biosciences | 555275 (RRID:AB_395699) | CD3ε (clone 17A2) |
Antibody | Biolegnd | 100734 (RRID:AB_2075238) | CD8α (clone 53-6.7) |
Antibody | BD Biosciences | 557655 (RRID:AB_396770) | CD19 (clone 1D3) |
Antibody | BD Biosciences | 557683 (RRID:AB_396793) | CD45R/B220 (clone RA3-6B2) |
Antibody | BD Biosciences | 552878 (RRID:AB_394507) | NK1.1 (clone PK136) |
Antibody | eBioscience | 11-0041-85 (RRID:AB_464893) | CD4 (clone GK1.5) |
DAPI | Roche | 10236276001 | 4,6-diamidino-2-phenylindole |
Flow cytometry | BD Biosciences | Aria II or III | |
Imaging chamber | Tomocube, Inc. | TomoDish | |
Holotomography | Tomocube, Inc. | HT-1H | |
Holotomography imaging software | Tomocube, Inc. | TomoStudio | |
Image professing software | MathWorks | Matlab R2017b |