Abstract
Objectives
Lymphocyte subsets are the predictors of disease diagnosis, treatment, and prognosis. Determination of lymphocyte subsets is usually carried out by flow cytometry. Despite recent advances in flow cytometry analysis, most flow cytometry data can be challenging with manual gating, which is labor-intensive, time-consuming, and error-prone. This study aimed to develop an automated method to identify lymphocyte subsets.
Methods
We propose a knowledge-driven combined with data-driven method which can gate automatically to achieve subset identification. To improve accuracy and stability, we have implemented a Loop Adjustment Gating to optimize the gating result of the lymphocyte population. Furthermore, we have incorporated an anomaly detection mechanism to issue warnings for samples that might not have been successfully analyzed, ensuring the quality of the results.
Results
The evaluation showed a 99.2 % correlation between our method results and manual analysis with a dataset of 2,000 individual cases from lymphocyte subset assays. Our proposed method attained 97.7 % accuracy for all cases and 100 % for the high-confidence cases. With our automated method, 99.1 % of manual labor can be saved when reviewing only the low-confidence cases, while the average turnaround time required is only 29 s, reducing by 83.7 %.
Conclusions
Our proposed method can achieve high accuracy in flow cytometry data from lymphocyte subset assays. Additionally, it can save manual labor and reduce the turnaround time, making it have the potential for application in the laboratory.
-
Research ethics: This study was approved by an Ethical Review Committee of Guangzhou KingMed Center for Clinical Laboratory. The approval ID is 2023158.
-
Informed consent: Not applicable.
-
Author contributions: All the authors have accepted responsibility for the entire content of this manuscript and approved its submission.
-
Competing interests: The funding organization played no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication. The authors state no conflict of interest.
-
Research funding: This research was supported by the Science and Technology Program of Guangzhou, China (Grant Number: 2024A03J0752).
-
Data availability: Not applicable.
To provide a clear description of our method, some concepts need to be explained or defined mathematically.
Gated cell population
We refer to the cell population obtained through gating as the “gated cell population” and treat it as a set denoted by capital letters (such as L stands for gated lymphocyte population). Mathematical operations are defined on it, as described in Table A1.
Density distribution
We refer to the density function of the intensity of cell population C on the marker X as d C@X (∙). We define the set composed of the peaks, which are points with a larger value than the window range, as follows.
where l is the length of the window and
We refer to the minimum density value between the nth and (n + 1)th peak as
Empirical interval
Based on past samples, we select an empirical interval of the threshold for each marker. The lower bound and upper bound of the empirical interval are denoted as LB X and UB X respectively, where X is the marker. In general, the threshold should fall within the empirical interval, making the troughs in the empirical interval potential threshold candidates.
Proportional splitting
If our cell population mainly consists of only one cell subset, the distribution generally has no troughs. To select a threshold, a proportional splitting approach can be used to divide the entire cell subset into the same quadrant. If we consider the proportion of the subset in the cell population to be n%, we should choose a number greater than n% as the splitting ratio. This is a hyperparameter, denoted as SR(%) in our method.
Referred to conventional usage, we defined certain symbols and functions as shown in Table A1. Based on this, our threshold selection strategy can be summarized in formula form as Table A2, and our anomaly detection rules can be summarized in formula form as Table A3.
References
1. Hernberg, M. Lymphocyte subsets as prognostic markers for cancer patients receiving immunomodulative therapy. Med Oncol 1999;16:145–53. https://doi.org/10.1007/bf02906126.Search in Google Scholar
2. Polk, BF, Fox, R, Brookmeyer, R, Kanchanaraksa, S, Kaslow, R, Visscher, B, et al.. Predictors of the acquired immunodeficiency syndrome developing in a cohort of seropositive homosexual men. N Engl J Med 1987;316:61–6. https://doi.org/10.1056/nejm198701083160201.Search in Google Scholar PubMed
3. Giorgi, JV, Cheng, HL, Margolick, JB, Bauer, KD, Ferbas, J, Waxdal, M, et al.. Quality control in the flow cytometric measurement of T-lymphocyte subsets: the multicenter AIDS cohort study experience. Clin Immunol Immunopathol 1990;55:173–86. https://doi.org/10.1016/0090-1229(90)90096-9.Search in Google Scholar PubMed
4. Ivanova, EA, Orekhov, AN. T helper lymphocyte subsets and plasticity in autoimmunity and cancer: an overview. BioMed Res Int 2015;2015:327470. https://doi.org/10.1155/2015/327470.Search in Google Scholar PubMed PubMed Central
5. Adan, A, Alizada, G, Kiraz, Y, Baran, Y, Nalbant, A. Flow cytometry: basic principles and applications. Crit Rev Biotechnol 2017;37:163–76. https://doi.org/10.3109/07388551.2015.1128876.Search in Google Scholar PubMed
6. Maecker, HT, McCoy, JP, Nussenblatt, R. Standardizing immunophenotyping for the human immunology project. Nat Rev Immunol 2012;12:191–200. https://doi.org/10.1038/nri3158.Search in Google Scholar PubMed PubMed Central
7. Bashashati, A, Brinkman, RR. A survey of flow cytometry data analysis methods. Adv Bioinf 2009;2009:584603. https://doi.org/10.1155/2009/584603.Search in Google Scholar PubMed PubMed Central
8. Verschoor, CP, Lelic, A, Bramson, JL, Bowdish, DM. An introduction to automated flow cytometry gating tools and their implementation. Front Immunol 2015;6:380. https://doi.org/10.3389/fimmu.2015.00380.Search in Google Scholar PubMed PubMed Central
9. Cheung, M, Campbell, JJ, Whitby, L, Thomas, RJ, Braybrook, J, Petzing, J. Current trends in flow cytometry automated data analysis software. Cytometry Part A 2021;99:1007–21. https://doi.org/10.1002/cyto.a.24320.Search in Google Scholar PubMed
10. Robinson, JP, Ostafe, R, Iyengar, SN, Rajwa, B, Fischer, R. Flow cytometry: the next revolution. Cells 2023;12:1875. https://doi.org/10.3390/cells12141875.Search in Google Scholar PubMed PubMed Central
11. Rajwa, B, Wallace, PK, Griffiths, EA, Dundar, M. Automated assessment of disease progression in acute myeloid leukemia by probabilistic analysis of flow cytometry data. IEEE Trans Biomed Eng 2016;64:1089–98. https://doi.org/10.1109/tbme.2016.2590950.Search in Google Scholar PubMed PubMed Central
12. Rajwa, B, Venkatapathi, M, Ragheb, K, Banada, PP, Hirleman, ED, Lary, T, et al.. Automated classification of bacterial particles in flow by multiangle scatter measurement and support vector machine classifier. Cytometry Part A 2008;73:369–79. https://doi.org/10.1002/cyto.a.20515.Search in Google Scholar PubMed
13. Hu, Z, Tang, A, Singh, J, Bhattacharya, S, Butte, AJ. A robust and interpretable end-to-end deep learning model for cytometry data. Proc Natl Acad Sci U S A 2020;117:21373–80. https://doi.org/10.1073/pnas.2003026117.Search in Google Scholar PubMed PubMed Central
14. Czechowska, K, Lannigan, J, Aghaeepour, N, Back, JB, Begum, J, Behbehani, G, et al.. Cyt-Geist: current and future challenges in cytometry: reports of the CYTO 2019 conference workshops. Cytometry Part A 2019;95:1236–74. https://doi.org/10.1002/cyto.a.23941.Search in Google Scholar PubMed
15. Lo, K, Hahne, F, Brinkman, RR, Gottardo, R. flowClust: a bioconductor package for automated gating of flow cytometry data. BMC Bioinf 2009;10:1–8. https://doi.org/10.1186/1471-2105-10-145.Search in Google Scholar PubMed PubMed Central
16. Finak, G, Bashashati, A, Brinkman, R, Gottardo, R. Merging mixture components for cell population identification in flow cytometry. Adv Bioinf 2009;2009:247646. https://doi.org/10.1155/2009/247646.Search in Google Scholar PubMed PubMed Central
17. Aghaeepour, N, Nikolic, R, Hoos, HH, Brinkman, RR. Rapid cell population identification in flow cytometry data. Cytometry Part A 2011;79:6–13. https://doi.org/10.1002/cyto.a.21007.Search in Google Scholar PubMed PubMed Central
18. Qian, Y, Wei, C, Eun-Hyung Lee, F, Campbell, J, Halliley, J, Lee, JA, et al.. Elucidation of seventeen human peripheral blood B-cell subsets and quantification of the tetanus response using a density-based method for the automated identification of cell populations in multidimensional flow cytometry data. Cytometry, Part B 2010;78:S69–82. https://doi.org/10.1002/cyto.b.20554.Search in Google Scholar PubMed PubMed Central
19. Dempster, AP, Laird, NM, Rubin, DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Series B Stat Methodol 1977;39:1–22. https://doi.org/10.1111/j.2517-6161.1977.tb01600.x.Search in Google Scholar
20. Malek, M, Taghiyar, MJ, Chong, L, Finak, G, Gottardo, R, Brinkman, RR. flowDensity: reproducing manual gating of flow cytometry data by automated density-based cell population identification. Bioinformatics 2014;31:606–7. https://doi.org/10.1093/bioinformatics/btu677.Search in Google Scholar PubMed PubMed Central
21. Morbach, H, Eichhorn, EM, Liese, JG, Girschick, HJ. Reference values for B cell subpopulations from infancy to adulthood. Clin Exp Immunol 2010;162:271–9. https://doi.org/10.1111/j.1365-2249.2010.04206.x.Search in Google Scholar PubMed PubMed Central
22. Pagana, KD, Pagana, TJ. Mosby’s diagnostic and laboratory test reference. St. Louis: Elsevier Mosby; 2012.Search in Google Scholar
23. Gratama, JW, Kraan, J, Keeney, M, Mandy, F, Sutherland, DR, Wood, BL. Enumeration of immunologically defined cell populations by flow cytometry: approved guideline. Wayne: Clinical Laboratory Standards Institute; 2007.Search in Google Scholar
24. Krouwer, JS. Why Bland–Altman plots should use X, not (Y+X)/2 when X is a reference method. Stat Med 2008;27:778–80. https://doi.org/10.1002/sim.3086.Search in Google Scholar PubMed
25. Giavarina, D. Understanding bland altman analysis. Biochem Med 2015;25:141–51. https://doi.org/10.11613/bm.2015.015.Search in Google Scholar
26. Aghaeepour, N, Finak, G, FlowCAP Consortium, Dream Consortium, Hoos, H, Mosmann, TR, et al.. Critical assessment of automated flow cytometry data analysis techniques. Nat Methods 2013;10:228–38. https://doi.org/10.1038/nmeth.2365.Search in Google Scholar PubMed PubMed Central
Supplementary Material
This article contains supplementary material (https://doi.org/10.1515/cclm-2023-1141).
© 2024 Walter de Gruyter GmbH, Berlin/Boston