HemeAI

A machine learning approach to hematological analysis

Abstract

This project's goal is to automate the complete blood count tests (CBC tests) and peripheral blood smears. CBCs and peripheral blood smears are used to determine the cellular components of the blood in order to detect abnormalities and determine diseases in case there are any. The process of counting the cells and reviewing them after an abnormal CBC can have a long wait time and is prone to human error. We are using YOLO (You Only Look Once), a computer vision model, to speed this process up by both performing the CBC and classifying each cell. This project was split into two where one YOLO model was used to identify and count white blood cells, red blood cells, and platelets, for the CBC portion and another YOLO model detects specific white blood cells and abnormal red blood cells. The specific white blood cells being detected for the second YOLO model were eosinophils, basophils, myelocytes, neutrophils, monocytes, lymphocytes, erythroblasts, and abnormal red blood cells. These different types of specific white blood cells and abnormal red blood cells are used to diagnose five diseases which are anemia, thrombocytopenia, basophilia, eosinophilia, and leukemia. The first YOLO model (CBC portion) yielded an F1 score of 0.85. The second YOLO model (specific white blood cells and abnormal red blood cells) yielded an F1 score of 0.75.

Intoduction

Complete blood counts (CBCs) are one of the most commonly ordered lab tests. They are critical in narrowing down possible diagnoses. When the results of a CBC are abnormal, a peripheral blood smear is usually conducted. Peripheral blood smears are manually conducted and are subject to human error. In addition to this, it can take several days to receive the results and even longer in rural and developing areas. In order to find an alternative, in this project we aim to:

  • Automate a Complete Blood Count
  • Automate a peripheral blood smear
  • Make a preliminary diagnosis based on the results

Project Demo

Results: Automatic CBC
Confusion Matrix after training the YOLOv8 model for CBC

From the above figure, we can conclude that the model performs really well. The ratio of true positives for RBC, WBC, and platelets is 0.80, 0.98, and 0.92 respectively. However, we are alarmed by the False Positive Ratio for Background, since the model predicts RBC for the background relatively frequently (ratio = 0.94). The F1 score of this model 0.85.

Results: Disease Detection
Confusion Matrix after training the YOLOv8 model for CBC

From the above figure, we can conclude that the model performs really well. The ratio of the true positives for abnormal red blood cells (abnormal RBCs), band neutrophil, basophil, eosinophil, erythroblast, lymphocyte, monocyte, myelocyte, neutrophil, and segmented neutrophil are 0.48, 0.71, 0.96, 0.96, 0.86, 0.90, 0.97, 0.92, 0.33, and 0.70 respectively. However, we are alarmed by the False Positive Ratio for Background, since the model predicts RBC for the background relatively frequently (ratio = 0.94). The F1 score of this model 0.75.

Presentation

Research Paper

Conclusion

YOLO has the potential to improve the accuracy and efficiency/turn-around time of CBCs and peripheral blood smears. The proposed approach involves using YOLO to classify the different types of cells within an image of blood cells, and then determine what disease the patient may have if abnormalities are found in the ratios between RBC, WBC, and Platelets. This can significantly reduce the time necessary to conduct a peripheral blood smear as that is subject to human error and may take several days to complete and deliver. The results from this approach have been promising. The first YOLO model yielded an F1 score of 0.85, while the second model yielded an F1 score of 0.75.

Try it out!

Try this project out for yourself! Follow the steps below:

  • Visit the Github Link and clone the repository
  • Follow the steps under "Steps to run the project" in README.md to start the application
  • At this point, ensure that you have a set of blood samples to test the model on. A sample dataset for the CBC model can be found on this Drive. Additionally, a sample dataset for the Disease Detection model can be found on this Drive
  • Navigate to the AutoCBC page through the header
  • Click Upload Image and select all blood smear images you would like to run the CBC model on
  • Once you have added enough images click the process image button
  • If there is an abnormal result in your CBC you will be prompted with a button to the disease detection portion. If any of the components do not load, there might have been an issue with the CORS settings. Simply refresh the page and reupload the images.
  • Click Upload Image and select all blood smear images that you would like to run the Disease Detection model on
  • The page will prompt with the potential disease you might have. If any of the components do not load, there might have been an issue with the CORS settings. Simply refresh the page and reupload the images.