LUAD GGN Patients Study: Digital Pathology and Machine Learning Model Validation

by drbyos

Revolutionizing Lung Cancer Diagnostics: A Deep Learning Approach to Identifying Ground-Glass Nodules

A recent study has made significant strides in improving the accuracy of lung cancer diagnostics, focusing on ground-glass nodules (GGNs), which are small areas of increased opacity in the lungs that may indicate early-stage lung adenocarcinoma (LUAD). This research, conducted at the Affiliated Zhongshan Hospital and the Affiliated Xinhua Hospital of Dalian University, enrolled 289 LUAD patients between January 2019 and December 2022.

Study Population and Inclusion Criteria

The study retrospectively and consecutively included patients who met specific criteria: aged over 18 years, no prior exposure to radiotherapy or chemotherapy, availability of matching CT and pathological images, and confirmation of LUAD through surgery and pathology. Patients with solid nodules, incomplete data, or additional cancers elsewhere were excluded. The final cohort was divided into training and test sets, each further categorized into invasive and non-invasive subgroups.

This research adhered to the principles laid out in the Declaration of Helsinki, receiving ethical approval from the Affiliated Zhongshan Hospital of Dalian University. Due to the retrospective nature, informed consent was waived.

Fig. 1

Flowchart revealing study population according to inclusion and exclusion criteria.

Whole-Slide Imaging (WSI) Collection and Processing

Postoperative pathological tissues were immersed in formalin, embedded in paraffin, sectioned into 5(mu)m-thick slides, and stained with hematoxylin and eosin (H&E). Two pathologists with over 10 years of experience conducted the diagnosis, with the chief pathologist deciding in case of disputes. WSIs were scanned using the Easyscan model at 20(times) magnification with a pixel resolution of 0.5(mu)m/pixel and an effective area of 15 mm (times) 15 mm. Each patient’s largest and most typical tumor tissue was selected for analysis, yielding 289 WSIs.

WSI Annotation

WSIs were manually marked as tumor or non-tumor areas by two experienced pathologists using QuPath-0.3.2 software. For inconsistent annotations, the chief pathologist made the final decision. Excluded areas included folds, bleeding, and blurriness. Tumor areas were labeled: 0 for non-invasive lesions (AAH, AIS, MIA) and 1 for invasive carcinomas (ICA). An annotated example is illustrated in Fig. 2.

Fig. 2
figure 2

An example of annotation of regions of interest for WSI data.

WSI Preprocessing and Augmentation

WSIs were partitioned into non-overlapping 512(times)512-pixel patches to reduce computational demand. Patches were cleaned through thresholding, excluding white regions and patches with excessive brightness or lack of tissue. Vahadane color normalization was applied to minimize color variations. This preprocessing yielded 2,584,435 patches, which were then augmented through random translations, rotations, flipping, and 90-degree rotations to increase data diversity.

Model Specification

Deep Learning Model

The study employed three deep learning models—ResNet18, ResNet50, and ResNet101—for tumor region recognition. The models used ReLU activation functions except in the last fully connected layer, which used Softmax. Training parameters included a cosine decay learning rate, batch size of 128, 32 epochs, initial learning rate of 0.01, and Adam optimization. The best-performing model was selected to extract features.

Feature Selection

Patch-level WSIs were fed into the deep learning model to obtain labels and probabilities. Two multiple instance learning methods—patch likelihood histogram (PLH) and bag of words (BoW)—were used to aggregate features to the WSI-level. Correlated features were removed using Spearman correlation, and LASSO regression with cross-validation determined the optimal regularization weight.

Machine Learning-Based Pathomics Model

Reduced features were input into various machine learning models—random forest, extremely randomized trees, extreme gradient boosting, and light gradient boosting machine—after tuning parameters. Comparisons of the models’ predictive performance led to the selection of the best model for assisting pathologists.

Assisting Model

The predictive accuracy of the pathomics model was compared with that of 10 pathologists with varying levels of experience, 5 with 1-3 years and 5 with 6-9 years. The comparison was conducted before and after introducing the model, with a washout period of one month. Experience was defined as reviewing fewer than 50 LUAD surgical specimens annually.

Performance Metrics

The model’s predictive performance was evaluated using ROC curves, AUC, accuracy, sensitivity, specificity, PPV, and NPV in both training and testing cohorts. Decision curve analysis was utilized to assess clinical utility. A flowchart of feature extraction and model establishment is depicted in Fig. 3.

Fig. 3
figure 3

Flowchart of extraction of features, model establishment, and performance evaluation.

Impact and Future Directions

This research represents a significant advancement in lung cancer diagnostics, particularly for the detection and classification of ground-glass nodules. By leveraging deep learning and whole-slide imaging, this study aims to enhance pathologists’ accuracy and efficiency, contributing to earlier and more precise cancer identification.

Future work could involve larger, multi-center studies to further validate the findings and explore additional applications of this approach in various clinical settings.

Conclusion

The study highlights the power of combining advanced imaging techniques with machine learning to improve diagnostic accuracy in lung cancer. As research continues to evolve, such innovations have the potential to transform patient care and outcomes.

What do you think about these advancements in cancer diagnostics? Share your thoughts below!

Related Posts

Leave a Comment