TY - JOUR
T1 - Lateral elbow tendinopathy and artificial intelligence
T2 - Binary and multilabel findings detection using machine learning algorithms
AU - Droppelmann, Guillermo
AU - Tello, Manuel
AU - García, Nicolás
AU - Greene, Cristóbal
AU - Jorquera, Carlos
AU - Feijoo, Felipe
N1 - Publisher Copyright:
Copyright © 2022 Droppelmann, Tello, García, Greene, Jorquera and Feijoo.
PY - 2022/9/23
Y1 - 2022/9/23
N2 - Background: Ultrasound (US) is a valuable technique to detect degenerative findings and intrasubstance tears in lateral elbow tendinopathy (LET). Machine learning methods allow supporting this radiological diagnosis. Aim: To assess multilabel classification models using machine learning models to detect degenerative findings and intrasubstance tears in US images with LET diagnosis. Materials and methods: A retrospective study was performed. US images and medical records from patients with LET diagnosis from January 1st, 2017, to December 30th, 2018, were selected. Datasets were built for training and testing models. For image analysis, features extraction, texture characteristics, intensity distribution, pixel-pixel co-occurrence patterns, and scales granularity were implemented. Six different supervised learning models were implemented for binary and multilabel classification. All models were trained to classify four tendon findings (hypoechogenicity, neovascularity, enthesopathy, and intrasubstance tear). Accuracy indicators and their confidence intervals (CI) were obtained for all models following a K-fold-repeated-cross-validation method. To measure multilabel prediction, multilabel accuracy, sensitivity, specificity, and receiver operating characteristic (ROC) with 95% CI were used. Results: A total of 30,007 US images (4,324 exams, 2,917 patients) were included in the analysis. The RF model presented the highest mean values in the area under the curve (AUC), sensitivity, and also specificity by each degenerative finding in the binary classification. The AUC and sensitivity showed the best performance in intrasubstance tear with 0.991 [95% CI, 099, 0.99], and 0.775 [95% CI, 0.77, 0.77], respectively. Instead, specificity showed upper values in hypoechogenicity with 0.821 [95% CI, 0.82, −0.82]. In the multilabel classifier, RF also presented the highest performance. The accuracy was 0.772 [95% CI, 0.771, 0.773], a great macro of 0.948 [95% CI, 0.94, 0.94], and a micro of 0.962 [95% CI, 0.96, 0.96] AUC scores were detected. Diagnostic accuracy, sensitivity, and specificity with 95% CI were calculated. Conclusion: Machine learning algorithms based on US images with LET presented high diagnosis accuracy. Mainly the random forest model shows the best performance in binary and multilabel classifiers, particularly for intrasubstance tears.
AB - Background: Ultrasound (US) is a valuable technique to detect degenerative findings and intrasubstance tears in lateral elbow tendinopathy (LET). Machine learning methods allow supporting this radiological diagnosis. Aim: To assess multilabel classification models using machine learning models to detect degenerative findings and intrasubstance tears in US images with LET diagnosis. Materials and methods: A retrospective study was performed. US images and medical records from patients with LET diagnosis from January 1st, 2017, to December 30th, 2018, were selected. Datasets were built for training and testing models. For image analysis, features extraction, texture characteristics, intensity distribution, pixel-pixel co-occurrence patterns, and scales granularity were implemented. Six different supervised learning models were implemented for binary and multilabel classification. All models were trained to classify four tendon findings (hypoechogenicity, neovascularity, enthesopathy, and intrasubstance tear). Accuracy indicators and their confidence intervals (CI) were obtained for all models following a K-fold-repeated-cross-validation method. To measure multilabel prediction, multilabel accuracy, sensitivity, specificity, and receiver operating characteristic (ROC) with 95% CI were used. Results: A total of 30,007 US images (4,324 exams, 2,917 patients) were included in the analysis. The RF model presented the highest mean values in the area under the curve (AUC), sensitivity, and also specificity by each degenerative finding in the binary classification. The AUC and sensitivity showed the best performance in intrasubstance tear with 0.991 [95% CI, 099, 0.99], and 0.775 [95% CI, 0.77, 0.77], respectively. Instead, specificity showed upper values in hypoechogenicity with 0.821 [95% CI, 0.82, −0.82]. In the multilabel classifier, RF also presented the highest performance. The accuracy was 0.772 [95% CI, 0.771, 0.773], a great macro of 0.948 [95% CI, 0.94, 0.94], and a micro of 0.962 [95% CI, 0.96, 0.96] AUC scores were detected. Diagnostic accuracy, sensitivity, and specificity with 95% CI were calculated. Conclusion: Machine learning algorithms based on US images with LET presented high diagnosis accuracy. Mainly the random forest model shows the best performance in binary and multilabel classifiers, particularly for intrasubstance tears.
KW - AUC curve
KW - diagnosis
KW - random forest
KW - tennis elbow
KW - ultrasound
UR - http://www.scopus.com/inward/record.url?scp=85140098879&partnerID=8YFLogxK
U2 - 10.3389/fmed.2022.945698
DO - 10.3389/fmed.2022.945698
M3 - Article
AN - SCOPUS:85140098879
SN - 2296-858X
VL - 9
JO - Frontiers in Medicine
JF - Frontiers in Medicine
M1 - 945698
ER -