الفهرس | Only 14 pages are availabe for public view |
Abstract In this study, we have designed a skin lesion classifier based on capsule networks within a multitask learning framework that is providing the healthcare stakeholders the classification of the skin lesion and the segmentation mask of the lesion. Our framework consists of 5 main key parts: feature extraction section, capsules network, capsule map, classification prediction head and segmentation prediction head. The feature extraction funnel is a series of convolution layers and max-pooling layers to bring down the large derma image into an easier consumable and feature rich activation maps for the capsule network and to provide the help in bringing up the segmentation mask for the original derma image in its original dimension. The capsule network here is the core of our framework which provides a capsule map which is the seed map for our classification and segmentation prediction heads. The capsule map was the big bet of this study whether it will learn how to encode the pose and spatial relations of the skin lesion features or not. Our capsule network learnt excellent weights for feature extraction, which subsequently offered a strong capsule map that provided the classification prediction head and segmentation prediction head the needed boost to offer phenomenal results. The classification prediction head was based on either two or three capsules for the binary and tri-class framework respectively. The segmentation prediction head is a series of transpose convolution layers merging the capsule map with the activation maps. Our frameworks were evaluated over the ISIC 2017 challenge which generously offered training, validation, and test set. The dataset was shooting for 3 skin pathologies which are: melanoma, seborrheic keratosis, and nevus. We offered a thorough analysis for our SKINCAPs framework which is a multitask learning framework based on capsule networks. We designed two frameworks: a binary framework and tri-class framework. The key objective of the binary framework was to classify between melanoma cases and non-melanoma cases which included seborrheic keratosis and nevus cases, while the tri-class framework was classifying melanoma cases, seborrheic keratosis cases and nevus cases. Both frameworks were offering the segmentation masks for the skin lesion. Our work was analysed thoroughly by comparing our 2 frameworks against each other, comparing the 2 frameworks against the best submissions in ISIC 2017 challenge for both tasks 1 and 3 in the competition (segmentation and classification) and comparing our multitask learning framework to one of the best capsule networks research works done for the skin lesion analysis. Our analysis showed that the tri-class framework was better than the binary framework, since it offered a sensitivity of 91.7% vs 71.8%. This confirmed our hypothesis about the capsule map. The capsule map is performing better in the classification prediction head when we disentangle the classes separately compared to combining seborrheic keratosis and nevus into a single map which is the case for our binary framework. It was also clear to us that our tri-class models were performing superiorly in the sensitivity section, which is of more importance when it comes to general population initial screening. We believe that for deploying AI systems in the healthcare industry, it is super crucial to have unprecedented sensitivity levels over the specificity. Moreover, the multitask learning framework helped our model to squeeze better performance over the normal classification route. Yet in general, both frameworks offered substantial improvements; the binary framework offered an AUC score of 98.3% and the tri-class framework offered an AUC score of 99.1%. On the other hand, our framework segmentation predictions were very close to the expert ground truth 50 annotations. According to the ISIC challenge committee, Jaccard indexes which are above 78.6% are sufficient for inter-observer agreement. Our segmentation prediction head was able to achieve 79.2% and 80% Jaccard index for the binary framework and tri-class framework respectively. In this work, we demonstrated the strength of capsule networks and how embedding multitask learning was able to squeeze more performance out of it surpassing the best submissions in the ISIC challenge, keeping in mind the small dataset used for training the model. It is worth noting that this is the first work that has jointly offered a model that would perform both task 1 (segmentation) and task 3 (classification) in the ISIC 2017 challenge using a single model architecture. This is also a step to show the trajectory of explainable AI that can be helpful for the healthcare stakeholders where the model won’t only show off the predictions, but the model can also supply more in-depth aspects like the segmentation mask of the lesion. We know that as of now, this is not enough explanation, and in this sense, we wish to extend this work with a group of dermatologists that can bring us more details, features, attributes, and explanations about each lesion, which can help us extend our model with an extra prediction head that can offer the explainable points they would be looking after for each lesion prediction. This will ensure an even better performing model as has been shown in our study because more multitasking would lead to even more phenomenal performance with the benefit of adding in depth explainability insights for the healthcare teams. This was the key focus of our study, embedding multitask learning with capsule networks to offer explainability to the doctors, which would lead to a phenomenal performance across many aspects. We would also love to furtherly study if there is a correlation between the centre pixels of each lesion with the predictions and whether a segmentation model needs to be deployed first to extract the lesions from the derma images before sending them to the capsule network. We believe this would be also an important study to analyse further if we can squeeze more performance from the capsule networks. |