Author: Khalifa, Bassam Tarek Ahmed Ismail ./ Title: Skin lesion diagnosis using capsule networks in multitask learning framework (SkinCap) /

Search In this Thesis

العنوان

Skin lesion diagnosis using capsule networks in multitask learning framework (SkinCap) /

المؤلف

Khalifa, Bassam Tarek Ahmed Ismail .

هيئة الاعداد

باحث / Bassam Tarek Ahmed Ismail Khalifa

مشرف / Saleh Abdelshakour Shehaby

مشرف / Osama Hassan Atallah

مناقش / Saleh Abdelshakour Shehaby

مناقش / Yasser Fouad Mahmoud Hassan

الموضوع

Biomedical Engineering Science in Biomedical Image Processing

تاريخ النشر

2023.

عدد الصفحات

82 p. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

الهندسة الطبية الحيوية

تاريخ الإجازة

29/7/2023

مكان الإجازة

جامعة الاسكندريه - معهد البحوث الطبية - Biomedical Engineering

الفهرس

Only 14 pages are availabe for public view

from

Abstract

In this study, we have designed a skin lesion classifier based on capsule networks within a
multitask learning framework that is providing the healthcare stakeholders the classification of
the skin lesion and the segmentation mask of the lesion. Our framework consists of 5 main key
parts: feature extraction section, capsules network, capsule map, classification prediction head
and segmentation prediction head. The feature extraction funnel is a series of convolution layers
and max-pooling layers to bring down the large derma image into an easier consumable and
feature rich activation maps for the capsule network and to provide the help in bringing up the
segmentation mask for the original derma image in its original dimension. The capsule network
here is the core of our framework which provides a capsule map which is the seed map for our
classification and segmentation prediction heads. The capsule map was the big bet of this study
whether it will learn how to encode the pose and spatial relations of the skin lesion features or
not. Our capsule network learnt excellent weights for feature extraction, which subsequently
offered a strong capsule map that provided the classification prediction head and segmentation
prediction head the needed boost to offer phenomenal results. The classification prediction head
was based on either two or three capsules for the binary and tri-class framework respectively.
The segmentation prediction head is a series of transpose convolution layers merging the capsule
map with the activation maps.
Our frameworks were evaluated over the ISIC 2017 challenge which generously offered training,
validation, and test set. The dataset was shooting for 3 skin pathologies which are: melanoma,
seborrheic keratosis, and nevus. We offered a thorough analysis for our SKINCAPs framework
which is a multitask learning framework based on capsule networks. We designed two
frameworks: a binary framework and tri-class framework. The key objective of the binary
framework was to classify between melanoma cases and non-melanoma cases which included
seborrheic keratosis and nevus cases, while the tri-class framework was classifying melanoma
cases, seborrheic keratosis cases and nevus cases. Both frameworks were offering the
segmentation masks for the skin lesion.
Our work was analysed thoroughly by comparing our 2 frameworks against each other,
comparing the 2 frameworks against the best submissions in ISIC 2017 challenge for both tasks
1 and 3 in the competition (segmentation and classification) and comparing our multitask
learning framework to one of the best capsule networks research works done for the skin lesion
analysis. Our analysis showed that the tri-class framework was better than the binary framework,
since it offered a sensitivity of 91.7% vs 71.8%. This confirmed our hypothesis about the capsule
map. The capsule map is performing better in the classification prediction head when we
disentangle the classes separately compared to combining seborrheic keratosis and nevus into a
single map which is the case for our binary framework. It was also clear to us that our tri-class
models were performing superiorly in the sensitivity section, which is of more importance when
it comes to general population initial screening.
We believe that for deploying AI systems in the healthcare industry, it is super crucial to have
unprecedented sensitivity levels over the specificity. Moreover, the multitask learning framework
helped our model to squeeze better performance over the normal classification route. Yet in
general, both frameworks offered substantial improvements; the binary framework offered an
AUC score of 98.3% and the tri-class framework offered an AUC score of 99.1%. On the other
hand, our framework segmentation predictions were very close to the expert ground truth
50
annotations. According to the ISIC challenge committee, Jaccard indexes which are above 78.6%
are sufficient for inter-observer agreement. Our segmentation prediction head was able to
achieve 79.2% and 80% Jaccard index for the binary framework and tri-class framework
respectively.
In this work, we demonstrated the strength of capsule networks and how embedding multitask
learning was able to squeeze more performance out of it surpassing the best submissions in the
ISIC challenge, keeping in mind the small dataset used for training the model. It is worth noting
that this is the first work that has jointly offered a model that would perform both task 1
(segmentation) and task 3 (classification) in the ISIC 2017 challenge using a single model
architecture.
This is also a step to show the trajectory of explainable AI that can be helpful for the healthcare
stakeholders where the model won’t only show off the predictions, but the model can also supply
more in-depth aspects like the segmentation mask of the lesion. We know that as of now, this is
not enough explanation, and in this sense, we wish to extend this work with a group of
dermatologists that can bring us more details, features, attributes, and explanations about each
lesion, which can help us extend our model with an extra prediction head that can offer the
explainable points they would be looking after for each lesion prediction. This will ensure an
even better performing model as has been shown in our study because more multitasking would
lead to even more phenomenal performance with the benefit of adding in depth explainability
insights for the healthcare teams. This was the key focus of our study, embedding multitask
learning with capsule networks to offer explainability to the doctors, which would lead to a
phenomenal performance across many aspects.
We would also love to furtherly study if there is a correlation between the centre pixels of each
lesion with the predictions and whether a segmentation model needs to be deployed first to
extract the lesions from the derma images before sending them to the capsule network. We
believe this would be also an important study to analyse further if we can squeeze more
performance from the capsule networks.