In this assignment you are required to write code for detecting infections such as COVID-19 among X-Ray images:
- Use CNN, pre-trained on ImageNet, to extract basic features from X-Ray images.
- Train the classification layers in order to detect instances of Infected (COVID-19 + Pneumonia) and Normal X-Ray images.
- Fine-tune the entire network to try to improve performance.
This assignment must be completed using PyTorch. Assignments in Tensorflow/Keras or any other deep learning library WILL NOT BE ACCEPTED .
Chest X-Ray Images Dataset
New studies  have revealed that the damage done to lungs by infections belonging to the family of coronaviruses (COVID-19, SARS, ARDS etc.) can be observed in X-Ray and CT scan images. With a worldwide shortage of test kits, it is possible that careful analysis of X-Ray and CT scan images may be used in diagnosis of COVID-19 and for the regular assessment while they are recovering. In this assignment, we will use an open source dataset of X-Ray images and train a Convolutional Neural Network to try and detect instances of infections containing COVID-19 and Pneumonia.
This dataset contains X-Ray images from 2 classes:
|Class||# of images in training set||# of images in validation set||# of images in test set|
Chest X-Ray images are taken in different views (AP or PA) depending on which side of the body is facing the X-Ray scanner. Images from different views have slightly different features. For this task, we will be using images without considering their views. A few sample images:
Fine-tuning in Pytorch:
In PyTorch, each layer’s weights are stored in a Tensor. Each tensor has an attribute called ‘requires_grad’, which specifies if a layer needs training or not. In fine-tuning tasks, we freeze our pre-trained networks to a certain layer and update all the bottom layers. In PyTorch we can loop through our network layers and set ‘requires_grad’ to False for all the layers that we want to be freezed. We will set ‘requires_grad’ to True for any layer we want to fine-tune.
Besides code and report, you are all required to make a public GitHub repository where you will upload your code and results. Following conventions are for the repository only:
- Name your code notebook as covid19_classification.ipynb
- Name your repository as rollNumber_COVID19_DLSpring2020
- Show your results in md (confusion matrices and accuracy)
- Create a heading of Dataset and provide the link that was shared with you on Classroom.
- Create a folder named ‘weights’ and upload fine-tuned models. Naming convention of models is mentioned in each respective task.
- Add the following description to repository
“This repository contains code and results for COVID-19 classification assignment by
Deep Learning Spring 2020 course offered at Information Technology University, Lahore, Pakistan. This assignment is only for learning purposes and is not intended to be used for clinical purposes.”
- You may refer to the following repository to see how they have organized their results and description:
Task 1: Load pretrained CNN model and fine-tune FC Layers
- In this task you will fine-tune two networks ( ResNet-18 and VGG-16) pretrained on ImageNet
- Load these models in PyTorch and freeze all the layers except the last FC layers.
- Replace the FC layers with 2 FC layers. First FC layer will have neurons equal to:
○ (Last 2 digits of your roll number x 10) + 100
- The Last FC layer will have neurons according to the number of classes
- You may try different learning rates
- Save your model and name it as ‘vgg16_FC_Only.pth’ and ‘res18_FC_Only.pth’
Task 2: Fine-tune the CNN and FC layers of the network
- In this task you will fine-tune two pre-trained networks (ResNet-18 and VGG-16 pretrained on ImageNet weights)
- Perform different experiments where you first unfreeze only a few Convolutional layers and then the entire network and fine-tune on your dataset
- Compare the performance of training in different experiments. Show what effect it has on accuracy when you fine-tune just FC layers, then a single Conv layer, then a few Conv layers and then the entire network.
- Save your model where you fine-tune the whole network and name it as ‘vgg16_entire.pth’ and ‘pth’
In your report, for each task, you are required to provide the following:
- Confusion Matrix for train, test and validation sets
- Loss and accuracy curves for train and validation sets
- Experimental setup (learning rate, number of layers fine-tuned etc.)
- Two well classified images and two worst classified images from both classes
- Final accuracy and F1 score for each experiment
- GitHub Repository link
- Analysis on each task and comparison of experiments to each other
- Share the loss and accuracy curves on the train and validation for both.
- Use the same scale of axis for both the tasks so that they are comparable ● Discuss Task-1 vs Task-2, why and how it effects, which works better and why?
Please perform all tasks in the same notebook. Do NOT create separate notebooks for each task.
 Zu, Zi Yue, et al. “Coronavirus disease 2019 (COVID-19): a perspective from China.” Radiology (2020): 200490.