Description
Assignment #4
Deep Fashion
Transfer Learning and Multi-task Learning
Overview: Transfer Learning
• As discussed in lecture, transfer learning plays an essential role in many vision tasks.
• Torchvision provide many model architectures and pre-trained weights was trained on big general ImageNet dataset.
Overview: MTL(Multi-Task Learning)
• Multitask Learning (MLT) is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias.
• It does this by learning tasks in parallel while using a shared representation; what is learned for each task can help other tasks be learned better.
• In this assignment, you will gain experience in transfer learning and MLT. You are to implement a multi-task model to predict the category and attributes of a fashion item.
Deep Fashion
- Deep Fashion is a large-scale clothe dataset from The Chinese University of Hong Kong(香港中文大學).
- Dataset have over 800K images (different angles and different scenes).
• Each images of dataset is labeled with: 1. 50 category (multi-class)
2. 1000 attributes (multi-label)
3. Bounding box
4. Landmarks
Category: 0(dress) Attributes: floral, maxi
- 10 categories was selected from source dataset. Have 55845 images.
- 15 attributes was selected to compose this dataset.
Your task
• Build a deep network (could from pretrained one) that predicts the category and attributes of an item simultaneously (multi-tasking).
• There are two parts of output
• Category (multi-class classification):
• Each image could be classified into 1 of 10 categories
• Attribute (multi-label classification):
• Each image could be attributed with some of 15 attributes (could >= 1)
• You should consider the choice of activation and loss function • Note: DO NOT build two models respectively.
Evaluation
• Category
• Metric: Accuracy
• Submission format
• Attribute
• Metric: Mean F1-Score • Submission format
Hints from 2020’s me(Important): https://hackmd.io/@teacher144123/HyfKB639w
Things you cannot do
- You cannot submit results predicted by others.
- You cannot copy trained models from others.
- You cannot copy code from others, internet, GitHub …
- You cannot collect more images to train your model in order to boost performance.
- You cannot use the weights of pre-trained model. Any violation will result in 0 score!