NSYSU Assignment 5-Image Captioning Solved

35.00 $

Category:

Description

5/5 - (1 vote)

Assignment #5

Image Captioning

Overview

  • In the previous assignments, we implemented multiple image classification tasks.
  • In this assignment, you will design and train a neural network which combines CNN and RNN to process an input image, then output a sequence that describe the image.
  • You are free to use pre-trained models like ResNet or LSTM as your backbone structure.

Image Captioning

Image captioning is an interdisciplinary research problem that stands between computer vision and natural language processing.

Flickr8k Dataset

• Flickr8k-Images-Captions
• CollectedbyAlexanderMamaev.
• Sentence-basedimagedescriptionandsearch • Consistingof8,091imagesthatareeach

paired with five different captions

A child in a pink dress is climbing up a set of stairs in an entry way .

Assignment #5 Dataset

8091 imags

captions

Your task

  • We have code skeleton for you guys.
  • https://colab.research.google.com/drive/1E96yjndJyBTAEEcgSth-

    RyAVqd4H1WcW?usp=sharing

  • Design a convolutional neural network to do image captioning.
  • The images provided are of different resolutions. You’ll need to resize the images
  • To get a high accuracy, you’ll need to experiment with different filter sizes, different number of layers, and other design principles discussed in class to figure out a network architecture that works best.
  • You’ll also need to try data augmentation, dropout, batch normalization as well as different optimizers and other tricks to boost performance.

into a fixed size of your own choice.

Things you cannot do

• You cannot copy trained models from others.
• You cannot copy a whole page of code from the Internet.

Any violation will result in no points!

 

  • Assignment5_Image_Captioning-pxtfl8.zip