Description

5/5 - (1 vote)

You are required to do this assignment in the Python (version 3) programming language. Do not use any auto-differentiation toolboxes (PyTorch, TensorFlow, Keras, etc) – you are only permitted and recommended to vectorize your computation using the Numpy library.
We recommend that you look through all of the problems before attempting the first problem. However we do recommend you complete the problems in order, as the difficulty increases, and questions often rely on the completion of previous questions.
If you haven’t done so, use pdb to debug your code effectively.

Introduction

Name: CMU11485 Homework 3 Part 1-THE RISE OF RNN Solved
SKU: 61540
Price: 35.00 USD
Availability: InStock

In this assignment, you will continue to develop your own version of PyTorch, which is of course called MyTorch (still a brilliant name; a master stroke. Well done!).

Homework Structure

Below is a list of files that are directly relevant to hw3.

POR

TANT:

First,

copy

the

high

lighted

files/

fold

ers

from

the

HW3P1

hand

out

over

the

cor

spond

–

ing

fold

ers

that

you

used

hw1

and

hw2.

NOTE: We recommend you make a backup of your hw3 files before copying everything over, just in case you break code or want to revert back to an earlier version.

Next,

copy

and

paste

the

fol

low

ing

code

stubs

from

hw3/stubs.py

into

the

cor

rect

files.

Copy Slice(Function) into nn/functional.py.
Copy Cat(Function) into nn/functional.py.
Copy unsqueeze(), getitem () and len () into tensor.py. These are methods for class Tensor.
Copy function cat into tensor.py. This is an independent function and not a class method.

NOTE: You may need to define pow , which implicitly calls the Pow(Function), if you haven’t done so

ready.

This

should

able

han

dle

int

nents.

Check

HW3P1

FAQ

for

help.

0.1 Running/Submitting Code

This section covers how to test code locally and how to create the final submission.

0.1.1 Running Local Autograder

Run the command below to calculate scores and test your code locally.

./grade.sh 3

If this doesn’t work, converting line-endings may help:

sudo apt install dos2unix dos2unix grade.sh

./grade.sh 3

If all else fails, you can run the autograder manually with this:

python3 ./autograder/hw3_autograder/runner.py

0.1.2 Running the Sandbox

We’ve provided sandbox.py: a script to test and easily debug basic operations and autograd.

Note: We will not provide new sandbox methods for this homework. You are required to write your own from now onwards.

python3 sandbox.py

0.1.3 Submitting to Autolab

Note: You can submit to Autolab even if you’re not finished yet. You should do this early and often, as it guarantees you a minimum grade and helps avoid last-minute problems with Autolab.

Run this script to gather the needed files into a handin.tar file:

./create_tarball.sh

If this crashes (with some message about a hw4 folder) use dos2unix on this file too.

You can now upload handin.tar to Autolab .

1 RNN

In mytorch/nn/rnn.py we will implement a full-fledged Recurrent Neural Network module with the ability to handle variable length inputs in the same batch.

1.1 RNN Unit

Follow the starter code available in mytorch/nn/rnn.py to get a better sense of the various attributes of the RNNUnit.

Figure 1: The computation flow for the RNN Unit forward.

) (1)

The equation you should follow is given in equation 1.

You

are

quired

ple

ment

the

for

ward

method

NUnit

mod

ule. The description of the inputs and

expected outputs are specified below:

Inputs

x (effective batch size, input size)
- Input at the current time step.
- NOTE: For now interpret effective batch size same as regular batch size. The difference will become apparent later in the homework. For the definition of effective batch size check Appendix and/or Figure 5
h (effective batch size, hidden size)
- Hidden state generated by the previous time step, h_t₋₁

Outputs

h prime: (effective batch size, hidden size)

– New hidden state generated at the current time step, h⁰_t

NOTE: As a matter of convention, if you apply certain reshape/transpose operations while using self.weight ih then please do the same for self.weight hh. This is important to note because self.weight hh is symmetric and is therefore exposed to multiple interpretations on how to use it.

1.2 Detour 1: Cat, Slice, Unsqueeze

In the following section we implement some necessary functions which will prove to be extremely handy while completing the rest of the homework. These are namely: Cat, Slice and Unsqueeze.

1.2.1 Cat (7 points)

Concatenates the given sequence of tensors in the given dimension. All tensors must either have the same shape (except in the concatenating dimension) or be empty. Please refer to the PyTorch Documentation for better understanding. If you are not into documentations then please refer back to Recitation 0 where this was covered as well.

First implement the corresponding Cat class in mytorch/nn/functional.py. This will be a subclass of

Func

tion

class.

Next implement a helper function cat in the mytorch/tensor.py to call the concatenation operation on a list quences. This should in turn correctly call the corresponding Function sub-class. Below is a description

of the required inputs and outputs.

Inputs

seq: list of tensors
- The list basically contains the sequences we want to concatenate
dim: (int, default=0)
- The dimension along which we are supposed to concatenate the tensors in the list seq

Outputs • Tensor:

The concatenated tensor

NOTE: You don’t need to add anything to the Tensor class in mytorch/tensor.py with respect to Cat operation.

1.2.2 Slice

Despite being worth only 5 points, implementing this operation takes your Tensor class to a while new level. With this operation up and running, you can index your Tensor class just like you index Numpy arrays/PyTorch tensors, while autograd takes charge of calculating the required gradients (assuming you implemented the backward correctly).

First implement the corresponding Slice class in mytorch/nn/functional.py. This will be a subclass of

Func

tion

class.

add

the

getitem

method

your

Ten

sor

class

torch tensor.py,

which

turn

calls

the

Slice

func

tion

sub

-class.

HINT

The implementation of a slicing operation from scratch may appear to be a daunting task but we will employ a cool trick to get this done in just a few lines of code. Whenever we try to slice a segment of a given tensor using the [] notation, python creates a key (depending on what you provide within []) and passes this to the getitem function. This key is used by the class to provide the appropriate result to the user. The key can be either an integer, a python slice object, a list etc depending on how we slice our object. This is the principle implemented in Numpy for slicing ndarrays. For the purpose of our implementation we will try to leverage this fact to make our task easy. Once we create a getitem method for our tensor class, everytime we slice our tensors the getitem method will be invoked with the appropriate key. Given that the intention of slicing on our tensor is to get the appropriate segment of the underlying ndarray (data attribute) as a tensor object, can you use this key to complete the task.

1.2.3 Unsqueeze

Returns a new tensor with a dimension of size one inserted at the specified position. Please refer to the PyTorch Documentation for better understanding. If you are not into documentations then please refer back to Recitation 0 where this was covered as well.

Add

the

unsqueeze

method

your

Ten

sor

class

torch tensor.py

HINT: Why is this only worth 3 points? Well if you look closely, you might be able to use a function you implemented previously without any extra effort. The name of the function you might need begins with ’R’ in numpy.

1.3 Detour 2: pack sequence, unpack sequence

What’s all the fuss about handling variable length samples in a batch?

This section is devoted to the construction of objects of class PackedSequence which help us create batches in the context of RNNs.

As you might have guessed by now, handling of variable length sequences in a batch requires way more machinery than what exists in a simple RNN. In this section we will create all the utility functions that will help us with the handling of variable length sequence in a single batch.

Before we proceed let’s understand an important class that packages our packed sequences in a form which makes it easier for us to deal with them. This will also help you better understand why handling variable length sequences in a batch can be a tedious task.

Refer to Appendix for more details.

Please

use

the

Ten

sor

Slic

ing

and

Cat

ations

you

ple

mented

the

pre

ous

sec

tions

while

ple

ment

–

ing

these

func

tions.

1.3.1 pack sequence (15 points)

torch/n

n/u

til.py

ple

ment

the

method

pack sequence

This function as the name suggests, packs a list of variable length Tensors into an object of class PackedSequence. A list of tensors is given (as shown in Figure 2) which are to be converted into a PackedSequence object (detailed class description below). This object holds the packed 2d tensor which is later fed into the RNN module for training and testing. Refer to Figures 2, 3 and 4 to better understand what pack sequence does.

Figure 2: List of tensors we want to pack

Figure 3: First we sort the list in a descending order based on number of timesteps in each

Figure 4: Final Packed 2d Tensor

Class: PackedSequence Attributes

data
- The actual packed tensor. In the context of this homework this is a 2D tensor. Refer to 4
sorted indices
- 1d ndarray (numpy n-dimensional array) of integers containing the indices of elements in the original list when they are sorted based on number of time steps in each sample. Refer to Figure 2 and Figure 3
batch sizes
- 1d ndarray of integers where ith element represents the number of samples in the original list having atleast (i+1) timesteps. Refer to Figure 4 for more clarity.

Function: pack sequence Inputs

seq: list of tensors

– The list contains tensors representing individual instances, each having a variable length.

Outputs

PackedSequence:

– PackedSequence

1.3.2 unpack sequence

torch/n

n/u

til.py

ple

ment

the

method

pack sequence

This function unpacks a given PackedSequence object into a list of tensors that were used to create it. This is also a part of the class PackedSequence.

Function: unpack sequence Inputs

ps: PackedSequence

Outputs

list of Tensors:

1.4 TimeIterator

torch/n

n/rnn.py

ple

ment

the

for

ward

method

for

the

TimeIt

tor

ates

through

time

pro

cess

ing

the

tire

quence

For a given input this class it timesteps. Can be thought to represent a single layer for a given basic unit (interpret this as RNNUnit for now) which is applied at each time step.

Working of forward method in TimeIterator

This module’s forward method is tasked with receiving a batch of samples packed in the PackedSequence form.
The method runs the section of the input corresponding to a single time-step across the batches through an RNNUnit.
The hidden state returned by the RNNUnit is collected and then the section of the input corresponding to the next time-step across the batches are fed through the RNNUnit along with the previously ( lasttime step) collected hidden state to generate a new hidden-state.
This process is done iteratively till all time-steps are exhausted for each sample in the batch.
Follow Figures 5, 6, 7 and 8 to get a better understanding of how TimeIterator processes and given input PackedSequence.

The inputs and outputs of the forward method are given below for the

Class: TimeIterator Forward Method Inputs

input: PackedSequence
hidden: Tensor (batch size,hidden size)

Outputs

PackedSequence: hiddens at each timestep for each sample packaged as a packed sequence

Sam

ples

are

dered

scend

ing

der

based

num

ber

Tensor (batch size,hidden size): This is the hidden generated by the last time step for each sample joined together. This is a slight deviation from PyTorch.

base

class

with

sic unit

Finally in mytorch/nn/rnn.py create a class RNN as a sub-class of TimeIterator which instantiates the NUnit. Please look at the code for more details.

Figure 5: Iteration 0 for the TimeIterator

Figure 6: Iteration 1 for the TimeIterator

Figure 7: Iteration 2 for the TimeIterator

Figure 8: The final output from the TimeIterator

2 GRU

In this section we will explore the world of GRU. We will heavily use the machinery built for RNN to make our lives easier and create a full-fledged working GRU.

2.1 GRU Unit

nam

ing

con

ven

tion

than

the

torch

doc

men

In mytorch/nn/gru.py implement the forward pass for a GRUUnit (though we follow a slightly different tion.) The equations for a GRU cell are the following:

rt = σ(Wirxt + bir + Whrht−1 + bhr)	(2)
zt = σ(Wizxt + biz + Whzht−1 + bhz)	(3)
nt = tanh(Winxt + bin + rt ⊗ (Whnht−1 + bhn))	(4)
ht = (1 −zt) ⊗nt + zt ⊗ht−1	(5)

Please refer to (and use) the GRUUnit class attributes defined in the init method. You are not expected to define any extra attributes for a working implementation of this class.

The inputs to the GRUCell forward method are x and h represented as x_tand h_t₋₁in the equations above. These are the inputs at time t.

The output of the forward method is h_tin the equations above.

You are required to implement the forward method of this module. The description of the inputs and expected outputs are specified below:

Inputs

x (effective batch size, input size)
- Input at the current time step.
- For the definition of effective batch size check Appendix and/or Figure 5
h (effective batch size, hidden size)
- Hidden state generated by the previous time step, h_t₋₁

Outputs

h prime: (effective batch size, hidden size)

– New hidden state generated at the current time step, h⁰_t

2.2 GRU

class

with

sic unit

GRUU

Finally in mytorch/nn/gru.py create a class GRU as a sub-class of TimeIterator which instantiates the base nit. Please look at the code for more details.

Appendix

A Glossary

effective batch: i^theffective batch refers to the set of i^thtimesteps from each sample that are simultaneously fed to the RNNUnit in the (i − 1)^thiteration inside the TimeIterator
effective batch size: number of samples in an effective batch. Effective batch size of the i^theffective batch is equal to the number of samples containing atleast (i+1) timesteps

B What’s all the fuss about handling variable length samples in one batch?

B.1 Importance of batching in Deep Learning

When training a neural network, we stack together a number of inputs a.k.a batching and pass them together for a single training iteration. From a computational standpoint this helps us make the most of GPUs by parallelizing this computation (done by various frameworks such a PyTorch for you). Moreover ideas like batch normalization fundamentally depend on existence of batches instead of using single samples for training. Therefore there is no denying the importance of batching in the deep learning world.

B.2 Batching for MLPs and CNNs

Life in a world with only MLPs and CNNs was simpler from a batching perspective. Each input sample had exactly the same shape. These input samples in the form of tensors (each with the same shape) could then be stacked together along a new dimension to create a higher-dimensional tensor which then becomes our batch. For example consider a simple 1-D input where each sample has 20 features. We now want to create a batch of 256 such samples. Each sample has a shape (20,) which are then stacked along a new dimension (this now becomes dimension 0) to provide a batch of shape (256,20) where the first dimension represents number of samples in a batch while the second dimension represents the number of features.

B.3 Batching in RNNs

With RNNs, this becomes very tricky. Each sample in this context has a temporal dimension (atleast in the simplest case). The number of feature at each time step remains fixed for all the samples. Let this be K for the purpose of our discussion. Therefore each sample i has T_itime steps where at each time step we have K features. The i^thsample can then be represented by a tensor of shape (T_i,K). Given that the first dimension is of variable length there is no simple way for us to create batches as we did for MLPs and CNNs. One might argue for fixed time-step inputs, but that severely limits the power of RNNs/GRUs/LSTMs and reduces them to a large fixed MLP. Therefore batching in RNNs require a special setup which we provide via PackedSequences.

P1-k7pmti.zip

CMU11485 Homework 3 Part 1-THE RISE OF RNN Solved

If Helpful Share:

Description

Introduction

Homework Structure

0.1 Running/Submitting Code

0.1.1 Running Local Autograder

0.1.2 Running the Sandbox

0.1.3 Submitting to Autolab

1 RNN

1.1 RNN Unit

Inputs

Outputs

1.2 Detour 1: Cat, Slice, Unsqueeze

1.2.1 Cat (7 points)

Inputs

1.2.2 Slice

HINT

1.2.3 Unsqueeze

1.3 Detour 2: pack sequence, unpack sequence

1.3.1 pack sequence (15 points)

Class: PackedSequence Attributes

Function: pack sequence Inputs

Outputs

1.3.2 unpack sequence

1.4 TimeIterator

Working of forward method in TimeIterator

Class: TimeIterator Forward Method Inputs

Outputs

2 GRU

2.1 GRU Unit

Inputs

Outputs

2.2 GRU

A Glossary

B.1 Importance of batching in Deep Learning

B.2 Batching for MLPs and CNNs

B.3 Batching in RNNs

Related products

114 Project Solved

CMU11485 Homework 1 Part 2-An Introduction to Neural Networks Solved

CMU11485 Homework 2 Part 2-Face Verification using Convolutional Neural Networks Solved