CS152 L3D Learning from Limited Labeled Data - HW1: Transfer Learning for the Birds

Get in Touch with Our Experts

HW1: Transfer Learning for the Birds

Goals

We spent Week 2 learning about Transfer Learning, including both positive aspects ("Astounding Baseline" paper) and its limitations ("Do ImageNet transfer to real-world data?"). Assignment Writing Service

In this HW1, you'll apply Transfer Learning to a real dataset, and wrestle with several questions: Assignment Writing Service

Problem 1: For a specific target classification task of interest, would we rather have a source model trained on a "generic" dataset like ImageNet1k, or a smaller dataset related to our target task? Assignment Writing Service

Problem 2: What are the tradeoffs between fine-tuning just the last layer (aka "linear probing") and fine-tuning a few more layers? Can we compose these to do better? Assignment Writing Service

Starter Code and Provided Data

You can find the starter code and our provided "BirdSnap-10" dataset in the course Github repository here: Assignment Writing Service

Two ways to run the experiments
Assignment Writing Service

Option 1: Google Colab cloud-based notebook environment

To get started quickly on Colab: Assignment Writing Service

Follow this link:
Use "Ctrl-S" or similar to copy that notebook into your Google Drive to save progress
Here's a quick video demo of how to setup Colab. Assignment Writing Service

Option 2: Your own environment

You can find .yml files specifiying required python packages in the repo. Assignment Writing Service

You'll be responsible for getting things working yourself. Assignment Writing Service

Background

A wildlife conservation organization has reached out for help to develop an automated bird species classifier. Their goal is to classify 10 specific bird species critical to biodiversity conservation. Unfortunately, large datasets for these birds are difficult to acquire. Assignment Writing Service

You have been provided: Assignment Writing Service

a `train' dataset, to be used for all model development (training and validation)
a `test' dataset, to be used only to evaluate model generalization
You can obtain these images by unpacking `birdsnap10_224x224only.zip'. Assignment Writing Service

These images come from BirdSnap dataset, a public dataset released by Berg et al. in 2014 paper release page. Assignment Writing Service

You will explore 4 possible pretrained models provided by pytorchcv, that differ across two key axes Assignment Writing Service

Model architecture
ResNet-10, with ~5M parameters
ResNet-26, with ~17M parameters
Source Dataset
ImageNet-1k, large and diverse data containing >1 million images from 1000 classes
CUB-2011-200, specialized to birds, containing 11000 images of 200 bird species
We have tried our best to construct this task so your target BirdSnap-10 dataset has no class overlap with the species in CUB-2011-200, and also no overlap with classes in ImageNet-1k (which does contain several bird classes). Assignment Writing Service

Problem 1: Should Source Models Specialize or Generalize?

Your goal in Problem 1 is to compare the 4 possible source models (each combination of 2 datasets x 2 archs) at last layer fine-tuning. Assignment Writing Service

Tasks for Code Implementation

Open models.py, and examine the PTNetForBirdSnap10 class, which defines a pytorch neural net module that uses a pretrained backbone combined with a simple linear classification head for the 10-class BirdSnap data. Assignment Writing Service

CODE 1(i) Edit `setup_trainable_params` method so that, given a desired number of layers `n` as an int, the last `n` layers are set to trainable (accepting gradient updates in the PyTorch computation graph) and all other parameters should be frozen (not accepting gradient updates). Hint: you'll need to edit the boolean property `requires_grad` of each parameter tensor. Assignment Writing Service

Now open train.py, which defines a function for performing training on our target BirdSnap data. This function works whether we are just updating the last-layer or more layers. Assignment Writing Service

CODE 1(ii) Edit `train_model` method to compute the cross-entropy loss, in two places. First, for the current batch of train data, inside the `tr_loader` loop. Second, for the current batch of validation data, inside the `va_loader` loop. Assignment Writing Service

The way you take averages differs a bit: Assignment Writing Service

Train: You want a per-example average from current batch, as a fast yet unbiased estimator for computing loss/gradients.
Valid: You want the true per-example average over full validation dataset.
Next, implement strategies to avoid overfitting, like L2 penalization of weight magnitudes for the last layer. Assignment Writing Service

CODE 1(iii) Edit `train_model` to add a L2 penalty loss on the weights only (not biases) of the last layer ("classification head"). For our provided model, you can use the dict `model.trainable_params` to access a tensor using its name as the key. The last-layer weights are named `'output.weight'`. Hint: To compute L2 magnitude, think sum of squares). Assignment Writing Service

CODE 1(iv) Edit `train_model` to add early stopping functionality. We track the number of consecutive epochs that validation cross-entropy loss is getting worse. Once this exceeds a threshold, we revert the model to its previous state (which gave the best val-set cross-entropy) and return that model. Assignment Writing Service

Finally, write code to evaluate test set accuracy. Assignment Writing Service

CODE 1(v) Edit `eval_acc` (defined in the body of hw1.ipynb) to measure a model's accuracy (defined as the fraction of examples that are correctly predicted, from 0.0 to 1.0, higher is better) on the provided test set. Assignment Writing Service

Tasks for Experiment Execution

Now, step through the provided notebook hw1.ipynb to achieve the following Assignment Writing Service

EXPERIMENT 1(i): First, do last-layer fine-tuning of ResNet10 using the ImageNet1k pretrained model, on the available training/validation sets. Use the provided train/valid data loaders (don't mess with batch_size or other settings). Monitor train and validation metrics, make your own plots of these metrics as needed. Find a suitable learning rate and L2-regularization strength to minimize over/under-fitting. Assignment Writing Service

EXPERIMENT 1(ii): Repeat step 1(i) above for the other combinations of architecture and source dataset. Using the code's computed train/validation metrics tracked over epochs, find a setting of hyperparamters (n_epochs, lr, l2 penalty, seed). that seems to deliver reasonable heldout performance without too much over/under-fitting. Assignment Writing Service

We recommend saving intermediate results to a .pkl file or similar (see hw1.ipynb), so it is easy to plot later without redoing experiments. Assignment Writing Service

Tasks for Report Writing

In your submitted report, include the following Assignment Writing Service

FIGURE 1(a): Plot loss-vs-epoch for your best runs of (ResNet10, ImageNet1k) and (ResNet10, CUB). Use style provided in the Figure1a block in hw1.ipynb. Include this figure in your report, with a caption that summarizes major takeaway messages (did you see major overfitting? was strong L2-penalization needed? was early stopping beneficial?). Assignment Writing Service

FIGURE 1(b): Make target-task-accuracy vs source-task-accuracy plot, like the main figure in the Fang et al. paper from day04. Use the style provided in the Figure1b block in hw1.ipynb. Include this figure in your report, with a caption that summarizes both the major takeaway messages of your results (which src-dataset is better for our target task? which arch is better?). Try to reason about why given your knowledge from readings. Assignment Writing Service

Problem 2: LP then FT

To keep things simple, we'll fix ('ResNet10', 'ImageNet1k') for the arch and source dataset throughout Problem 2. Be sure you're only using this configuration. Assignment Writing Service

Your goal in Problem 2 is to implement LP-then-FT method of Kumar et al. (from our day04 readings). That is, you'll do: Assignment Writing Service

First stage of LP (Call train.train_model with n_trainable_layers=1). Assignment Writing Service

You can reuse the best hyperparameters from Problem 1 above.
Second stage of FT (Call train.train_model with n_trainable_layers=3). Be sure to initialize from the model produced by stage one. You'll have 3 trainable layers (not just 1), so lots more flexibility but also potential to overfit. Assignment Writing Service

You'll need to tune lr / l2penalty / n_epochs to be sure you are fitting reasonably

Tasks for Code Implementation

CODE 2(i) : Edit your hw1.ipynb notebook to implement two-phase training. In the first phase, again set n_trainable_layers=1 and use exactly the lr/l2penalty/seed combinations that worked well in Problem 1. In the second phase, you'll want to consider different settings of lr/l2penalty. Assignment Writing Service

To make things quick, you can run the first phase just once, yielding a good `LPmodel`, and then tune the hyperparameters of the second phase using `copy.deepcopy(LPmodel)` to get a copy of the model to train each hyperparameter config, while leaving the original LPmodel unchanged. Assignment Writing Service

Tasks for Experiment Execution

EXPERIMENT 2(i): Run a small-scale hyperparameter search (either manually or systematically), aiming to find a configuration of lr/l2penalty for the second phase that delivers the best possible validation performance. Don't spend more than about an hour. Assignment Writing Service

EXPERIMENT 2(ii): Compute the test-set accuracy for both the phase 1 and phase 2 "best" models, using `eval_acc` Assignment Writing Service

Tasks for Report Writing

In your submitted report, include the following Assignment Writing Service

FIGURE 2(a): Plot loss/error-vs-epoch plots in two panels (left=LP phase, right=FT phase), using style provided in the Figure2a block in hw1.ipynb. Aim to show the best run from experiment 2(i), where ideally reading across the plot there is obvious continuity between the LP and FT phases (e.g. va loss doesn't immediately jump away from the values seen at end of pretraining). Include this figure in your report, with a caption that summarizes major takeaway messages: was your implementation successful? Assignment Writing Service

SHORT ANSWER 2(b) Report the ultimate test-set accuracy for both LP and LP-then-FT. Reflect on any differences. Assignment Writing Service

Problem 3: Conceptual Questions

Short Answer 3a

Provide a math formula for computing the complete loss used to train models here, including the cross-entropy and the L2-penalty terms. Assignment Writing Service

Notation: Assignment Writing Service

B : total number of examples in current batch, indexed by $i$
C : total number of classes for target task
$y_{i}$ : int indicator of class label for example $i$
$z_{i}$ : vector of logits for example $i$
$w$ : matrix of weight parameters of last layer
$b$ : vector of bias parameters of last layer
You can only use basic math functions (log, sum, exp). Be sure to clearly define the assumed size of each vector/matrix, using the actual values in the code you implemented in Problem 1. Assignment Writing Service

CS152 L3D Learning from Limited Labeled Data - HW1: Transfer Learning for the Birds

HW1: Transfer Learning for the Birds

Goals

Starter Code and Provided Data

You can find the starter code and our provided "BirdSnap-10" dataset in the course Github repository here: Assignment Writing Service Two ways to run the experiments Assignment Writing Service

Option 1: Google Colab cloud-based notebook environment

To get started quickly on Colab: Assignment Writing Service Follow this link:Use "Ctrl-S" or similar to copy that notebook into your Google Drive to save progressHere's a quick video demo of how to setup Colab. Assignment Writing Service

Option 2: Your own environment

You can find .yml files specifiying required python packages in the repo. Assignment Writing Service You'll be responsible for getting things working yourself. Assignment Writing Service

Background

Problem 1: Should Source Models Specialize or Generalize?

Your goal in Problem 1 is to compare the 4 possible source models (each combination of 2 datasets x 2 archs) at last layer fine-tuning. Assignment Writing Service

Tasks for Code Implementation

Tasks for Experiment Execution

Tasks for Report Writing

Problem 2: LP then FT

Tasks for Code Implementation

Tasks for Experiment Execution

Tasks for Report Writing

Problem 3: Conceptual Questions

Short Answer 3a

You can find the starter code and our provided "BirdSnap-10" dataset in the course Github repository here: Assignment Writing Service

Two ways to run the experiments
Assignment Writing Service

To get started quickly on Colab: Assignment Writing Service

Follow this link:
Use "Ctrl-S" or similar to copy that notebook into your Google Drive to save progress
Here's a quick video demo of how to setup Colab. Assignment Writing Service

You can find .yml files specifiying required python packages in the repo. Assignment Writing Service

You'll be responsible for getting things working yourself. Assignment Writing Service