1. Homepage
  2. Coding
  3. CS152 L3D Learning from Limited Labeled Data - HW1: Transfer Learning for the Birds

CS152 L3D Learning from Limited Labeled Data - HW1: Transfer Learning for the Birds

Order Now
TuftsCS152Learning from Limited Labeled DataTransfer Learning for the BirdsPythonMachine Learning

HW1: Transfer Learning for the Birds

Goals

We spent Week 2 learning about Transfer Learning, including both positive aspects ("Astounding Baseline" paper) and its limitations ("Do ImageNet transfer to real-world data?"). Assignment Writing Service

In this HW1, you'll apply Transfer Learning to a real dataset, and wrestle with several questions: Assignment Writing Service

  • Problem 1: For a specific target classification task of interest, would we rather have a source model trained on a "generic" dataset like ImageNet1k, or a smaller dataset related to our target task? Assignment Writing Service

  • Problem 2: What are the tradeoffs between fine-tuning just the last layer (aka "linear probing") and fine-tuning a few more layers? Can we compose these to do better? Assignment Writing Service

Starter Code and Provided Data

Option 1: Google Colab cloud-based notebook environment

Option 2: Your own environment

Background

A wildlife conservation organization has reached out for help to develop an automated bird species classifier. Their goal is to classify 10 specific bird species critical to biodiversity conservation. Unfortunately, large datasets for these birds are difficult to acquire. Assignment Writing Service

You have been provided: Assignment Writing Service

  • a `train' dataset, to be used for all model development (training and validation)
  • a `test' dataset, to be used only to evaluate model generalization

You can obtain these images by unpacking `birdsnap10_224x224only.zip'. Assignment Writing Service

These images come from BirdSnap dataset, a public dataset released by Berg et al. in 2014 paper release page. Assignment Writing Service

You will explore 4 possible pretrained models provided by pytorchcv, that differ across two key axes Assignment Writing Service

  • Model architecture
    • ResNet-10, with ~5M parameters
    • ResNet-26, with ~17M parameters
  • Source Dataset
    • ImageNet-1k, large and diverse data containing >1 million images from 1000 classes
    • CUB-2011-200, specialized to birds, containing 11000 images of 200 bird species

We have tried our best to construct this task so your target BirdSnap-10 dataset has no class overlap with the species in CUB-2011-200, and also no overlap with classes in ImageNet-1k (which does contain several bird classes). Assignment Writing Service

Problem 1: Should Source Models Specialize or Generalize?

Your goal in Problem 1 is to compare the 4 possible source models (each combination of 2 datasets x 2 archs) at last layer fine-tuning. Assignment Writing Service

Tasks for Code Implementation

Open models.py, and examine the PTNetForBirdSnap10 class, which defines a pytorch neural net module that uses a pretrained backbone combined with a simple linear classification head for the 10-class BirdSnap data. Assignment Writing Service

CODE 1(i) Edit setup_trainable_params method so that, given a desired number of layers n as an int, the last n layers are set to trainable (accepting gradient updates in the PyTorch computation graph) and all other parameters should be frozen (not accepting gradient updates). Hint: you'll need to edit the boolean property requires_grad of each parameter tensor. Assignment Writing Service

Now open train.py, which defines a function for performing training on our target BirdSnap data. This function works whether we are just updating the last-layer or more layers. Assignment Writing Service

CODE 1(ii) Edit train_model method to compute the cross-entropy loss, in two places. First, for the current batch of train data, inside the tr_loader loop. Second, for the current batch of validation data, inside the va_loader loop. Assignment Writing Service

The way you take averages differs a bit: Assignment Writing Service

  • Train: You want a per-example average from current batch, as a fast yet unbiased estimator for computing loss/gradients.
  • Valid: You want the true per-example average over full validation dataset.

Next, implement strategies to avoid overfitting, like L2 penalization of weight magnitudes for the last layer. Assignment Writing Service

CODE 1(iii) Edit train_model to add a L2 penalty loss on the weights only (not biases) of the last layer ("classification head"). For our provided model, you can use the dict model.trainable_params to access a tensor using its name as the key. The last-layer weights are named 'output.weight'. Hint: To compute L2 magnitude, think sum of squares). Assignment Writing Service

CODE 1(iv) Edit train_model to add early stopping functionality. We track the number of consecutive epochs that validation cross-entropy loss is getting worse. Once this exceeds a threshold, we revert the model to its previous state (which gave the best val-set cross-entropy) and return that model. Assignment Writing Service

Finally, write code to evaluate test set accuracy. Assignment Writing Service

CODE 1(v) Edit eval_acc (defined in the body of hw1.ipynb) to measure a model's accuracy (defined as the fraction of examples that are correctly predicted, from 0.0 to 1.0, higher is better) on the provided test set. Assignment Writing Service

Tasks for Experiment Execution

Now, step through the provided notebook hw1.ipynb to achieve the following Assignment Writing Service

EXPERIMENT 1(i): First, do last-layer fine-tuning of ResNet10 using the ImageNet1k pretrained model, on the available training/validation sets. Use the provided train/valid data loaders (don't mess with batch_size or other settings). Monitor train and validation metrics, make your own plots of these metrics as needed. Find a suitable learning rate and L2-regularization strength to minimize over/under-fitting. Assignment Writing Service

EXPERIMENT 1(ii): Repeat step 1(i) above for the other combinations of architecture and source dataset. Using the code's computed train/validation metrics tracked over epochs, find a setting of hyperparamters (n_epochs, lr, l2 penalty, seed). that seems to deliver reasonable heldout performance without too much over/under-fitting. Assignment Writing Service

We recommend saving intermediate results to a .pkl file or similar (see hw1.ipynb), so it is easy to plot later without redoing experiments. Assignment Writing Service

Tasks for Report Writing

In your submitted report, include the following Assignment Writing Service

FIGURE 1(a): Plot loss-vs-epoch for your best runs of (ResNet10, ImageNet1k) and (ResNet10, CUB). Use style provided in the Figure1a block in hw1.ipynb. Include this figure in your report, with a caption that summarizes major takeaway messages (did you see major overfitting? was strong L2-penalization needed? was early stopping beneficial?). Assignment Writing Service

FIGURE 1(b): Make target-task-accuracy vs source-task-accuracy plot, like the main figure in the Fang et al. paper from day04. Use the style provided in the Figure1b block in hw1.ipynb. Include this figure in your report, with a caption that summarizes both the major takeaway messages of your results (which src-dataset is better for our target task? which arch is better?). Try to reason about why given your knowledge from readings. Assignment Writing Service

Problem 2: LP then FT

To keep things simple, we'll fix ('ResNet10', 'ImageNet1k') for the arch and source dataset throughout Problem 2. Be sure you're only using this configuration. Assignment Writing Service

Your goal in Problem 2 is to implement LP-then-FT method of Kumar et al. (from our day04 readings). That is, you'll do: Assignment Writing Service

  • First stage of LP (Call train.train_model with n_trainable_layers=1). Assignment Writing Service

    • You can reuse the best hyperparameters from Problem 1 above.
  • Second stage of FT (Call train.train_model with n_trainable_layers=3). Be sure to initialize from the model produced by stage one. You'll have 3 trainable layers (not just 1), so lots more flexibility but also potential to overfit. Assignment Writing Service

    • You'll need to tune lr / l2penalty / n_epochs to be sure you are fitting reasonably

Tasks for Code Implementation

CODE 2(i) : Edit your hw1.ipynb notebook to implement two-phase training. In the first phase, again set n_trainable_layers=1 and use exactly the lr/l2penalty/seed combinations that worked well in Problem 1. In the second phase, you'll want to consider different settings of lr/l2penalty. Assignment Writing Service

To make things quick, you can run the first phase just once, yielding a good LPmodel, and then tune the hyperparameters of the second phase using copy.deepcopy(LPmodel) to get a copy of the model to train each hyperparameter config, while leaving the original LPmodel unchanged. Assignment Writing Service

Tasks for Experiment Execution

EXPERIMENT 2(i): Run a small-scale hyperparameter search (either manually or systematically), aiming to find a configuration of lr/l2penalty for the second phase that delivers the best possible validation performance. Don't spend more than about an hour. Assignment Writing Service

EXPERIMENT 2(ii): Compute the test-set accuracy for both the phase 1 and phase 2 "best" models, using eval_acc Assignment Writing Service

Tasks for Report Writing

In your submitted report, include the following Assignment Writing Service

FIGURE 2(a): Plot loss/error-vs-epoch plots in two panels (left=LP phase, right=FT phase), using style provided in the Figure2a block in hw1.ipynb. Aim to show the best run from experiment 2(i), where ideally reading across the plot there is obvious continuity between the LP and FT phases (e.g. va loss doesn't immediately jump away from the values seen at end of pretraining). Include this figure in your report, with a caption that summarizes major takeaway messages: was your implementation successful? Assignment Writing Service

SHORT ANSWER 2(b) Report the ultimate test-set accuracy for both LP and LP-then-FT. Reflect on any differences. Assignment Writing Service

Problem 3: Conceptual Questions

Short Answer 3a

Provide a math formula for computing the complete loss used to train models here, including the cross-entropy and the L2-penalty terms. Assignment Writing Service

Notation: Assignment Writing Service

  • B : total number of examples in current batch, indexed by i
  • C : total number of classes for target task
  • yi : int indicator of class label for example i
  • zi : vector of logits for example i
  • w : matrix of weight parameters of last layer
  • b : vector of bias parameters of last layer

You can only use basic math functions (log, sum, exp). Be sure to clearly define the assumed size of each vector/matrix, using the actual values in the code you implemented in Problem 1. Assignment Writing Service

联系辅导老师!
私密保护
WeChat 微信
Tufts代写,CS152代写,Learning from Limited Labeled Data代写,Transfer Learning for the Birds代写,Python代写,Machine Learning代写,Tufts代编,CS152代编,Learning from Limited Labeled Data代编,Transfer Learning for the Birds代编,Python代编,Machine Learning代编,Tufts代考,CS152代考,Learning from Limited Labeled Data代考,Transfer Learning for the Birds代考,Python代考,Machine Learning代考,Tufts代做,CS152代做,Learning from Limited Labeled Data代做,Transfer Learning for the Birds代做,Python代做,Machine Learning代做,Tuftshelp,CS152help,Learning from Limited Labeled Datahelp,Transfer Learning for the Birdshelp,Pythonhelp,Machine Learninghelp,Tufts作业代写,CS152作业代写,Learning from Limited Labeled Data作业代写,Transfer Learning for the Birds作业代写,Python作业代写,Machine Learning作业代写,Tufts编程代写,CS152编程代写,Learning from Limited Labeled Data编程代写,Transfer Learning for the Birds编程代写,Python编程代写,Machine Learning编程代写,Tufts作业答案,CS152作业答案,Learning from Limited Labeled Data作业答案,Transfer Learning for the Birds作业答案,Python作业答案,Machine Learning作业答案,