Deep Learning from Scratch - Week 6


Course starts soon..


Quiz


We will start now with a quiz based on the first week material

You have 6 minutes to answer the quiz.

The quiz link:
Quiz Link
It will be copied in Mattermost and in the Zoom chat.

Quick project overview

How to setup a project

Even though each project is different, they may have some common points.
A typical skeleton for a project could look like this.

  • Read about what has been done in this direction.
  • Decide on the tool you want to use. (everybody in the team)
  • Find a suitable and well-structured dataset. Pre-process if needed.
  • Start testing a very simple model. Start small, and increase complexity gradually.
  • Once parameters are tweaked, check it against a baseline.
  • Try to understand and justify your choices. Why this architecture?
  • Prepare a presentation to show what you have done.

Timeline of a project

Start: 17 May → End: 21/28 June
Duration: ~5/6 Weeks

Suggestion (not mandatory, of course):

17.05 - 07.06: Project choice, dataset pre-processing, maybe first simple model and objective

07.06: Peer Review session, each group present their status to another group

08.06 - 21.06: Architecture, fine-tuning, preparing presentation

21/28.06: Presentation of the projects

Things to know:

  • Project is necessary to get the credits and the opecampus certification.
  • Project is not graded.
  • (At least) once during the execution of the project, we can have extra session with your group to discuss status and issues.
  • External guests may be invited at the final presentation.

Creation of the groups:

We divide now in smaller groups with the people who already know about their project and we try to create the groups.

QUIZ (15 mins)


1. What does having high bias mean? How is this related to the bias parameter b?
2. Dropout consists in setting to 0 a random number of neurons. Why don't we eliminate them directly and we use less? Is dropout used in training, in testing or in both phases?
3. Dropout and Regularization are techniques to reduce the usage of neurons. So why is it that we create always bigger network with more neurons and then techniques to shut them down? Would not be easier to just use smaller networks?
4. If you are training a neural network which tends to overfit, would you choose Dropout or Regularization?
5. How do you choose which part of the dataset goes into the training and which one into the test or dev set?

DISCUSSION AND ANSWERS


Dropout vs Regularization

TL;DR
They are similar techinques with similar results, dropout seems to work slightly better for larger networks
Long Answer
There is even a full paper about that.. (for one-layer networks only)

PAPER OF THE WEEK

Generative Adversarial Nets, Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio,
NIPS'14: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2December 2014 Pages 2672–2680

Quotation

In the proposed adversarial nets framework, the generative model is pitted against an adversary: a discriminative model that learns to determine whether a sample is from the model distribution or the data distribution. The generative model can be thought of as analogous to a team of counterfeiters, trying to produce fake currency and use it without detection, while the discriminative model is analogous to the police, trying to detect the counterfeit currency. Competition in this game drives both teams to improve their methods until the counterfeits are indistiguishable from the genuine articles.

EXERCISE (15-20 mins)

We go through the programming assignments that were planned for this week.

For the next week

  • Finish the second week of the course
  • Do the Programming Assignment on Optimization