Outrageously Large Neural Networks--The Sparsely-Gated Mixture-of-Experts Layer

14 Aug 2020

Introduction

Conditional computation is a technique to increase a model’s capacity (without...

Gradient Surgery for Multi-Task Learning

06 Aug 2020

The paper hypothesizes that main optimization challenges in multi-task learning arise because of...

GradNorm--Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks

30 Jul 2020

Introduction

The paper proposes GradNorm, a gradient normalization algorithm that improves multi-task...

TaskNorm--Rethinking Batch Normalization for Meta-Learning

23 Jul 2020

Introduction

Meta-learning techniques are shown to benefit from the use of deep...

Averaging Weights leads to Wider Optima and Better Generalization

16 Jul 2020

Introduction

The paper proposes Stochastic Weight Averaging (SWA) procedure for improving the...

Decentralized Reinforcement Learning -- Global Decision-Making via Local Economic Transactions

09 Jul 2020

Introduction

The paper explores the connections between the concepts of a single...

When to use parametric models in reinforcement learning?

02 Jul 2020

Introduction

The paper compares replay-based approaches with model-based approaches in Reinforcement Learning...

Network Randomization - A Simple Technique for Generalization in Deep Reinforcement Learning

25 Jun 2020

Introduction

The paper proposed a Technique for improving the generalization ability of...

On the Difficulty of Warm-Starting Neural Network Training

18 Jun 2020

Introduction

The paper considers learning scenarios where the training data is available...

Supervised Contrastive Learning

30 Apr 2020

Introduction

The paper builds on the prior work on self-supervised contrastive learning...

Older Newer