Question 1

What is the purpose of batch Normalisation?

Accepted Answer

Batch Normalization (BN) is a technique used to improve the training of deep neural networks. Its primary purpose is to normalize the activations across the current batch to have zero mean and unity variance. This normalization helps in reducing the internal covariate shift, which is the change in the distribution of layer inputs during training. By mitigating this shift, batch normalization accelerates the training process and allows the use of higher learning rates, ultimately leading to better model performance.

Question 2

What are the advantages of batch normalization?

Accepted Answer

Batch normalization offers several advantages in training deep neural networks:  1. Faster convergence: By normalizing the activations, BN reduces the internal covariate shift, allowing the model to converge faster during training. 2. Higher learning rates: BN enables the use of higher learning rates without the risk of divergence, further speeding up the training process. 3. Regularization effect: BN introduces a slight regularization effect, which can help reduce overfitting in some cases. 4. Improved gradient flow: BN helps in improving the gradient flow through the network, making it easier to train deeper models. 5. Reduced dependency on initialization: With BN, the model becomes less sensitive to the initial weights, making the training process more robust.

Question 3

Why is batch normalization used in Convolutional Neural Networks (CNN)?

Accepted Answer

Batch normalization is used in Convolutional Neural Networks (CNN) to address the internal covariate shift problem, which occurs when the distribution of layer inputs changes during training. This shift can slow down the training process and make it difficult to train deep CNNs. By normalizing the activations across the current batch, BN helps in stabilizing the training process, allowing for faster convergence, higher learning rates, and improved model performance.

Question 4

What is the difference between batch normalization and normalization?

Accepted Answer

Normalization is a general term that refers to the process of scaling data to a standard range, typically with zero mean and unit variance. It is a preprocessing step applied to input data before feeding it into a machine learning model. On the other hand, batch normalization is a specific technique used during the training of deep neural networks. It normalizes the activations across the current batch at each layer of the network, reducing the internal covariate shift and improving the training process.

Question 5

How does Extended Batch Normalization (EBN) address the issue of small batch sizes?

Accepted Answer

Extended Batch Normalization (EBN) is a method proposed to address the issue of small batch sizes in batch normalization. EBN computes the mean along the (N, H, W) dimensions, similar to BN, but computes the standard deviation along the (N, C, H, W) dimensions. This approach enlarges the number of samples from which the standard deviation is computed, alleviating the problem of inaccurate batch statistics estimation in BN with small batch sizes while achieving close performances to BN with large batch sizes.

Question 6

What is Modality Batch Normalization (MBN), and how does it improve performance in cross-modality tasks?

Accepted Answer

Modality Batch Normalization (MBN) is a method that normalizes each modality sub-mini-batch separately during the training process. By reducing the distribution gaps between different modalities, MBN boosts the performance of cross-modality tasks, such as Visible-Infrared cross-modality person re-identification (VI-ReID) models. This approach helps in better handling the variations in data distribution across different modalities, leading to improved model performance.

Question 7

What is Filter Response Normalization (FRN), and how does it compare to batch normalization?

Accepted Answer

Filter Response Normalization (FRN) is a novel combination of normalization and activation function that operates on each activation channel of each batch element independently, eliminating the dependency on other batch elements. Unlike batch normalization, which normalizes activations across the current batch, FRN normalizes activations within each channel independently. This approach makes FRN less sensitive to batch size variations and allows it to outperform BN and other alternatives in various settings for all batch sizes.

Question 8

In which practical applications can batch normalization be used?

Accepted Answer

Batch normalization can be used in various practical applications of deep learning, including:  1. Image classification: BN helps improve the training process and performance of image classification models, such as CNNs. 2. Object detection: BN can be used in object detection models, like Faster R-CNN and YOLO, to improve training stability and accuracy. 3. Semantic segmentation: BN is beneficial in semantic segmentation tasks, where it helps in training deeper models with better performance. 4. Natural language processing: BN can also be applied to recurrent neural networks (RNNs) and transformers in NLP tasks to improve training and model performance.

Batch Normalization