Activation algorithms are the gates that determine, at each node in the net, whether and to what extent to transmit the signal the node has received from the previous layer.
A combination of weights (coefficients) and biases work on the input data from the previous layer to determine whether that signal surpasses a given treshhold and is deemed significant.
Batch Normalization does what is says: it normalizes mini-batches as they’re fed into a neural-net layer.
Batch normalization has two potential benefits: it can accelerate learning because it allows you to employ higher learning rates, and also regularizes that learning.
For example, suppose you’re sitting in your quiet suburban home and you hear something that sounds like a lion roaring.