we need to apply an activation function f(x) so as to make the network more powerful and add the ability to it to learn something complex and complicated form data and represent non-linear complex arbitrary functional mappings between inputs and outputs. Hence using a non-linear Activation we are able to generate non-linear mappings from inputs to outputs.
Also, another important feature of an Activation function is that it should be differentiable. We need it to be this way so as to perform backpropagation optimization strategy while propagating backwards in the network to compute gradients of Error(loss) with respect to Weights and then accordingly optimize weights using Gradient descend or any other Optimization technique to reduce Error.
Most popular types of Activation functions:
Sigmoid or Logistic - Predict value in between 0 to 1. Returns f(x) = 1 / (1 + exp(-x))
TanH (Hyperbolic Tangent) - Predict value in between -1 to 1. Returns f(x) = tanh(x)
ArcTan - Predict value in between -1.5 to 1.5
ReLu (Rectified Linear Units) - Predict value in between 0 to X. Returns f(x) = max(0, x). It has become very popular in the past couple of years. It was recently proved that it had 6 times improvement in convergence from Tanh function. But its limitation is that it should only be used within Hidden layers of a Neural Network Model.
0 comments:
Post a Comment