A neural network is a powerful data modeling tool that is able to capture and represent complex input/output relationships. The motivation for the development of neural network technology stemmed from the desire to develop an artificial system that could perform "intelligent" tasks similar to those performed by the human brain. Neural networks resemble the human brain in the following two ways:
- A neural network acquires knowledge through learning
- A neural network's knowledge is stored within inter-neuron connection strengths known as synaptic weights
The Neural Network algorithm is an artificial intelligence technique that explores more possible data relationships than other algorithms. Because it is such a thorough technique, the processing of it is usually slower than the processing of other classification algorithms.
A neural network consists of basic units modeled after biological neurons. Each unit has many inputs that it combines into a single output value. These inputs are connected together, so the outputs of some units are used as inputs into other units. The network can have one or more middle layers called hidden layers. The simplest are feed-forward networks (pictured), where there is only a one-way flow through the network from the inputs to the outputs. There are no cycles in the feed-forward networks.
As mentioned, units combine inputs into a single output value. This combination is called the unit’s activation function. Consider this example: The human ear can function near a working jet engine. Yet, if it were only 10 times more sensitive, you would be able to hear a single molecule hitting the membrane in your ears! What does that mean? When you go from 0.01 to 0.02, the difference should be comparable with going from 100 to 200. In biology, there are many types of non-linear behavior.
Thus, an activation function has two parts. The first part is the combination function that merges all of the inputs into a single value (weighted sum, for example). The second part is the transfer function, which transfers the value of the combination function to the output value of the unit. The linear transfer function would do just the linear regression. The transfer functions are S-shaped, like the sigmoid function:
Sigmoid(x) = 1 / (1 + e(-x)).
A single hidden layer is optimal, so the Neural Network algorithm always uses a maximum of one (or zero for Logistic Regression).
The Neural Network algorithm uses the hyperbolic tangent activation function in the hidden layer and the sigmoid function in output layer. You can see a Neural Network with a single hidden layer in the following picture.
Training a neural network is the process of setting the best weights on the inputs of each of the units. This backpropagation process does the following:
- Gets a training example and calculates outputs
- Calculates the error – the difference between the calculated and the expected (known) result
- Adjusts the weights to minimize the error
Like the Decision Trees algorithm, you can use the Neural Network algorithm for classification and prediction. The interpretation of the Neural Network algorithm results is somewhat more complex than the interpretation of the Decision Trees algorithm results. Consequently, the Decision Trees algorithm is more popular.