The Essence of Neural Networks

Amruta Mulay
6 min readJan 2, 2021

Abstract

The world comprises many complex, unstructured and enormous amounts of data to compute so as to perform a network of operations over them. Humans are incapable of extricating information from compound structures. If we design a mathematical model that recreates the way a human brain is able to analyze and work, we come up with a perfect computing system that is able to solve the complicated set of data. We aim to understand the essence of the working of neural networks. They employ a chain of algorithms that endeavor to comprehend the intrinsic relationships for a set of data through procedures that imitate the way a human brain works. Based on the research, there are some more proposed implementations as well as the future scopes that are discussed to extrapolate the execution in different lines of businesses so that we use the technology at its best.

What is a Neural Network?

We define neural networks as a sequence of algorithms that attempt to recognize the hidden relationships in a set of data by means of imitating the way a human brain might perceive and operate based on logical reasoning. They can acclimatize to the evolving inputs so as to generate the finest possible outcome without the requirement of reshaping the output criteria. They are capable from translating texts to identifying faces, recognizing speeches to reading handwritten texts, controlling robots to a lot more that we can possibly imagine.

Figure-1

Working Of Neural Network

A neural network may usually comprise multiple layers. We prefer multiple layers since each layer performs different transformations on the data that is fed as input. The input layer is described as the first layer that is responsible for picking up the input signals which are further passed on to the next layers. Every node that is present in the input layer denotes a separate feature present in the given piece of data. This input is multiplied by an assigned weight of the specific feature and fed into the next layer. All the complicated calculations and feature extraction methods are implemented on this next layer which is also termed as the hidden layer. This is responsible for finding the hidden characteristics of these input signals and gaining unknown information. Usually, there consists of more than one type of hidden layer so as to improve the performance of the model. The last layer is designated as the output layer which yields the final result.

The first layer, i.e the input layer comprises three nodes that take the input data. This input data is passed to the next layer, also termed as the hidden layer containing two nodes which are responsible for the feature extraction and other complex calculations. Finally the output of the hidden layer is fed as input to the output layer which contains the final result computed by the previous layers.

Role of Hidden Layers

It is quite obvious to interpret from Figure-1 that the hidden layers play the main role of processing the information. They act as an interface between the input and the output layers. The main calculation involved is to find the weighted sum between the input and the assigned weights. A fixed bias is added to the same and a precise activation function is selected to execute. The hidden layers can comprise multiple layers. The larger the number of hidden layers will take more time to produce the desired output. But multiple layers in the hidden layer will facilitate more complex problem solving. It is always better if we choose the optimal number of hidden layer and nodes within them. Experimental analysis has shown that we can find the ideal number of nodes required within a hidden layer is as follows:

There is a high possibility of overfitting of data that can take place. The analysis produced may fit a given set of data so well that it may likely fail to fit in case of future observation. It may cause a reduction in the data-set’s generalizability. This is where the ‘Factor’ variable comes into action in equation. It is a number ranging from 1–10 that is used to prevent overfitting of the data.

Activation Functions in Neural Network

Activation functions are defined as the mathematical functions in the neural network that help in determining the result. Each neuron is connected to this function and it verifies whether they should be activated or not. This is done to evaluate whether every neuron that provides the input is admissible for the model’s prediction. Normalization of the output of each neuron is one of the key features of the activation function. They help in normalizing in the interval [0,1] or [-1,1]. Another characteristic of activation function is the computational efficiency. There is a technique termed as back-propagation which provides an additional computational strain on the neural networks so as to train the model. Thus the algorithmic proficiency of the activation function should be so as to effectively calculate the millions of neurons for each data sample.

Figure-2

In the absence of the activation functions in neural networks, we would just have a linear regression model which would be simple to solve but cause restrictions while evaluating more complex problems such as image categorization or language translation. Thus, we use activation functions so as to implement the non-linear transformations so that we are able to solve problems concerning the higher degree polynomials. The following table discusses the various different types of activation functions that we may use while executing the neural networks.

Figure-3

Advantages of Neural Networks

  • Better fault tolerance: In case one or more neural networks get corrupted, the result generated is not affected by it. This ensures the network is good at tolerating the faults.
  • Capable of working even when full knowledge is not available: After we train our neural networks, the result produced will possibly be inchoate and deficient. The poor performance is determined by this significance of the missing data.
  • Parallel Processing: Due to the enormous computational power of the neural network, they are able to perform multiple functions at a time.
  • Information is stored on the entire network: Data and information is stored on the network. So in case there are any bits and pieces of information that goes missing, the whole network is still able to operate.
  • It’s ability to train the device: The main characteristic of the neural networks is that they learn from the events presented to them and decide accordingly. Thus it trains the machine to act or respond in a specific way depending on their past learning of similar events.
  • Distributed Memory: To get the desired output, it is necessary to make our neural network model to learn and be able to delineate the examples. The advancement of our neural network model is directly proportional to the samples that are selected.

--

--