Neural Networks Regression¶
The Neural Networks Regression task creates a feed-forward neural network (NN). The NN model consists of a certain number of elementary perceptrons organized in layers. Each perceptron returns a sigmoidal function (activation function) that depends on a weighted sum of its input. A back-propagation algorithm is used to find the optimal weight vector for each perceptron. The output of the task is a model consisting of a weight matrix that can be used by an Apply Model task to perform the NN prediction on a set of examples.
The Neural Networks Regression task is divided into four tabs:
The Options tab, where users can choose the attribute they will work on and with.
The Monitor tab, where users can view the properties of the new clusters.
The Weights tab, where users can view the weight matrix for each hidden layer.
The Results tab, where users can visualize the results.
The Options tab¶
The Options tab is divided into two tabs: the Basic tab and the Advanced tab.
In the Basic tab, users will find the following structure:
Available attributes, where users will find the dataset’s attributes. It is always displayed on both Basic and Advanced tabs.
Attributes drop area, where users can drag the chosen attributes.
Customization pane, where users are able to customize different options.
Available Attributes
Within this section, users will find a list of all the dataset’s attributes. To search for a specific attribute, use the lens icon at the top right of the pane. Users also have the option of sorting the attributes according to their preferences. They can choose from a drop-down list of the Order by option:
Attribute
Name
Type
Ignored
Role
Attribute drop area
The attribute drop area is divided into two panes:
The Input attributes pane, where users can drag and drop the input attributes they want to use to create the classification of data. This operation can be done via a Manual list (users need to manually drag & drop the selected attributes onto the pane) or via a Filtered list.
The Output attributes pane, where users can drag and drop the input attributes they want to use to build the network. This operation can be done via a Manual list (users need to manually drag & drop the selected attributes onto the pane) or via a Filtered list.
Customization panes
Within this pane, users can set the following options:
Number of hidden layers: the number of layers in a network.
Neurons for each hidden layer: the number of neurons in each hidden layer. This option is enabled only if the value specified for the Number of hidden layers option is greater than zero.
Learning rate: the coefficient of the gradient descent.
Momentum term: the momentum of the back-propagation algorithm.
Normalization for input attributes: the type of normalization to use when treating ordered (discrete or continuous) variables. Available values are:
None
Attribute
Normal
Minmax [0.1]
Minmax [-1,1]
Normalization for output attributes: the method for normalizing output variables. Available values are:
None
Attribute
Normal
Minmax [0.1]
Minmax [-1,1]
In the Advanced tab, users can set the following options:
Maximum number of iterations: it indicates the (overall) maximum number of iterations for the gradient descent algorithm.
Number of trials: it indicates the number of trials (i.e. the repeated execution of back-propagation algorithm, with different starting conditions).
Maximum number of iterations for any trial: it indicates the maximum number of iterations allowed for the gradient descent algorithm for each trial.
Initialize random generator with seed: if selected, a seed which defines the starting point in the sequence, is used during random generation operations.
Use systemic activation function: if selected, it uses an activation function in the range (-1,1) instead of (0,1).
Append results: if selected, the results of this computation are appended to the dataset, otherwise they replace the results of previous computations.
Aggregate data before processing: if selected, identical patterns will be aggregated and considered as a single pattern during the evaluation process.
The Monitor tab¶
Within this tab, users can view the temporal evolution of some quantities related to the network optimization.
This tab is divided itself into three other tabs:
The Error Tab
The Gradient Tab
Error tab
Within this tab, users will see two plots. The plot above shows the mean errors on the training set. The neural network gradually changes the weights (i.e. the parameters of the NN) as it progresses, so that the value decreases. The plot below shows the maximum error; it takes the highest error (record) and monitors this error.
On the X axis: users will see the number of iterations.
On the Y axis: users will see the value of error.
Gradient tab
The gradient is a parameter used to indicate how the NN stabilizes. The lower it is, the more stabilized the parameters become.
On the X axis: users will see the number of iterations.
On the Y axis: users will see the value of gradient.
The Weights tab¶
This tab helps users to view the weight matrix for each hidden layer. The layer to be displayed can be selected from the Select a Layer drop-down list.
The Results tab¶
Within this tab, users can visualize a summary of the computation.
This tab is divided into two panes:
General Info
Within this pane, users can find the following information:
Task Label
Elapsed time (sec)
Result Quantities
Within this pane, users can set and configure the following options:
Average number of iterations per trial
Final value of gradient
Number of layers
Number of neuronal units
Number of samples
Number of trials
Number of weights (included biases)
Total number of iterations
These checkboxes are checked by default.
Example¶
The following example uses the Adult dataset.
After having imported the dataset with the Import from Text File task, split the dataset into test and training sets (30% test and 70% training) with the Split Data task.
Add a Neural Networks Regression task to the flow.
Configure the following options and attributes in the Basic tab:
Input attributes: age, workclass, education, relationship, race.
Output attributes: hours-per-week.
Number of hidden layers: 1
Neurons for each hidden layer: 5
Configure the following options in the Advanced tab:
Maximum number of iterations: 500
Maximum number of iterations for any trial: 500
Save and compute the task.
The results can be checked in the Monitor tab of the NN task. Within this tab, users can visualize the values of Error, and Gradient by clicking on the corresponding tab.
The Weights tab displays the weights obtained through the optimization procedure. The layer to be visualized can be selected in the Select a layer drop-down list: each row corresponds to a neuron of that layer and contains the weights relative to each input attribute.
Add an Apply Model task to the NN task. The forecast produced can be analyzed by right-clicking the Apply Model task and selecting Take a look.
In the data table, the following columns showing the results of the NN computation have been added:
the NN output forecast: pred(hours-per-week)
the error, i.e. the difference between the real output and the predicted output; if the actual output is missing, the field is left empty: err(hours-per-week)