Classification tasks¶
Classification tasks integrate classification algorithms and can be used to solve supervised learning problems.
An algorithm is a sequence of strips, triggered by an initial action, which reacts to changing circumstances, and produces an output or a decision.
Algorithms play a crucial role in machine learning, where the concept of learning refers to the ability of a system to autonomously learn and improve from experience.
Rulex Platform provides a full range of classification tasks, where users can set the constraints easily.
Before starting the analysis, we recommend you to split the data into two parts:
a training set, used to identify patterns in the data, and to create a set of rules which can be used to make predictions on data. The training set is usually made up of the 70-80% of the available data.
a test set, used to test the model and to confirm its accuracy. It is usually made up of the remaining 30-20% of the available data.
Sometimes it might be required that you split the data into the validation set, too. This isn’t a mandatory operation, as these patterns are used only for internal validation by some modeling methods.
You can divide the dataset with the Data Manager or with the Split Data task.
If you want to apply the model to the data, remember to add an Apply Model task after the chosen classification task.
Note
If the output role has been established in the Output attributes drop area of the classification task, the role has not been modified at dataset level.
This means that this role is not taken into account in any subsequent task.
On the other hand, if the output role has been established in a Data Manager task, the role will be taken into account in any subsequent task.
Supervised learning¶
In supervised learning problems an output attribute is present in the dataset and the target of the analysis is to derive a model which describes the relationship between this attribute and all the other input attributes of the dataset.
Classification problems aim to define which class or category input attributes in a dataset belong.
The output attribute that defines the target class or category is a nominal attribute. For example, a classification problem may predict whether car sales will increase, decrease or remain stable (3 possible outcomes) over the next 12 months.
Classification tasks layout¶
Rulex Factory’s classification tasks have a common layout. They are usually made of three tabs, each one of them with a specific aim:
- the Options tab, which can be divided into two tabs: the Basic tab and the Advanced tab. The options contained in each tab vary according to the chosen task. In the Basic tab, or in the Options tab if the Advanced tab is not provided for the corresponding task, you will find a common layout which consists of:
the Attributes list
the configuration options, which consist of checkboxes, drop-down lists, number fields which are necessary to customize the analysis to perform.
You will find more about the Options tab and the configurations available in each task’s page.
the Monitor tab: here you will find the corresponding plot for each task, which can be visualized only after the computation has finished.
- the Results tab: here you will find computation information, which vary according to the chosen task. It is divided into two sections:
the General info section, where you will find general information.
the Result quantities section, where you will find detailed quantities. The information provided here varies according to the chosen task.
See also
You will find all details regarding options, plots and results obtained on the corresponding task’s page.