Classification tasks¶

Classification tasks integrate classification algorithms and can be used to solve supervised learning problems.

An algorithm is a sequence of strips, triggered by an initial action, which reacts to changing circumstances, and produces an output or a decision.

Algorithms play a crucial role in the machine learning: the concept of learning in machine learning refers to the ability of a system to autonomously learn and improve from experience.

Rulex Platform provides a full range of classification tasks, where users can set the constraints easily.

Before starting the analysis, we recommend you to split the data into two parts:

a training set, used to identify patterns of data, and to create a set of rules which can be used to make predictions on data. The training set is usually made of the 70-80% of the data available.
a test set, used to test the model and to confirm its accuracy. It is usually made of the remaining 30-20% of the data available.

Sometimes it might be required that you split the data into the validation set, too. It isn’t a mandatory operation, as these patterns are used only for internal validation by some modeling methods.

You can divide the dataset with the Data Manager or with the Split Data task.

If you want to apply the model to the data, remember to add an Apply Model task after the chosen classification task.

Supervised learning¶

In supervised learning problems an output attribute is present in the dataset and the target of the analysis to derive a model which describes the relationship between this attribute and other input attributes in the dataset.

Classification problems aim to define which class or category input attributes in a dataset belong.

The output attribute that defines the target class or category is a nominal attribute. For example, a classification problem may predict whether car sales will increase, decrease or remain stable (3 possible outcomes) over the next 12 months.

Classification tasks layout¶

Rulex Factory’s classification tasks have a common layout. They are usually made of three tabs, each one of them with a specific aim:

the Options tab, which can be divided into two tabs: the Basic tab and the Advanced tab. The options contained in each tab vary according to the chosen task. In the Basic tab, or in the Options tab if the Advanced tab isn’t provided for the corresponding task, you will find a common layout which consists of:
- the Attributes list
- the attribute drop area
- the configuration options, which consist of checkboxes, drop down lists, number fields which are necessary to customize the analysis to perform.
You will find more about the Options tab and the configurations available in each task’s page.
the Monitor tab: here you will find the corresponding plot for each task, which can be visualized not only after the computation has finished, but also during this operation.
the Results tab: here you will find computation information, which vary according to the chosen task. It is divided into two sections:
- the General info section, where you will find general information.
- the Result quantities section, where you will find detailed quantities. The information provided here varies according to the chosen task.