Pre-Processing tasks#

As it is frequently necessary to perform operations on the structure or contents of datasets prior to creating a predictive model, Rulex Factory provides you with all you need to pre-process your data.

For example, it may be necessary to reshape some of the datasets in the flow prior to merging them in a single table, or transform attributes to more manageable data types, or clean up the dataset by removing outliers or attributes that could cause confusion in the final model.

Rulex Factory provides multiple tasks, each one corresponding to a specific pre-processing operation.

This category can be divided into four subgroups: the Transforming, Reshaping, Merging, and Cleaning tasks.

  • The Transforming tasks transform the dataset’s structure, by dividing them into a finite set of intervals, or simply by defining temporal windows for each of them.

  • The Reshaping tasks reshape datasets into new rows or into new columns, or simply convert rows into columns and vice versa.

  • The Merging tasks join datasets by their rows, or concatenate tables by their columns.

  • The Cleaning tasks clean datasets by deleting attributes which can create confusion in the resulting model.


Tasks layout#

As this task category covers different pre-processing operations, the tasks have been designed so that all the required characteristics can be set in the Options tab: this tab is usually divided into two main panes, one containing the Available attributes list (more information can be found here), and another one containing the customization options, according to the chosen task.

The only task where the Options tab is divided into two secondary tabs is the Fill/Clean task: as it offers two different operations on data, the corresponding options have been inserted in the Fill and in the Clean tabs.