Convert Structure to Dataset¶
The Convert Structure to Dataset task transforms one of Rulex Factory’s data structures into a dataset.
There are different reasons why users may wish to convert structures.
Here are a few examples:
Analyze the dataset using a Data Manager for detailed insights. For example, users can filter rules through a formula to identify only those rules that have a specific coverage area.
The resulting dataset can be exported to an external item, such as a file or database table.
Produce a dataset that contains all the information included in a model, such as the weights of a neural networks or the coefficients of a linear regression.
The task is made of one tab only, the Options Tab.
The Options tab¶
The Options tab is made of two panes:
Structure pane, where users can select the structure which will be converted to a dataset.
Configuration pane, which is displayed after a structure has been chosen.
Structure pane
Within this pane, a drop-down list is available, to choose the data structure to be converted.
Available options are:
Association rules
Auto regressive models
Clusters
Cluster labels
Discretization cutoffs
Frequent itemsets
Frequent sequences
Monitor
Results
Rules
Models
Pca eigenvectors
Relevances
Configuration pane
As previously said, the configuration pane is displayed when a structure has been chosen.
After any of all the available structures has been chosen, apart from the Rules structure, the message “No parameters need to be set for this task: just compute the task by right-clicking it and selecting Compute > Compute selected” appears.
If a Rules structure has been chosen, the Structure pane is populated with the following options:
- Dataset format: the format for the output dataset. The possible values are:
One row for each rule (default): the resulting table contains a row for each rule.
One row for each term: the resulting table contains a row for each condition attribute value within each rule.
One rule for each condition: the resulting table contains a row for each condition attribute within each rule.
- Conditions on ordered attributes format: the required format for the conditions on ordered attributes. The possible values are:
a<x<b (default): values will always be displayed with greater than or less than indicators.
x>a, x in [a,b]: values will be displayed with greater than or less than indicators, or as a range when possible.
x in [0,Inf], x in [a,b]: values will always be displayed as a range, using an infinite value (Inf) when these are not the end of an available range value.
Example¶
The following example uses the Adult dataset.
After having created a structure using the Standard Clustering task, add a Convert Structure to Dataset task to the flow.
Select Clusters in the Select the structure option and save and compute the task.
The output dataset will contain the results in tabular format, as they are available in the Clusters tab in the Standard Clustering task.