Reshape to Long¶
The reshape to long pivoting operation reshapes a dataset by transforming a set of similar attributes into a new pair of attributes, containing the name of the erased column and the value respectively.
Consequently, the number of attributes in the dataset will be decreased and the number of rows will be increased accordingly.
Rulex Factory’s reshaping operations are similar to creating pivot tables in Microsoft Excel, where you can create a new table with the structure you require by summarizing the data from the original table. However, the Reshape to Long task reshapes the original table without creating a second table, and the new table is structured automatically, increasing or decreasing the number of rows and columns, and repeating some rows to fit the new shape if necessary.
The task is made of only one tab, the Options tab, which needs to be configured according to your needs.
This is how a Reshape to Long task is displayed. It is made of only one tab, the Options tab. You will find more information on how to configure this task in the next paragraph.
The Options tab¶
The Options tab is divided into two main areas: in the one on the left you will find the Attributes list, where you can uncheck the attributes you want to remove from (for more information see the corresponding page); while in the one on the right you will find all the configurations available.
The configuration options available are:
The Attributes to be transformed in long format: drag and drop the attributes that will become a single column in the resulting data table. Instead of manually dragging and dropping attributes, they can be defined via a filtered list.
The Number of long attributes in the final table: specify over how many columns the united values should be spread. By default, the values of all the Attributes to be transformed in long format are inserted in a single column (with an extra column specifying which attribute the value refers to).
The Contiguous attributes in a group belong to the same final long attribute: if selected, contiguous attributes will be used in the same final column, otherwise attributes are selected alternatively.
The Keep at least one row for each key: if selected, at least one row is displayed for each key, even if it contains only missing values.
Example¶
After having imported a file onto the stage, drag a Reshape to Long task onto the stage and link it to the source task.
Double-click on the task to open it and set the reshaping parameters. In the example here, we want our dataset to be reshaped to two columns, one with the order ID, and one with the products. Select all the Item attributes and drag them onto the Attributes to be transformed in long format area. Save and compute the task.
Add a Data Manager to check the results and link it to the Reshape to Long task. Then, double-click on it. As you can see, the selected attributes have been transformed into the following columns:
long_1, which contains the labels of the items (i.e. item 1, item 2 and so on)
wide_1, which contains the values.