Bridge tasks¶

Bridge tasks allow users to add external scripts to their data in Rulex Factory: sometimes it might happen that users already have statistical algorithms to integrate to their flow, or it might be needed to perform certain operations requiring a specific programming language.

The following external scripts are supported in Rulex Factory:

R
Python

In the Bridge task family, the following tasks are provided:

R Bridge: applies R scripts to data elaborated in Rulex Factory.
R Import Bridge: imports data from R scripts into Rulex Factory.
Python Bridge: applies Python scripts to data elaborated in Rulex Factory.
Python Import Bridge: imports data from Python scripts into Rulex Factory.

See also

The format data in Python is a dictionary that associates to each column name (key) the list of values of the column itself (values), for example: "age" : [39, 45,...] "workclass" : ["Private", "State-gov",...]

Warning

To use these tasks, it is necessary to install:

Python3 (with the packages “ipython” and “pandas”) to use the Python Bridge and the Import from Python Bridge tasks with the Python interpreter.
R, r-essentials, r-base, r-hash to use the R Bridge and the Import from R Bridge tasks.
Miniconda to use all the bridge tasks with the Conda interpreter.

It is strongly recommended to use Python 3.10 or higher versions. After having installed Miniconda, it is recommended to check on the machine where Rulex Platform is running if Python 3.10 (or a newer version) has been installed. If not, the user must install it manually.

Bridge tasks layout¶

Once opened on the interface, all the tasks belonging to the Bridge family have the same layout.

The Python Bridge and R Bridge tasks allow users to perform statistical calculations using Python/R scripts on data, and either overwrite the original dataset with the output results, or create a new dataset.

The Python Import Bridge and R Import Bridge tasks allow users to import data directly from a Python/R script, either by entering the script directly into the task, or by referencing an external script file.

All the tasks are made of one tab only, the Options tab.

In the Options tab, users can distinguish between three areas:

The configuration area, located on top side of the task window, contains the options required to define the characteristics of the Python/R run.
The language toolbar, located at the right of the configuration options, containing buttons which perform a series of operations on language blocks. More details are provided in the corresponding section.
The language console, located below the configuration options, is a pane made of language blocks, where users can add rows to write Python/R code and scripts, or select a script file to execute, depending on the options set in the configuration area. More details can be found in the corresponding section.

See also

According to the selected language, either Python or R, the code or script to be executed in the language console is organized as multiple language blocks which can either contain Python or R languages.

The configuration area

The configuration area allows users to set up the Python/R task execution and the type of interpreter which will be used during the computation, as well as some buttons which can be used to work on the language console pane.

The following options are provided:

Execute: choose if a Code or a Script will be run by clicking on the switch button. If Code has been chosen, the interactive area is populated with rows where users can type the Python/R code, while if Script has been chosen, an area where users can upload their Script file appears. The script layout area is the same which can be found in the location controller in the Import task family.
Interpreter: select the interpreter (Python/ R or Conda) which will be used in the bridge connection by clicking on the switch button. If the change is performed while the bridge is connected, the current bridge will be disconnected. Users are notified of this feature through a pop-up window, where they must confirm that they want to disconnect the current bridge.
Options window button: by clicking on this pencil-shaped button, users can open the Options menu, whose options change according to the chosen interpreter: more information on it can be found in the corresponding section.
Connect Bridge: by clicking on this button, the connection to the chosen interpreter is performed, and the first four buttons of the language block options are activated. It is required to click this button every time users need to debug the code written in the code area. Once the connection is completed, it becomes greyed out, and changes its name to Connected.

Click on the pencil icon to open the configuration menu. This menu can be used to set up the Python/R/Conda interpreters, and must be set up while configuring the task for the first time. After the options listed below have been set, click Install to install the Python/R/Conda packages.

See also

This menu must be used also to add new Python/R/Conda installations which are not available on the chosen interpreter.

If Python has been chosen as the interpreter, the window contains two options only, the Interpreter and the Requirements fields. Here, users need to specify the paths to, respectively, the Python/R interpreter (in .exe format) and the file containing the requirements (in .txt format).

The paths can be typed in the corresponding text fields, or users can select it from their machine by clicking on the Select button (standalone version). In the Cloud/Server versions, the path in the Interpreter field can only be typed manually.

After the fields have been filled, the Install button activates, and must be clicked to install the interpreter.
If Conda has been chosen as the interpreter, the window contains three options, the Interpreter, the Environment file and the Environment fields. Here, users need to specify the paths to, respectively, the Conda interpreter (in .exe format), the file containing the configurations (in .yaml format).

The Environment field is made of a drop-down list, where users can choose among the existing environments in the Conda installation, or create a new one by clicking on the Create User Environment button. By default, the generated realm environment will have the machine’s username.

Note

If a standalone installation is being used, the user environment name will be RulexDesktop-username.

After the Options window has been set, click Test connection to test if the connection will be performed correctly. Click Close to close the window without saving the changes.

Language toolbar

In the language toolbar the following buttons are provided:

Run the kernel: it runs the code/script written on the selected language block(s).
Interrupt the kernel: it stops the bridge connection.
Restart the kernel: it stops and restarts the bridge connection.
Run all the code: it stops and restarts the bridge connection, then it runs all the code/script.
Add a cell at the beginning: it adds a row at the bottom of the existing ones.
Add a cell at the end: it adds a row on top of the existing ones.
Cut: it cuts the selected language block(s).
Copy: it copies the selected language block(s).
Paste: it pastes the selected language block(s).

The toolbar can be hidden by clicking on the right arrow present at the end of the toolbar itself.

Language console pane

The language console pane layout changes if it is running a Code or a Script.

As previously said, if it is required to run a Script, the language console pane displays a location controller which is the same as the one in the Import task family. More information can be found in the corresponding page.

While if it is required to run Code, single language blocks are displayed, where users can type their code.

Each language block can be selected by clicking on it. Multiple selection can be performed by clicking with the mouse on the chosen language blocks, while pressing SHIFT on the keyboard. The selected language blocks will be highlighted.

An arrow is present on the right end of each language block, by clicking it, users will be able to use a series of buttons which allow them to perform specific operations:

Show description (Ctrl+Left)/Show Code: displays the description of the selected language block(s) if the code is visualized, or it displays the code when the description window is visualized. The description is made of a title and/or an image which helps identifying the code written in it. To add the image, click on the greyed out placeholder image an upload it, while the title can be edited by clicking onto the Title placeholder: an editing window appears, made of the same options available in the documentation editor window. To know more about these options, go to the corresponding page.
Copy the selected cells and paste them below the selection: duplicates the selected language block(s).
Move cells up (Ctrl+Up): moves the selected language block(s) up.
Move cells down (Ctrl+Down): moves the selected language block(s) down.
Insert a cell above (Ctrl+K): adds a cell above the current one.
Insert a cell below (Ctrl+B): adds a cell below the current one.
Delete cells (canc): deletes the selected language block(s).

Once opened, click on the right arrow icon to hide again the series of buttons described above.

See also

When the language block description is visualized, an icon explaining the code status is displayed at the right end of the corresponding language block. The icon indicates if the code is ready to be run, if it has been run or if there is an error in running it. Each area can be collapsed by clicking on the status bar present on the left side of any language block. The Color of the status bar indicates the computational status of the current block:

Purple means a ready GOLD block not already computed.
Green means a GOLD block already executed.
Red means a GOLD block returning an error.
Yellow means a GOLD block returning a warning.

The Color of the status bar is visible only when the language block row is selected. You can select any number of language block rows by simply clicking on them, eventually pressing Ctrl or Shift keys for a multiple or range selection.

Hint

The input dataset is converted to a Pandas DataFrame in Python (Dataframe in R) and automatically stored as r_dataset into the task. So, when writing code referring to the source dataset in Python Bridge and R Bridge tasks, the file must be referenced as r_dataset. In Import from Python/R Bridge tasks the dataframe is empty.

The flow variables are received automatically by the task, and must be referenced as r_vars into the task, when writing code referring to the flow variables in Python Bridge and R Bridge tasks.