Changes between Rulex 4 and Rulex Factory¶
In this page we want to summarize all the changes implemented between the old Rulex company’s software (Rulex 4) and the new component of Rulex Platform, Rulex Factory.
In this document, a focus is provided on the difference in options and computation behavior only. The user experience has been completely renovated: the new component is in fact a new software, which lays its roots in Rulex 4. Rulex Factory is part of a bigger project called Rulex Platform.
This list is meant to be a guide for the migration and the transition from the old Rulex 4 process to the new Factory flow. This migration is thought to be effortless and automatically performed by the Rulex Factory Flow import routine.
However, some critical situations need to be understood and solved in Rulex 4 directly before the import operation in the new software.
These situations are listed in the section Changes not mitigated.
Changes managed automatically by import converter¶
The modifications required by Rulex Factory listed here are automatically imposed by the process-flow converter used in the import operation when a prcx
file is selected:
In Rulex 3.2, the functions
currdate
andtostring
were available to evaluate the current date and to cast the result to string (an intermediate attribute type, different from nominal, which has been removed in Rulex 4). While Rulex 4 exceptionally allows the use of these two functions in theprocvars
option of module tasks (Execute Process File task or Rulex Process File Source task), in the import operation Rulex Factory converts these two functions tocurrDate
andcast
functions, respectively.- In Rulex Factory, comment lines before any history rows (the one starting with
//DATASET OPERATION
) are used to automatically build the visual description. Therefore, their presence is mandatory. Since in Rulex 4 some operations did not have these comment lines, in Rulex Factory the converter adds them, when needed. Namely: //DATASET OPERATION\n//REMOVE ROW
is added to Rulex 3.2removeRow
operation.//DATASET OPERATION\n
or//RULESET OPERATION\n
is added where missing.
- In Rulex Factory, comment lines before any history rows (the one starting with
In Rulex 4, in the history, the data structure for rules is mentioned as rules. Rulex Factory converter changes it to ruleset to align it with the dataset naming, which was already present in Rulex 4 for data.
In Rulex 4, the status of the history lines is not reset when the task is disconnected from the source task. This may lead to misleading results by using a combination of Import from Task tasks and the computation of part of the history. Therefore, in Rulex Factory the status is always updated when a computed task is disconnected. During the conversion from prcx to rfl, the status of the history lines belonging to not computed tasks is set to ready.
- The following options have been removed, as they are mainly unused options of Rulex 3.2 or Bridge options no longer useful in new bridge tasks configuration:
compat32
filenamefrc
delimfrc
alertstart
alertend
alerterror
alertrecipients
alerterrortype
alertduration
alertdurationtn
debugfilename
language
rhostname
report
rportaux
routputname
rinputname
scriptfromfile
rulboundselrows
The converter deletes these options during the import operation, if they are present.
In Rulex 4, graphical drop-down menu options are sometimes associated with string values (for example in the uri option in import/export task), or with numbers (representing the position of the entry in the list). In this last case, if the number of entries is two, also the
False
/True
binary entries are permitted, since they are automatically converted to a0
/1
integer. Consequently, adding a new entry to the list can change the position number, thus affecting the execution of the already existing processes. For this reason, any drop-down menu option in Rulex Factory is now associated with a list of string values. The converter automatically changes the options according to the table below:Option
Value mapping
byname
0
→position
1
→name
cattype
0
→inner
1
→outer
oplist
0
→X
1
→=
2
→!=
3
→<
4
→<=
5
→>
6
→>=
17
→substr
18
→superstr
19
→not_substr
20
→not_superstr
21
→begin
22
→is_begin
23
→not_begin
24
→is_not_begin
25
→end
26
→is_end
27
→not_end
28
→is_not_end
29
→damerau_levenshtein_<=
30
→levenshtein_<=
31
→hamming_<=
32
→long_common_substr_<=
33
→damerau_levenshtein_>
34
→levenshtein_>
35
→hamming_>
36
→long_common_substr_>
37
→is_anagram
38
→is_word
39
→include_word
40
→primary_phonetic
41
→secondary_phonetic
42
→some_phonetic_common
43
→both_phonetic_common
44
→is_included
45
→included
ordimptype
0
→fixed
1
→mean
2
→median
3
→mode
4
→minimum
5
→maximum
6
→minimumchange
timeimptype
0
→fixed
1
→mean
2
→median
3
→mode
4
→minimum
5
→maximum
6
→minimumchange
incrdecr (see note <#incrdecrnote>)
-1
→decrement
1
→increment
jointype
0
→inner
1
→louter
2
→router
3
→outer
4
→lcomplement
5
→rcomplement
6
→complement
mergetype
0
→nofill
1
→left
2
→right
misspolicy
0
→normal
1
→always
2
→never
inpdisctype
0
→incremental
1
→entropy
2
→chi
3
→width
4
→frequency
5
→roc
outdisctype
0
→width
1
→frequency
distmethod
0
→euclidean
2
→euclidean-norm
3
→manhattan
4
→manhattan-norm
5
→pearson
evaldistmethod
0
→euclidean
2
→euclidean-norm
3
→manhattan
4
→manhattan-norm
5
→pearson
normtype
0
→nonorm
1
→attribute
2
→normal
3
→minmax01
4
→minmax-11
timeunit
0
→second
1
→minute
2
→hour
3
→day
4
→week
5
→month
6
→quarter
7
→year
impurtype
0
→entropy
1
→gini
2
→error
centroidtype
0
→means
1
→median
2
→medoids
kmeanstype
0
→standard
1
→incremental
2
→error
arsmoothfunc
0
→log
1
→box
2
→nosmooth
shuffletype
0
→noshuffle
1
→reshuffle
2
→keepshuffle
rulewide
0
→term
1
→condition
2
→eule
ruleinterval
0
→operator
1
→mix
2
→interval
treepruning
0
→no
1
→complexity
2
→reduced
3
→pessimistic
treeusemissing
0
→average
1
→remove
2
→include
adefiltmode
0
→maximal
1
→closed
2
→confidence
appenddata
0
→dropinsert
1
→appendinsert
2
→update
3
→updateinsert
4
→delete
nomimptype
0
→fixed
1
→mode
allonly
0
→allbut
1
→only
firstlast
0
→first
1
→last
svmkernel
0
→linear
1
→polynomial
2
→radial
3
→sigmoid
assigntype
0
→random
1
→smart
2
→weight
fulldeploy
0
→all
1
→requested
2
→fair
anomaly
0
→one-class
1
→anomaly
ordroll
1
→minimum
2
→maximum
3
→summation
4
→average
5
→median
6
→mode
7
→standdev
8
→absdev
nomroll
1
→minimum
2
→maximum
3
→summation
4
→average
5
→median
6
→mode
7
→standdev
8
→absdev
svmtype
If task is Svm classification task: *
0
→c_svc
*1
→nu_svc
If task is Svm regression task: *
0
→epsilon_svr
*1
→nu_svr
The converter automatically applies the table above to change the options coming from Rulex 4 during the flow import operation.
Attention
The only option which still keeps its numerical values (even if its graphical representation in Rulex Factory is a drop-down menu) is winauth
in Import from Database task and Conditional Import from Database Task. This option has still 0
/ 1
or False
/ True
as possible values. This choice was made to reduce the transition effort especially when dealing with possible runtime parametrization.
Note
In the incrdecr
option of module tasks (Execute Process File task or Rulex Process File Source task). The use of numbers greater than 1
or lower than -1
, which led to inconsistency behavior also in Rulex 4, is no longer allowed.
- The following options have been renamed in the transition from Rulex 4 to Rulex Factory:
rcode
→scriptcode
rcommand
→exepath
In Rulex 4, there are conflicting options for import tasks (for example
filename
/filelist
). The list version is used if it is set, overriding the value of the name version. For this reason, in Rulex Factory the variousfilename
,sheetname
,query
andtablename
options have been removed and only thefilelist
,sheetlist
,querylist
andtablelist
options are available. The converter moves, when needed, the value inserted in thefilename
,sheetname
,query
ortablename
entry in the new list option, by changing it into a list of a single element.In Rulex Factory, macro code has completely changed from Rulex 4. In Rulex 4, it was written as a list of internal command code (representing the socket language between the Rulex 4 Client and the Rulex 4 Server). In Rulex Factory, the macro code is made of CLI/API commands, and it is aligned with the new Rulex Platform API service. The converter automatically casts the old commands of Rulex 4 into the corresponding Rulex Platform ones.
Changes mitigated by import converter through value modifications¶
The modifications required by Rulex Factory listed here are automatically imposed by the process-flow converter through the introduction of ad-hoc code routine which can not be re-executed from Rulex Factory directly. They should be left untouched and never replicated in any other point of the Rulex Factory flow.
Comparison operations (
==
,!=
,<=
,>=
) executed in formulas or inifelse
functions on nominal columns return different results onNone
entries when working on Rulex 4 and not on Rulex Factory. The differences are mitigated when converting from Rulex 4 by enclosing these operations in a backward-compatibility functionrns
, which restores the behavior of Rulex 4.
Attention
The function rns
must NOT be used in any newly created code, as the new behavior of Rulex Factory is strongly recommended.
Conditions in Rulex Factory have been greatly expanded in operational possibilities. Now operations can be performed directly into condition codes, thus avoiding the need to create many additional support columns. However, as a side effect, conditions received as input in the Convert Ruleset to Dataset task must now be more specific to avoid conflicts between the old and the new structure. For example, while the condition code
PROD_SIZE in {3/Midi}
is accepted in Rulex 4, in Rulex Factory the string3/Midi
must be surrounded by quotes to specify that it is a string. The converter takes care of these occurrences by writing the condition in the following form:PROD_SIZE in {"3/Midi"}
.In Rulex 4, setting the
procvar
option to modify one specific variable in a module evaluation leads to a final option reporting the whole list of variables and not only the modified ones. To mitigate this effect, the converter modifies all theprocvar
options to delete any entry that is equal both in the module task and the parent flow, keeping the ones that are not present in the parent flow.In Rulex Factory, the Reshape to Wide task will create all the new columns in the same table position of the original long attribute, when more than one long attributes are selected to be expanded. In Rulex 4, instead, when there is more than one long attribute, the new columns are still inserted at the end of the whole table, regardless of the position number or the order of the outputs. In Rulex Factory, a dedicated flag has been added to the task and its value is generated by the converter to ensure the same behavior as in Rulex 4 in the imported task.
In Rulex Factory, the module execution operation requires an rfl file as entry point, while Rulex 4 requires a prcx. In some situations, the options’ names of the module task are defined by using the Runtime Variables task. This makes the update of all the options from
.prcx
to.rfl
more error prone. For this reason, a dedicated flag has been introduced in the Runtime Variables task; it is set asTrue
by the converter to ensure the conversion of all.prcx
extensions into.rfl
at runtime during the execution of the task itself.
Warning
This dedicated flag does NOT convert the module itself at runtime, but only the option value. The .rfl
version is meant to be already present in the same location of the original .prcx
version.
In Rulex Factory, the value of the option
process
in the Import from Task task to specify the current flow has been changed from-- THIS --
to__this__
. When importing aprcx
, the converter changes all the optionsprocess
in the Import from Task task with value-- THIS --
into__this__
.Rulex 4 allowed using in option
loopvar
values like@var
and took@var
as the iterator instead of its value, while Platform correctly uses the@var
value. This difference has been mitigated converting this particular option when importing aprcx
.In Rulex Factory, when importing a .prcx file, the
hard reset
macros are converted intosoft reset
macros.In Rulex Factory, when importing a .prcx file, in the Network Optimizer task, the option
defaultcost
is set to 1, instead of its default value -1, to keep the Network Optimizer task’s behavior in Rulex Platform equal to the one in Rulex 4.
Changes not mitigated or corrected by the converter¶
These remaining differences between Rulex 4 and Rulex Factory are established to be at low/minor impact, and therefore they are not mitigated by the converter, as they require in case a manual modification of the final imported flow.
The module execution has been moved from Python (used in Rulex 4) to GOLD in Rulex Factory, aligning these two tasks (Execute Process File task or Rulex Process File Source task) to the rest of the tasks during the execution process. As in Rulex 4 all tasks except the module ones must have had their options aligned with their assigned type, in Rulex Factory this inconsistent behavior has been corrected. In Rulex 4 the casting operations in modules performed with Python allow tasks to work even if the provided values in options are not of the correct type, while this is not possible anymore in Rulex Factory. However, the impact of this change has been defined as low/minimal.
- When a module is executed in Rulex 4, the process variables used in the module flow have the following priority:
Parent workflow
Procvar option of the module task
Module workflow variables
- This leads the
procvar
option to be meaningless in most occasions. In Rulex Factory, the priority behavior has been changed, and it has the following priority: Procvar option of the module task
Parent flow
Module flow variables
The impact has been studied as minimal since
procvar
option use is not so frequent in Rulex 4.In Rulex Factory, while executing modules, the selected
loopvar
must be defined and then selected only in the parent flow and no longer in the module only, as required in Rulex 4. This is related to the disaster recovery policy set in Rulex Factory for module computation, which executes the loop iteration in a completely different thread now. This allows the system to recover itself even if a hard crash due to memory consumption happens. The change only deals with the iteration of an unclear loop and its impact has been estimated as minimal.In Rulex Factory, when using an Import from Task task into a module, if the Flow option is set to __this__, the flow which is taken into account is the module itself, which is the flow where the Import from Task task is located. In Rulex 4 instead, when selecting __this__ as the Flow option, the software took into account the parent flow.
In Rulex 4, process variables are calculated before any tasks leading to an evaluation, which in some situations depends on the running time (execution time) of the current process. In Rulex Factory, to make the whole system more deterministic, the flow variables are evaluated only at the beginning of each computation and then updated only by a Runtime Variables task. This also leads to an improvement in performance. The impact is only related to functions depending on external factors, such as the
currDatetime
function. However, the use of this function in all the studied cases is affected and not supported by Rulex 4 behavior. In Rulex Factory, more deterministic cases will be obtained without any change in the meaningful part of the flow.Starting from
1.1.2-21
, the Network Optimizer task has been completely rewritten in Rulex Factory to introduce a native Priority management. The modification of the optimization routine may lead to differences compared to previous version, even with the same input data and options. However, all these differences lead to the same or to better cost functions, achieving an overall better optimization.Starting from
1.1.2-21
, the Network Optimizer task raises an error both when the Destination value doesn’t match with the corresponding Destination Quantity value and when the Cost value related to a Source and Destination pair doesn’t match.Starting from
1.1.2-21
in the Import from Text File task, when a text delimiter option has been defined, the system recognizes the attributes whose values are included into text delimiters as nominal ones.In the Import from Excel File task, percentage columns defined in MS Excel files are recognized as percentage attributes. In Rulex 4, percentage columns in MS Excel files were imported as continuous attributes, instead.
Starting from
1.2.1-27
, the Import from Text File task has been modified to treat a general inconsistency of Rulex 4 in the management of empty lines at the beginning of the file itself. This leads to a necessary modification ofnameline
andtypeline
option if set to match the new empty line count. However, the new empty line count is now aligned to what the user can see in a normal text editor.In Rulex 4, both the soft reset and the hard reset operations didn’t delete data from the database. In Rulex Factory, the hard reset operation now erases data from the database, while the soft reset operation has the same behavior as in Rulex 4.
In Rulex 4, when exporting a MS Excel file, continuous values were converted to integers upon exportation, while in Rulex Factory they are exported keeping the continuous attribute type.
In Rulex Factory, when a continuous column was cast from continuous to nominal in the Data Manager, the software applied a round function; while if the cast operation was performed in other tasks, the software applied a floor function. Starting from Rulex Factory
1.2.1-144
, when a column is converted from continuous to nominal, the software applies a round function, no matter which task performs the cast operation.Starting from
1.2.2-12
, the database name is not mandatory anymore when setting up a connection to Spark and Databricks.