The GOLD language#

GOLD language is Rulex underlying proprietary language, which can be found behind any operation performed on Rulex Platform.

Rulex Platform is a self-coding platform, writing GOLD code for every operation performed on the WYSIWYG interface.

However, there are some operations where users can customize operations by writing short pieces of code. In particular:

This page is meant to be a general description of GOLD’s syntax rules, which must be followed when customizing the operations listed above.


GOLD base types#

As all the programming languages do, GOLD uses variables and constants to define its operations. Variables are represented in GOLD as a continuous list of letters (capital or not) separated by __, when necessary.

Important

GOLD language is always case-sensitive and 1-based, meaning all its vectors or lists start with index 1 as their first element.

GOLD variables and constants base types are the following:

  • string or nominal

  • binary

  • integer

  • continuous

  • percentage

  • currency

  • date

  • week

  • month

  • quarter

  • datetime

  • time

Only the first four can be directly defined in GOLD code as external constants:

  • Strings are expressed as a character list enclosed in " (double quotes) or ' (single quotes). Within strings, the quote character can be used escaped by the character \. \ character is also used as escape character for special character as \n newline character. When using true \ character, it should be written as \\ to be correctly encoded.

  • Continuous numbers are expressed by using . as decimal separator.

    Note

    1 is read as an integer number while 1. is read as a continuous number.

  • Binary values can be inserted by using the following GOLD constants: True or true, False or false.

  • None or null constants represent the missing value; they are the only values in GOLD which can be cast in all the GOLD base type.

  • "" or '' empty strings are automatically converted to the None constant by the GOLD interpreter.

The rest of the base GOLD type can only be created using GOLD functions or through explicit cast. Cast can be performed by using the native function cast or through the constructor function of each base GOLD type which spells as the name type itself.

Tip

The native function type returns the type name as a string, if applied on any GOLD variable or constant.

On top of these base GOLD type, there are also derived language structures. The following language structures are available:

  • GOLD Groups

  • GOLD Objects

  • GOLD List

  • GOLD Dictionaries

  • GOLD Functions

  • GOLD Classes

Groups, functions and classes are more advanced elements: as their use is not important for Rulex Platform configuration, their description will be omitted.

GOLD functions will be described in the dedicated section, skipping their syntax definition.

GOLD objects are higher dimensional structures, such as vectors, matrices or tensors designed by combining constants or scalars of the twelve GOLD base types. They are expressed by using square brackets [] and their nesting corresponds to their dimension. Some examples are:

  • Homogeneous vectors, such as: [1,2,3] or ["Cleveland", "Minneapolis", "Portland"].

  • Heterogeneous matrices, such as: [[1,2,3],["Cleveland", "Minneapolis", "Portland"], [True, False, True]].

Note

GOLD objects are always rectangular, if their dimension is higher than 1. If some rows haven’t got the same number of columns, shorter ones are enlarged filling the new values with None.

Objects are memory-optimized structures; they are used in all big data operations. By using GOLD groups (which can be created only by using objects as internal terms), they are also the key for Rulex Platform astonishing performance.

GOLD lists are collectors of variables. They are especially used to assign a list of variables to a unique function input parameter. They are not memory optimized, but as they can still be assigned to a GOLD variable itself, it is not forbidden to create a list of lists. The symbol indicating a list is () the round bracket; some examples are:

  • (["Dad", "Granpa", "Mum"], 1, [["1970-1-23", "2023-2-14"], ["1890", "1670-4-25"]]) aggregate together objects with different size.

  • (mean, sum, leaf) is a list made of GOLD functions

GOLD dictionaries are key-value tables helping users in searching operations. Their symbol is {k:v} the curly bracket and the colon to divide keys and values. Some examples:

  • {"Portland":"Oregon", "Cleveland":"Ohio"}

  • {1:True, 0:False, None:"Undefined"}


GOLD operators#

In GOLD there are several operators which allow users to combine the provided constants and variables. In particular:

  • >=: greater than or equal to comparison operator

  • <=: lower than or equal to comparison operator

  • ==: equal to comparison operator

  • !=: not equal to comparison operator

  • is: is comparison operator (see the note below for the difference between it and the == operator)

  • is not or not is: not is comparison operator (see the note below for the difference between it and the != operator)

  • and: logical and operator

  • or: logical or operator

  • not: logical not operator

  • in or isin: in operator to state if a member is contained in an object or a list.

  • +: the binary sum operator

  • -: the binary minus operator

  • *: the binary prod operator

  • /: the binary ratio operator

  • %: the binary module operator

  • ^: the binary power operator

  • :: the range operator, allowing users to define objects or lists by indicating their first and last steps. Some examples are:
    • [3:5] = [3,4,5]

    • (2:6:2) = (2,4,6)

    • [5:3:-1] = [5,4,3]

Note

None has specific algebra to pay attention to, when used with binary operators:

  1. If operations are performed with strings, None is automatically converted to the "" empty string and treated as the empty string for the rest of the operation.

  2. If operations are performed with any other type, any operation performed with one of the two terms as None the result will be None.

  3. is or not is are an exception to the previous row, since they are binary comparison operators; they return only True or False. This is the main difference comparing them to the == and != operators, which return None when operating on None.


Indexing objects#

Indexing operations allow users to select a subset of the objects and to reduce their dimension, if necessary. The operator used for indexing are the [] square brackets.

Three types of GOLD structures can be inserted inside the square brackets:

  • scalar integers;

  • vectors of integers or binaries, i.e. homogeneous objects of dimension 1 of integers or binaries;

  • lists made of terms belonging to the previous types.

The effect is different, according to the considered type. First, users can use binary vectors to perform the selection, which is equivalent to fulfilling integers corresponding to the True position.

a = ["Iowa", "Wisconsin", "Tennesee"]
a[[True, False, True]] -> a[[1,3]] = ["Iowa", "Tennesee"]

Please note that, in the example above, the presence of two nested [] square brackets: one for the indexing operator, and one for the object definition. Also, the 1-based nature of GOLD indexing can be noticed.

Taking into account the previous results, the examples shown in the rest of the section will refer to integer quantities only.

Scalars in indexing operations reduce the dimension by one of the object to which they are applied. If they are applied on vectors, they create scalars:

a = ["Iowa", "Wisconsin", "Tennesee"]
a[1] = "Iowa"

but

a = ["Iowa", "Wisconsin", "Tennesee"]
a[[1]] = ["Iowa"]

Since [1] is a vector object with only one term while 1 is a scalar.

In indexing operations, vectors select the subset of the applied object corresponding to the provided indexes. It also maintains the indexing order provided by the user.

a = ["Iowa", "Wisconsin", "Tennesee"]
a[[3,1,2]] = ["Tennesee", "Iowa", "Wisconsin"]

a = [[1,2,3],[4,5,6],[7,8,9]]
a[[1,3]] = [[1,2,3],[7,8,9]]

Tip

  1. Negative indexes can be used to select all except the provided index:

    a = ["Iowa", "Wisconsin", "Tennesee"]
    a[[-1]] = ["Wisconsin", "Tennesee"]
    
  2. Range operator : can be useful to divide the indexing operation. The first number and the last number can be omitted: they are then inferred by the applied object.

    a = ["Iowa", "Wisconsin", "Tennesee", "Delaware", "Texas", "California"]
    a[[2:4]] = ["Wisconsin", "Tennesee", "Delaware"]
    a[[3:]] = ["Tennesee", "Delaware", "Texas", "California"]
    a[[:5]] = ["Iowa", "Wisconsin", "Tennesee", "Delaware", "Texas"]
    a[[:]] = ["Iowa", "Wisconsin", "Tennesee", "Delaware", "Texas", "California"]
    a[[::2]] = ["Iowa", "Tennesee", "Texas"]
    

    Please note even in this case the presence of two nested [] square brackets: one for the indexing operator, and one for the object definition.

Lists are then used to indexing multiple dimension in a unique operation:

a = [[1,2,3],[4,5,6],[7,8,9]]
a[(1,2)] = 2
a[([1,2],[2,3])] = [[2,3], [5,6]]
a[([:], 3)] = [[3],[6],[9]]

Tip

The list delimiter () can be omitted in indexing operator:

a = [[1,2,3],[4,5,6],[7,8,9]]
a[1,2] = 2
a[[1,2],[2,3]] = [[2,3], [5,6]]
a[[:], 3] = [[3],[6],[9]]

As a final comment, the [] operator is also used to indexing lists precisely in the same way it has been just presented or to pick values associated to a particular key in GOLD Dictionaries.

l = ("Iowa", mean, 2)
l[[1,3]] = ("Iowa", 2)

d = {"Illinois": "Chicago", "Oregon":"Portland"}
d["Illinois"] = "Chicago"

Calling GOLD functions#

All that can’t be done with base operators can be done using GOLD language, through GOLD functions. Rulex Platform provides a whole set of functions which can be used generally and in Data Manager tasks’ formulas. The complete list of functions which can be used in Data Manager task are listed here. All of them can be applied on standard objects also when working with the parametric option visualization. Here are some important extra base functions, which can be used only outside the Data Manager tasks:

  • dim(vec): it returns the dimension of the object vec.

  • type(var): it returns the type of GOLD variable var as a string.

  • len(vec, dim=None): it returns the length of the object vec at the dimension dim. By default, it returns the most external dimension.

  • which(vecbin): it returns the indexes of the position of the binary vector object vecbin corresponding to the True.

  • sort(vec, dim = 1, ascending = True): it returns the sorted indexes of the vector vec looking at the dimension dim in ascending order if ascending is True or descending order if it is False.

  • tempPath(): it returns the temporary folder of the user’s system.

  • userPath(): it returns the user folder of the user’s system.

  • uuid(): it generates and returns a UUID of the user’s system.

  • print(string = "%", vars = None, tostring = False, tojson = False, quotech = '\"', dateasfunc=False, dateasstring=False): it prints a certain string by replacing all the % characters with the content of the list vars on the standard output or on the string if tostring is True. The parameters following the tostring one customize the print of various GOLD base types, such as binaries or dates.

  • flat(matrix): it converts an object of a dimension greater than 1 into a single vector. Similar to the flatten python function.

  • cat(*var1, var2...*): it concatenates vectors objects in a unique vector.

  • union(var1, var2...): same as cat, but taking only distinct values.

  • transpose(matrix): it transposes the object matrix of dimension 2.

  • getFunction(name, instance=None): it returns the GOLD function associated to name name. If an instance is provided (a GOLD class instance), the method of the instance with the name name is returned instead.

  • intersect(var1, var2...): it returns the values present in all the vectors.

GOLD function call can be called by using positional parameters or Keywords:

  1. Positional parameters are expressed by inserting the variable in the right position of function definition.

  2. Keywords are expressed as <name-parameter>=<value-provided>.

Unpacking parameters

Parameters can be unpacked in a GOLD function call by using lists and dictionaries. The unpacking syntax is highlighted in the following code block:

l = ([1,2,3], [4,5,6], [7,8,9])
cat(*l) = [1,2,3,4,5,6,7,8,9]

d = {"vec": [1,2,3], "ascending": False}
sort(**d) = [3,2,1]

In the first example, the list is unpacked by putting the first member of the List in the first positional parameter, the second in the second one and so on. In the second example, the dictionary is unpacked assigned to the parameter vec the value [1,2,3] and to the parameter ascending the value False.

Tip

If in a GOLD function definition the forms *args or **kwargs are present, they indicate a function which can be called with an infinite number of positional parameters or with an infinite number of keywords. They will be treated internally as a args List or a kwargs Dictionary. This mechanism is referred as a GOLD function packing feature.


Rulex Platform Attributes, Variables and Cells in GOLD#

In many places, users want to refer Code or Vault Rulex Platform variables in their GOLD code parametrization. They are stored in GOLD language as global variables which are written with the prefix @ when referring to the standard GOLD variable name.

Therefore, in any GOLD code written in Rulex Platform users can perform operations using Rulex Platform variables, by simply writing @<variable-name>.

In the Data Manager formula toolbar, moreover, users need to easily refer to dataset attributes in Data tab or to sheet cells in Sheet tab.

To refer to attributes, users can use the $<name-attribute> or the $$<name-attribute> shortcuts. GOLD behavior of the two syntaxes is the same, as the difference is only at Rulex Platform interface level. To refer to cells, users can use the #<name-cell> or the ##<name-cell> shortcuts. GOLD behavior of the two syntaxes is the same, as the difference is only at Rulex Platform interface level.

The attribute’s name is expressed as a standard string. Cells can be expressed using one of the following formats:

  • Microsoft Excel-like: A1.

  • Row-column form: R1C1.

Warning

These shortcuts are available only in the Data Manager formula toolbar. They can not be used in the parametric option visualization.

These shortcuts do not automatically return the content of the column or of the cell as standard GOLD objects. A dedicated GOLD function needs to be performed on them to convert them into standard GOLD object. The function is:

  • extractSymbol(shortcut): it returns the shortcut expression as a standard GOLD object representing the column vector or the scalar cell.

This function is called automatically in background for all the Data Manager functions listed in this page. While if users want to call some function of the previous list on one of these shortcuts, they need to add it explicitly.

Shortcuts can be composed to form lists or ranges. However, since they are not of GOLD base type the results will always be a GOLD list. Use the previous extractSymbol to convert the list to the final GOLD object. Examples are:

($"age", $"workclass")
($"age":$"workclass")
(#"A1", #"B3")
(#"A1": #"B3")

Please note the () round bracket to define the final GOLD list object.


Condition and Rule Parser#

In Rulex Platform Rule Manager and in Query Manager of Data Manager you have to write rule conditions. Rules and conditions are critical in Rulex Platform; some efforts have been made to make their syntax simpler for end user and more effective.

This means conditions are not directly expressed in GOLD language but with a slightly different syntax to simplify its writing and its behavior.

This section highlights the main syntax differences between Rulex Platform condition language and GOLD language:

  1. Quotes around strings in conditions can be omitted if no special characters are present and if they not conflict with any attribute name of the underlying structure. They must be used in ambiguous situations.

  2. Quotes and $ shortcuts around attribute name in condition can be omitted if no special characters are present and the name coincides with an effective present attribute in the underlying structure. They must be used in ambiguous situations.

  3. Vectors can be expressed with {} delimiter as well as standard [] delimiter.

  4. Operators == and is are equal in condition; their returns False for None. Operators != and not is are maintained different. This is due to the fact difference between the first two is meaningless in query execution where None is equivalent to False.

  5. Operators >= and <= operates as the or operator between < or > and the is operator. This for the same reason of previous point.

  6. Functions can be applied to attribute names with the following syntax $"month(<attribute-name>)" instead of standard month($"<attribute-name>"). This to permits the omission of quotes and $ even in case of simple function application.