Math and trigonometric functions

Math and trigonometric functions perform mathematical operations on the selected columns.

The following functions are available:


abs

The abs function returns the absolute value of each value of the column.

Parameters

abs(column)

Parameter

Description

column

The attribute used to evaluate its absolute value. Its type must be numeral. The column parameter is mandatory.

Example - abs(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to convert the loudness attribute’s values to their absolute value.

  • Add a new column to the dataset and type the following formula:abs($"loudness").

  • As you can see, the column has been filled with the absolute value of the loudness attribute.


acos

The acos function returns the arccosine value of each value of the column.

Parameters

acos(column)

Parameters

Description

column

The attribute from which we want to retrieve the arccosine value. Its type must be numeral. The column parameter is mandatory.

Example - acos(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to calculate the arccosine of the valence attribute.

  • Add a new column, and type the following formula: acos($"valence").

  • As you can see, the column has been filled with the arccosine values of the valence attribute’s values.


acosh

The acosh function returns the hyperbolic arccosine of each value of the column.

Parameters

acosh(column)

Parameter

Description

column

The attribute from which we want to retrieve the hyperbolic arccosine value. Its type must be numeral. The column parameter is mandatory.

Example - acosh(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to calculate the hyperbolic arccosine of the tempo attribute.

  • Add a new column, and type the following formula: acosh($"tempo").

  • As you can see, the column has been filled with the hyperbolic arccosine values of the tempo attribute’s values.


asin

The asin function returns the arcsine of each value of the column.

Parameters

asin(column)

Parameters

Description

column

The attribute from which we want to retrieve the arccosine value. Its type must be numeral. The column parameter is mandatory.

Example - asin(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to retrieve the arcsine of each value of the valence column.

  • Add a new column, and type the following formula: asin($"valence").

  • As you can see, the column has been filled with the arcsine values of the valence attribute’s values.


asinh

The asinh function returns the hyperbolic arcsine of each value of the column.

Parameters

asinh(column)

Parameter

Description

column

The attribute from which we want to retrieve the arcsine value. Its type must be numeral. The column parameter is mandatory.

Example - asinh(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to retrieve the hyperbolic arcsine of each value of the valence attribute.

  • Add a new column, and type the following formula: asinh($"valence").

  • As you can see, the column has been filled with the hyperbolic arcsine values of the valence attribute.


atan

The atan function returns the arctangent of each value of the column.

Parameters

atan(column)

Parameter

Description

column

The attribute from which we want to retrieve the arctangent value. Its type must be numeral. The column parameter is mandatory.

Example - atan(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to calculate the arctangent of the valence attribute.

  • Add a new attribute, and type the following formula: atan($"valence").

  • As you can see, the column has been filled with the arctangent values of the valence attribute’s values.


atanh

The atanh function returns the hyperbolic arctangent of each value of the column.

Parameters

atanh(column)

Parameter

Description

column

The column from which we want to retrieve the hyperbolic arctangent values. Its type must be numeral. The column parameter is mandatory.

Example - atanh(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to calculate the hyperbolic arctangent of the valence column.

  • Add a new attribute, then type the following formula: atanh($"valence").

  • As you can see, the column has been filled with the hyperbolic arctangent values of the valence attribute’s values.


baseConv

The baseConv function converts a base 10 integer, or a string that corresponds to an integer, to a different base. Optional parameters allow the user to apply a 2-complement code (if set to True) on the input and/or on the output value.

Parameters

baseConv(column, basein, baseout, compflagin, compflagout)

Parameter

Description

column

The attribute to be converted. Its type must be numeral. The column parameter is mandatory.

basein

The column’s base. The basein parameter is mandatory.

baseout

The result’s base. The baseout parameter is mandatory.

compflagin

If it is False, or not specified, a 2-complement code is not applied to the original column, while if it is True, a 2-complement code is applied to the original column.

compflagout

If it is False, or not specified, a 2-complement code is not applied to the result, while if it is True, a 2-complement code is applied to the original column.

Example - baseConv(column, basein, baseout)

The following example uses the MT functions dataset.

  • In this example, we want to change the base (base-10) of the Var_1 attribute’s values to base-2 values.

  • In the Var_2 attribute, type the following formula: baseConv($"Var_1",10,2).

  • The values in Var_1 have been converted in binary form and are displayed in Var_2.


ceil

The ceil function returns each value of the column, rounded up to its next bigger integer unit.

Parameters

ceil(column)

Parameter

Description

column

The attribute to round up. Its type must be numeral. The column parameter is mandatory.

Example - ceil(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to round up the data in the tempo column.

  • Add a new attribute, then type the following formula: ceil($"Tempo")


cos

The cos function returns the cosine of each row of the column.

Parameters

cos(column)

Parameter

Description

column

The attribute used to evaluate its cosine. Its type must be numeral. The column parameter is mandatory.

Example - cos(column)

The following example uses the Bike Sales dataset.

  • In this example, we want to retrieve the cosine of the Revenue attribute.

  • Add a new column, and type the following formula: cos($"Revenue")

  • As you can see, the attribute has been filled with the cosine of the Revenue column’s values.


cosh

The cosh function returns the hyperbolic cosine of each row of the column.

Parameters

cosh(column)

Parameter

Description

column

The attribute from which we want to retrieve the hyperbolic cosine. Its type must be numeral. The column parameter is mandatory.

Example - cosh(column)

The following example uses the Bike Sales dataset.

  • In this example, we want to retrieve the hyperbolic cosine of the Revenue attribute.

  • Add a new column, and type the following formula: cosh($"Revenue")

  • As you can see, the attribute has been filled with the hyperbolic cosine of the Revenue attribute’s values.


cumProd

The cumProd function returns the cumulative product of the column, evaluated within groups defined by the group parameter if required.

Parameters

cumProd(column, group)

Parameter

Description

column

The attribute used to evaluate its cumulative product. Its type must be numeral. The column parameter is mandatory.

group

It allows you to group the results by a certain attribute’s values. The group parameter can also be a list of attributes.

Example - cumprod(column)

The following example uses the Adult dataset.

  • In this example, we want to retrieve the cumulative product of the education-num attribute.

  • Add a new attribute, then type the following formula: cumProd($"education-num")

  • Rulex Platform attributes’ memory supports numbers till 5 billion of billions (namely, 5.000.000.000.000.000.000). If the cumProd value exceeds this amount, then negative values or not logical values are returned.


cumSum

The cumSum function returns the cumulative sum of the column, which is a sequence of partial sums of all the values in the rows before the current one, evaluated within groups defined by the group parameter if required.

Parameters

cumSum(column, group)

Parameter

Description

column

The attribute used to evaluate the cumulative sum. Its type must be numeral. The column parameter is mandatory.

group

It allows you to group the results by a certain column’s values.

Example - cumsum(column)

The following example uses the Bike Sales dataset.

  • In this example, we want to calculate the cumulative sum of the Revenue attribute.

  • Add a new column and type the following formula: cumSum($"Revenue")

  • Moving on during our analysis, we would like to group the cumulative sum by the Country values, so the formula will become: cumSum($"Revenue",$"Country")

  • The results can be read as follows: * The cumulative sum of the Revenue in Canada up to row 1 is 950. * The cumulative sum of the Revenue in Canada up to row 2 is 1900. * The cumulative sum of the Revenue in Australia up to row 3 is 2401, the same value of Revenue, as it is the first occurrence for the Australia group. * The cumulative sum of the Revenue in Australia up to row 4 (second occurrence of the Australia group) is 4489.


exp

The exp function returns the exponential of each value of the column.

Parameters

exp(column)

Parameter

Description

column

The attribute used to evaluate the exponential. Its type must be numeral. The column parameter is mandatory.

Example - exp(column)

The following example uses the Bike Sales dataset.

  • In this example, we want to retrieve the exponential of the Day attribute.

  • Add a new attribute, then type the following formula: exp($"Day")

  • As you can see, the column has been filled with the exponential in the corresponding row.

  • Rulex Platform attributes’ memory supports numbers till 5 billion of billions (namely, 5.000.000.000.000.000.000). If the exp value exceeds this amount, then negative values or not logical values are returned.


floor

The floor function returns each value of the column, rounded down to the next smaller integer unit of the column.

Parameters

floor(column)

Parameter

Description

column

The attribute to round down. Its type must be numeral. The column parameter is mandatory.

Example - floor(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to round down the Tempo attribute.

  • Add a new column, and write the following formula: floor($"tempo")


isInteger

The isInteger function checks whether the data type of the attribute specified in the string parameter is an integer.

Parameters

isInteger(string, binary)

Parameter

Description

string

The nominal attribute to be tested. The string parameter is mandatory.

binary

If the optional binary parameter is True, or not specified, results will be provided as boolean values (True / False), while if it is False, results will be provided in binary form (1 / 0). The binary parameter values (True / False) are case-sensitive.

Note

Integer type valid formats (if the values aren’t this format, the function will return a False or 0).
  • 45;

  • -44

Example - isInteger(column)

The following example uses the Bike Sales dataset.

  • In this example, we want to check if the Age_Group attribute is an Integer attribute.

  • We add a new attribute and write the following formula: isInteger($"Age_Group")

  • We didn’t specify the binary parameter, as we want our results to be displayed in boolean values.


log

The log function returns the natural logarithm of each row of the column.

Parameters

log( column )

Parameter

Description

column

The attribute used to evaluate the natural logarithm. Its type must be numeral.

Example - log(column)

The following example uses the Bike Sales dataset.

  • In this example, we want to retrieve the logarithm of the Customer_Age attribute.

  • Add a new attribute, then type the following formula: log($"Customer_Age")

  • As you can see, the attribute has been filled with the logarithm of the Customer_Age attribute’s values.


log10

The log10 function returns the logarithm (to the base 10) of each row of the column.

Parameters

log10(column)

Parameter

Description

column

The attribute used to evaluate the logarithm to the base 10. Its type must be numeral. The column parameter is mandatory.

Example - log10(column)

The following example uses the Bike Sales dataset.

  • In this example, we want to calculate the logarithm to the base 10 of the Customer_Age attribute.

  • Add a new attribute, then type the following formula: log10($"Customer_Age")


prod

The prod function returns the product of the column, evaluated within groups defined by the group parameter if required.

Parameters

prod(column, group)

Parameter

Description

column

The attribute used to evaluate the product. Multiple attributes can be specified, as long as they are enclosed in double brackets, i.e. column((column1, column2)). The column parameter is mandatory.

group

The attribute by which you want to group your results. Multiple attributes can be specified, as long as they are enclosed in double brackets, i.e. prod(column, (group1,group2)).

Example - prod(column)

The following example uses the MT functions dataset.

  • In this example, we want to evaluate the product of all the values within the Var_2 attribute, grouped by the Var_1 attribute’s values.

  • Add a new attribute, then type the following formula: prod($"Var_2",$"Var_1")

  • The results can be read as follows: * For 1 as Var_1 value, the product of Var_2 is 70368744177664 * For 2 as Var_1 value, the product of Var_2 is 18014398509481984


rand

The rand function returns a random column with the specified number of elements. If the number of elements is specified, a random column is created with n (n=number of rows) elements.

Parameters

rand(n,seed)

Parameter

Description

n

The number of rows to fill with random values. Its default value corresponds to the number of rows of the dataset.

seed

A base value, used to generate random values. It makes the random generation deterministic.

Example - rand(n, seed)

  • In this example, we want to add a new column and to fill only 10 rows of it, with 5 as the base value.

  • Add a new column, then type the following formula: rand(10,5)


randGauss

The randGauss function returns a normally distributed random column with the specified number of elements. If the number of elements is specified, a random column with n (=number of rows) elements is created.

Parameters

randGauss(n, seed, mean, stddev)

Parameter

Description

n

The number of rows to fill with random values. Its default value corresponds to the number of rows of the dataset.

seed

A base value, used to generate random values. It makes the random generation deterministic.

mean

0 as default, it is the normal distribution’s mean.

stddev

1 as default, it is the normal distribution’s standard deviation.

Example - randGauss(n, seed)

  • In this example, we want to add a new column and to fill the first 20 rows with a seed of 2, using the randGauss function.

  • Add a new column and type the following formula: randGauss(20,2)

  • No mean and stddev have been specified.


round

The round function returns each value of the column, of continuous type, rounded up or down, depending on their last digits:

  • If the last digit considered is less than 5, then the value is rounded down;

  • If the last digit considered is more than or equal to 5, then the value is rounded up.

Parameters

round(column)

Parameter

Description

column

The attribute used to evaluate the nearest integer value. Its type must be numeral. The column parameter is mandatory.

Example - round(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to add a column which contains the round of the writing score attribute’s values.

  • Add a new column and type the following formula: round($"tempo")


sign

The sign function returns the sign of each value of the column.

  • -1 is returned if the value is negative, while

  • 1 is returned if the value is positive.

Parameters

sign(column)

Parameter

Description

column

The attribute used to evaluate the sign. Its type must be numeral or nominal. The column parameter is mandatory.

Example - sign(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to retrieve the sign of the loudness attribute.

  • Add a new attribute and type the following formula: sign($"loudness")

  • 1 is returned if the loudness value sign in the same row is positive

  • -1 is returned if the loudness value sign in the same row is negative


sin

The sin function returns the sine of each row of the column.

Parameters

sin(column)

Parameter

Description

column

The attribute used to evaluate the sine. Its type must be numeral. The column parameter is mandatory.

Example - sin(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to fill a new column with the sine of the key attribute.

  • Add a new column, then type the following formula: sin($"key")


sinh

The sinh function returns the hyperbolic sine of each row of the column.

Parameters

sinh(column)

Parameter

Description

column

The attribute used to retrieve the hyperbolic sine. Its type must be numeral. The column parameter is mandatory.

Example - sinh(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to fill a new column with the hyperbolic sine of the key attribute.

  • Add a new column, then type the following formula: sinh($"key")


sqrt

The sqrt function returns the square root of each value of the column.

Parameters

sqrt(column)

Parameter

Description

column

The attribute used to evaluate the square root. Its type must be numeral.

Example - sqrt(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to fill a new column with the square root of the tempo attribute.

  • Add a new column, then write the following formula: sqrt($"tempo")


sum

The sum function returns the sum of the column, evaluated within groups defined by the group parameter if required.

Parameters

sum(column, group)

Parameter

Description

column

The attribute used to evaluate the sum. Multiple attributes can be specified, as long as they are enclosed in double brackets, i.e. column((column1, column2)). In this occurrence, the sum will be calculated by row. The column parameter is mandatory.

group

The attribute by which you want to group your results. If this parameter is a list, the column parameter can’t be a list.

Example - sum(column)

The following example uses the Students Performance dataset.

  • In this example, we want to sum the values of the writing score attribute.

  • Add a new attribute, then type the following formula: sum($"writing score")

  • The attribute has been filled with the sum of all the values in the writing score attribute.

Example - sum(column, group)

The following example uses the Students Performance dataset.

  • In this example, we want our results to be grouped by the test preparation course attribute’s values.

  • The formula becomes: sum($"writing score",$"test preparation course")

  • The results can be read as follows: * The sum of the writing score for those who took no preparation course (’None‘) is 41412. * The sum of the writing score for those who completed the preparation course is 26642.

Example - sum((column1, column2))

The following example uses the Students Performance dataset.

  • When the column value is a list, the formula considers the single rows of the attributes, so it sums considering the chosen attributes in rows.

  • For example, if we want to know the sum of the math score, writing score, reading score of the student in row 1, the formula becomes:sum(($"math score",$"writing score",$"reading score"))

  • The results can be read as follows: * The student in row 1, female, group B, parental level of education bachelor’s degree, standard lunch and no test preparation course has an overall score of 218. * The student in row 4, male, group A, parental level of education associate’s degree, free/reduced lunch and no test preparation course has an overall score of 148.


tan

The tan function returns the tangent of each row of the column.

Parameters

tan(column)

Parameter

Description

column

The attribute used to evaluate the tangent. Its type must be numeral. The column parameter is mandatory.

Example - tan(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to calculate the tangent of the tempo attribute.

  • Add a new column, and type the following formula: tan($"tempo")


tanh

The tanh function returns the hyperbolic tangent of each row of the column.

Parameters

tanh(column)

Parameter

Description

column

The attribute used to evaluate the hyperbolic tangent. Its type must be numeral. The column parameter is mandatory.

Example - tanh(column)

The following example uses the Top Hits Spotify dataset.

  • In this example, we want to retrieve the hyperbolic tangent of the key attribute.

  • Add a new attribute, then type the following formula: tanh($"key")