Math and trigonometric functions¶
Math and trigonometric functions perform mathematical operations on the selected columns.
The following functions are available:
abs¶
The abs function returns the absolute value of each value of the column.
Parameters
abs(column)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate its absolute value. Its type must be numeral. The column parameter is mandatory. |
Example - abs(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to convert the loudness attribute’s values to their absolute value.
Add a new column to the dataset and type the following formula:
abs($"loudness")
.As you can see, the column has been filled with the absolute value of the loudness attribute.
acos¶
The acos function returns the arccosine value of each value of the column.
Parameters
acos(column)
Parameters |
Description |
---|---|
column |
The attribute from which the arccosine value will be retrieved. Its type must be numeral. The column parameter is mandatory. |
Example - acos(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to calculate the arccosine of the valence attribute.
Add a new column, and type the following formula:
acos($"valence")
.As you can see, the column has been filled with the arccosine values of the valence attribute’s values.
acosh¶
The acosh function returns the hyperbolic arccosine of each value of the column.
Parameters
acosh(column)
Parameter |
Description |
---|---|
column |
The attribute from which the hyperbolic arccosine value will be retrieved. Its type must be numeral. The column parameter is mandatory. |
Example - acosh(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to calculate the hyperbolic arccosine of the tempo attribute.
Add a new column, and type the following formula:
acosh($"tempo")
.As you can see, the column has been filled with the hyperbolic arccosine values of the tempo attribute’s values.
asin¶
The asin function returns the arcsine of each value of the column.
Parameters
asin(column)
Parameters |
Description |
---|---|
column |
The attribute from which the arccosine value will be retrieved. Its type must be numeral. The column parameter is mandatory. |
Example - asin(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to retrieve the arcsine of each value of the valence column.
Add a new column, and type the following formula:
asin($"valence")
.As you can see, the column has been filled with the arcsine values of the valence attribute’s values.
asinh¶
The asinh function returns the hyperbolic arcsine of each value of the column.
Parameters
asinh(column)
Parameter |
Description |
---|---|
column |
The attribute from which the arcsine value will be retrieved. Its type must be numeral. The column parameter is mandatory. |
Example - asinh(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to retrieve the hyperbolic arcsine of each value of the valence attribute.
Add a new column, and type the following formula:
asinh($"valence")
.As you can see, the column has been filled with the hyperbolic arcsine values of the valence attribute.
atan¶
The atan function returns the arctangent of each value of the column.
Parameters
atan(column)
Parameter |
Description |
---|---|
column |
The attribute from which the arctangent value will be retrieved. Its type must be numeral. The column parameter is mandatory. |
Example - atan(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to calculate the arctangent of the valence attribute.
Add a new attribute, and type the following formula:
atan($"valence")
.As you can see, the column has been filled with the arctangent values of the valence attribute’s values.
atanh¶
The atanh function returns the hyperbolic arctangent of each value of the column.
Parameters
atanh(column)
Parameter |
Description |
---|---|
column |
The column from which the hyperbolic arctangent values will be retrieved. Its type must be numeral. The column parameter is mandatory. |
Example - atanh(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to calculate the hyperbolic arctangent of the valence column.
Add a new attribute, then type the following formula:
atanh($"valence")
.As you can see, the column has been filled with the hyperbolic arctangent values of the valence attribute’s values.
baseConv¶
The baseConv function converts a base 10 integer, or a string that corresponds to an integer, to a different base. Optional parameters allow the user to apply a 2-complement code (if set to True) on the input and/or on the output value.
Parameters
baseConv(column, basein, baseout, compflagin, compflagout)
Parameter |
Description |
---|---|
column |
The attribute to be converted. Its type must be numeral. The column parameter is mandatory. |
basein |
The column’s base. The basein parameter is mandatory. |
baseout |
The result’s base. The baseout parameter is mandatory. |
compflagin |
If it is |
compflagout |
If it is |
Example - baseConv(column, basein, baseout)
The following example uses the MT functions dataset.
In this example, we want to change the base (base-10) of the Var_1 attribute’s values to base-2 values.
In the Var_2 attribute, type the following formula:
baseConv($"Var_1",10,2)
.The values in Var_1 have been converted in binary form and are displayed in Var_2.
ceil¶
The ceil function returns each value of the column, rounded up to its next bigger integer unit.
Parameters
ceil(column)
Parameter |
Description |
---|---|
column |
The attribute to round up. Its type must be numeral. The column parameter is mandatory. |
Example - ceil(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to round up the data in the tempo column.
Add a new attribute, then type the following formula:
ceil($"Tempo")
cos¶
The cos function returns the cosine of each row of the column.
Parameters
cos(column)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate its cosine. Its type must be numeral. The column parameter is mandatory. |
Example - cos(column)
The following example uses the Bike Sales dataset.
In this example, we want to retrieve the cosine of the Revenue attribute.
Add a new column, and type the following formula:
cos($"Revenue")
As you can see, the attribute has been filled with the cosine of the Revenue column’s values.
cosh¶
The cosh function returns the hyperbolic cosine of each row of the column.
Parameters
cosh(column)
Parameter |
Description |
---|---|
column |
The attribute from which we want to retrieve the hyperbolic cosine. Its type must be numeral. The column parameter is mandatory. |
Example - cosh(column)
The following example uses the Bike Sales dataset.
In this example, we want to retrieve the hyperbolic cosine of the Revenue attribute.
Add a new column, and type the following formula:
cosh($"Revenue")
As you can see, the attribute has been filled with the hyperbolic cosine of the Revenue attribute’s values.
cumProd¶
The cumProd function returns the cumulative product of the column, evaluated within groups defined by the group parameter if required.
Parameters
cumProd(column, group)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate its cumulative product. Its type must be numeral. The column parameter is mandatory. |
group |
It allows you to group the results by a certain attribute’s values. The group parameter can also be a list of attributes. |
Example - cumprod(column)
The following example uses the Adult dataset.
In this example, we want to retrieve the cumulative product of the education-num attribute.
Add a new attribute, then type the following formula:
cumProd($"education-num")
Rulex Platform attributes’ memory supports numbers till 5 billion of billions (namely, 5.000.000.000.000.000.000). If the cumProd value exceeds this amount, then negative values or not logical values are returned.
cumSum¶
The cumSum function returns the cumulative sum of the column, which is a sequence of partial sums of all the values in the rows before the current one, evaluated within groups defined by the group parameter if required.
Parameters
cumSum(column, group)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate the cumulative sum. Its type must be numeral. The column parameter is mandatory. |
group |
It allows you to group the results by a certain column’s values. |
Example - cumsum(column)
The following example uses the Bike Sales dataset.
In this example, we want to calculate the cumulative sum of the Revenue attribute.
Add a new column and type the following formula:
cumSum($"Revenue")
Moving on during our analysis, we would like to group the cumulative sum by the Country values, so the formula will become:
cumSum($"Revenue",$"Country")
The results can be read as follows: * The cumulative sum of the Revenue in Canada up to row 1 is 950. * The cumulative sum of the Revenue in Canada up to row 2 is 1900. * The cumulative sum of the Revenue in Australia up to row 3 is 2401, the same value of Revenue, as it is the first occurrence for the Australia group. * The cumulative sum of the Revenue in Australia up to row 4 (second occurrence of the Australia group) is 4489.
exp¶
The exp function returns the exponential of each value of the column.
Parameters
exp(column)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate the exponential. Its type must be numeral. The column parameter is mandatory. |
Example - exp(column)
The following example uses the Bike Sales dataset.
In this example, we want to retrieve the exponential of the Day attribute.
Add a new attribute, then type the following formula:
exp($"Day")
As you can see, the column has been filled with the exponential in the corresponding row.
Rulex Platform attributes’ memory supports numbers till 5 billion of billions (namely, 5.000.000.000.000.000.000). If the exp value exceeds this amount, then negative values or not logical values are returned.
floor¶
The floor function returns each value of the column, rounded down to the next smaller integer unit of the column.
Parameters
floor(column)
Parameter |
Description |
---|---|
column |
The attribute to round down. Its type must be numeral. The column parameter is mandatory. |
Example - floor(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to round down the Tempo attribute.
Add a new column, and write the following formula:
floor($"tempo")
isInteger¶
The isInteger function checks whether the data type of the attribute specified in the string parameter is an integer.
Parameters
isInteger(string, binary)
Parameter |
Description |
---|---|
string |
The nominal attribute to be tested. The string parameter is mandatory. |
binary |
If the optional binary parameter is |
Note
- Integer type valid formats (if the values aren’t this format, the function will return a
False
or0
). 45;
-44
Example - isInteger(column)
The following example uses the Bike Sales dataset.
In this example, we want to check if the Age_Group attribute is an Integer attribute.
We add a new attribute and write the following formula:
isInteger($"Age_Group")
We didn’t specify the binary parameter, as we want our results to be displayed in boolean values.
log¶
The log function returns the natural logarithm of each row of the column.
Parameters
log( column )
Parameter |
Description |
---|---|
column |
The attribute used to evaluate the natural logarithm. Its type must be numeral. |
Example - log(column)
The following example uses the Bike Sales dataset.
In this example, we want to retrieve the logarithm of the Customer_Age attribute.
Add a new attribute, then type the following formula:
log($"Customer_Age")
As you can see, the attribute has been filled with the logarithm of the Customer_Age attribute’s values.
log10¶
The log10 function returns the logarithm (to the base 10) of each row of the column.
Parameters
log10(column)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate the logarithm to the base 10. Its type must be numeral. The column parameter is mandatory. |
Example - log10(column)
The following example uses the Bike Sales dataset.
In this example, we want to calculate the logarithm to the base 10 of the Customer_Age attribute.
Add a new attribute, then type the following formula:
log10($"Customer_Age")
prod¶
The prod function returns the product of the column, evaluated within groups defined by the group parameter if required.
Parameters
prod(column, group)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate the product. Multiple attributes can be specified, as long as they are enclosed in double brackets, i.e. |
group |
The attribute by which you want to group your results. Multiple attributes can be specified, as long as they are enclosed in double brackets, i.e. |
Example - prod(column)
The following example uses the MT functions dataset.
In this example, we want to evaluate the product of all the values within the Var_2 attribute, grouped by the Var_1 attribute’s values.
Add a new attribute, then type the following formula:
prod($"Var_2",$"Var_1")
The results can be read as follows: * For 1 as Var_1 value, the product of Var_2 is 70368744177664 * For 2 as Var_1 value, the product of Var_2 is 18014398509481984
rand¶
The rand function returns a random column with the specified number of elements. If the number of elements is specified, a random column is created with n (n=number of rows) elements.
Parameters
rand(n,seed)
Parameter |
Description |
---|---|
n |
The number of rows to fill with random values. Its default value corresponds to the number of rows of the dataset. |
seed |
A base value, used to generate random values. It makes the random generation deterministic. |
Example - rand(n, seed)
In this example, we want to add a new column and to fill only 10 rows of it, with 5 as the base value.
Add a new column, then type the following formula:
rand(10,5)
randGauss¶
The randGauss function returns a normally distributed random column with the specified number of elements. If the number of elements is specified, a random column with n (=number of rows) elements is created.
Parameters
randGauss(n, seed, mean, stddev)
Parameter |
Description |
---|---|
n |
The number of rows to fill with random values. Its default value corresponds to the number of rows of the dataset. |
seed |
A base value, used to generate random values. It makes the random generation deterministic. |
mean |
0 as default, it is the normal distribution’s mean. |
stddev |
1 as default, it is the normal distribution’s standard deviation. |
Example - randGauss(n, seed)
In this example, we want to add a new column and to fill the first 20 rows with a seed of 2, using the randGauss function.
Add a new column and type the following formula:
randGauss(20,2)
No mean and stddev have been specified.
round¶
The round function returns each value of the column, of continuous type, rounded up or down, depending on their last digits:
If the last digit considered is less than 5, then the value is rounded down;
If the last digit considered is more than or equal to 5, then the value is rounded up.
Parameters
round(column)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate the nearest integer value. Its type must be numeral. The column parameter is mandatory. |
Example - round(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to add a column which contains the round of the writing score attribute’s values.
Add a new column and type the following formula:
round($"tempo")
sign¶
The sign function returns the sign of each value of the column.
-1 is returned if the value is negative, while
1 is returned if the value is positive.
Parameters
sign(column)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate the sign. Its type must be numeral or nominal. The column parameter is mandatory. |
Example - sign(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to retrieve the sign of the loudness attribute.
Add a new attribute and type the following formula:
sign($"loudness")
1 is returned if the loudness value sign in the same row is positive
-1 is returned if the loudness value sign in the same row is negative
sin¶
The sin function returns the sine of each row of the column.
Parameters
sin(column)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate the sine. Its type must be numeral. The column parameter is mandatory. |
Example - sin(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to fill a new column with the sine of the key attribute.
Add a new column, then type the following formula:
sin($"key")
sinh¶
The sinh function returns the hyperbolic sine of each row of the column.
Parameters
sinh(column)
Parameter |
Description |
---|---|
column |
The attribute used to retrieve the hyperbolic sine. Its type must be numeral. The column parameter is mandatory. |
Example - sinh(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to fill a new column with the hyperbolic sine of the key attribute.
Add a new column, then type the following formula:
sinh($"key")
sqrt¶
The sqrt function returns the square root of each value of the column.
Parameters
sqrt(column)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate the square root. Its type must be numeral. |
Example - sqrt(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to fill a new column with the square root of the tempo attribute.
Add a new column, then write the following formula:
sqrt($"tempo")
sum¶
The sum function returns the sum of the column, evaluated within groups defined by the group parameter if required.
Parameters
sum(column, group)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate the sum. Multiple attributes can be specified, as long as they are enclosed in double brackets, i.e. column((column1, column2)). In this occurrence, the sum will be calculated by row. The column parameter is mandatory. |
group |
The attribute by which you want to group your results. If this parameter is a list, the column parameter can’t be a list. |
Example - sum(column)
The following example uses the Students Performance dataset.
In this example, we want to sum the values of the writing score attribute.
Add a new attribute, then type the following formula:
sum($"writing score")
The attribute has been filled with the sum of all the values in the writing score attribute.
Example - sum(column, group)
The following example uses the Students Performance dataset.
In this example, we want our results to be grouped by the test preparation course attribute’s values.
The formula becomes:
sum($"writing score",$"test preparation course")
The results can be read as follows: * The sum of the writing score for those who took no preparation course (’
None
‘) is 41412. * The sum of the writing score for those who completed the preparation course is 26642.
Example - sum((column1, column2))
The following example uses the Students Performance dataset.
When the column value is a list, the formula considers the single rows of the attributes, so it sums considering the chosen attributes in rows.
For example, if we want to know the sum of the math score, writing score, reading score of the student in row 1, the formula becomes:
sum(($"math score",$"writing score",$"reading score"))
The results can be read as follows: * The student in row 1, female, group B, parental level of education bachelor’s degree, standard lunch and no test preparation course has an overall score of 218. * The student in row 4, male, group A, parental level of education associate’s degree, free/reduced lunch and no test preparation course has an overall score of 148.
tan¶
The tan function returns the tangent of each row of the column.
Parameters
tan(column)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate the tangent. Its type must be numeral. The column parameter is mandatory. |
Example - tan(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to calculate the tangent of the tempo attribute.
Add a new column, and type the following formula:
tan($"tempo")
tanh¶
The tanh function returns the hyperbolic tangent of each row of the column.
Parameters
tanh(column)
Parameter |
Description |
---|---|
column |
The attribute used to evaluate the hyperbolic tangent. Its type must be numeral. The column parameter is mandatory. |
Example - tanh(column)
The following example uses the Top Hits Spotify dataset.
In this example, we want to retrieve the hyperbolic tangent of the key attribute.
Add a new attribute, then type the following formula:
tanh($"key")