Graph functions

Graph functions are complex functions used to perform operations on particular direct graphs.

The following functions are available:

Note

In the rest of the page we are going to refer to a link from one node to another as an edge.


connComp

This function operates on directed graphs. For each node, it returns the corresponding subgroup.

The node is the parent or son value of the graph.

In other words, the connComp function defines which subgroups of nodes are interconnected.

Parameters

connComp(parent, son, group)

Parameter

Description

parent

The attribute containing the parent nodes of a directed graph. The parent parameter is mandatory.

son

The attribute containing the son nodes of a directed graph. The son parameter is mandatory.

group

The attribute by which you want to further group results. The group parameter can also be defined as a list: connComp($"parent",$"son",($"group1", $"group2"))

Example - connComp(parent, son)

The following example uses the connComp dataset.

  • In this example, we want to retrieve the connected component of this graph.

  • We initially had two attributes which define the parent and son relation, we have created a new one the ConnComp attribute, and applied the connComp function to it by specifying which attribute is the parent, and which is the son.

  • So, the formula would be: connComp($"Parent",$"Son").

  • In the ConnComp attribute, the function has defined the subgroups, so we now know that there are 3 subgroups and to which group each node belong.


leaf

The leaf function returns the corresponding leaf for each node of the son attribute. The leaf is the very last node of the branch in a directed graph.

Parameters

leaf(parent, son, group, whichpath, separator, weights, operator)

Parameter

Description

parent

The attribute containing the parent nodes of a directed graph. The parent parameter is mandatory.

son

The attribute containing the son nodes of a directed graph. The son parameter is mandatory.

group

The attribute by which you want to further group results.
The group parameter can also be defined as a list: leaf($"parent",$"son",($"group1", $"group2"))

whichpath

When defined, it allows the user to choose which path is to be considered:

  • the shortest one - whichpath = "minimum" ,
  • the longest one - whichpath = "maximum" or
  • all the paths - whichpath = "all" = in this case, leafs are concatenated in a single string.
If no path is specified, the leaf function applies the “minimum” variable by default.

separator

It specifies the separator to use in the concatenation of different leafs. The default separator is "-".This parameter can only be used when the whichpath = "all" is defined.

weights

The attribute defining the length of the edge.

operator

It defines how to combine weights attributes along the path to the leaf. If left unspecified, the default operator applied is sum and the weight attributes will be summed. While if the operator is prod, they will be multiplied. It can only be used if the weights parameter is specified.

Example - leaf(parent, son)

The following example uses the Master data dataset.

  • In this example, we want to retrieve the corresponding leaf for each node.

  • In the leaf function, we need to define which attribute is the parent, and which is the son.

  • So, the formula would be: leaf($"LocFr",$"LocTo").

  • Here we see the result we have obtained with the leaf function, and we now know what’s the corresponding leaf for each value in the son attribute.


leafDistance

The leafDistance function calculates the distance, in terms of number of edges, of each node of the son attribute from its leaf. A leaf is the very last node of the branch.

Parameters

leafDistance(parent, son, group, whichpath, separator, weights, operator)

Parameter

Description

parent

The attribute containing the parent nodes of a directed graph. The parent parameter is mandatory.

son

The attribute containing the son nodes of a directed graph. The son parameter is mandatory.

group

The attribute by which you want to further group results.
The group parameter can also be defined as a list: leafDistance($"parent",$"son",($"group1", $"group2"))

whichpath

When defined, it allows the user to choose which path is to be considered:

  • the shortest one - whichpath = "minimum" ,
  • the longest one - whichpath = "maximum" or
  • all the paths - whichpath = "all" = in this case, leafs are concatenated in a single string.
If no path is specified, the leaf function applies the “minimum” variable by default.

separator

It specifies the separator to use in the concatenation of different distances. The default separator is "-".This parameter can only be used when the whichpath = "all" is defined.

weights

The attribute defining the length of the edge.

operator

It defines how to combine weights attributes along the path to the leaf. If left unspecified, the default operator applied is sum and the weight attributes will be summed. While if the operator is prod, they will be multiplied. It can only be used if the weights parameter is specified.

Example - leafDistance(parent, son)

The following example uses the Master data dataset.

  • In this example, we have a supply chain dataset, and we want to calculate the distance from the location (LocFr attribute) to the final customer facing distribution (LocTo attribute).

  • We can use the leafDistance function which calculates how many steps are to be taken to reach the leaf of a directed graph.

  • We need to define the parent and son parameters, which are respectively the LocFr and LocTo attributes.

  • So the formula would be: leafDistance($"LocFr",$"LocTo").

Example - leafDistance(parent, son, group)

The following example uses the Master data dataset.

  • In this other example, we want to calculate the distance from the location to the final customer facing distribution center for each product.

  • If the location is a customer facing distribution center, the result will be 0.

  • To achieve this goal we need to use the leafDistance function, define the parent and son parameters, which are respectively the LocFr and LocTo attributes, and group the results by Product.

  • So the formula would be: leafDistance($"LocFr",$"LocTo",$"Product").

  • In this way the leafDistance function has calculated the distance in steps from LocFr to LocTo for each product.


root

The root function returns the corresponding root of a node. The root is the very first node of the branch.

Parameters

root(parent, son, group, whichpath, separator, weights, operator)

Parameter

Description

parent

The attribute containing the parent nodes of a directed graph. The parent parameter is mandatory.

son

The attribute containing the son nodes of a directed graph. The son parameter is mandatory.

group

The attribute by which you want to further group results. The group parameter can also be defined as a list: root($"parent",$"son",($"group1", $"group2"))

whichpath

When defined, it allows the user to choose which path is to be considered:

  • the shortest one - whichpath = "minimum" ,
  • the longest one - whichpath = "maximum" or
  • all the paths - whichpath = "all" = in this case, leafs are concatenated in a single string.
If no path is specified, the leaf function applies the “minimum” variable by default.

separator

It specifies the separator to use in the concatenation of different roots. The default separator is "-".This parameter can only be used when the whichpath = "all" is defined.

weights

The attribute defining the length of the edge.

operator

It defines how to combine weights attributes along the path to the root. If left unspecified, the default operator applied is sum and the weight attributes will be summed. While if the operator is prod, they will be multiplied. It can only be used if the weights parameter is specified.

Example - root(parent, son)

The following example uses the BOMs dataset.

  • In this example, we want to retrieve the corresponding root of the nodes defined in the son attribute.

  • In the root function, we need to define which attribute is the parent, and which is the son.

  • This is a Bill of Material (BOM), and the parent-son relationship is defined respectively by the ParentComponentID and ComponentID attributes.

  • So, the formula is: root($"ParentComponentID",$"ComponentID").

  • Here we can see the results - the root function has retrieved the root for each son node.

Example - root(parent, son, group)

The following example uses the BOMs dataset.

  • In this other example, we want to retrieve the root of the nodes defined in the son attribute, and group the results by the Quantity attribute.

  • We’re still using the BOM dataset, so the parent-son relationship is respectively defined by the ParentComponentID and ComponentID.

  • So, the consequent formula would be: root($"ParentComponentID", $"ComponentID", $"Quantity").

  • Here are the results that have been grouped by the Quantity attribute.


rootDistance

The rootDistance function calculates the distance, in terms of number of edges, of each node of the son attribute from its root. A root is the very first node of the branch.

Parameters

rootDistance(parent, son, group, whichpath, separator, weights, operator)

Parameter

Description

parent

The attribute containing the parent nodes of a directed graph. The parent parameter is mandatory.

son

The attribute containing the son nodes of a directed graph. The son parameter is mandatory.

group

The attribute by which you want to further group results. The group parameter can also be defined as a list: rootDistance($"parent",$"son",($"group1", $"group2"))

whichpath

When defined, it allows the user to choose which path is to be considered:

  • the shortest one - whichpath = "minimum" ,
  • the longest one - whichpath = "maximum" or
  • all the paths - whichpath = "all" = in this case, leafs are concatenated in a single string.
If no path is specified, the leaf function applies the “minimum” variable by default.

separator

It specifies the separator to use in the concatenation of different distances. The default separator is "-".This parameter can only be used when the whichpath = "all" is defined.

weights

The attribute defining the length of the edge.

operator

It defines how to combine weights attributes along the path to the root. If left unspecified, the default operator applied is sum and the weight attributes will be summed. While if the operator is prod, they will be multiplied. It can only be used if the weights parameter is specified.

Example - rootDistance(parent, son)

The following example uses the BOMs dataset.

  • In this example, we want to retrieve the distance of each node from the root of the directed graph. This is a Bill of Material (BOM), and the parent-son relationship is defined respectively by the ParentComponentID and ComponentID attributes.

  • As each row has not only information on the component itself, but also on its parent component, we can use the rootDistance function which calculates the distance of each component from the finished product.

  • We add a new attribute, and define the rootDistance function specifying which attribute is the parent and which is the son.

  • So, the formula is: rootDistance($"ParentComponentID",$"ComponentID")

  • Here the rootDistance function has calculated how many parent levels each node has. We can see that for the finished product in row 1, the parent level is 0 because we are already at the highest level of the hierarchy of the graph.

  • While for the other components we can see all the intermediates levels up to 3, which is the maximum distance from the final product - root - of this directed graph.

Example - rootDistance(parent, son, weights=weights, operator=operator)

The following example uses the BOMs dataset.

  • In this other example, we want to calculate the total quantity of component needed to build the final product.

  • The Quantity attribute states how many components we need to build the parent component, but not for the whole finished product.

  • To achieve this goal we need to define the rootDistance function as follows: * specify which are the parent and son attributes * add the Quantity attribute as weights parameter as it defines the length of the edge * specify the operator parameter as prod - which will multiply the results by the weight.

  • We add a new attribute and define the rootDistance function.

  • The consequent formula would be: rootDistance($"ParentComponentID", $"ComponentID", weights=$"Quantity", operator="prod").

  • In the picture we can see the results.