Statistics configuration panels¶
Anytime in the Statistic Manager you want to evaluate statistics on some attribute of the underlying dataset, you first need to drag’n drop desired input attributes and then select the class of statistics you want to compute.
Then, you may want to further customize the provided stats list by adding a not default included statistic or by controlling some computation options of this operation.
All these fine-tuning customizations are performed in the Statistics configuration panel opened by clicking on the Pencil icon located at the right of each row in the Statistics area.
Since these panels differ according to the statistic class selected we are to dedicate one section to each group. All the panels own two tabs:
Statistics tab: where the actual stats to evaluate are chosen.
Options tab: where computation options for the evaluation are imposed.
Single statistics¶
Stats available in the Statistics tab for this subclass are (in bold the default one):
- Sample size, stat category which includes:
Number of total valid samples
- Descriptive, location and central tendency measures, stat category which includes:
Number of distinct values
Number of missing values
Minimum value
Index of minimum element
Maximum value
Index of maximum element
Sum value
Absolute sum value
Product value
Absolute product value
Mean value
Absolute mean values
Geometric mean value
Geometric absolute mean value
Harmonic mean value
Harmonic absolute mean value
Mode value
Number of mode elements
Index of mode element
Median value
Lower quartile
Upper quartile
Lower whisker for box plot
Upper whisker for box plot
- Dispersion and heterogeneity measures, stat category which includes:
Range of values
Interquartile range
Standard error of mean
Standard deviation
Standard error of standard deviation
Variance
Standard error of variance
Coefficient of variation
Mean absolute deviation
Median absolute deviation
Pietra index
Entropy
Normalized entropy
Gini coefficient
Normalized Gini coefficient
- Concentration measures, stat category which includes:
Gini concentration index
- Symmetry and shape measures, stat category which includes:
Skewness value
Standard error of skewness
Kurtosis value
Standard error of kurtosis
You can check/uncheck single statistic to add them to the computed list. You can also check/uncheck the whole stat category to add/remove all its entries from the evaluated list.
For this subclass, the unique computation option which can be specified is:
Statistics on integer variables are continuous: a checkbox controlling if the result of a statistic evaluation on integer attribute must be converted to integer as well (if option unchecked) or not (if option checked).
Values, frequencies and quantiles¶
Stats available in the Statistics tab for this subclass are (in bold the default one):
- Sample size, stat category which includes:
Number of total valid samples
- Frequencies indicators, stat category which includes:
Distinct values
Absolute frequencies
Relative frequencies
Cumulative frequencies
Partial sums
Partial means
Lorenz Curve
Pietra Curve
Generic quantiles
Rank
You can check/uncheck single statistic to add them to the computed list. You can also check/uncheck the whole stat category to add/remove all its entries from the evaluated list.
For this subclass, you can specify the following options:
Statistics on integer variables are continuous: a checkbox controlling if the result of a statistic evaluation on integer attribute must be converted to integer as well (if option unchecked) or not (if option checked).
Value for generic quantiles: the number of quantiles used when requested by the frequencies’ evaluation.
Correlation/Covariance¶
Stats available in the Statistics tab for this subclass are (in bold the default one):
- Sample size, stat category which includes:
Number of total valid samples
- Pearson correlation coefficient, stat category which includes:
r-value of Pearson coefficient
P-value of Pearson coefficient
- Spearman Correlation Coefficient, stat category which includes:
ρ -value for Spearman coefficient
P-value for Spearman coefficient
- Kendall Tau, stat category which includes:
τ-value for Kendall Tau
P-value for Kendall Tau
- Simple regression coefficient, stat category which includes:
β-value for Simple regression coefficient
P-value for Simple regression coefficient
You can check/uncheck single statistic to add them to the computed list. You can also check/uncheck the whole stat category to add/remove all its entries from the evaluated list.
For this subclass, no computation options are present.
Cross tabulation statistics¶
Stats available in the Statistics tab for this subclass are (in bold the default one):
- Sample size, stat category which includes:
Number of total valid samples
- Contingency tables, stat category which includes:
Contingency table
Expected contingency table
- Statistical test, stat category which includes:
Pearson χ square
P-value for Pearson χ square
You can check/uncheck single statistic to add them to the computed list. You can also check/uncheck the whole stat category to add/remove all its entries from the evaluated list.
For this subclass, the unique computation option which can be specified is:
Use missing values to control if missing value has to be considered during stats evaluation.
ROC Curve¶
Stats available in the Statistics tab for this subclass are (in bold the default one):
- Sample size, stat category which includes:
Number of valid positives samples
Number of valid negatives samples
Number of total valid samples
- ROC curve (scalar), stat category which includes:
Area Under Curve
P-value of Area Under Curve
Standard Error of Area Under Curve
Point of maximum youden index
Point closest to
(0, 1)
Point of maximum accuracy
Point with specificity = sensitivity
- ROC curve (vector), stat category which includes:
AUC 95% confidence interval
1-Specificity
Sensitivity
Accuracies
Thresholds
Youden indices
Likelihood ratio -
Likelihood ration +
You can check/uncheck single statistic to add them to the computed list. You can also check/uncheck the whole stat category to add/remove all its entries from the evaluated list.
For this subclass, you can specify the following computation options:
Statistics on integer variables are continuous: a checkbox controlling if the result of a statistic evaluation on integer attribute must be converted to integer as well (if option unchecked) or not (if option checked).
Use target attribute: select if you want to use the terms into the Var_2/Target area as ROC curve target.
Consider missing value as target with negative outcome
- Positive test for: this drop-down menu permits to select one of the following choices:
Greater Values
Lower Values
Automatic Selection
Test for independent samples¶
Stats available in the Statistics tab for this subclass are (in bold the default one):
- Sample size, stat category which includes:
Number of valid positives samples
Number of valid negatives samples
Number of total valid samples
- Wilcoxon and Mann-Whitney test, stat category which includes:
Mann-Whitney U-value
Mann-Whitney normalized U-value
Wilcoxon R1-value
Wilcoxon Normalized R1-value
P-value of Wilcoxon test
- Kolmogorov-Smirnov test, stat category which includes:
KS value
P-value for KS test
- Student t-test, stat category which includes:
Student t-value
P-value for Student t-test
- Levene test, stat category which includes:
F-value for Levene test
P-value for Levene test
You can check/uncheck single statistic to add them to the computed list. You can also check/uncheck the whole stat category to add/remove all its entries from the evaluated list.
For this subclass, you can specify the following computation options:
Use target attribute: select if you want to use the terms into the Var_2/Target area as ROC curve target.
Test for paired samples¶
Stats available in the Statistics tab for this subclass are (in bold the default one):
- Sample size, stat category which includes:
Number of total valid pairs
- Student t-test, stat category which includes:
Student t-value
P-value for Student t-test
- Wilcoxon test, stat category which includes:
W-value for Wilcoxon test
W-value for normalized Wilcoxon test
P-value for Wilcoxon test
Number of unequal pairs
You can check/uncheck single statistic to add them to the computed list. You can also check/uncheck the whole stat category to add/remove all its entries from the evaluated list.
For this subclass, no computation options are present.