abydos.stats package¶
abydos.stats.
The stats module defines functions for calculating various statistical data about linguistic objects.
Functions are provided for calculating the following means:
arithmetic mean (
amean()
)geometric mean (
gmean()
)harmonic mean (
hmean()
)quadratic mean (
qmean()
)contraharmonic mean (
cmean()
)logarithmic mean (
lmean()
)identric (exponential) mean (
imean()
)Seiffert's mean (
seiffert_mean()
)Lehmer mean (
lehmer_mean()
)Heronian mean (
heronian_mean()
)Hölder (power/generalized) mean (
hoelder_mean()
)arithmetic-geometric mean (
agmean()
)geometric-harmonic mean (
ghmean()
)arithmetic-geometric-harmonic mean (
aghmean()
)
And for calculating:
midrange (
midrange()
)median (
median()
)mode (
mode()
)variance (
var()
)standard deviation (
std()
)
Some examples of the basic functions:
>>> nums = [16, 49, 55, 49, 6, 40, 23, 47, 29, 85, 76, 20]
>>> amean(nums)
41.25
>>> aghmean(nums)
32.42167170892585
>>> heronian_mean(nums)
37.931508950381925
>>> mode(nums)
49
>>> std(nums)
22.876935255113754
Two pairwise functions are provided:
mean pairwise similarity (
mean_pairwise_similarity()
), which returns the mean similarity (using a supplied similarity function) among each item in a collectionpairwise similarity statistics (
pairwise_similarity_statistics()
), which returns the max, min, mean, and standard deviation of pairwise similarities between two collections
The confusion table class (ConfusionTable
) can be constructed in
a number of ways:
four values, representing true positives, true negatives, false positives, and false negatives, can be passed to the constructor
a list or tuple with four values, representing true positives, true negatives, false positives, and false negatives, can be passed to the constructor
a dict with keys 'tp', 'tn', 'fp', 'fn', each assigned to the values for true positives, true negatives, false positives, and false negatives can be passed to the constructor
The ConfusionTable
class has methods:
to_tuple()
extracts theConfusionTable
values as a tuple: (\(w\), \(x\), \(y\), \(z\))
to_dict()
extracts theConfusionTable
values as a dict: {'tp':\(w\), 'tn':\(x\), 'fp':\(y\), 'fn':\(z\)}
true_pos()
returns the number of true positives
true_neg()
returns the number of true negatives
false_pos()
returns the number of false positives
false_neg()
returns the number of false negatives
correct_pop()
returns the correct population
error_pop()
returns the error population
pred_pos_pop()
returns the test positive population
pred_neg_pop()
returns the test negative population
cond_pos_pop()
returns the condition positive population
cond_neg_pop()
returns the condition negative population
population()
returns the total population
precision()
returns the precision
precision_gain()
returns the precision gain
recall()
returns the recall
specificity()
returns the specificity
npv()
returns the negative predictive value
fallout()
returns the fallout
fdr()
returns the false discovery rate
accuracy()
returns the accuracy
accuracy_gain()
returns the accuracy gain
balanced_accuracy()
returns the balanced accuracy
informedness()
returns the informedness
markedness()
returns the markedness
pr_amean()
returns the arithmetic mean of precision & recall
pr_gmean()
returns the geometric mean of precision & recall
pr_hmean()
returns the harmonic mean of precision & recall
pr_qmean()
returns the quadratic mean of precision & recall
pr_cmean()
returns the contraharmonic mean of precision & recall
pr_lmean()
returns the logarithmic mean of precision & recall
pr_imean()
returns the identric mean of precision & recall
pr_seiffert_mean()
returns Seiffert's mean of precision & recall
pr_lehmer_mean()
returns the Lehmer mean of precision & recall
pr_heronian_mean()
returns the Heronian mean of precision & recall
pr_hoelder_mean()
returns the Hölder mean of precision & recall
pr_agmean()
returns the arithmetic-geometric mean of precision & recall
pr_ghmean()
returns the geometric-harmonic mean of precision & recall
pr_aghmean()
returns the arithmetic-geometric-harmonic mean of precision & recall
fbeta_score()
returns the \(F_{beta}\) score
f2_score()
returns the \(F_2\) score
fhalf_score()
returns the \(F_{\frac{1}{2}}\) score
e_score()
returns the \(E\) score
f1_score()
returns the \(F_1\) score
f_measure()
returns the F measure
g_measure()
returns the G measure
mcc()
returns Matthews correlation coefficient
significance()
returns the significance
kappa_statistic()
returns the Kappa statistic
>>> ct = ConfusionTable(120, 60, 20, 30)
>>> ct.f1_score()
0.8275862068965518
>>> ct.mcc()
0.5367450401216932
>>> ct.specificity()
0.75
>>> ct.significance()
66.26190476190476
The ConfusionTable
class also supports checking for equality with
another ConfusionTable
and casting to string with str()
:
>>> (ConfusionTable({'tp':120, 'tn':60, 'fp':20, 'fn':30}) ==
... ConfusionTable(120, 60, 20, 30))
True
>>> str(ConfusionTable(120, 60, 20, 30))
'tp:120, tn:60, fp:20, fn:30'
-
class
abydos.stats.
ConfusionTable
(tp=0, tn=0, fp=0, fn=0)[source]¶ Bases:
object
ConfusionTable object.
This object is initialized by passing either four integers (or a tuple of four integers) representing the squares of a confusion table: true positives, true negatives, false positives, and false negatives
The object possesses methods for the calculation of various statistics based on the confusion table.
Initialize ConfusionTable.
- Parameters
tp (int or a tuple, list, or dict) -- True positives; If a tuple or list is supplied, it must include 4 values in the order [tp, tn, fp, fn]. If a dict is supplied, it must have 4 keys, namely 'tp', 'tn', 'fp', & 'fn'.
tn (int) -- True negatives
fp (int) -- False positives
fn (int) -- False negatives
- Raises
AttributeError -- ConfusionTable requires a 4-tuple when being created from a tuple.
Examples
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct == ConfusionTable((120, 60, 20, 30)) True >>> ct == ConfusionTable([120, 60, 20, 30]) True >>> ct == ConfusionTable({'tp': 120, 'tn': 60, 'fp': 20, 'fn': 30}) True
New in version 0.1.0.
-
accuracy
()[source]¶ Return accuracy.
Accuracy is defined as
\[\frac{tp + tn}{population}\]Cf. https://en.wikipedia.org/wiki/Accuracy
- Returns
The accuracy of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.accuracy() 0.782608695652174
New in version 0.1.0.
-
accuracy_gain
()[source]¶ Return gain in accuracy.
The gain in accuracy is defined as
\[G(accuracy) = \frac{accuracy}{random~ accuracy}\]Cf. https://en.wikipedia.org/wiki/Gain_(information_retrieval)
- Returns
The gain in accuracy of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.accuracy_gain() 1.4325259515570934
New in version 0.1.0.
-
actual_entropy
()[source]¶ Return the actual entropy.
Implementation based on https://github.com/Magnetic/proficiency-metric
- Returns
The actual entropy of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.actual_entropy() 0.6460905050608101
New in version 0.4.0.
-
balanced_accuracy
()[source]¶ Return balanced accuracy.
Balanced accuracy is defined as
\[\frac{sensitivity + specificity}{2}\]Cf. https://en.wikipedia.org/wiki/Accuracy
- Returns
The balanced accuracy of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.balanced_accuracy() 0.775
New in version 0.1.0.
-
cond_neg_pop
()[source]¶ Return condition negative population.
- Returns
The condition negative population of the confusion table
- Return type
int
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.cond_neg_pop() 80
New in version 0.1.0.
-
cond_pos_pop
()[source]¶ Return condition positive population.
- Returns
The condition positive population of the confusion table
- Return type
int
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.cond_pos_pop() 150
New in version 0.1.0.
-
correct_pop
()[source]¶ Return correct population.
- Returns
The correct population of the confusion table
- Return type
int
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.correct_pop() 180
New in version 0.1.0.
-
d_measure
()[source]¶ Return D-measure.
\(D\)-measure is defined as
\[1-\frac{1}{\frac{1}{precision}+\frac{1}{recall}-1}\]- Returns
The \(D\)-measure of the confusion table
- Return type
float
Examples
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.d_measure() 0.2941176470588237
New in version 0.4.0.
-
dependency
()[source]¶ Return dependency.
Implementation based on https://github.com/Magnetic/proficiency-metric
- Returns
The dependency of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.dependency() 0.12618094145262454
New in version 0.4.0.
-
diagnostic_odds_ratio
()[source]¶ Return diagnostic odds ratio.
Diagnostic odds ratio is defined as
\[\frac{tp \cdot tn}{fp \cdot fn}\]Cf. https://en.wikipedia.org/wiki/Diagnostic_odds_ratio
- Returns
The negative likelihood ratio of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.diagnostic_odds_ratio() 12.0
New in version 0.4.0.
-
e_score
(beta=1.0)[source]¶ Return \(E\)-score.
This is Van Rijsbergen's effectiveness measure: \(E=1-F_{\beta}\).
Cf. https://en.wikipedia.org/wiki/Information_retrieval#F-measure
- Parameters
beta (float) -- The \(\beta\) parameter in the above formula
- Returns
The \(E\)-score of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.e_score() 0.17241379310344818
New in version 0.1.0.
-
error_pop
()[source]¶ Return error population.
- Returns
The error population of the confusion table
- Return type
int
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.error_pop() 50
New in version 0.1.0.
-
error_rate
()[source]¶ Return error rate.
Error rate is defined as
\[\frac{fp + fn}{population}\]- Returns
The error rate of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.error_rate() 0.21739130434782608
New in version 0.4.0.
-
f1_score
()[source]¶ Return \(F_{1}\) score.
\(F_{1}\) score is the harmonic mean of precision and recall
\[2 \cdot \frac{precision \cdot recall}{precision + recall}\]Cf. https://en.wikipedia.org/wiki/F1_score
- Returns
The \(F_{1}\) of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.f1_score() 0.8275862068965518
New in version 0.1.0.
-
f2_score
()[source]¶ Return \(F_{2}\).
The \(F_{2}\) score emphasizes recall over precision in comparison to the \(F_{1}\) score
Cf. https://en.wikipedia.org/wiki/F1_score
- Returns
The \(F_{2}\) of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.f2_score() 0.8108108108108109
New in version 0.1.0.
-
f_measure
()[source]¶ Return \(F\)-measure.
\(F\)-measure is the harmonic mean of precision and recall
\[2 \cdot \frac{precision \cdot recall}{precision + recall}\]Cf. https://en.wikipedia.org/wiki/F1_score
- Returns
The math:F-measure of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.f_measure() 0.8275862068965516
New in version 0.1.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the ConfusionTable.pr_hmean method instead.
-
fallout
()[source]¶ Return fall-out.
Fall-out is defined as
\[\frac{fp}{fp + tn}\]AKA false positive rate (FPR)
Cf. https://en.wikipedia.org/wiki/Information_retrieval#Fall-out
- Returns
The fall-out of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.fallout() 0.25
New in version 0.1.0.
-
false_neg
()[source]¶ Return false negatives.
AKA Type II error
- Returns
The false negatives of the confusion table
- Return type
int
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.false_neg() 30
New in version 0.1.0.
-
false_omission_rate
()[source]¶ Return false omission rate (FOR).
FOR is defined as
\[\frac{fn}{tn + fn}\]Cf. https://en.wikipedia.org/wiki/False_omission_rate
- Returns
The false omission rate of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.false_omission_rate() 0.3333333333333333
New in version 0.4.0.
-
false_pos
()[source]¶ Return false positives.
AKA Type I error
- Returns
The false positives of the confusion table
- Return type
int
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.false_pos() 20
New in version 0.1.0.
-
fbeta_score
(beta=1.0)[source]¶ Return \(F_{\beta}\) score.
\(F_{\beta}\) for a positive real value \(\beta\) "measures the effectiveness of retrieval with respect to a user who attaches \(\beta\) times as much importance to recall as precision" (van Rijsbergen 1979)
\(F_{\beta}\) score is defined as
\[(1 + \beta^2) \cdot \frac{precision \cdot recall} {((\beta^2 \cdot precision) + recall)}\]Cf. https://en.wikipedia.org/wiki/F1_score
- Parameters
beta (float) -- The \(\beta\) parameter in the above formula
- Returns
The \(F_{\beta}\) of the confusion table
- Return type
float
- Raises
AttributeError -- Beta must be a positive real value
Examples
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.fbeta_score() 0.8275862068965518 >>> ct.fbeta_score(beta=0.1) 0.8565371024734982
New in version 0.1.0.
-
fdr
()[source]¶ Return false discovery rate (FDR).
False discovery rate is defined as
\[\frac{fp}{fp + tp}\]Cf. https://en.wikipedia.org/wiki/False_discovery_rate
- Returns
The false discovery rate of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.fdr() 0.14285714285714285
New in version 0.1.0.
-
fhalf_score
()[source]¶ Return \(F_{0.5}\) score.
The \(F_{0.5}\) score emphasizes precision over recall in comparison to the \(F_{1}\) score
Cf. https://en.wikipedia.org/wiki/F1_score
- Returns
The \(F_{0.5}\) score of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.fhalf_score() 0.8450704225352114
New in version 0.1.0.
-
fnr
()[source]¶ Return false negative rate.
False negative rate is defined as
\[\frac{fn}{tp + fn}\]AKA miss rate
Cf. https://en.wikipedia.org/wiki/False_negative_rate
- Returns
The false negative rate of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> round(ct.fnr(), 8) 0.2
New in version 0.4.0.
-
g_measure
()[source]¶ Return G-measure.
\(G\)-measure is the geometric mean of precision and recall:
\[\sqrt{precision \cdot recall}\]This is identical to the Fowlkes–Mallows (FM) index for two clusters.
Cf. https://en.wikipedia.org/wiki/F1_score#G-measure
Cf. https://en.wikipedia.org/wiki/Fowlkes%E2%80%93Mallows_index
- Returns
The \(G\)-measure of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.g_measure() 0.828078671210825
New in version 0.1.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the ConfusionTable.pr_gmean method instead.
-
igr
()[source]¶ Return information gain ratio.
Implementation based on https://github.com/Magnetic/proficiency-metric
Cf. https://en.wikipedia.org/wiki/Information_gain_ratio
- Returns
The information gain ratio of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.igr() 0.22019657299448012
New in version 0.4.0.
-
informedness
()[source]¶ Return informedness.
Informedness is defined as
\[sensitivity + specificity - 1\]AKA Youden's J statistic ([You50])
AKA DeltaP'
Cf. https://en.wikipedia.org/wiki/Youden%27s_J_statistic
- Returns
The informedness of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.informedness() 0.55
New in version 0.1.0.
-
jaccard
()[source]¶ Return Jaccard index.
The Jaccard index of a confusion table is
\[\frac{tp}{tp+fp+fn}\]- Returns
The Jaccard index of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.jaccard() 0.7058823529411765
New in version 0.4.0.
-
joint_entropy
()[source]¶ Return the joint entropy.
Implementation based on https://github.com/Magnetic/proficiency-metric
- Returns
The joint entropy of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.joint_entropy() 1.1680347446270396
New in version 0.4.0.
-
kappa_statistic
()[source]¶ Return κ statistic.
The κ statistic is defined as
\[\kappa = \frac{accuracy - random~ accuracy} {1 - random~ accuracy}`\]The κ statistic compares the performance of the classifier relative to the performance of a random classifier. \(\kappa\) = 0 indicates performance identical to random. \(\kappa\) = 1 indicates perfect predictive success. \(\kappa\) = -1 indicates perfect predictive failure.
- Returns
The κ statistic of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.kappa_statistic() 0.5344129554655871
New in version 0.1.0.
-
lift
()[source]¶ Return lift.
Implementation based on https://github.com/Magnetic/proficiency-metric
- Returns
The lift of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.lift() 1.3142857142857143
New in version 0.4.0.
-
markedness
()[source]¶ Return markedness.
Markedness is defined as
\[precision + npv - 1\]AKA DeltaP
- Returns
The markedness of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.markedness() 0.5238095238095237
New in version 0.1.0.
-
mcc
()[source]¶ Return Matthews correlation coefficient (MCC).
The Matthews correlation coefficient is defined in [Mat75] as:
\[\frac{(tp \cdot tn) - (fp \cdot fn)} {\sqrt{(tp + fp)(tp + fn)(tn + fp)(tn + fn)}}\]This is equivalent to the geometric mean of informedness and markedness, defined above.
Cf. https://en.wikipedia.org/wiki/Matthews_correlation_coefficient
- Returns
The Matthews correlation coefficient of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.mcc() 0.5367450401216932
New in version 0.1.0.
-
mutual_information
()[source]¶ Return the mutual information.
Implementation based on https://github.com/Magnetic/proficiency-metric
- Returns
float -- The mutual information of the confusion table
Cf. https (//en.wikipedia.org/wiki/Mutual_information)
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.mutual_information() 0.14738372372641576
New in version 0.4.0.
-
neg_likelihood_ratio
()[source]¶ Return negative likelihood ratio.
Negative likelihood ratio is defined as
\[\frac{1-recall}{specificity}\]Cf. https://en.wikipedia.org/wiki/Likelihood_ratios_in_diagnostic_testing
- Returns
The negative likelihood ratio of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.neg_likelihood_ratio() 0.2666666666666666
New in version 0.4.0.
-
npv
()[source]¶ Return negative predictive value (NPV).
NPV is defined as
\[\frac{tn}{tn + fn}\]AKA inverse precision
Cf. https://en.wikipedia.org/wiki/Negative_predictive_value
- Returns
The negative predictive value of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.npv() 0.6666666666666666
New in version 0.1.0.
-
phi_coefficient
()[source]¶ Return φ coefficient.
The \(\phi\) coefficient is defined as
\[\phi = \frac{tp \cdot tn - fp \cdot tn} {\sqrt{(tp + fp) \cdot (tp + fn) \cdot (tn + fp) \cdot (tn + fn)}}\]- Returns
The φ coefficient of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.phi_coefficient() 0.5367450401216932
New in version 0.4.0.
-
population
()[source]¶ Return population, N.
- Returns
The population (N) of the confusion table
- Return type
int
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.population() 230
New in version 0.1.0.
-
pos_likelihood_ratio
()[source]¶ Return positive likelihood ratio.
Positive likelihood ratio is defined as
\[\frac{recall}{1-specificity}\]Cf. https://en.wikipedia.org/wiki/Likelihood_ratios_in_diagnostic_testing
- Returns
The positive likelihood ratio of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pos_likelihood_ratio() 3.2
New in version 0.4.0.
-
pr_aghmean
()[source]¶ Return arithmetic-geometric-harmonic mean of precision & recall.
Iterates over arithmetic, geometric, & harmonic means until they converge to a single value (rounded to 12 digits), following the method described in [RaissouliLC09].
- Returns
The arithmetic-geometric-harmonic mean of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_aghmean() 0.8280786712108288
New in version 0.1.0.
-
pr_agmean
()[source]¶ Return arithmetic-geometric mean of precision & recall.
Iterates between arithmetic & geometric means until they converge to a single value (rounded to 12 digits)
Cf. https://en.wikipedia.org/wiki/Arithmetic-geometric_mean
- Returns
The arithmetic-geometric mean of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_agmean() 0.8283250315702829
New in version 0.1.0.
-
pr_amean
()[source]¶ Return arithmetic mean of precision & recall.
The arithmetic mean of precision and recall is defined as
\[\frac{precision \cdot recall}{2}\]Cf. https://en.wikipedia.org/wiki/Arithmetic_mean
- Returns
The arithmetic mean of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_amean() 0.8285714285714285
New in version 0.1.0.
-
pr_cmean
()[source]¶ Return contraharmonic mean of precision & recall.
The contraharmonic mean is
\[\frac{precision^{2} + recall^{2}}{precision + recall}\]Cf. https://en.wikipedia.org/wiki/Contraharmonic_mean
- Returns
The contraharmonic mean of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_cmean() 0.8295566502463055
New in version 0.1.0.
-
pr_ghmean
()[source]¶ Return geometric-harmonic mean of precision & recall.
Iterates between geometric & harmonic means until they converge to a single value (rounded to 12 digits)
Cf. https://en.wikipedia.org/wiki/Geometric-harmonic_mean
- Returns
The geometric-harmonic mean of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_ghmean() 0.8278323841238441
New in version 0.1.0.
-
pr_gmean
()[source]¶ Return geometric mean of precision & recall.
The geometric mean of precision and recall is defined as:
\[\sqrt{precision \cdot recall}\]Cf. https://en.wikipedia.org/wiki/Geometric_mean
- Returns
The geometric mean of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_gmean() 0.828078671210825
New in version 0.1.0.
-
pr_heronian_mean
()[source]¶ Return Heronian mean of precision & recall.
The Heronian mean of precision and recall is defined as
\[\frac{precision + \sqrt{precision \cdot recall} + recall}{3}\]Cf. https://en.wikipedia.org/wiki/Heronian_mean
- Returns
The Heronian mean of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_heronian_mean() 0.8284071761178939
New in version 0.1.0.
-
pr_hmean
()[source]¶ Return harmonic mean of precision & recall.
The harmonic mean of precision and recall is defined as
\[\frac{2 \cdot precision \cdot recall}{precision + recall}\]Cf. https://en.wikipedia.org/wiki/Harmonic_mean
- Returns
The harmonic mean of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_hmean() 0.8275862068965516
New in version 0.1.0.
-
pr_hoelder_mean
(exp=2)[source]¶ Return Hölder (power/generalized) mean of precision & recall.
The power mean of precision and recall is defined as
\[\frac{1}{2} \cdot \sqrt[exp]{precision^{exp} + recall^{exp}}\]for \(exp \ne 0\), and the geometric mean for \(exp = 0\)
Cf. https://en.wikipedia.org/wiki/Generalized_mean
- Parameters
exp (float) -- The exponent of the Hölder mean
- Returns
The Hölder mean for the given exponent of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_hoelder_mean() 0.8290638930598233
New in version 0.1.0.
-
pr_imean
()[source]¶ Return identric (exponential) mean of precision & recall.
The identric mean is: precision if precision = recall, otherwise
\[\frac{1}{e} \cdot \sqrt[precision - recall]{\frac{precision^{precision}} {recall^{recall}}}\]Cf. https://en.wikipedia.org/wiki/Identric_mean
- Returns
The identric mean of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_imean() 0.8284071826325543
New in version 0.1.0.
-
pr_lehmer_mean
(exp=2.0)[source]¶ Return Lehmer mean of precision & recall.
The Lehmer mean is
\[\frac{precision^{exp} + recall^{exp}} {precision^{exp-1} + recall^{exp-1}}\]Cf. https://en.wikipedia.org/wiki/Lehmer_mean
- Parameters
exp (float) -- The exponent of the Lehmer mean
- Returns
The Lehmer mean for the given exponent of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_lehmer_mean() 0.8295566502463055
New in version 0.1.0.
-
pr_lmean
()[source]¶ Return logarithmic mean of precision & recall.
The logarithmic mean is: 0 if either precision or recall is 0, the precision if they are equal, otherwise
\[\frac{precision - recall} {ln(precision) - ln(recall)}\]Cf. https://en.wikipedia.org/wiki/Logarithmic_mean
- Returns
The logarithmic mean of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_lmean() 0.8282429171492667
New in version 0.1.0.
-
pr_qmean
()[source]¶ Return quadratic mean of precision & recall.
The quadratic mean of precision and recall is defined as
\[\sqrt{\frac{precision^{2} + recall^{2}}{2}}\]Cf. https://en.wikipedia.org/wiki/Quadratic_mean
- Returns
The quadratic mean of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_qmean() 0.8290638930598233
New in version 0.1.0.
-
pr_seiffert_mean
()[source]¶ Return Seiffert's mean of precision & recall.
Seiffert's mean of precision and recall is
\[\frac{precision - recall}{4 \cdot arctan \sqrt{\frac{precision}{recall}} - \pi}\]It is defined in [Sei93].
- Returns
Seiffert's mean of the confusion table's precision & recall
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pr_seiffert_mean() 0.8284071696048312
New in version 0.1.0.
-
precision
()[source]¶ Return precision.
Precision is defined as
\[\frac{tp}{tp + fp}\]AKA positive predictive value (PPV)
Cf. https://en.wikipedia.org/wiki/Precision_and_recall
Cf. https://en.wikipedia.org/wiki/Information_retrieval#Precision
- Returns
The precision of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.precision() 0.8571428571428571
New in version 0.1.0.
-
precision_gain
()[source]¶ Return gain in precision.
The gain in precision is defined as
\[G(precision) = \frac{precision}{random~ precision}\]Cf. https://en.wikipedia.org/wiki/Gain_(information_retrieval)
- Returns
The gain in precision of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.precision_gain() 1.3142857142857143
New in version 0.1.0.
-
pred_neg_pop
()[source]¶ Return predicted negative population.
- Returns
The predicted negative population of the confusion table
- Return type
int
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pred_neg_pop() 90
New in version 0.1.0.
Changed in version 0.4.0: renamed from test_neg_pop
New in version 0.1.0.
-
pred_pos_pop
()[source]¶ Return predicted positive population.
- Returns
The predicted positive population of the confusion table
- Return type
int
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.pred_pos_pop() 140
New in version 0.1.0.
Changed in version 0.4.0: renamed from test_pos_pop
New in version 0.1.0.
-
predicted_entropy
()[source]¶ Return the predicted entropy.
Implementation based on https://github.com/Magnetic/proficiency-metric
- Returns
The predicted entropy of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.predicted_entropy() 0.6693279632926457
New in version 0.4.0.
-
prevalence
()[source]¶ Return prevalence.
Prevalence is defined as
\[\frac{condition positive}{population}\]Cf. https://en.wikipedia.org/wiki/Prevalence
- Returns
The prevelence of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.prevalence() 0.6521739130434783
New in version 0.4.0.
-
proficiency
()[source]¶ Return the proficiency.
Implementation based on https://github.com/Magnetic/proficiency-metric [SLaclavik15]
AKA uncertainty coefficient
Cf. https://en.wikipedia.org/wiki/Uncertainty_coefficient
- Returns
The proficiency of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.proficiency() 0.228116219897929
New in version 0.4.0.
-
recall
()[source]¶ Return recall.
Recall is defined as
\[\frac{tp}{tp + fn}\]AKA sensitivity
AKA true positive rate (TPR)
Cf. https://en.wikipedia.org/wiki/Precision_and_recall
Cf. https://en.wikipedia.org/wiki/Sensitivity_(test)
Cf. https://en.wikipedia.org/wiki/Information_retrieval#Recall
- Returns
The recall of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.recall() 0.8
New in version 0.1.0.
-
significance
()[source]¶ Return the significance, \(\chi^{2}\).
Significance is defined as
\[\chi^{2} = \frac{(tp \cdot tn - fp \cdot fn)^{2} (tp + tn + fp + fn)} {((tp + fp)(tp + fn)(tn + fp)(tn + fn)}`\]Also: \(\chi^{2} = MCC^{2} \cdot n\)
Cf. https://en.wikipedia.org/wiki/Pearson%27s_chi-square_test
- Returns
The significance of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.significance() 66.26190476190476
New in version 0.1.0.
-
specificity
()[source]¶ Return specificity.
Specificity is defined as
\[\frac{tn}{tn + fp}\]AKA true negative rate (TNR)
AKA inverse recall
Cf. https://en.wikipedia.org/wiki/Specificity_(tests)
- Returns
The specificity of the confusion table
- Return type
float
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.specificity() 0.75
New in version 0.1.0.
-
to_dict
()[source]¶ Cast to dict.
- Returns
The confusion table as a dict
- Return type
dict
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> import pprint >>> pprint.pprint(ct.to_dict()) {'fn': 30, 'fp': 20, 'tn': 60, 'tp': 120}
New in version 0.1.0.
-
to_tuple
()[source]¶ Cast to tuple.
- Returns
The confusion table as a 4-tuple (tp, tn, fp, fn)
- Return type
tuple
Example
>>> ct = ConfusionTable(120, 60, 20, 30) >>> ct.to_tuple() (120, 60, 20, 30)
New in version 0.1.0.
-
abydos.stats.
amean
(nums)[source]¶ Return arithmetic mean.
The arithmetic mean is defined as
\[\frac{\sum{nums}}{|nums|}\]Cf. https://en.wikipedia.org/wiki/Arithmetic_mean
- Parameters
nums (list) -- A series of numbers
- Returns
The arithmetric mean of nums
- Return type
float
Examples
>>> amean([1, 2, 3, 4]) 2.5 >>> amean([1, 2]) 1.5 >>> amean([0, 5, 1000]) 335.0
New in version 0.1.0.
-
abydos.stats.
gmean
(nums)[source]¶ Return geometric mean.
The geometric mean is defined as
\[\sqrt[|nums|]{\prod\limits_{i} nums_{i}}\]Cf. https://en.wikipedia.org/wiki/Geometric_mean
- Parameters
nums (list) -- A series of numbers
- Returns
The geometric mean of nums
- Return type
float
Examples
>>> gmean([1, 2, 3, 4]) 2.213363839400643 >>> gmean([1, 2]) 1.4142135623730951 >>> gmean([0, 5, 1000]) 0.0
New in version 0.1.0.
-
abydos.stats.
hmean
(nums)[source]¶ Return harmonic mean.
The harmonic mean is defined as
\[\frac{|nums|}{\sum\limits_{i}\frac{1}{nums_i}}\]Following the behavior of Wolfram|Alpha: - If one of the values in nums is 0, return 0. - If more than one value in nums is 0, return NaN.
Cf. https://en.wikipedia.org/wiki/Harmonic_mean
- Parameters
nums (list) -- A series of numbers
- Returns
The harmonic mean of nums
- Return type
float
- Raises
ValueError -- hmean requires at least one value
Examples
>>> hmean([1, 2, 3, 4]) 1.9200000000000004 >>> hmean([1, 2]) 1.3333333333333333 >>> hmean([0, 5, 1000]) 0
New in version 0.1.0.
-
abydos.stats.
agmean
(nums, prec=12)[source]¶ Return arithmetic-geometric mean.
Iterates between arithmetic & geometric means until they converge to a single value (rounded to 10 digits).
Cf. https://en.wikipedia.org/wiki/Arithmetic-geometric_mean
- Parameters
nums (list) -- A series of numbers
- Returns
float -- The arithmetic-geometric mean of nums
prec (int) -- Digits of precision when testing convergeance
Examples
>>> agmean([1, 2, 3, 4]) 2.3545004777751077 >>> agmean([1, 2]) 1.4567910310469068 >>> agmean([0, 5, 1000]) 2.9753977059954195e-13
New in version 0.1.0.
-
abydos.stats.
ghmean
(nums, prec=12)[source]¶ Return geometric-harmonic mean.
Iterates between geometric & harmonic means until they converge to a single value (rounded to 10 digits).
Cf. https://en.wikipedia.org/wiki/Geometric-harmonic_mean
- Parameters
nums (list) -- A series of numbers
prec (int) -- Digits of precision when testing convergeance
- Returns
The geometric-harmonic mean of nums
- Return type
float
Examples
>>> ghmean([1, 2, 3, 4]) 2.058868154613003 >>> ghmean([1, 2]) 1.3728805006183502 >>> ghmean([0, 5, 1000]) 0.0
>>> ghmean([0, 0]) 0.0 >>> ghmean([0, 0, 5]) nan
New in version 0.1.0.
-
abydos.stats.
aghmean
(nums, prec=12)[source]¶ Return arithmetic-geometric-harmonic mean.
Iterates over arithmetic, geometric, & harmonic means until they converge to a single value (rounded to 10 digits), following the method described in [RaissouliLC09].
- Parameters
nums (list) -- A series of numbers
prec (int) -- Digits of precision when testing convergeance
- Returns
The arithmetic-geometric-harmonic mean of nums
- Return type
float
Examples
>>> aghmean([1, 2, 3, 4]) 2.198327159900212 >>> aghmean([1, 2]) 1.4142135623731884 >>> aghmean([0, 5, 1000]) 335.0
New in version 0.1.0.
-
abydos.stats.
cmean
(nums)[source]¶ Return contraharmonic mean.
The contraharmonic mean is
\[\frac{\sum\limits_i x_i^2}{\sum\limits_i x_i}\]Cf. https://en.wikipedia.org/wiki/Contraharmonic_mean
- Parameters
nums (list) -- A series of numbers
- Returns
The contraharmonic mean of nums
- Return type
float
Examples
>>> cmean([1, 2, 3, 4]) 3.0 >>> cmean([1, 2]) 1.6666666666666667 >>> cmean([0, 5, 1000]) 995.0497512437811
New in version 0.1.0.
-
abydos.stats.
imean
(nums)[source]¶ Return identric (exponential) mean.
The identric mean of two numbers x and y is: x if x = y otherwise
\[\frac{1}{e} \sqrt[x-y]{\frac{x^x}{y^y}}\]Cf. https://en.wikipedia.org/wiki/Identric_mean
- Parameters
nums (list) -- A series of numbers
- Returns
The identric mean of nums
- Return type
float
- Raises
ValueError -- imean supports no more than two values
Examples
>>> imean([1, 2]) 1.4715177646857693 >>> imean([1, 0]) nan >>> imean([2, 4]) 2.9430355293715387
New in version 0.1.0.
-
abydos.stats.
lmean
(nums)[source]¶ Return logarithmic mean.
The logarithmic mean of an arbitrarily long series is defined by http://www.survo.fi/papers/logmean.pdf as
\[\begin{split}L(x_1, x_2, ..., x_n) = (n-1)! \sum\limits_{i=1}^n \frac{x_i} {\prod\limits_{\substack{j = 1\\j \ne i}}^n ln \frac{x_i}{x_j}}\end{split}\]Cf. https://en.wikipedia.org/wiki/Logarithmic_mean
- Parameters
nums (list) -- A series of numbers
- Returns
The logarithmic mean of nums
- Return type
float
- Raises
ValueError -- No two values in the nums list may be equal
Examples
>>> lmean([1, 2, 3, 4]) 2.2724242417489258 >>> lmean([1, 2]) 1.4426950408889634
New in version 0.1.0.
-
abydos.stats.
qmean
(nums)[source]¶ Return quadratic mean.
The quadratic mean is defined as
\[\sqrt{\sum\limits_{i} \frac{num_i^2}{|nums|}}\]Cf. https://en.wikipedia.org/wiki/Quadratic_mean
- Parameters
nums (list) -- A series of numbers
- Returns
The quadratic mean of nums
- Return type
float
Examples
>>> qmean([1, 2, 3, 4]) 2.7386127875258306 >>> qmean([1, 2]) 1.5811388300841898 >>> qmean([0, 5, 1000]) 577.3574860228857
New in version 0.1.0.
-
abydos.stats.
heronian_mean
(nums)[source]¶ Return Heronian mean.
The Heronian mean is:
\[\frac{\sum\limits_{i, j}\sqrt{{x_i \cdot x_j}}} {|nums| \cdot \frac{|nums| + 1}{2}}\]for \(j \ge i\)
Cf. https://en.wikipedia.org/wiki/Heronian_mean
- Parameters
nums (list) -- A series of numbers
- Returns
The Heronian mean of nums
- Return type
float
Examples
>>> heronian_mean([1, 2, 3, 4]) 2.3888282852609093 >>> heronian_mean([1, 2]) 1.4714045207910316 >>> heronian_mean([0, 5, 1000]) 179.28511301977582
New in version 0.1.0.
-
abydos.stats.
hoelder_mean
(nums, exp=2)[source]¶ Return Hölder (power/generalized) mean.
The Hölder mean is defined as:
\[\sqrt[p]{\frac{1}{|nums|} \cdot \sum\limits_i{x_i^p}}\]for \(p \ne 0\), and the geometric mean for \(p = 0\)
Cf. https://en.wikipedia.org/wiki/Generalized_mean
- Parameters
nums (list) -- A series of numbers
exp (numeric) -- The exponent of the Hölder mean
- Returns
The Hölder mean of nums for the given exponent
- Return type
float
Examples
>>> hoelder_mean([1, 2, 3, 4]) 2.7386127875258306 >>> hoelder_mean([1, 2]) 1.5811388300841898 >>> hoelder_mean([0, 5, 1000]) 577.3574860228857
New in version 0.1.0.
-
abydos.stats.
lehmer_mean
(nums, exp=2)[source]¶ Return Lehmer mean.
The Lehmer mean is
\[\frac{\sum\limits_i{x_i^p}}{\sum\limits_i{x_i^(p-1)}}\]Cf. https://en.wikipedia.org/wiki/Lehmer_mean
- Parameters
nums (list) -- A series of numbers
exp (numeric) -- The exponent of the Lehmer mean
- Returns
The Lehmer mean of nums for the given exponent
- Return type
float
Examples
>>> lehmer_mean([1, 2, 3, 4]) 3.0 >>> lehmer_mean([1, 2]) 1.6666666666666667 >>> lehmer_mean([0, 5, 1000]) 995.0497512437811
New in version 0.1.0.
-
abydos.stats.
seiffert_mean
(nums)[source]¶ Return Seiffert's mean.
Seiffert's mean of two numbers x and y is
\[\frac{x - y}{4 \cdot arctan \sqrt{\frac{x}{y}} - \pi}\]It is defined in [Sei93].
- Parameters
nums (list) -- A series of numbers
- Returns
Sieffert's mean of nums
- Return type
float
- Raises
ValueError -- seiffert_mean supports no more than two values
Examples
>>> seiffert_mean([1, 2]) 1.4712939827611637 >>> seiffert_mean([1, 0]) 0.3183098861837907 >>> seiffert_mean([2, 4]) 2.9425879655223275 >>> seiffert_mean([2, 1000]) 336.84053300118825
New in version 0.1.0.
-
abydos.stats.
median
(nums)[source]¶ Return median.
With numbers sorted by value, the median is the middle value (if there is an odd number of values) or the arithmetic mean of the two middle values (if there is an even number of values).
Cf. https://en.wikipedia.org/wiki/Median
- Parameters
nums (list) -- A series of numbers
- Returns
The median of nums
- Return type
int or float
Examples
>>> median([1, 2, 3]) 2 >>> median([1, 2, 3, 4]) 2.5 >>> median([1, 2, 2, 4]) 2
New in version 0.1.0.
-
abydos.stats.
midrange
(nums)[source]¶ Return midrange.
The midrange is the arithmetic mean of the maximum & minimum of a series.
Cf. https://en.wikipedia.org/wiki/Midrange
- Parameters
nums (list) -- A series of numbers
- Returns
The midrange of nums
- Return type
float
Examples
>>> midrange([1, 2, 3]) 2.0 >>> midrange([1, 2, 2, 3]) 2.0 >>> midrange([1, 2, 1000, 3]) 500.5
New in version 0.1.0.
-
abydos.stats.
mode
(nums)[source]¶ Return the mode.
The mode of a series is the most common element of that series
Cf. https://en.wikipedia.org/wiki/Mode_(statistics)
- Parameters
nums (list) -- A series of numbers
- Returns
The mode of nums
- Return type
int or float
Example
>>> mode([1, 2, 2, 3]) 2
New in version 0.1.0.
-
abydos.stats.
std
(nums, mean_func=<function amean>, ddof=0)[source]¶ Return the standard deviation.
The standard deviation of a series of values is the square root of the variance.
Cf. https://en.wikipedia.org/wiki/Standard_deviation
- Parameters
nums (list) -- A series of numbers
mean_func (function) -- A mean function (amean by default)
ddof (int) -- The degrees of freedom (0 by default)
- Returns
The standard deviation of the values in the series
- Return type
float
Examples
>>> std([1, 1, 1, 1]) 0.0 >>> round(std([1, 2, 3, 4]), 12) 1.11803398875 >>> round(std([1, 2, 3, 4], ddof=1), 12) 1.290994448736
New in version 0.3.0.
-
abydos.stats.
var
(nums, mean_func=<function amean>, ddof=0)[source]¶ Calculate the variance.
The variance (\(\sigma^2\)) of a series of numbers (\(x_i\)) with mean \(\mu\) and population \(N\) is:
\[\sigma^2 = \frac{1}{N}\sum_{i=1}^{N}(x_i-\mu)^2\]Cf. https://en.wikipedia.org/wiki/Variance
- Parameters
nums (list) -- A series of numbers
mean_func (function) -- A mean function (amean by default)
ddof (int) -- The degrees of freedom (0 by default)
- Returns
The variance of the values in the series
- Return type
float
Examples
>>> var([1, 1, 1, 1]) 0.0 >>> var([1, 2, 3, 4]) 1.25 >>> round(var([1, 2, 3, 4], ddof=1), 12) 1.666666666667
New in version 0.3.0.
-
abydos.stats.
mean_pairwise_similarity
(collection, metric=<function sim_levenshtein>, mean_func=<function hmean>, symmetric=False)[source]¶ Calculate the mean pairwise similarity of a collection of strings.
Takes the mean of the pairwise similarity between each member of a collection, optionally in both directions (for asymmetric similarity metrics.
- Parameters
collection (list) -- A collection of terms or a string that can be split
metric (function) -- A similarity metric function
mean_func (function) -- A mean function that takes a list of values and returns a float
symmetric (bool) -- Set to True if all pairwise similarities should be calculated in both directions
- Returns
The mean pairwise similarity of a collection of strings
- Return type
float
- Raises
ValueError -- mean_func must be a function
ValueError -- metric must be a function
ValueError -- collection is neither a string nor iterable type
ValueError -- collection has fewer than two members
Examples
>>> round(mean_pairwise_similarity(['Christopher', 'Kristof', ... 'Christobal']), 12) 0.519801980198 >>> round(mean_pairwise_similarity(['Niall', 'Neal', 'Neil']), 12) 0.545454545455
New in version 0.1.0.
-
abydos.stats.
pairwise_similarity_statistics
(src_collection, tar_collection, metric=<function sim_levenshtein>, mean_func=<function amean>, symmetric=False)[source]¶ Calculate the pairwise similarity statistics a collection of strings.
Calculate pairwise similarities among members of two collections, returning the maximum, minimum, mean (according to a supplied function, arithmetic mean, by default), and (population) standard deviation of those similarities.
- Parameters
src_collection (list) -- A collection of terms or a string that can be split
tar_collection (list) -- A collection of terms or a string that can be split
metric (function) -- A similarity metric function
mean_func (function) -- A mean function that takes a list of values and returns a float
symmetric (bool) -- Set to True if all pairwise similarities should be calculated in both directions
- Returns
The max, min, mean, and standard deviation of similarities
- Return type
tuple
- Raises
ValueError -- mean_func must be a function
ValueError -- metric must be a function
ValueError -- src_collection is neither a string nor iterable
ValueError -- tar_collection is neither a string nor iterable
Example
>>> tuple(round(_, 12) for _ in pairwise_similarity_statistics( ... ['Christopher', 'Kristof', 'Christobal'], ['Niall', 'Neal', 'Neil'])) (0.2, 0.0, 0.118614718615, 0.075070477184)
New in version 0.3.0.