diff --git a/docs/sources/CHANGELOG.md b/docs/sources/CHANGELOG.md
index c0fa2b804..d9d5b5167 100755
--- a/docs/sources/CHANGELOG.md
+++ b/docs/sources/CHANGELOG.md
@@ -21,6 +21,7 @@ The CHANGELOG for the current development version is available at
- The `mlxtend.evaluate.bootstrap_point632_score` now supports `fit_params`. ([#861](https://github.com/rasbt/mlxtend/pull/861))
- The `mlxtend/plotting/decision_regions.py` function now has a `contourf_kwargs` for matplotlib to change the look of the decision boundaries if desired. ([#881](https://github.com/rasbt/mlxtend/pull/881) via [[pbloem](https://github.com/pbloem)])
+- The `mlxtend.frequent_patterns.metrics` provides **Kulczynski metric** and **Imbalance Ratio** metrics as `kulczynski_measure` and `imbalance_ratio` ([#840](https://github.com/rasbt/mlxtend/issues/840))
##### Changes
diff --git a/docs/sources/user_guide/frequent_patterns/metrics.ipynb b/docs/sources/user_guide/frequent_patterns/metrics.ipynb
new file mode 100644
index 000000000..838774e6a
--- /dev/null
+++ b/docs/sources/user_guide/frequent_patterns/metrics.ipynb
@@ -0,0 +1,389 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Evaluating quality of Association Rules"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Overview"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "A strong association rule may or may not be interesting for a specific application. Some measures have been developed to help evaluate association rules. `mlxtend` implements two such measures, Kulczynski Measure and Imbalance Ratio."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Kulczynski Measure:\n",
+ "\n",
+ "The Kulczynski measure $K_{A,B}$ can be interpreted as the average between the confidence that $A ⇒ B$ and the confidence that $B ⇒ A$\n",
+ "\n",
+ "The Kulczynski measure $K_{A,B} ∈ [0, 1]$ of the itemsets $A ⊆ I$ and\n",
+ "$B ⊆ I$ such that $A ∩ B = \\varnothing$ is given by\n",
+ "\n",
+ "$$K_{A,B} = \\frac{V_{A⇒B} + V_{B⇒A}}{2}$$\n",
+ "\n",
+ "$$K_{A,B} = \\frac{1}{2} \\Bigg[\\frac{sup(A \\cup B)}{sup(A)} + \\frac{sup(A \\cup B)}{sup(B)} \\Bigg]$$\n",
+ "\n",
+ "- If $K_{A,B} = 0$, then $A ⊆ T$ implies that $B \\nsubseteq T$ for any transaction $T$\n",
+ "- If $K_{A,B} = 1$, then $A ⊆ T$ implies that $B ⊆ T$ for any transaction $T$\n",
+ "- Note that the Kulczynski measure is symmetric: $K_{A,B} = K_{B,A}$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Imbalance Ratio:\n",
+ "The imbalance ratio $I_{A,B}$ can be interpreted as the ratio between the absolute difference between the support count of $A$ and the support count of $B$ and the number of transactions that contain $A$, $B$, or both $A$ and $B$\n",
+ "- The imbalance ratio $I_{A,B} ∈ [0, 1]$ of the itemsets $A ⊆ I$ and $B ⊆ I$ is given by\n",
+ "\n",
+ "$$I_{A,B} =\\frac{|N_A − N_B|}{N_A + N_B − N_{A∪B}}$$\n",
+ "- If $I_{A,B} = 0$, then $A$ and $B$ have the same support\n",
+ "- If $I_{A,B} = 1$, then either $A$ or $B$ has zero support\n",
+ "- Note that the imbalance ratio is symmetric: $I_{A,B} = I_{B,A}$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## References\n",
+ "\n",
+ "[1] Chapter 6 of J. Han, M. Kamber, J. Pei, “Data Mining: Concepts and Techniques”, 3rd edition, Elsevier/Morgan Kaufmann, 2012"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Example 1 -- Evaluate Kulczynski Measure of an Association rule:\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " antecedents | \n",
+ " consequents | \n",
+ " antecedent support | \n",
+ " consequent support | \n",
+ " support | \n",
+ " confidence | \n",
+ " lift | \n",
+ " leverage | \n",
+ " conviction | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " (Eggs) | \n",
+ " (Kidney Beans) | \n",
+ " 0.8 | \n",
+ " 1.0 | \n",
+ " 0.8 | \n",
+ " 1.00 | \n",
+ " 1.00 | \n",
+ " 0.00 | \n",
+ " inf | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " (Kidney Beans) | \n",
+ " (Eggs) | \n",
+ " 1.0 | \n",
+ " 0.8 | \n",
+ " 0.8 | \n",
+ " 0.80 | \n",
+ " 1.00 | \n",
+ " 0.00 | \n",
+ " 1.0 | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " (Eggs) | \n",
+ " (Onion) | \n",
+ " 0.8 | \n",
+ " 0.6 | \n",
+ " 0.6 | \n",
+ " 0.75 | \n",
+ " 1.25 | \n",
+ " 0.12 | \n",
+ " 1.6 | \n",
+ "
\n",
+ " \n",
+ " 3 | \n",
+ " (Onion) | \n",
+ " (Eggs) | \n",
+ " 0.6 | \n",
+ " 0.8 | \n",
+ " 0.6 | \n",
+ " 1.00 | \n",
+ " 1.25 | \n",
+ " 0.12 | \n",
+ " inf | \n",
+ "
\n",
+ " \n",
+ " 4 | \n",
+ " (Milk) | \n",
+ " (Kidney Beans) | \n",
+ " 0.6 | \n",
+ " 1.0 | \n",
+ " 0.6 | \n",
+ " 1.00 | \n",
+ " 1.00 | \n",
+ " 0.00 | \n",
+ " inf | \n",
+ "
\n",
+ " \n",
+ " 5 | \n",
+ " (Onion) | \n",
+ " (Kidney Beans) | \n",
+ " 0.6 | \n",
+ " 1.0 | \n",
+ " 0.6 | \n",
+ " 1.00 | \n",
+ " 1.00 | \n",
+ " 0.00 | \n",
+ " inf | \n",
+ "
\n",
+ " \n",
+ " 6 | \n",
+ " (Yogurt) | \n",
+ " (Kidney Beans) | \n",
+ " 0.6 | \n",
+ " 1.0 | \n",
+ " 0.6 | \n",
+ " 1.00 | \n",
+ " 1.00 | \n",
+ " 0.00 | \n",
+ " inf | \n",
+ "
\n",
+ " \n",
+ " 7 | \n",
+ " (Eggs, Onion) | \n",
+ " (Kidney Beans) | \n",
+ " 0.6 | \n",
+ " 1.0 | \n",
+ " 0.6 | \n",
+ " 1.00 | \n",
+ " 1.00 | \n",
+ " 0.00 | \n",
+ " inf | \n",
+ "
\n",
+ " \n",
+ " 8 | \n",
+ " (Eggs, Kidney Beans) | \n",
+ " (Onion) | \n",
+ " 0.8 | \n",
+ " 0.6 | \n",
+ " 0.6 | \n",
+ " 0.75 | \n",
+ " 1.25 | \n",
+ " 0.12 | \n",
+ " 1.6 | \n",
+ "
\n",
+ " \n",
+ " 9 | \n",
+ " (Onion, Kidney Beans) | \n",
+ " (Eggs) | \n",
+ " 0.6 | \n",
+ " 0.8 | \n",
+ " 0.6 | \n",
+ " 1.00 | \n",
+ " 1.25 | \n",
+ " 0.12 | \n",
+ " inf | \n",
+ "
\n",
+ " \n",
+ " 10 | \n",
+ " (Eggs) | \n",
+ " (Onion, Kidney Beans) | \n",
+ " 0.8 | \n",
+ " 0.6 | \n",
+ " 0.6 | \n",
+ " 0.75 | \n",
+ " 1.25 | \n",
+ " 0.12 | \n",
+ " 1.6 | \n",
+ "
\n",
+ " \n",
+ " 11 | \n",
+ " (Onion) | \n",
+ " (Eggs, Kidney Beans) | \n",
+ " 0.6 | \n",
+ " 0.8 | \n",
+ " 0.6 | \n",
+ " 1.00 | \n",
+ " 1.25 | \n",
+ " 0.12 | \n",
+ " inf | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " antecedents consequents antecedent support \\\n",
+ "0 (Eggs) (Kidney Beans) 0.8 \n",
+ "1 (Kidney Beans) (Eggs) 1.0 \n",
+ "2 (Eggs) (Onion) 0.8 \n",
+ "3 (Onion) (Eggs) 0.6 \n",
+ "4 (Milk) (Kidney Beans) 0.6 \n",
+ "5 (Onion) (Kidney Beans) 0.6 \n",
+ "6 (Yogurt) (Kidney Beans) 0.6 \n",
+ "7 (Eggs, Onion) (Kidney Beans) 0.6 \n",
+ "8 (Eggs, Kidney Beans) (Onion) 0.8 \n",
+ "9 (Onion, Kidney Beans) (Eggs) 0.6 \n",
+ "10 (Eggs) (Onion, Kidney Beans) 0.8 \n",
+ "11 (Onion) (Eggs, Kidney Beans) 0.6 \n",
+ "\n",
+ " consequent support support confidence lift leverage conviction \n",
+ "0 1.0 0.8 1.00 1.00 0.00 inf \n",
+ "1 0.8 0.8 0.80 1.00 0.00 1.0 \n",
+ "2 0.6 0.6 0.75 1.25 0.12 1.6 \n",
+ "3 0.8 0.6 1.00 1.25 0.12 inf \n",
+ "4 1.0 0.6 1.00 1.00 0.00 inf \n",
+ "5 1.0 0.6 1.00 1.00 0.00 inf \n",
+ "6 1.0 0.6 1.00 1.00 0.00 inf \n",
+ "7 1.0 0.6 1.00 1.00 0.00 inf \n",
+ "8 0.6 0.6 0.75 1.25 0.12 1.6 \n",
+ "9 0.8 0.6 1.00 1.25 0.12 inf \n",
+ "10 0.6 0.6 0.75 1.25 0.12 1.6 \n",
+ "11 0.8 0.6 1.00 1.25 0.12 inf "
+ ]
+ },
+ "execution_count": 6,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "import pandas as pd\n",
+ "from mlxtend.preprocessing import TransactionEncoder\n",
+ "from mlxtend.frequent_patterns import apriori, association_rules\n",
+ "from mlxtend.frequent_patterns import metrics\n",
+ "\n",
+ "dataset = [['Milk', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],\n",
+ " ['Dill', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],\n",
+ " ['Milk', 'Apple', 'Kidney Beans', 'Eggs'],\n",
+ " ['Milk', 'Unicorn', 'Corn', 'Kidney Beans', 'Yogurt'],\n",
+ " ['Corn', 'Onion', 'Onion', 'Kidney Beans', 'Ice cream', 'Eggs']]\n",
+ "\n",
+ "te = TransactionEncoder()\n",
+ "te_ary = te.fit_transform(dataset)\n",
+ "df = pd.DataFrame(te_ary, columns=te.columns_)\n",
+ "freq_items = apriori(df, min_support=0.6, use_colnames=True)\n",
+ "rules = association_rules(freq_items, metric=\"confidence\", min_threshold=0.7)\n",
+ "rules"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0.875"
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "a = frozenset(['Onion'])\n",
+ "b = frozenset(['Kidney Beans', 'Eggs'])\n",
+ "metrics.kulczynski_measure(rules, a, b)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Example 2 -- Evaluate Imabalance Ratio of an Association rule:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0.2500000000000001"
+ ]
+ },
+ "execution_count": 8,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "a = frozenset(['Onion'])\n",
+ "b = frozenset(['Kidney Beans', 'Eggs'])\n",
+ "metrics.imbalance_ratio(freq_items, a, b)"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.2"
+ },
+ "orig_nbformat": 4
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/mlxtend/frequent_patterns/metrics.py b/mlxtend/frequent_patterns/metrics.py
new file mode 100644
index 000000000..74427a940
--- /dev/null
+++ b/mlxtend/frequent_patterns/metrics.py
@@ -0,0 +1,109 @@
+# mlxtend Machine Learning Library Extensions
+#
+# Functions to measure quality of association rules
+#
+# Author: Mohammed Niyas
+#
+# License: BSD 3 clause
+
+
+def kulczynski_measure(df, antecedent, consequent):
+ """Calculates the Kulczynski measure for a given rule.
+
+ Parameters
+ -----------
+ df : pandas DataFrame
+ pandas DataFrame of association rules
+ with columns ['antecedents', 'consequents', 'confidence']
+
+ antecedent : set or frozenset
+ Antecedent of the rule
+ consequent : set or frozenset
+ Consequent of the rule
+
+ Returns
+ ----------
+ The Kulczynski measure
+ K(A,C) = (confidence(A->C) + confidence(C->A)) / 2, range: [0, 1]\n.
+ """
+ if not df.shape[0]:
+ raise ValueError('The input DataFrame `df` containing '
+ 'the frequent itemsets is empty.')
+
+ # check for mandatory columns
+ required_columns = ["antecedents", "consequents", "confidence"]
+ if not all(col in df.columns for col in required_columns):
+ raise ValueError(
+ "Dataframe needs to contain the\
+ columns 'antecedents', 'consequents' and 'confidence'"
+ )
+
+ # get confidence of antecedent to consequent rule
+ a_to_c = df[
+ (df["antecedents"] == antecedent) & (df["consequents"] == consequent)]
+ try:
+ a_to_c_confidence = a_to_c["confidence"].iloc[0]
+ except IndexError:
+ a_to_c_confidence = 0
+
+ # get confidence of consequent to antecedent rule
+ c_to_a = df[
+ (df["antecedents"] == consequent) & (df["consequents"] == antecedent)]
+ try:
+ c_to_a_confidence = c_to_a["confidence"].iloc[0]
+ except IndexError:
+ c_to_a_confidence = 0
+ return (a_to_c_confidence + c_to_a_confidence) / 2
+
+
+def imbalance_ratio(df, a, b):
+ """
+ Calculates the imbalance ratio for a given pair of itemsets
+
+ Parameters
+ -----------
+ df : pandas DataFrame
+ pandas DataFrame of frequent itemsets
+ with columns ['support', 'itemsets']
+ a : set or frozenset
+ First itemset
+ b : set or frozenset
+ Second itemset
+
+ Returns
+ ----------
+ The imbalance ratio
+ I(A,B) = |support(A) - support(B)| /\
+ (support(A) + support(B) - support(A+B)), range: [0, 1]\n.
+ """
+ if not df.shape[0]:
+ raise ValueError('The input DataFrame `df` containing '
+ 'the frequent itemsets is empty.')
+
+ # check for mandatory columns
+ if not all(col in df.columns for col in ["support", "itemsets"]):
+ raise ValueError("Dataframe needs to contain the\
+ columns 'support' and 'itemsets'")
+
+ # get support of a
+ try:
+ sA = df[df["itemsets"] == a].support.iloc[0]
+ except IndexError:
+ sA = 0
+
+ # get support of b
+ try:
+ sB = df[df["itemsets"] == b].support.iloc[0]
+ except IndexError:
+ sB = 0
+
+ # get support of a union b
+ try:
+ sAB = df[df["itemsets"] == a.union(b)].support.iloc[0]
+ except IndexError:
+ sAB = 0
+
+ try:
+ return abs(sA - sB) / (sA + sB - sAB)
+ except ZeroDivisionError:
+ return 0
diff --git a/mlxtend/frequent_patterns/tests/test_metrics.py b/mlxtend/frequent_patterns/tests/test_metrics.py
new file mode 100644
index 000000000..a439e16df
--- /dev/null
+++ b/mlxtend/frequent_patterns/tests/test_metrics.py
@@ -0,0 +1,89 @@
+import pandas as pd
+from mlxtend.preprocessing import TransactionEncoder
+from mlxtend.frequent_patterns import apriori, association_rules
+from mlxtend.frequent_patterns import metrics
+from numpy.testing import assert_raises as numpy_assert_raises
+
+
+dataset = [['Milk', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],
+ ['Dill', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],
+ ['Milk', 'Apple', 'Kidney Beans', 'Eggs'],
+ ['Milk', 'Unicorn', 'Corn', 'Kidney Beans', 'Yogurt'],
+ ['Corn', 'Onion', 'Onion', 'Kidney Beans', 'Ice cream', 'Eggs']]
+
+te = TransactionEncoder()
+te_ary = te.fit_transform(dataset)
+df = pd.DataFrame(te_ary, columns=te.columns_)
+df_freq_items_with_colnames = apriori(df, min_support=0.6, use_colnames=True)
+df_strong_rules = association_rules(
+ df_freq_items_with_colnames, metric="confidence", min_threshold=0.7)
+
+
+def test_kulczynski_measure_default():
+ a = frozenset(['Onion'])
+ b = frozenset(['Kidney Beans', 'Eggs'])
+ assert metrics.kulczynski_measure(df_strong_rules, a, b) == 0.875
+
+
+def test_kulczynski_measure_set():
+ a = set(['Onion'])
+ b = set(['Kidney Beans', 'Eggs'])
+ assert metrics.kulczynski_measure(df_strong_rules, a, b) == 0.875
+
+
+def test_kulczynski_measure_no_antecedent():
+ a = frozenset(['Laptop'])
+ b = frozenset(['Kidney Beans', 'Eggs'])
+ assert metrics.kulczynski_measure(df_strong_rules, a, b) == 0.0
+
+
+def test_kulczynski_measure_no_consequent():
+ a = frozenset(['Onion'])
+ b = frozenset(['Laptop'])
+ assert metrics.kulczynski_measure(df_strong_rules, a, b) == 0.0
+
+
+def test_kulczynski_measure_no_rule():
+ a = frozenset(['Onion'])
+ b = frozenset(['Kidney Beans', 'Eggs'])
+ numpy_assert_raises(
+ ValueError, metrics.kulczynski_measure, pd.DataFrame(), a, b)
+
+
+def test_imbalance_ratio_default():
+ a = frozenset(['Onion'])
+ b = frozenset(['Kidney Beans', 'Eggs'])
+ assert metrics.imbalance_ratio(
+ df_freq_items_with_colnames, a, b) == 0.2500000000000001
+
+
+def test_imbalance_ratio_set():
+ a = set(['Onion'])
+ b = set(['Kidney Beans', 'Eggs'])
+ assert metrics.imbalance_ratio(
+ df_freq_items_with_colnames, a, b) == 0.2500000000000001
+
+
+def test_imbalance_ratio_no_itemset_a():
+ a = frozenset([])
+ b = frozenset(['Laptop'])
+ assert metrics.imbalance_ratio(df_freq_items_with_colnames, a, b) == 0.0
+
+
+def test_imbalance_ratio_no_itemset_b():
+ a = frozenset(['Laptop'])
+ b = frozenset([])
+ assert metrics.imbalance_ratio(df_freq_items_with_colnames, a, b) == 0.0
+
+
+def test_imbalance_ratio_no_itemset_a_b():
+ a = frozenset([])
+ b = frozenset([])
+ assert metrics.imbalance_ratio(df_freq_items_with_colnames, a, b) == 0.0
+
+
+def test_imbalance_ratio_no_rule():
+ a = frozenset(['Onion'])
+ b = frozenset(['Kidney Beans', 'Eggs'])
+ numpy_assert_raises(
+ ValueError, metrics.imbalance_ratio, pd.DataFrame(), a, b)