Skip to content

ashish-code/FuzzyVisualEncoding

Repository files navigation

Fuzzy Visual Encoding

Python MATLAB scikit-learn pytest License: MIT

Research implementation of Fuzzy Visual Encoding using Gustafson-Kessel (GK) and Gath-Geva (GG) fuzzy clustering for visual object category recognition. Replaces the standard k-means Bag-of-Features codebook with fuzzy clustering that learns per-cluster covariance matrices, producing soft membership vectors that better model the complex, ellipsoidal distribution of SIFT feature descriptors.

📄 Published: gupta12-icip.pdf (ICIP 2012) | Fuzzy Encoding using Gustafson-Kessel Algorithm.pdf


Key Idea

Standard Bag-of-Features assigns each SIFT descriptor to the nearest k-means cluster (hard assignment). This ignores uncertainty and cluster shape.

Fuzzy encoding computes soft membership to all clusters simultaneously:

K-means (hard assignment):        Gustafson-Kessel (fuzzy):

SIFT descriptor x                 SIFT descriptor x
     ↓                                 ↓
argmin_i ||x - v_i||²             u_i(x) = membership in cluster i
     ↓                            using Mahalanobis distance with
[0, 0, 1, 0, ..., 0]             adaptive covariance A_i per cluster
 (one-hot, K words)                    ↓
                                   [0.1, 0.6, 0.2, 0.05, ...]
                                   (fuzzy, captures shape + overlap)

Gustafson-Kessel learns an adaptive matrix A_i per cluster, allowing ellipsoidal cluster shapes fitted to the true geometry of the feature space. Gath-Geva extends this with full probabilistic (Gaussian) density estimation per cluster.


Pipeline

flowchart TB
    subgraph Extract["Feature Extraction"]
        IMG["Images"] --> DSIFT["Dense SIFT\n(128-d descriptors)"]
        DSIFT --> SAMPLE["Sample ~50k vectors\nfrom all categories"]
    end

    subgraph Dict["Dictionary Learning"]
        SAMPLE --> PCA["PCA\n128-d → 10-d"]
        PCA --> NORM["Range\nNormalize [0,1]"]
        NORM --> GK["Gustafson-Kessel\nFuzzy Clustering"]
        NORM --> GG["Gath-Geva\nFuzzy Clustering"]
        NORM --> FCM["Fuzzy C-Means\n(baseline)"]
        GK --> DICT["Fuzzy Dictionary\n(centres + covariance matrices)"]
        GG --> DICT
        FCM --> DICT
    end

    subgraph Encode["Image Encoding"]
        DSIFT2["Image SIFT"] --> PCA2["PCA project"]
        PCA2 --> EVAL["FuzzyDictionary.encode()\ncompute memberships u_i(x)"]
        EVAL --> AVG["Average over\nall descriptors"]
        AVG --> FVEC["Fuzzy vector\n(N_clusters,)"]
    end

    subgraph Classify["Classification"]
        DICT --> EVAL
        FVEC --> SVM["RBF SVM\nper category"]
        SVM --> MAP["mAP / Accuracy\nper category"]
    end
Loading

Fuzzy Clustering Methods

Method Cluster shape Distance Distance function
K-means Spherical Euclidean ‖x − vᵢ‖²
Fuzzy C-Means (FCM) Spherical Euclidean ‖x − vᵢ‖²
Gustafson-Kessel (GK) Ellipsoidal Mahalanobis xᵀ Aᵢ x (adaptive per cluster)
Gath-Geva (GG) Ellipsoidal Probabilistic Gaussian kernel with full covariance

All fuzzy methods use the fuzziness exponent m=2, producing memberships u_i(x) ∈ [0,1] satisfying Σᵢ u_i(x) = 1.


Results

Evaluated on standard computer vision benchmarks with dictionary sizes 16–512:

Dataset K-means FCM GK GG
VOC 2006 0.38 0.40 0.44 0.43
VOC 2010 0.33 0.35 0.39 0.38
Caltech-101 0.47 0.49 0.54 0.53
Caltech-256 0.29 0.31 0.35 0.34
Scene-15 0.52 0.54 0.58 0.57

Mean accuracy across categories, dictionary size 128.

Performance plots (per-category, per-dataset, dictionary size sensitivity) are in figures/.


Installation

Requires Python 3.10+. Uses uv for dependency management.

# Clone and install in editable mode with dev dependencies
git clone https://github.com/ashish-code/FuzzyVisualEncoding.git
cd FuzzyVisualEncoding

# Using uv (recommended)
uv sync --extra dev

# Or using pip
pip install -e ".[dev]"

Quick Start

Python API

from fuzzy_visual_encoding import FuzzyDictionary, cross_validate
import numpy as np

# X: (N, 128) SIFT feature matrix, y: (N,) integer category labels
X = np.load("sift_features.npy")
y = np.load("labels.npy")

# GK fuzzy dictionary: 128-d SIFT → PCA 10-d → 64 fuzzy clusters
fd = FuzzyDictionary(n_clusters=64, method="gk", n_pca=10, fuzziness=2.0)
Z = fd.fit_transform(X)    # (N, 64) fuzzy encoding

scores = cross_validate(Z, y, n_folds=10)
print(f"F1: {scores['f1_mean']:.3f} ± {scores['f1_std']:.3f}")

Command-line interface

# Encode features and evaluate with 10-fold CV
fuzzy-encode features.txt --method gk --n-clusters 64 --n-pca 10 --n-folds 10

# Compare all four methods
for method in kmeans fcm gk gg; do
    fuzzy-encode features.txt --method $method --n-clusters 64
done

Compare methods in Python

from fuzzy_visual_encoding import FuzzyDictionary, cross_validate

for method in ["kmeans", "fcm", "gk", "gg"]:
    fd = FuzzyDictionary(n_clusters=64, method=method)
    Z = fd.fit_transform(X)
    s = cross_validate(Z, y)
    print(f"{method:8s}: F1 = {s['f1_mean']:.3f} ± {s['f1_std']:.3f}")

Running Tests

# Run all 49 tests
pytest

# With coverage report
pytest --cov=fuzzy_visual_encoding --cov-report=term-missing

Repository Layout

FuzzyVisualEncoding/
├── src/
│   └── fuzzy_visual_encoding/        # Installable Python package
│       ├── __init__.py               # Public API exports
│       ├── algorithms.py             # GustafsonKessel, GathGeva, _fuzzy_cmeans
│       ├── dictionary.py             # FuzzyDictionary (fit / encode / fit_transform)
│       ├── classification.py         # fuzzy_classify(), cross_validate()
│       └── cli.py                    # fuzzy-encode console entry point
├── tests/
│   ├── test_algorithms.py            # 20 tests: shapes, normalisation, cluster separation
│   ├── test_dictionary.py            # 15 tests: all methods, PCA path, error guards
│   └── test_classification.py        # 14 tests: scores, ranges, reproducibility
├── scripts/
│   ├── pool_feature_vectors.py       # Pool per-category SIFT vectors for clustering
│   ├── extract_feature_vectors.py    # Write per-category descriptor .tab files
│   ├── extract_pca_features.py       # PCA-reduce pooled + per-category features
│   ├── plot_results.py               # Visualise accuracy (by category/dataset/dict size)
│   └── _legacy_*.py                  # Original 2011 Python 2 scripts (reference only)
├── figures/                          # Result PDFs (27 files)
├── matlab/                           # Original MATLAB source (13 files)
│   ├── CatFuzzyClass.m               # Main orchestrator
│   ├── calcFuzzyDict.m               # Dictionary creation (GK/GG/FCM)
│   ├── calcFuzzyCoeff.m              # Fuzzy coefficient computation
│   ├── compFuzzyClassPerf.m          # Classification evaluation
│   └── ... (9 more)
├── pyproject.toml                    # Build config (hatchling), uv deps, pytest settings
├── gupta12-icip.pdf                  # ICIP 2012 conference paper
└── Fuzzy Encoding for Visual Classification using Gustaffson-Kessel Algorithm.pdf

MATLAB → Python mapping

MATLAB file Python equivalent Status
GKclust(data, param) algorithms.GustafsonKessel.fit() ✅ Ported
GGclust(data, param) algorithms.GathGeva.fit() ✅ Ported
FCMclust(data, param) algorithms._fuzzy_cmeans() ✅ Ported
clusteval() dictionary.FuzzyDictionary.encode() ✅ Ported
calcFuzzyDict.m dictionary.FuzzyDictionary.fit() ✅ Ported
calcFuzzyCoeff.m dictionary.FuzzyDictionary.encode() ✅ Ported
compFuzzyCoeff.m dictionary.FuzzyDictionary.encode() ✅ Ported
compFuzzyClassPerf.m classification.fuzzy_classify() ✅ Ported
CatFuzzyClass.m classification.cross_validate() ✅ Ported
svmtrain(), svmpredict() sklearn.svm.SVC Replaced
clust_normalize() sklearn.preprocessing.MinMaxScaler Replaced
out_of_sample() sklearn.decomposition.PCA.transform() Replaced

FuzzyClusteringToolbox (MATLAB): Original code required Bezdek's FuzzyClusteringToolbox (not redistributable). The Python port implements GK and GG from first principles using NumPy, matching the original algorithm papers.


Datasets

Dataset Classes Images Download
Pascal VOC 2006 10 ~5,000 VOC Challenge
Pascal VOC 2010 20 ~20,000 VOC Challenge
Caltech-101 101 ~9,000 Caltech Vision Lab
Caltech-256 256 ~30,000 Caltech Vision Lab
Scene-15 15 ~4,500 Scene Understanding

Path configuration: Point --root-dir in the scripts to your local data directory. The original scripts used hardcoded Surrey University HPC paths (/vol/vssp/diplecs/ash/Data/) — these have been replaced with CLI arguments.


References

  • Gupta, A. (2012). Fuzzy Encoding for Visual Classification using Gustafson-Kessel Algorithm. ICIP 2012.
  • Gustafson, D.E., Kessel, W.C. (1979). Fuzzy Clustering with a Fuzzy Covariance Matrix. CDC.
  • Gath, I., Geva, A.B. (1989). Unsupervised Optimal Fuzzy Clustering. IEEE TPAMI.
  • Bezdek, J.C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms. Springer.

License

MIT — see LICENSE.

About

Fuzzy visual encoding using Gustafson-Kessel and Gath-Geva clustering. Learns adaptive covariance matrices per cluster for soft SIFT descriptor assignment. Outperforms k-means BOF on Caltech and VOC benchmarks. Published at ICIP 2012.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors