Skip to content

thatipamula-jashwanth/smart-knn

Repository files navigation

SmartKNN logo

SmartKNN

A modern, weighted nearest-neighbor learning algorithm with learned feature importance and adaptive neighbor search.

Website PyPI version Python versions CI status MIT License Hugging Face Demo Regression Benchmarks Classification Benchmarks

Overview

SmartKNN is a nearest-neighbor–based learning method that belongs to the broader KNN family of algorithms.

It is designed to address common limitations observed in classical KNN approaches, including:

  • uniform treatment of all features
  • sensitivity to noisy or weakly informative dimensions
  • limited scalability as dataset size grows

SmartKNN incorporates data-driven feature importance estimation, dimension suppression, and adaptive neighbor search strategies. Depending on dataset characteristics, it can operate using either a brute-force search or an approximate nearest-neighbor (ANN) backend, while exposing a consistent, scikit-learn–compatible API.

The method supports both regression and classification tasks and prioritizes robustness, predictive accuracy, and practical inference latency across a range of dataset sizes.


Key Capabilities

  • Learned feature weighting
    • MSE relevance
    • Mutual Information
    • Random Forest importance
      (method configurable depending on task and dataset)
  • Automatic preprocessing
    • normalization
    • NaN / Inf handling
    • feature masking
  • Distance-weighted neighbor voting
  • Brute-force and ANN backends
    • designed to scale to large datasets (hardware and tuning dependent)
    • optional GPU-accelerated neighbor search
  • Vectorized NumPy with Numba acceleration
  • Scikit-learn–compatible API

Installation

pip install smart-knn

Documentation

Detailed documentation and design notes are maintained externally. This repository README is intentionally kept concise.


Examples

Runnable examples are available in the examples/ directory:

python examples/regression_example.py
python examples/classification_example.py

Benchmarks & CI

  • Comprehensive benchmark suites for regression and classification
  • GitHub Actions CI for tests and benchmarks
  • Reproducible, engineering-focused evaluation

Benchmark details are documented in benchmarks/README.md.


Project Status

  • SmartKNN v2 is stable
  • API is frozen for the v2.x series (backward-compatible improvements only)
  • Actively maintained
  • Open to research and engineering collaboration

License

SmartKNN is released under the MIT License. See LICENSE for details.

Sponsor this project

Packages

 
 
 

Contributors

Languages