Skip to content
Change the repository type filter

All

    Repositories list

    • Gateway for refinery. Manages incoming requests and holds the workflow logic. To interact with the gateway, the UI or Python SDK can be used.
      Python
      3124Updated Dec 9, 2025Dec 9, 2025
    • Weak supervision for refinery. Manages the integration of heuristics such as labeling functions, active learners or zero-shot classifiers. Uses the weak-nlp library for the actual integration logic and algorithms.
      Python
      1000Updated Dec 9, 2025Dec 9, 2025
    • Updater for refinery. Manages migration logic to new versions if required.
      Python
      1000Updated Dec 9, 2025Dec 9, 2025
    • TypeScript
      0100Updated Dec 9, 2025Dec 9, 2025
    • Tokenizer for refinery. Manages the creation and storage of spaCy tokens for text-based record attributes and supports multiple language models. It is used by the gateway.
      Python
      1100Updated Dec 9, 2025Dec 9, 2025
    • Neural search for refinery. Manages similarity search powered by Qdrant and outlier detection, both based on vector representations of the project records.
      Python
      1500Updated Dec 9, 2025Dec 9, 2025
    • Execution environment for the active learning module in refinery. Containerized function as a service to build active learning models using scikit-learn and sequence-learn.
      Python
      1012Updated Dec 9, 2025Dec 9, 2025
    • Execution environment for labeling functions in refinery. Containerized function as a service to execute user-defined Python scripts.
      Python
      1002Updated Dec 9, 2025Dec 9, 2025
    • TypeScript
      0001Updated Dec 9, 2025Dec 9, 2025
    • Embedder for refinery. Manages the creation of document- and token-level embeddings using the embedders library.
      Python
      1102Updated Dec 9, 2025Dec 9, 2025
    • Execution environment for attribute calculation in refinery. Containerized function as a service to build custom attributes derived from the original data.
      Python
      1012Updated Dec 9, 2025Dec 9, 2025
    • Evaluates whether a user has access to certain resources.
      Python
      2000Updated Dec 9, 2025Dec 9, 2025
    • TypeScript
      0000Updated Dec 9, 2025Dec 9, 2025
    • Data model for refinery. Manages entities and their access for multiple services, e.g. the gateway.
      Python
      1304Updated Dec 9, 2025Dec 9, 2025
    • TypeScript
      0000Updated Dec 9, 2025Dec 9, 2025
    • Submodule which contains the requirements of the different parent images of refinery.
      Python
      0101Updated Dec 6, 2025Dec 6, 2025
    • Defines parent image for the Docker images of the refinery services that require torch (gpu).
      Shell
      0000Updated Nov 27, 2025Nov 27, 2025
    • Defines parent image for the Docker images of the refinery services that require torch (cpu).
      Shell
      0000Updated Nov 26, 2025Nov 26, 2025
    • Defines parent image for the Docker images of the refinery services which provide an execution environment.
      Shell
      0000Updated Nov 26, 2025Nov 26, 2025
    • Defines parent image for the Docker images of the refinery services which require the integration of the model and the s3 submodule.
      Shell
      0000Updated Nov 26, 2025Nov 26, 2025
    • Defines parent image for the Docker images of the refinery services with the smallest set of requirements.
      Shell
      0000Updated Nov 26, 2025Nov 26, 2025
    • Scripts used for Kern AI CI/CD efforts
      Shell
      0011Updated Nov 25, 2025Nov 25, 2025
    • Websocket module for refinery. Enables asynchronous notifications inside the application.
      Go
      1000Updated Oct 20, 2025Oct 20, 2025
    • JavaScript
      0000Updated Sep 30, 2025Sep 30, 2025
    • Dockerfile
      0001Updated Sep 4, 2025Sep 4, 2025
    • S3 related AWS and Minio logic.
      Python
      1000Updated Jul 21, 2025Jul 21, 2025
    • embedders

      Public
      With embedders, you can easily convert your texts into sentence- or token-level embeddings within a few lines of code. Use cases for this include similarity search between texts, information extraction such as named entity recognition, or basic text classification.
      Python
      22111Updated Jul 14, 2025Jul 14, 2025
    • Gateway proxy for refinery. Manages incoming requests and forwards them to the gateway. Used by the Python SDK.
      Python
      2000Updated Jul 10, 2025Jul 10, 2025
    • bricks

      Public
      Open-source natural language enrichments at your fingertips.
      Python
      244617310Updated Jan 14, 2025Jan 14, 2025
    • refinery

      Public
      The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
      Python
      741.5k700Updated Dec 9, 2024Dec 9, 2024