Releases · google-ai-edge/mediapipe

30 Jan 19:49

dbcp1

v0.10.32

8317ba7

MediaPipe v0.10.32 Latest

Latest

Build changes

Enables ml drift metal delegate as inference calculator backend.
[mediapipe] support armv7 (32 bit in mediapipe tasks)
Do not assume canvas is BGRA in RenderToWebGpuCanvas.
Fix sampling logic in ImageToTensorConverterWebGpu.
Migrate GlShaderCalculator to API3.
Migrate gl_shader_calculator_test to use API3 builder.

Bazel changes

[mediapipe] verion bump to 0.10.27
Dawn has completed these changes, so the old paths are no longer used.
Integrate tiny Juno inpainting graph into GenAiProcessor
Readme for API3
Add Resources::ResolveId to enable placeholder resource ids usage.
Web LLM: a few more small edits for Gemma3n
Include headers from global namespace
Add comment to Eigen version in WORKSPACE to remind about synchronization with TensorFlow's Eigen dependency
Migrating VisibilityCopyCalculator to API3.
Update from Bazel v6.5.0 to v7.4.1, Protobuf v3.19.1 to v5.28.3. Other packages also update the version within WORKSPACE.
Fix for weight cache on Windows.
Create Selfie Segmentation Demo App for LiteRT NPU.
Add Any support for API3
Adding AudioBuffer support to web LLM Inference API to handle more audio input types for MM models
Provide API3 interface for PassThroughCalculator using newly added Any type.
Migrate MergeCalculator to API3 and newly introduced Any type.
pybind11 version and py_proto_library macro update.
Add test for PacketResamplerCalculator with a very short video.
Initial version of sync function runner for API3
Fix function runner error reporting.
[mediapipe] version bump
Migrate CombinedPredictionCalculator to API3
Clean up CombinedPredictionCalculator
Currently, wrapping a TextureFrame in a media-pipe Packet assumes the texture is 8-bit RGBA. This patch allows specifying other texture formats to support common color formats like RGBA16F for HDR content.
Support timestamp bound updates in function runner.
Migrate TensorsToSegmentationCalculator to MediaPipe API3.
Add OneOf support for API3.
Provide ineference calculator API3 interface.
Migrate LandmarksToMatrixCalculator to API3
Update MediaPipe OSS to C++20.
Add a flag to use fp16 activations in tests.
Migrate HandednessToMatrixCalculator to API3.
Update xnnpack version.
Use the new xnn_reduce_mean_squared reduction for the RMSNorm.
Migrate ImageToTensorCalculator to API3.
Consistently use MutexLock instead of manual locking/unlocking
Add ImageProcessingOptions to FaceDetector C API
Enable node names as compile time strings in OSS.
Migrate API3 nodes to use compile time string names.
Fall back to producer context in gpu_buffer.GetReadView
Document api3 GetOrDie / VisitOrDie
Update log for missing InferenceCalculatorXnnpack registration.
Add NodeName for non-generic calculator context.
Add ImageProcessingOptions support to FaceLandmarker C API.
Migrate FaceLandmarker C API to use MediaPipe Image
Update CombinedPredictionCalculator test to new Runner
Fix comment about when things die.
Migrate WebGpuShaderCalculator to MediaPipe API3.
Proto changes for Tiny Gemma on ml_drift
Enable API3 FunctionRunner for WEB
Bump MediaPipe version to 0.10.29.
Add CompareAndSaveImageOutputDynamic to compare to a dynamic golden instead of a file.
Bump MediaPipe version to 0.10.30.
Improve error message of graph validation, to include node calculator name
Refactor Hand Landmarker C API to use new MP Image
Add ExternalGlTextureSyncMode to require efficient synchronization.
Add an option to get a Packet for API3 OneOf input.
Migrate GpuBufferToImageFrameCalculator to API3.
Add support to pass a single visitor in VisitOrDie for OneOf inputs.
Add VisitAsPacketOrDie for OneOf inputs.
Add MediaPipe Tasks C API for AudioClassifier.
Ensure correct type of #api3 Packet.
Support fractional frame rates in MediaPipe video processing.
Add ImageProcessingOptions support to Object Detector C API.
Add ImageProcessingOptions support to MediaPipe PoseLandmarker C API.
Update object detector to apply ImageFrame C API
Update pose landmarker to apply ImageFrame C API
Migrate HandAssociationCalculator to MediaPipe API3.
Adding ImageProcesingOptions to image_classifier C API.
Adding ImageProcesingOptions to gesture_recognizer C API.
Migrate GestureRecognizer (C API) to MpImagePtr
Migrate ImageFrameToGpuBufferCalculator to API3
Set default thread_num to LiteRT::CPU delegate
Qualify Packet and MakePacket while in mediapipe::api3 namespace to avoid future collisions with api3::Packet / api3::MakePacket
Remove redundant empty parentheses from lambdas in MediaPipe API3.
Migrate ImageClassifier (C API) to MpImagePtr.
Update HandAssociationCalculatorTest to new Runner
Add ImageProcessingOptions support to Image Segmenter C API
Migrate TensorsToSegmentationCalculator to API3
Extend lifetime of Image data when MpImage is constructed from an existing MediaPipe Image
Allow GetData() calls for contiginuous images
Migrate ImageSegmenter C API to MpImagePtr
Add GetLabels() to the ImageSegmenter C API
Generalize UnpackMediaSequenceCalculator's support for encoded media streams.
Clean up unused variables
Get rid of _with_options in favor of optional param (C & Python API)
Migrate Python ImageSegmenter to C API
Refactor Image Embedder C API to use MpImage
Simplify FunctionRunner template types.
Simplify setting options in HandAssociationCalculatorTest
Update MediaPipe C API vision task result callbacks to use MpStatus.
Offer attachments functionality from WebGPU service.
Enable creation of WebGPU service from explicitly provided wgpu::Device.
Remove unnecessary checks.
Enables RGBA input with RGB output.
Modify the Has{} generic function to check across the different value kinds of
Refactor TextClassifier C API to use MpStatus.
Update C API for TextEmbedder to use MpStatus
Update C API for LanguageDetector to return MpStatus
Add ImageProcessingOptions support to the MediaPipe ImageEmbedder C API.
Add ImageProcessingOptions to InteractiveSegmenter C API and migrate to MpImage
Fix counting pixels with different colors for image comparison tests
Remove redundant has_confidence_masks field from ImageSegmenterResult.
Migrate ImageClassifier C API to use MpStatus.
Refactor Face Detector C API to use MpStatus.
Update ImageSegmenter C API to return MP Status
Add support for XNNPACK's SLOW_CONSISTENT_ARITHMETIC flag
Update ImageEmbedder C API to return MP Status
Update InteractiveSegmenter C API to return MP Status
Get rid of _with_options in favor of optional param (C)
Refactor Gesture Recognizer C API to use MpStatus.
Update ObjectDetector C API to return MpStatus
Update HandLandmarker C API to return MP Status
Update PoseLandmarker C API to return MpStatus
Fix libmediapipe.so compilation on Windows
Remove no longer used Image types
Add side packets support for FunctionRunner.
Fix documentation.
Update bot assignees in bot_config.yml.
Added new R8 mode for GlShaderCalculator
Add visibility declarations for Windows
Centralize WebGPU header includes
Web Solutions: patch for importScripts error with modules in workers
Small cleanups in MP Task C++ segmentation graphs and ModelTaskGraph
Update AudioClassifier to retain error messages
Retain error messages in the Metadata API
Retain error messages in Language Detector
Update GestureRecognizer to retain error messages
Update FaceLandmarker C API to use new naming and return type convention
Retain error messages in TextEmbedder
Retain error messages in HandLandmarker
Retain error messages in ImageEmbedder
Retain Error Messages in Object Detector
Small cleanup of scheduler_queue
Refactor ImageClassifier C API to return error messages.
Allow empty Tensors in InferenceCalculator
Update ImageSegmenter to retain error messages
Retain error messages for PoseLandmarker
Retain error messages in MpImage
Allow packing of input streams into empty SequenceExample.
Retain error messages in Interactive Segmenter
Replace custom test macros with EXPECT_EQ/ASSERT_EQ
Add MpErrorFree to avoid missing function on Windows
Bump MP version to 0.10.31
Add Kotlin support to MediaPipe OSS repo
Prepare for functiongemma with MP web LLM API
Add experimental mapSync support in GetTexture2dData.
Fail with error at empty decoded image in OpenCVEncodedImageToImageFrameCalculator. Empty decoded image can happen if decoding fails.
Support option dependencies in mediapipe proto rules
Fix logging large one-dimensional vertical data
Adds GPU output support for category masks (copy only, result listener zero-copy case is not addressed yet)
Add wgpu::ExternalTexture support

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

Allow users to configure NPU delegate
Remove all references to subgraph reshaping which is enabled by default
Don't swallow Task exceptions for synchronous use cases
Update score thresholds for Java classifier/embedder tests
Restore default .so location
Add RegionOfInterest Proto to Java Protobuf list
Don't assume images are RGB
Allow users to configure the NPU delegate

iOS

Expose preferredBackends
Add stream cancellation support in swift API.
Add audio modality support to iOS GenAI inference.
Use wrapper type for RenderData
Remove MP AudioEmbedder

Javascript

Web LLM: basic .wav audio support for Gemma 3N
Web LLM: Small fix to multimodal error message strings
Add a test case to test empty packet inputs.
Migrate tasks tests to use common image test util.
Enab...

Assets 2

10 Jul 16:30

whhone

v0.10.26

80ae8af

MediaPipe v0.10.26

16kb Page Size Support

All the latest Android packages from Google Maven are now supporting the Android 16kb page size.
0.10.26.1 includes also the support for ARM v7 CPUs (32-bit).

Bazel changes

mediapipe task version bump
Introduce new variant of TFLiteModelLoader::LoadFromPath that allows to specify the mmap mode
Add DefaultSidePacketCalculator unit test under calculators/core
Add a test parameter to ignore pixels above diff limit
- Add MaskOverlayCalculator unit test under calculators/image
Web LLM: make GetSizeInTokens work for first vision-capable models
Log invalid format in proto lite mode.
Needed in order to update Dawn to match the standard webgpu.h, here:
Migrate InverseMatrixCalculator to API3
Migrate WarpAffineCalculator to #mediapipe-api3 + introduce GetGenericContext
Migrate landmark_projection_calculator to API3
Inference calculator refactoring.
Adding int32-vector output to constant side-packet calculator
Switch usages of ShaderModuleWGSLDescriptor to ShaderSourceWGSL
[mediapipe] update to Android SDK and NDK 26 -> 28
Migrate world_landmark_projection_calculator to API3
Migrate landmarks_refinement_calculator to API3
[mediapipe] upgrade docker image to use JDK 21
Cleanup projection calculator from reliance on C++ variable scoping and add a new test case.
Update std::once_flag/call_once to absl::once_flag/call_once in OSS version

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Add audio modality support to LLM Inference API.
[mediapipe] update opencv dependency

Assets 2

10 Jul 16:26

whhone

v0.10.25

95fd272

MediaPipe v0.10.25

Bazel changes

Make ThreadPoolExecutorOptions callable from Java/Kotlin
Add contract validator for API3
API3 Extract reusable part of API2 graph builder.
Adding license headers.
Post release version bump
API3 calculators should default to timestamp offset 0 (the same default as in API2) for consistency and it should be possible to unset the default.
API3 graph, stream & side_packet
Add SRQ config option into TransformerParams proto.
Graph API2 API3 interop.
Adding basic vision+text testing capabilities to web LLM Inference API

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Drops duplicated Android manifest file.
Makes LlmTaskRunner internal.

Javascript

Web LLM: Minor refactoring to allow more usage of newer LLM code paths

Assets 2

20 May 19:23

dbcp1

v0.10.24

fbc45e1

MediaPipe v0.10.24

Build changes

Add FdFinishedFunc util to mediapipe
rename a config setting to BUILD_FOR_OSS
#mediapipe #ios remove custom cpp version (rely on the common cpp version set at build time)
Rely on the common cpp version set at build time.

Framework and core calculator improvements 

Update C++ Graph Builder to support source layers.
Bump MP version for release 0.10.23.
Add Back-Edge support in Graph builder.
Add a destructor to WebGpuAsyncFuture that correctly frees any pending future.
Add tools for logging Tensors, ImageFrames and cv::Mats
Add a utility for creating a view of a Tensor into an OpenCV Mat
Add WebGpuCreateRenderPipelineAsync utility.
[mediapipe] update documentation mentioning python versions
Bump MP version for release 0.10.24.
Add support for GemmaV2-2B via XNNPACK.
Remove obsolete checks that integer division rounds to zero.
Inline SafeIntStrongIntValidator::SanityCheck function
Debug logging: Fix and properly support logging RGBA images
Fix modules/face_detection documentation.
Add LogHalideBuffer variant for logging Halide buffers
Add support for GemmaV3-1B models using XNNPACK.
Correct documentation to reflect actual behavior.
Fix KleidiAI repository URL.
Removed usage of deprecated InitFromGraphWithTransforms.
Dynamically quantize inputs only once before projecting to queries, keys, and values.
add an enum option to spectrogram calculator to output frames with all channels instead of vector of matrices
Fix GlBufferView (bug: incomplete move constructor)
Don't recreate write views on the same internal-only-use tensor (which triggers error messages) and fix read/write view usages.
Support loading PackWeightsCache from a file descriptor
Update flag description to use correct name for input_token_limit
Reduce logging frequency for some warnings.
Allow header output for all resampling strategies.
Fix failing build: blaze --blazerc=/dev/null build //third_party/mediapipe/examples/ios/facedetectioncpu:FaceDetectionCpuApp.apple_binary --config=ios_arm64 --ios_minimum_os=12.0
Add std::vector output support to ConstantSidePacketCalculator
Avoid creating unused StatusRep objects on each CalculatorNode::ProcessNode call
Avoid creating multiple status reps on each mediapipe::tool::StatusStop() call
Add option to process timestamp bound for ImmediateMuxCalculator.

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

Move the callback registration into the InferenceSession.
Add updateSessionConfig getSetencePieceProcessor API to Java interface.
Add getSessionOptions method to LlmInferenceSession.
This enables cloning for OpenCL-backed inference sessions
Adding support for prompt templates
Adds support to cancel async generation.
Expose the max number of image to process to unlock vision for multi-modal processing
Remove unnecessary chunk for add image API
Declare the dependency of the OpenCL libraries, so that clients don't have to.

iOS

Add vision modality support in swift API.
Moving skia conversion to LLM c lib.

Javascript

Remove artificial limits on maxBufferSize and maxStorageBufferBindingSize for LLM Inference on web.
Use different parameters (topk, temperature) for gemma3
Web LLM Inference: better error messaging for re-entry occurring from callback
Add toggle for allowing the forcing of float32 precision for LLM Inference on web

Python

Create a Packet containing a vector of ImageFrames. Get a list of ImageFrames from a Packet.
Remove unused parameter from a docstring.
Avoid unnecessary copy of ImageFrames.
Add extra settings (disallowing service default initialization) for the base solution and allow setting it from pose solution.
No public description
Create a script that runs the AI Edge Converter for all models in models.json
Support bundling additional .tflite models in .task
Enabling LoRA for Gemma3 conversions
Update llm bundler to put vision in .task

MediaPipe Dependencies

Update WASM files for 0.10.22 release

Assets 2

17 Mar 22:42

rtg0795

v0.10.22

c54c06d

MediaPipe v0.10.22

Build changes

[mediapipe] standardize import of androidx_annotation_annotation
[mediapipe] standardize import of androidx_appcompact
[mediapipe] standardize import of androidx_constraint_layout
[mediapipe] standardize import of androidx_core
[mediapipe] standardize import of androidx_legacy_legacy_support_v4
[mediapipe] delete unused 3p android_library androidx_material
[mediapipe] standardize import of androidx_recyclereview
[mediapipe] standardize import of camerax
Fix llm_engine_main build for DRISHTI_DISABLE_GPU=1

Framework and core calculator improvements

Updating Troubleshooting with VLOG info.
Update tensors_to_image_calculator.cc
Delegate memory-mapping the model file to the resource system
Add static helpers to timestamp classes
Remove use of designated initializers in tflite_model_loader.cc
Add support for INT64 in VectorIntToTensorCalculator.
Use renamed wgpu::ImageCopy* structures.
[mediapipe] improve mediapipe_java_proto_src_extractor
Bump MP version for release 0.10.22.
[mediapipe] improve maven artifact template
Add two_tap_fir_filter_calculator and update com_google_audio_tools revision.
Adds check to reject services with an empty shared_ptr
Adds check to ensure input tensors match model tensor size & type
Replace MapName with StaticMap in places where it's not important to use MapName
Make DelayedReleaser an "attachement" of the GlContext instance.
Utility functions that create RGB images for testing.
Avoids the sharing of GL contexts between nested mediapipe graphs.
Adds output stream stats to GraphRuntimeInfo
Move ImageFrames while splitting a vector of ImageFrames.
Add input stream to control zoom factor used in content_zooming_calculator.
Introduces GPU synchronization when accessing GetOpenGlBufferReadViews from a different OpenGL context than was used for the GetOpenGlBufferWriteView.
Adds documentation about graph runtime monitoring.
Use wgpu::ShaderSourceWGSL instead of wgpu::ShaderModuleWGSLDescriptor.
[mediapipe] restore mediapipe_aar.bzl
Add CreateWgslShader utility.
Update resource loading in WebGpuShaderCalculator to latest API.
Added prompt templates for session in C API

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

[mediapipe] clean up an unused target ":llm" in core
[mediapipe] correct the protobuf_lite dependency
[mediapipe] move llm jni from "core" to "genai"
[mediapipe] move llm proto from "core" to "genai"
[mediapipe] build genai tasks with exact dependencies
[mediapipe] create genai's specifc ProgressListener and ErrorHandler
[mediapipe] build vision and image_generator tasks with exact dependencies
Don't use MediaPipeException in JNI layer
Make generateResponseAsync() return a ListenableFuture and add ProgressCallback to its arguments
Update JNI to enable litert CPU backend for LLM inference.
Delete engine when task is closed.

iOS

Add sequenceBatchSize option when setting up the inference engine..

Javascript

Fix DrawingUtils constructor failing in Web Workers
Change starting LoraModel ids from 0 to 1.
Add a function to determine what type of model (handwritten, converted) a file is
Fix tee not cancelling the parent stream when both children are cancelled
Distinguish between '.bin' and '.task' in createFrom*
Move streamToUint8Array from task runner lib to model loading utility lib, so the graph runner extensions would be able to utilize it.

MediaPipe Dependencies

Update WASM files for 0.10.21-rc.20250303 release
Update WASM files for 0.10.22 release

Assets 2

07 Feb 22:57

kalyan2789g

v0.10.21

cad7f3a

MediaPipe v0.10.21

Framework and core calculator improvements

Update tensors_to_image_calculator.cc
Fix incorrect name in ValidateRequiredSidePacketTypes status message.
Delegate memory-mapping the model file to the resource system
Add documentation for GpuOrigin::DEFAULT
Add multiclass nms options for object detector.
Add static helpers to timestamp classes
Add Dockerfiles to allow users to build their own wheels
Remove std::aligned_storage.
Nit: add details to "no implementation available" error message
Remove use of designated initializers in tflite_model_loader.cc
Add resample_time_series_calculator.

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

Make LLM classes non-final to support mocking.
Adds TopP parameter in the LLM Inference API.
Add CPU / GPU options in Java LLM Inference Task.
Do not require Proto types in public API.

Javascript

[Web LLM] Fix for duplicate timestamp issue that could occur when loading two LoRA models in immediate succession
Return error code and file error message in C API for both PredictSync and PredictAsync
Added isIdle function to check whether web LlmInference instance is ready for work.
Make the parameters for generateResponse optional.

Model Maker changes

Enable the option of exporting a model with a fixed batch size.
Use Optional[int] instead of int | None for pre 3.10 python
Make LLM classes non-final to support mocking.
Adds TopP parameter in the LLM Inference API.

Assets 2

18 Dec 22:17

dbcp1

v0.10.20

0f80899

MediaPipe v0.10.20

Build changes

Add comments to explain how to configure OpenCV in the opencv_macos.BUILD file.
Add libc++_shared.so to MediaPipe Android examples.
Add linkstatic to OpenCV prebuilts

Framework and core calculator improvements

Fix ParseFromString() compilation issue in OSS
All the dead links fixed
Add troubleshooting tip for unsupported XNNPACK flags during build
Add UniqueId::Dup.
Updating the XNNPACK latest commit hash
Format Workspace file
Add EglSync wrapper.
Update Bazel version to 6.5.0
Update sync_wait to support UniqueFd.
Fix GlContext includes
Add EglSyncPoint/CreateEglSyncPoint.
Bump MediaPipe version to 0.10.19.
More perfetto tracking for EglSync.
Patch for supporting WebGPU .deviceInfo during API migration.
Log the Tensor multi-write error message only once.
Enable GpuBufferStorageAhwb ASYNC usage for use case: AhwbView write -> GlTextureView read
Add IsSignaled function (the previous SyncWait for checking status triggers unnecessary StrFormat)
Add type information to error message when accessing an empty packet.
Add SharedFD type
Update SyncWait/IsSignaled to work with SharedFd.
Enable SharedFd usage in EglSync
Adding VLOG overrides - MediaPipe utilizes VLOG heavily, but it's not straightforward for how to enable this when running an Android app. VLOG overrides allow to relatively quickly enable VLOGs for various modules within MediaPipe.
Updating Troubleshooting with VLOG info.
Slice only the tokens which are needed for the next stage of the LLM pipeline.
Adds DebugInputStreamHandler.
Delete YUVImage copy and move operations
Adds GetGraphRuntimeInfo methods which generates runtime debugging information about the state of InputStreams.
Add a sample script to run LLM inference on Android via the MediaPipe LLM inference engine.
Update bot_config.yml
Add option to set max sequence size in PackMediaSequenceCalculator instead of having it hard coded.
Update run_llm_inference.sh with recommended models.
Allow to read the input frame rate from the header in the input side stream and to limit the frame rate.
Add stream operator<< for TypeId
Extract native-to-UTF8 path string conversion; add FormatLastError()
Update comment in yuv_image.h
Introduce shadow_copy parameter to PathToResourceAsFile
Fix header includes after refactoring
Migrate away from status builders
Avoids the creation of two "default" GpuExecutor instances
Adds and integrates GraphRuntimeInfoLogger into CalculatorGraph.
nit: don't overwrite InitializeDefaultExecutor argument "use_application_thread"
Add Name() access to the source names in api2.
Add memory mapping and locking to file helpers
Fix Windows build
Fix Windows build, part 2
Support memory mapping in resources.
Bump MediaPipe version to 0.10.20.
Enable log message output for messages larger than 4096 bytes.
Add vision modality to the C API

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

Adds the canonical toBuilder method to the LlmInferenceOptions object.
Add Vision Modality to the MediaPipe LLM JNI Layer
Add vision modality to the Java LLM API
Remove unused Proto dependency

iOS

Fixed empty pose world landmarks in iOS holistic landmarker

Javascript

Improve logging to allow users to understand 1) which InferenceCalculator backend is used (without extra VLOG flags) and 2) when a model is loaded (including its size).
nits: Remove linter warnings, fix unused includes.

Python

Update the expected accuracy for text embedder test.
Remove the check for start and stop tokens in the LLM bundler.

Model Maker changes

Move tensorflow lite python calls to ai-edge-litert.

MediaPipe Dependencies

Update WASM files

Assets 2

07 Nov 23:34

dbcp1

v0.10.18

76e52c7

MediaPipe v0.10.18

Build changes

Following open-sourcing webgpu with open-sourcing one of its dependencies third_party/emscripten
Add pillow, pyyaml, and requests to model_maker BUILD

Framework and core calculator improvements

Loading resources through calculator and subgraph contexts and configuring through kResourcesService.
Use std::make_unique
Moves OnDiskCacheHelper class into a separate file / compilation target
Pools: report buffer specs on failure, fix status propagation, fix includes
Open-Source MediaPipe's WebGPU helpers.
BatchMatul uses transpose parameter.
Introduce Resource to represent a generic resource (file content, embedded/in-memory resource) for reading.
Bump up the version number to 0.10.16
Migrate from AdapterProperties to AdapterInfo
Migrate from Resource::ReadContents to Resources::Get (using ForEachLine where required)
Update Resources docs to mention ForEachLine (so devs don't fallback to ReadContents in such a case)
Adjust WebGPU device registration
Fix includes/copies/checks for BuildLabelMapFromFiles
Migrate to BuildLabelMapFromFiles.
Update Python version requirements in setup.py
Introduce Resources with mapping, so graphs can use placeholders instead of actual resource paths.
Remove Resources::ReadContents & add Resource::TryReleaseAsString.
Fix ports for multi side outputs.
Update solution android apps with explicit exported attribute.
Ensure kResourcesService is set before CalculatorGraph is initialized (otherwise subgraphs/nodes may get the wrong default resources).
Switch inference tests to ResourceProviderCalculator & update builder to refer MODEL_RESOURCE.
Migrate modules to use ResourceProviderCalculator.
Support single tensor input in TensorsToImageCalculator
Migrate TfLiteModelLoader to use MP Resources.
Remove deprecated TfLiteModelLoader::LoadFromPath.
Fix for isIOS() platform util on worker and non-worker contexts
Support single tensor input in TensorsToSegmentationCalculator
Makes CalculatorContext::GetGraphServiceManager() private
BatchMatMul can handle cases where ndims != 4 and quantization
RmsNorm has an optional scale parameter.
Allowed variable audio packet size by setting num_samples to null.
Fix technically correct but confusing example in top level comments.
Removing ReturnType helper, since it's part of the standard now.
Update XNNPack to 9/24
Enable LoRA conversion support for Gemma2-2B
Improve warning when InferenceCalculator backends are not linked
Bump MediaPipe version to 0.10.17.
Update OpenCV to a version that compiles with C++ 17
Force xnnpack when CPU inference is enforced
Install PyBind before TensorFlow to get the MediaPipe version
Change MP version to 0.10.18
Add validation to LLM bundler, alternative takePicture method to support custom thread executor, CopySign op, const Spec() method to OutputStreamManager, support for converting SRGBA ImageFrame to YUVImage, model configuration parameters for Gemma2-2B, support for converting SRGBA ImageFrame to YUVImage, model configuration parameters for Gemma2-2B, menu for the default demo app and option to Close processor/graph and Exit gracefully, ngrammer, per layer embeddings and Relu1p5 fields to llm_params and update from Proto, a special InMemory Resources (current use case is in tests, but may be needed for some simple things as well), ResourceProviderCalculator (replacement for LocalFileContentsCalculator), Resource support into TfliteModelCalculator and a flag to set the default number of XNNPACK threads.

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

Initialize new members in LlmModelSettings
Create an implicit session for all requests to generateResponse()
Change session management so that all JNI calls come from the same thread.
Add Session API support to LLM Java API

iOS

Updated name of iOS audio classifier delegate
Fixed incorrect stream mode in iOS audio classifier options
Added method to ios audio task runner
Updated iOS audio classifier BUILD file
Fixed buffer length calculation in iOS MPPAudioData
Updated iOS audio data tests to fix issue in buffer length calculation
Revert "Added method for getting interleaved float32 pcm buffer from audio file"
Updated comments in iOS LlmInference
Dropped Refactored suffix for modified files in iOS genai
Updated documentation of LlmTaskRunner
Removed allocation of LlmInference Options
Updated the response generation queue to be serial in iOS LlmInference
Updated documentation of iOS LlmInference, documentation of LlmInference+Session
Fixed marking of response generation completed control flow in LlmInference+Session.
LlmInference.Options: remove unnecessary numOfSupportedLoraRanks parameter.
Add activation data type to LlmInference.Options.
Added more methods to iOS AVAudioPCMBuffer+TestUtils, few basic iOS audio classifier tests, options tests to iOS audio classifier, utils for AVAudioFile, test for score threshold to MPPAudioClassifierTests, constants in MPPAudioClassifierTests, close method to iOS audio classifier, iOS MPPAudioData test utils, stream mode tests for iOS audio classifier, iOS audio classifier to cocoapods build, audio record creation tests to MPPAudioClassifierTests, close method to MPPAudioEmbedder, iOS audio embedder tests, more utility methods to MPPAudioEmbedderTests, streams mode tests for iOS audio embedder, iOS audio embedder to cocoapods build, comments to MPPAudioClassifierTests, iOS audio embedder header and implementation, iOS audio classifier implementation file, method for getting interleaved float32 pcm buffer from audio file, refactored iOS LlmTaskRunner, iOS LlmSessionRunner, more errors to GenAiInferenceError, refactored LlmInference, iOS session runner to build files, extra safeguards for response context in LlmSessionRunner, LlmInference+Session.swift and documentation regarding session and inference life times to iOS LLM Inference.
Fixed issue with iOS audio embedder result parsing, iOS audio embedder options processing , index error in AVAudioFile+TestUtils, audio classifier result processing in stream mode, error handling in MPPAudioData, microphone recording issues in iOS MPPAudioRecord, documentation of iOS Audio Record, iOS audio record and audio data tests by avoiding audio engine running state checks and iOS audio embedder result helpers and bug due to simultaneous response generation calls across sessions.
Updated method signatures in iOS audio classifier tests
Fixed flow limiting in iOS audio classifier
Removed duplicate test from MPPAudioClassifierTests
Updated comments in AVAudioFile+TestUtils
Changed the name of iOS audio classifier async test helper
Update comment for LlmInference.Session.clone() method.
Marked inits unavailable in MPPFloatBuffer
Updated documentation of iOS audio record
Adds a LlmInference.Metrics for providing some key performance metrics ( initialization time, response generation time) of the LLM inference.
Removed unwanted imports from iOS audio data tests
Cleaned ios audio test utils BUILD file
Remove the activation data type from the Swift API. We don't expect users to set it directly.
Use seconds instead of milliseconds for latency metrics.

Javascript

Add comments to generateResponses method.
Migrate to ForEachLine to have a single source of truth for getting file contents lines.
Workaround for multi-output web LLM issue where last response can get corrupted when numResponses is odd.
Quick fix for wrong number of multi-outputs sometimes when streaming

Python

Add a flag in the converter config for generating fake weights. When it is set to true, all weights will be filled with zeros.
Update text embedder test to match the output after XNNPack upgrade.
Update remaining data in text embedder test to match the output after XNNPack upgrade.
Update the expected value of the text embedder test.
Add python pip deps to WORKSPACE
Fix pip_deps targets.

Model Maker changes

Undo dynamic sequence length for export_model api because it doesn't work with MediaPipe.
Replace mock with unittest.mock in model_maker tests.
Move tensorflow lite python calls to ai-edge-litert.

MediaPipe Dependencies

Update WASM files

Assets 2

30 Aug 19:09

ayushgdev

v0.10.15

e252e56

MediaPipe v0.10.15

Build changes

Fix unwanted dependency on GPU libraries.
Adds TwoTapFirFilterCalculator.
Add public visibility to graph_service headers.
Disable ASAN, TSAN and MSAN tests which take more than 10 minutes.

Framework and core calculator improvements

Update PointToForeign with an optional cleanup object.
Enable BeginLoopCalculator for move-only types (e.g. Tensor) without Packet::Consume usage and copyable types without copying unless it's a fundamental type.
Ensure proper release of resources in case of multiple AHWB reads.
Enables the configuration of GpuBufferPool options via GpuResources::Create();
Bugfix to correctly handle landmark projection in the non-square case.
add utility to wait for a sync (represented by FD)
Change a RET_CHECK to RET_CHECK_EQ
KinematicPathSolver: Avoid overshooting target
Introduce GetDefaultGpuExecutor(GpuResources) to allow executing all calculators on MP GPU thread.
No destruction for static ahwb_usage_track_.
Unbind framebufffer in Affine Transformation Runner GL
Move/isolate ahwb_usage_track_ into tensor_ahwb
Guard ahwb_tensor_track_ with mutex.
Add SidePacketConnectionTest
Update C++ Graph Builder to support executors and support input/output stream handlers.
Node::Input/OutputStreamHandler -> Node::SetInput/OutputStreamHandler
Add Packet::Share() method in replacement of SharedPtrWithPacket() function.
Default to high-performance power preference hint for WebGL contexts. For some computers with dual GPUs (like MBP2019), this will more frequently give us the higher performance GPU, which is generally preferable for most of our use cases (realtime rendering and ML), since speed is more critical than power consumption. If necessary, the user can override this setting by requesting their canvas' WebGL context manually before initializing the graph.
Introduce input_scale parameter to SpectogramCalculator.
Improve documentation of graph options
Add an option to PackMediaSequenceCalculator to add empty clip labels instead of ignoring them. This is useful when we want to distinguish processing errors from no-detections.
Updates language detection headers
Fix dangling error reporter pointer in memory mapped models
Fix for possible infinite stall using setOptions immediately before a loadLoraModel call.
Add relu1p5 op, abs op, Log op, mdspan and Lhs Broadcast Sub with test
Fix missing member move in Tensor class
Add support for single Tensor output streams for ImageToTensorCalculator.
Fix some compilation errors in WebGPU code. These changes are all minor.
Add single tensor output support to tensor_converter_calculator.
Replace QCHECK with ABSL_QCHECK and CHECK with ABSL_CHECK.
Fix a bug in TensorAHWB that triggers a crash with multiple delayed AHWB readers followed by a CPU reader.
Fixes an unnecessary allocation of GraphServiceManager in case it is adopted from the calculator context.
Fix triggering of DFATAL message.
Remove xnn_enable_avx512fp16=false from .bazelrc
Replace uses of TfLiteOperatorCreate with TfLiteOperatorCreateWithData
Compile with '--keep_going' in setup.py
Update ndk version so that our open source users get the best possible performance out of mediapipe.
Correct address of android ndk
Replace absl::make_unique with std::make_unique in tensor.cc and tensor_ahwb.cc.
LLM decode benchmarks fill the cache with a predefined number of tokens before starting decoding.
Add logic to drop the offending non-monotonically increasing timestamp in the MicrophoneHelper.
Make packet payload const.
Pass flag to indicate that consuming op may support prepacked GEMM.
Get timestamp from OpenCV VideoCapture after first frame is read.
Update XNNPack and cpuinfo
Update TensorFlow to 2024-07-18.
Remove deprecated TfLiteOperatorCreateWithData function
Add option to use shifted window in SpectrogramCalculator.
Move AhwbUsage struct and helper methods into a separate library.
Make fields in PacketGetter.Pair public.
The GraphProfiler my be destoried before the task executed in the executor.
Introduce flag in MicrophoneHelper to drop non-increasing timestamps.
llm_test - add batch size of 8 for BM_Llm_QCINT8/512/128
Add method to create MP Tensor from TfLite tensor specs
Refactors AHardwareBufferView class to be instantiated with a TensorAhwbUsage pointer.
Refactor LlmBuilder to have one graph
Add expected_seq_len param to ComputeLogits()
Fix mediapipe::file::Exists() for >2GB files on Windows.
Bump XNNPACK and KleidiAI versions.
Update MP demo app to acquire wake lock
Replace mediapipe::StatusOr with absl::StatusOr
Sync on ssbo_writte_ before mapping an AHWB to a CpuReadView.

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

Bump targetSdkVersion to 34 throughout MediaPipe.

iOS

Updated documentation in iOS audio classifier
Added iOS holistic landmarker to vision framework build
Changed method name in MPPAudioClassifierResult
Added audio classifier options helpers
Added audio classifier result helpers
Added method to create audio record MPPAudioTaskRunner
Removed unused imports in MPPAudioTaskRunner
Added iOS audio embedder result, classifier result, classifier options, embedder options, embedder options helpers, classifier header and embedder result helpers
Add missing argument for num_draft_tokens.

Javascript

Set quantization bits for LoRA weight conversion to match those specified
Warn on adding packets to a closed input stream instead of silently dropping packets.
Enable experimental support for Chromium WGSL subgroups in LLM API, when available.
Support multi-response generation.

Python

Add prompt template to llm bundler.

Bug fixes

class_weights flag cuases a crash for multiclass case

Model Maker changes

Rename old BinaryAUC metric to BinarySparseAUC(used by text_classifier) and create a new BinaryAUC metric which does not expect sparse inputs.
Allow configuration of num_parallel_calls and cycle_length in hparams
Improve python code format.
Use tf.io.gfile.GFile for writing metadata file in image classifier.
Change SparsePrecision metric to BinarySparsePrecision metric, and same for SparseRecall->BinarySparseRecall in the core library. We only care about these metrics in the binary case, so this change makes the metric classnames more accurate for it's intended usage.
Support multilabel model training in text classifier
Create and add metrics for multi-class case
Support a customized best model monitor for multiclass cases

MediaPipe Dependencies

Update WASM files

Assets 2

13 May 17:40

ayushgdev

v0.10.14

4cf89a7

MediaPipe v0.10.14

Framework and core calculator improvements

Expose Lora ranks.
Update C API documentation to make it clear that the callback is invoked multiple times
Do not free response in PredictAsync callback
Enable usage of DRISHTI_PROFILING from non mediapipe namespaces.
Add model type to ImageGeneratorOptions.
Allow casting Stream->Stream

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to other platforms.

iOS

Added iOS audio data tests
Removed unused methods in AVAudioPCMBufferTestUtils
Added read at offset tests to MPPAudioRecordTests
Renamed property in MPPAudioData
Added iOS Audio Packet Creator
Added iOS audio running mode
Added iOS Packet Creator
Added iOS audio task runner
Updated documentation of MPPAudioPacketCreator

Javascript

Allow models to be uploaded via ReadableStreamDefaultReader
Allow all tasks to use a ReadableStreamDefaultReader
Expose Web LoRA API.
Raise WebGPU errors to JavaScript.
Update GenAI Experimental README
Update GenAI README

Python

Fixed result_callback() argument

MediaPipe Dependencies

Flatbuffers upgrade to 24.3.7
Update TF and FlatBuffer dependency to latest.

Assets 2

Releases: google-ai-edge/mediapipe

MediaPipe v0.10.32

Build changes

Bazel changes

MediaPipe Tasks update

Android

iOS

Javascript

Uh oh!

MediaPipe v0.10.26

16kb Page Size Support

Bazel changes

MediaPipe Tasks update

Uh oh!

MediaPipe v0.10.25

Bazel changes

MediaPipe Tasks update

Javascript

Uh oh!

MediaPipe v0.10.24

Build changes

Framework and core calculator improvements

MediaPipe Tasks update

Android

iOS

Javascript

Python

MediaPipe Dependencies

Uh oh!

MediaPipe v0.10.22

Build changes

Framework and core calculator improvements

MediaPipe Tasks update

Android

iOS

Javascript

MediaPipe Dependencies

Uh oh!

MediaPipe v0.10.21

Framework and core calculator improvements

MediaPipe Tasks update

Android

Javascript

Model Maker changes

Uh oh!

MediaPipe v0.10.20

Build changes

Framework and core calculator improvements

MediaPipe Tasks update

Android

iOS

Javascript

Python

Model Maker changes

MediaPipe Dependencies

Uh oh!

MediaPipe v0.10.18

Build changes

Framework and core calculator improvements

MediaPipe Tasks update

Android

iOS

Javascript

Python

Model Maker changes

MediaPipe Dependencies

Uh oh!

MediaPipe v0.10.15

​Build changes

Framework and core calculator improvements

MediaPipe Tasks update

Android

iOS

Javascript

Python

Bug fixes

Model Maker changes

MediaPipe Dependencies

Uh oh!

MediaPipe v0.10.14

Framework and core calculator improvements 

Build changes