UPSTREAM PR #17504: server: add support for local image path loading for server (updated) #324

loci-dev · 2025-11-25T20:37:48Z

Updated server-local-image-loading feature branch with latest master branch changes

This is the same changeset reflected in ggml-org/llama.cpp#16874 but updated to the latest changes in master, as too much had diverged and the prior pull request is stale at this point.

…branch changes

loci-agentic-ai · 2025-11-25T21:25:23Z

Explore the complete analysis inside the Version Insights

Performance Analysis Summary

Project: llama.cpp
PR #324: Server support for local image path loading
Comparison: Target version 34caec7c vs Baseline aab9b31c

Analysis Overview

This PR adds local file loading capability for multimodal models via file:// URLs with security controls. The implementation introduces two new command-line arguments (--allowed-local-media-path, --local-media-max-size-mb) and file validation logic in the server request processing path.

Key Findings

Performance-Critical Areas Impact

Argument Parsing Module (common/arg.cpp):
The observed performance degradations occur in lambda operators within common_params_parser_init, specifically in regex scanning operations. The top affected functions show response time increases ranging from 4159 microseconds to 72070 microseconds, with throughput increases between 67 nanoseconds and 424 nanoseconds.

This PR adds two new argument parsing lambdas that perform filesystem operations (std::filesystem::canonical, std::filesystem::is_directory) rather than regex operations. These filesystem calls execute once during server initialization and are not on the critical inference path.

Inference and Tokenization Functions:
No changes were made to core inference functions (llama_decode, llama_encode, llama_tokenize, llama_model_load_from_file). The multimodal processing integration uses existing mtmd_default_marker() patterns without modifying token processing logic. Request-time file loading occurs during prompt preparation, before inference begins.

Tokens Per Second Impact

No measurable impact on inference throughput. The affected functions are in argument parsing and request preprocessing, not in the token generation loop. File loading operations (2-12 milliseconds per image) occur during request initialization, outside the inference cycle. Core tokenization and decode functions remain unchanged.

Power Consumption Analysis

Power consumption changes across binaries are minimal:

build.bin.llama-cvector-generator: 516 nanojoules decrease (0.185% reduction)
build.bin.llama-tts: 204 nanojoules decrease (0.072% reduction)
build.bin.llama-quantize: 16 nanojoules increase (0.040% increase)
build.bin.llama-bench: 2 nanojoules increase (0.004% increase)

All changes are below 0.2%, indicating no meaningful power consumption impact from this PR.

Code Changes Context

The implementation adds security-focused file access controls: path canonicalization prevents directory traversal, prefix matching enforces directory whitelisting, file type validation restricts to regular files, and size limits prevent resource exhaustion. These are one-time validation operations during request processing, not repeated during inference.

The extreme percentage increases observed in argument parsing functions (up to 2,876,615%) reflect changes in regex pattern complexity across the broader codebase, not filesystem operations introduced by this PR. The absolute time increases (microseconds range) occur during initialization, which happens once per server start or once per request for file validation.

updated server-local-image-loading feature branch with latest master …

25d5b0c

…branch changes

loci-dev temporarily deployed to PROD__AL_DEMO November 25, 2025 20:37 — with GitHub Actions Inactive

loci-dev force-pushed the main branch 7 times, most recently from 92ef8cd to 7dd50b8 Compare November 26, 2025 16:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UPSTREAM PR #17504: server: add support for local image path loading for server (updated) #324

UPSTREAM PR #17504: server: add support for local image path loading for server (updated) #324

Uh oh!

loci-dev commented Nov 25, 2025

Uh oh!

loci-agentic-ai bot commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

UPSTREAM PR #17504: server: add support for local image path loading for server (updated) #324

Are you sure you want to change the base?

UPSTREAM PR #17504: server: add support for local image path loading for server (updated) #324

Uh oh!

Conversation

loci-dev commented Nov 25, 2025

Uh oh!

loci-agentic-ai bot commented Nov 25, 2025

Performance Analysis Summary

Analysis Overview

Key Findings

Performance-Critical Areas Impact

Tokens Per Second Impact

Power Consumption Analysis

Code Changes Context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants