Skip to content

[Feature Request] Android GPU Acceleration Support (Vulkan/MLC-LLM)Β #371

@Fangwangye

Description

@Fangwangye

Describe the bug

Feature Request
Description
Currently, LLM Unity only supports CPU inference on Android (LLMLib.cs line 671-674 only adds "android" architecture regardless of GPU settings). This results in very slow inference speeds (2-3 tokens/sec) on mobile devices.
I would like to request Android GPU acceleration support using Vulkan backend.
Current Behavior
numGPULayers setting is visible in Inspector but has no effect on Android PossibleArchitectures() in LLMLib.cs only returns CPU architecture for Android:
csharp
else if (Application.platform == RuntimePlatform.Android)
{
architectures.Add("android"); // Only CPU, no GPU option
}

Steps to reproduce

No response

LLMUnity version

v3.0.0

Operating System

None

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions