[Feature Request] Android GPU Acceleration Support (Vulkan/MLC-LLM)

### Describe the bug

Feature Request
Description
Currently, LLM Unity only supports CPU inference on Android (`LLMLib.cs` line 671-674 only adds "android" architecture regardless of GPU settings). This results in very slow inference speeds (2-3 tokens/sec) on mobile devices.
I would like to request **Android GPU acceleration support** using Vulkan backend.
 Current Behavior
numGPULayers` setting is visible in Inspector but has no effect on Android
PossibleArchitectures()` in `LLMLib.cs` only returns CPU architecture for Android:
csharp
else if (Application.platform == RuntimePlatform.Android)
{
    architectures.Add("android"); // Only CPU, no GPU option
}

### Steps to reproduce

_No response_

### LLMUnity version

v3.0.0

### Operating System

None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Android GPU Acceleration Support (Vulkan/MLC-LLM) #371

Describe the bug

Steps to reproduce

LLMUnity version

Operating System

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature Request] Android GPU Acceleration Support (Vulkan/MLC-LLM) #371

Description

Describe the bug

Steps to reproduce

LLMUnity version

Operating System

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions