Regarding the slow performance issue of Android Chrome

Model: Llama 3.2, 3B, Q4F16_1
Phone: Pixel 8 Pro
In computer and phone log observation, data in phone parts seems to be suppressed, causing content that is not long to take 20 seconds or more, which changes to 1B/1.7B (Qwen) results in approximation.

PC (AMD-680M)
prefill_chunk_size = 1024
buffer_size_required_bytes=  64MB :[WebLLM Stats] prefill: 111.7527 tok/s, decoding: 18.0826 tok/s (Prompt: 300 / Gen: 157) [F16: ON]
buffer_size_required_bytes= 128MB :[WebLLM Stats] prefill: 111.7943 tok/s, decoding: 15.5302 tok/s (Prompt: 300 / Gen: 170) [F16: ON]
____

Pixel 8 Pro
prefill_chunk_size = 1024
buffer_size_required_bytes= 64MB : [WebLLM Stats] prefill: 5.3816 tok/s, decoding: 5.0826 tok/s (Prompt: 148 / Gen: 47) [F16: ON]
buffer_size_required_bytes= 128MB :[WebLLM Stats] prefill: 5.3792 tok/s, decoding: 5.0428 tok/s (Prompt: 148 / Gen: 51) [F16: ON]
buffer_size_required_bytes= 256MB :[WebLLM Stats] prefill: 5.3907 tok/s, decoding: 5.0690 tok/s (Prompt: 148 / Gen: 48) [F16: ON]

prefill_chunk_size = 256 (Recompile WASM)
buffer_size_required_bytes= 256MB [WebLLM Stats] prefill: 5.3828 tok/s, decoding: 5.0957 tok/s (Prompt: 148 / Gen: 49) [F16: ON]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding the slow performance issue of Android Chrome #759

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Regarding the slow performance issue of Android Chrome #759

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions