Skip to content

create model from local file#269

Open
maifeeulasad wants to merge 16 commits intoollama:mainfrom
maifeeulasad:modelfile-local-maifee
Open

create model from local file#269
maifeeulasad wants to merge 16 commits intoollama:mainfrom
maifeeulasad:modelfile-local-maifee

Conversation

@maifeeulasad
Copy link

@maifeeulasad maifeeulasad commented Jan 16, 2026

closes #191, #194

there will be some issues with ollama API compatibility as mentioned in https://github.com/maifeeulasad/ollama-js/blob/0bff23ca558835a47c9191f603725b2227d0f309/src/index.ts#L62 , we need to discuss further on this one

i will be writing

  • tests
  • documentation
  • stess testing script, to validate if this multi step file upload approach is more efficient or not

 - calculate sha256
 - check existance
 - upload blob
 - create blob file map
 - replace model file with bolb reference
 - multi step approach instead of one big approach
 - upload blob first
 - validate files w hashing
 - send request to create model from given data
 - mulitple file upload supported
@maifeeulasad
Copy link
Author

@BruceMacD care to take look into this one please?

@maifeeulasad
Copy link
Author

🎯 Stress Testing Results

File Size Avg Time Min Max Total
mmproj-tinygemma3.gguf 1.01 MB 8.06ms 5.85ms 10.98ms 806ms
gte-small.Q2_K.gguf 24.08 MB 9.01ms 6.47ms 37.49ms 901ms
tinygemma3-Q8_0.gguf 45.04 MB 88.46ms 84.16ms 103.03ms 8.85s
mxbai-embed-large-v1-f16.gguf 638.58 MB 29.33ms 16.33ms 1102.73ms 2.93s

Key Findings:

  • Highly optimized blob caching: The 639MB file averaged only 29.33ms because blobs were cached after first upload
  • Consistent performance: Small variance (±5-10ms) across 100 iterations for smaller files
  • One outlier: 639MB file had one slow iteration (1102ms) vs typical 16-29ms, likely the initial uncached upload

Parallel Upload Testing:

  • 4 files with parallel degree 8 completed in 104.70ms
  • Demonstrates efficient concurrent model creation

Combined Stress Test (400 operations):

  • Phase 1 (Sequential): 400 operations across 100 iterations
  • Phase 2 (Parallel): 138.63ms for 4 models
  • Phase 3 (Burst): 129.37ms for 4 models simultaneously
  • Memory efficiency: Started at 15.84MB, peaked at 28.07MB (+12.23MB), ended at 16.75MB

💾 Memory Analysis

  • Maximum heap: 28.07MB during combined stress test
  • Extremely efficient, considering 808 model creation operations

⚡ Performance Comparison: ollama-js vs curl

File Curl Time ollama-js Time Speedup
1.01 MB 26.63ms 1.36ms 19.6× faster
24.08 MB 45.60ms 1.52ms 30× faster
45.04 MB 60.78ms 1.85ms 32.9× faster
638.58 MB 493.04ms 1.57ms 314× faster

Command Needed

  • ITERATIONS=100 PARALLEL_DEGREE=8 npx tsx examples/create-from-files/stess-test.ts
  • ollama list | awk 'NR>1 {print $1}' | grep '^stress-test-' | xargs -r ollama rm

Copy link
Member

@BruceMacD BruceMacD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all your hard work submitting this, I appreciate it.

The file upload flow gets the big picture right, blob upload via streaming, then a JSON create request with digest references. Nice work on that. A few things to address.

@maifeeulasad
Copy link
Author

All issues have been addressed. Care to take another look please? @BruceMacD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Creating Ollama Model from Local Files

2 participants