Skip to content

Commit d4aef5c

Browse files
committed
AI: Rename vision.ApiRequestOptions to vision.ModelOptions
Signed-off-by: Michael Mayer <[email protected]>
1 parent 068d5db commit d4aef5c

File tree

12 files changed

+86
-74
lines changed

12 files changed

+86
-74
lines changed

internal/ai/vision/README.md

Lines changed: 23 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
## PhotoPrism — Vision Package
22

3-
**Last Updated:** November 25, 2025
3+
**Last Updated:** December 2, 2025
44

55
### Overview
66

@@ -51,20 +51,29 @@ The `vision.yml` file is usually kept in the `storage/config` directory (overrid
5151
5252
#### Model Options
5353

54-
| Option | Default | Description |
55-
|-------------------|-----------------------------------------------------------------------------------------|------------------------------------------------------------------------------------|
56-
| `Temperature` | engine default (`0.1` for Ollama; unset for OpenAI) | Controls randomness; clamped to `[0,2]`. `gpt-5*` OpenAI models are forced to `0`. |
57-
| `TopP` | engine default (`0.9` for some Ollama label defaults; unset for OpenAI) | Nucleus sampling parameter. |
58-
| `MaxOutputTokens` | engine default (OpenAI caption 512, labels 1024; Ollama label default 256) | Upper bound on generated tokens; adapters raise low values to defaults. |
59-
| `ForceJson` | engine-specific (`true` for OpenAI labels; `false` for Ollama labels; captions `false`) | Forces structured output when enabled. |
60-
| `SchemaVersion` | derived from schema name | Override when coordinating schema migrations. |
61-
| `Stop` | engine default | Array of stop sequences (e.g., `["\\n\\n"]`). |
62-
| `NumThread` | runtime auto | Caps CPU threads for local engines. |
63-
| `NumCtx` | engine default | Context window length (tokens). |
54+
The model `Options` adjust model parameters such as temperature, top-p, and schema constraints when using [Ollama](ollama/README.md) or [OpenAI](openai/README.md):
55+
56+
| Option | Default | Description |
57+
|-------------------|-----------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|
58+
| `Temperature` | engine default (`0.1` for Ollama) | Controls randomness with a value between `0.01` and `2.0`; not used for OpenAI's GPT-5. |
59+
| `TopK` | engine default (model-specific) | Limits sampling to the top K tokens to reduce rare or noisy outputs. |
60+
| `TopP` | engine default (`0.9` for some Ollama label defaults; unset for OpenAI) | Nucleus sampling; keeps the smallest token set whose cumulative probability ≥ `p`. |
61+
| `MinP` | engine default (unset unless provided) | Drops tokens whose probability mass is below `p`, trimming the long tail. |
62+
| `TypicalP` | engine default (unset unless provided) | Keeps tokens with typicality under the threshold; combine with TopP/MinP for flow. |
63+
| `Seed` | random per run (unless set) | Fix for reproducible outputs; unset for more variety between runs. |
64+
| `RepeatLastN` | engine default (model-specific) | Number of recent tokens considered for repetition penalties. |
65+
| `RepeatPenalty` | engine default (model-specific) | Multiplier >1 discourages repeating the same tokens or phrases. |
66+
| `NumPredict` | engine default (Ollama only) | Ollama-specific max output tokens; synonymous intent with `MaxOutputTokens`. |
67+
| `MaxOutputTokens` | engine default (OpenAI caption 512, labels 1024) | Upper bound on generated tokens; adapters raise low values to defaults. |
68+
| `ForceJson` | engine-specific (`true` for OpenAI labels; `false` for Ollama labels; captions `false`) | Forces structured output when enabled. |
69+
| `SchemaVersion` | derived from schema name | Override when coordinating schema migrations. |
70+
| `Stop` | engine default | Array of stop sequences (e.g., `["\\n\\n"]`). |
71+
| `NumThread` | runtime auto | Caps CPU threads for local engines. |
72+
| `NumCtx` | engine default | Context window length (tokens). |
6473

6574
#### Model Service
6675

67-
Used for Ollama/OpenAI (and any future HTTP engines). All credentials and identifiers support `${ENV_VAR}` expansion.
76+
Configures the endpoint URL, method, format, and authentication for [Ollama](ollama/README.md), [OpenAI](openai/README.md), and other engines that perform remote HTTP requests:
6877

6978
| Field | Default | Notes |
7079
|------------------------------------|------------------------------------------|------------------------------------------------------|
@@ -78,6 +87,8 @@ Used for Ollama/OpenAI (and any future HTTP engines). All credentials and identi
7887
| `FileScheme` | set by engine alias (`data` or `base64`) | Controls image transport. |
7988
| `Disabled` | `false` | Disable the endpoint without removing the model. |
8089

90+
> **Authentication:** All credentials and identifiers support `${ENV_VAR}` expansion. `Service.Key` sets `Authorization: Bearer <token>`; `Username`/`Password` injects HTTP basic authentication into the service URI when it is not already present.
91+
8192
### Field Behavior & Precedence
8293

8394
- Model identifier resolution order: `Service.Model``Model``Name`. `Model.GetModel()` returns `(id, name, version)` where Ollama receives `name:version` and other engines receive `name` plus a separate `Version`.

internal/ai/vision/api_request.go

Lines changed: 1 addition & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -32,43 +32,6 @@ const (
3232
logDataTruncatedSuffix = "... (truncated)"
3333
)
3434

35-
// ApiRequestOptions represents additional model parameters listed in the documentation.
36-
type ApiRequestOptions struct {
37-
NumKeep int `yaml:"NumKeep,omitempty" json:"num_keep,omitempty"`
38-
Seed int `yaml:"Seed,omitempty" json:"seed,omitempty"`
39-
NumPredict int `yaml:"NumPredict,omitempty" json:"num_predict,omitempty"`
40-
TopK int `yaml:"TopK,omitempty" json:"top_k,omitempty"`
41-
TopP float64 `yaml:"TopP,omitempty" json:"top_p,omitempty"`
42-
MinP float64 `yaml:"MinP,omitempty" json:"min_p,omitempty"`
43-
TfsZ float64 `yaml:"TfsZ,omitempty" json:"tfs_z,omitempty"`
44-
TypicalP float64 `yaml:"TypicalP,omitempty" json:"typical_p,omitempty"`
45-
RepeatLastN int `yaml:"RepeatLastN,omitempty" json:"repeat_last_n,omitempty"`
46-
Temperature float64 `yaml:"Temperature,omitempty" json:"temperature,omitempty"`
47-
RepeatPenalty float64 `yaml:"RepeatPenalty,omitempty" json:"repeat_penalty,omitempty"`
48-
PresencePenalty float64 `yaml:"PresencePenalty,omitempty" json:"presence_penalty,omitempty"`
49-
FrequencyPenalty float64 `yaml:"FrequencyPenalty,omitempty" json:"frequency_penalty,omitempty"`
50-
Mirostat int `yaml:"Mirostat,omitempty" json:"mirostat,omitempty"`
51-
MirostatTau float64 `yaml:"MirostatTau,omitempty" json:"mirostat_tau,omitempty"`
52-
MirostatEta float64 `yaml:"MirostatEta,omitempty" json:"mirostat_eta,omitempty"`
53-
PenalizeNewline bool `yaml:"PenalizeNewline,omitempty" json:"penalize_newline,omitempty"`
54-
Stop []string `yaml:"Stop,omitempty" json:"stop,omitempty"`
55-
Numa bool `yaml:"Numa,omitempty" json:"numa,omitempty"`
56-
NumCtx int `yaml:"NumCtx,omitempty" json:"num_ctx,omitempty"`
57-
NumBatch int `yaml:"NumBatch,omitempty" json:"num_batch,omitempty"`
58-
NumGpu int `yaml:"NumGpu,omitempty" json:"num_gpu,omitempty"`
59-
MainGpu int `yaml:"MainGpu,omitempty" json:"main_gpu,omitempty"`
60-
LowVram bool `yaml:"LowVram,omitempty" json:"low_vram,omitempty"`
61-
VocabOnly bool `yaml:"VocabOnly,omitempty" json:"vocab_only,omitempty"`
62-
UseMmap bool `yaml:"UseMmap,omitempty" json:"use_mmap,omitempty"`
63-
UseMlock bool `yaml:"UseMlock,omitempty" json:"use_mlock,omitempty"`
64-
NumThread int `yaml:"NumThread,omitempty" json:"num_thread,omitempty"`
65-
MaxOutputTokens int `yaml:"MaxOutputTokens,omitempty" json:"max_output_tokens,omitempty"`
66-
Detail string `yaml:"Detail,omitempty" json:"detail,omitempty"`
67-
ForceJson bool `yaml:"ForceJson,omitempty" json:"force_json,omitempty"`
68-
SchemaVersion string `yaml:"SchemaVersion,omitempty" json:"schema_version,omitempty"`
69-
CombineOutputs string `yaml:"CombineOutputs,omitempty" json:"combine_outputs,omitempty"`
70-
}
71-
7235
// ApiRequestContext represents a context parameter returned from a previous request.
7336
type ApiRequestContext = []int
7437

@@ -84,7 +47,7 @@ type ApiRequest struct {
8447
Url string `form:"url" yaml:"Url,omitempty" json:"url,omitempty"`
8548
Org string `form:"org" yaml:"Org,omitempty" json:"org,omitempty"`
8649
Project string `form:"project" yaml:"Project,omitempty" json:"project,omitempty"`
87-
Options *ApiRequestOptions `form:"options" yaml:"Options,omitempty" json:"options,omitempty"`
50+
Options *ModelOptions `form:"options" yaml:"Options,omitempty" json:"options,omitempty"`
8851
Context *ApiRequestContext `form:"context" yaml:"Context,omitempty" json:"context,omitempty"`
8952
Stream bool `form:"stream" yaml:"Stream,omitempty" json:"stream"`
9053
Images Files `form:"images" yaml:"Images,omitempty" json:"images,omitempty"`

internal/ai/vision/engine.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ type EngineDefaults interface {
3636
SystemPrompt(model *Model) string
3737
UserPrompt(model *Model) string
3838
SchemaTemplate(model *Model) string
39-
Options(model *Model) *ApiRequestOptions
39+
Options(model *Model) *ModelOptions
4040
}
4141

4242
// Engine groups the callbacks required to integrate a third-party vision service.

internal/ai/vision/engine_ollama.go

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -78,20 +78,20 @@ func (ollamaDefaults) SchemaTemplate(model *Model) string {
7878
}
7979

8080
// Options returns the Ollama service request options.
81-
func (ollamaDefaults) Options(model *Model) *ApiRequestOptions {
81+
func (ollamaDefaults) Options(model *Model) *ModelOptions {
8282
if model == nil {
8383
return nil
8484
}
8585

8686
switch model.Type {
8787
case ModelTypeLabels:
88-
return &ApiRequestOptions{
88+
return &ModelOptions{
8989
Temperature: DefaultTemperature,
9090
TopP: 0.9,
9191
Stop: []string{"\n\n"},
9292
}
9393
case ModelTypeCaption:
94-
return &ApiRequestOptions{
94+
return &ModelOptions{
9595
Temperature: DefaultTemperature,
9696
}
9797
default:

internal/ai/vision/engine_openai.go

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,19 +80,19 @@ func (openaiDefaults) SchemaTemplate(model *Model) string {
8080
}
8181

8282
// Options returns default OpenAI request options for the model.
83-
func (openaiDefaults) Options(model *Model) *ApiRequestOptions {
83+
func (openaiDefaults) Options(model *Model) *ModelOptions {
8484
if model == nil {
8585
return nil
8686
}
8787

8888
switch model.Type {
8989
case ModelTypeCaption:
90-
return &ApiRequestOptions{
90+
return &ModelOptions{
9191
Detail: openai.DefaultDetail,
9292
MaxOutputTokens: openai.CaptionMaxTokens,
9393
}
9494
case ModelTypeLabels:
95-
return &ApiRequestOptions{
95+
return &ModelOptions{
9696
Detail: openai.DefaultDetail,
9797
MaxOutputTokens: openai.LabelsMaxTokens,
9898
ForceJson: true,

internal/ai/vision/engine_openai_test.go

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ func TestOpenAIBuilderBuildCaptionDisablesForceJSON(t *testing.T) {
4040
Type: ModelTypeCaption,
4141
Name: openai.DefaultModel,
4242
Engine: openai.EngineName,
43-
Options: &ApiRequestOptions{ForceJson: true},
43+
Options: &ModelOptions{ForceJson: true},
4444
}
4545
model.ApplyEngineDefaults()
4646

@@ -59,7 +59,7 @@ func TestApiRequestJSONForOpenAI(t *testing.T) {
5959
Prompt: "describe the scene",
6060
Images: []string{"data:image/jpeg;base64,AA=="},
6161
ResponseFormat: ApiFormatOpenAI,
62-
Options: &ApiRequestOptions{
62+
Options: &ModelOptions{
6363
Detail: openai.DefaultDetail,
6464
MaxOutputTokens: 128,
6565
Temperature: 0.2,
@@ -111,7 +111,7 @@ func TestApiRequestJSONForOpenAIDefaultSchemaName(t *testing.T) {
111111
Model: "gpt-5-mini",
112112
Images: []string{"data:image/jpeg;base64,AA=="},
113113
ResponseFormat: ApiFormatOpenAI,
114-
Options: &ApiRequestOptions{
114+
Options: &ModelOptions{
115115
Detail: openai.DefaultDetail,
116116
MaxOutputTokens: 64,
117117
ForceJson: true,
@@ -254,7 +254,7 @@ func TestPerformApiRequestOpenAISuccess(t *testing.T) {
254254
Model: "gpt-5-mini",
255255
Images: []string{"data:image/jpeg;base64,AA=="},
256256
ResponseFormat: ApiFormatOpenAI,
257-
Options: &ApiRequestOptions{
257+
Options: &ModelOptions{
258258
Detail: openai.DefaultDetail,
259259
},
260260
Schema: json.RawMessage(`{"type":"object"}`),
@@ -299,7 +299,7 @@ func TestPerformApiRequestOpenAITextFallback(t *testing.T) {
299299
Model: "gpt-5-mini",
300300
Images: []string{"data:image/jpeg;base64,AA=="},
301301
ResponseFormat: ApiFormatOpenAI,
302-
Options: &ApiRequestOptions{
302+
Options: &ModelOptions{
303303
Detail: openai.DefaultDetail,
304304
},
305305
Schema: nil,

internal/ai/vision/model.go

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ type Model struct {
4646
SchemaFile string `yaml:"SchemaFile,omitempty" json:"schemaFile,omitempty"`
4747
Resolution int `yaml:"Resolution,omitempty" json:"resolution,omitempty"`
4848
TensorFlow *tensorflow.ModelInfo `yaml:"TensorFlow,omitempty" json:"tensorflow,omitempty"`
49-
Options *ApiRequestOptions `yaml:"Options,omitempty" json:"options,omitempty"`
49+
Options *ModelOptions `yaml:"Options,omitempty" json:"options,omitempty"`
5050
Service Service `yaml:"Service,omitempty" json:"service,omitempty"`
5151
Path string `yaml:"Path,omitempty" json:"-"`
5252
Disabled bool `yaml:"Disabled,omitempty" json:"disabled,omitempty"`
@@ -334,12 +334,12 @@ func (m *Model) GetSource() string {
334334

335335
// GetOptions returns the API request options, applying engine defaults on
336336
// demand. Nil receivers return nil.
337-
func (m *Model) GetOptions() *ApiRequestOptions {
337+
func (m *Model) GetOptions() *ModelOptions {
338338
if m == nil {
339339
return nil
340340
}
341341

342-
var engineDefaults *ApiRequestOptions
342+
var engineDefaults *ModelOptions
343343
if defaults := m.engineDefaults(); defaults != nil {
344344
engineDefaults = cloneOptions(defaults.Options(m))
345345
}
@@ -348,7 +348,7 @@ func (m *Model) GetOptions() *ApiRequestOptions {
348348
switch m.Type {
349349
case ModelTypeLabels, ModelTypeCaption, ModelTypeGenerate:
350350
if engineDefaults == nil {
351-
engineDefaults = &ApiRequestOptions{}
351+
engineDefaults = &ModelOptions{}
352352
}
353353
normalizeOptions(engineDefaults)
354354
m.Options = engineDefaults
@@ -364,7 +364,7 @@ func (m *Model) GetOptions() *ApiRequestOptions {
364364
return m.Options
365365
}
366366

367-
func mergeOptionDefaults(target, defaults *ApiRequestOptions) {
367+
func mergeOptionDefaults(target, defaults *ModelOptions) {
368368
if target == nil || defaults == nil {
369369
return
370370
}
@@ -402,7 +402,7 @@ func mergeOptionDefaults(target, defaults *ApiRequestOptions) {
402402
}
403403
}
404404

405-
func normalizeOptions(opts *ApiRequestOptions) {
405+
func normalizeOptions(opts *ModelOptions) {
406406
if opts == nil {
407407
return
408408
}
@@ -412,7 +412,7 @@ func normalizeOptions(opts *ApiRequestOptions) {
412412
}
413413
}
414414

415-
func cloneOptions(opts *ApiRequestOptions) *ApiRequestOptions {
415+
func cloneOptions(opts *ModelOptions) *ModelOptions {
416416
if opts == nil {
417417
return nil
418418
}
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
package vision
2+
3+
// ModelOptions represents additional model parameters listed in the documentation.
4+
type ModelOptions struct {
5+
NumKeep int `yaml:"NumKeep,omitempty" json:"num_keep,omitempty"` // Ollama ↓
6+
Seed int `yaml:"Seed,omitempty" json:"seed,omitempty"`
7+
NumPredict int `yaml:"NumPredict,omitempty" json:"num_predict,omitempty"`
8+
Temperature float64 `yaml:"Temperature,omitempty" json:"temperature,omitempty"`
9+
TopK int `yaml:"TopK,omitempty" json:"top_k,omitempty"`
10+
TopP float64 `yaml:"TopP,omitempty" json:"top_p,omitempty"`
11+
MinP float64 `yaml:"MinP,omitempty" json:"min_p,omitempty"`
12+
TypicalP float64 `yaml:"TypicalP,omitempty" json:"typical_p,omitempty"`
13+
TfsZ float64 `yaml:"TfsZ,omitempty" json:"tfs_z,omitempty"`
14+
RepeatLastN int `yaml:"RepeatLastN,omitempty" json:"repeat_last_n,omitempty"`
15+
RepeatPenalty float64 `yaml:"RepeatPenalty,omitempty" json:"repeat_penalty,omitempty"`
16+
PresencePenalty float64 `yaml:"PresencePenalty,omitempty" json:"presence_penalty,omitempty"`
17+
FrequencyPenalty float64 `yaml:"FrequencyPenalty,omitempty" json:"frequency_penalty,omitempty"`
18+
Mirostat int `yaml:"Mirostat,omitempty" json:"mirostat,omitempty"`
19+
MirostatTau float64 `yaml:"MirostatTau,omitempty" json:"mirostat_tau,omitempty"`
20+
MirostatEta float64 `yaml:"MirostatEta,omitempty" json:"mirostat_eta,omitempty"`
21+
PenalizeNewline bool `yaml:"PenalizeNewline,omitempty" json:"penalize_newline,omitempty"`
22+
Stop []string `yaml:"Stop,omitempty" json:"stop,omitempty"`
23+
Numa bool `yaml:"Numa,omitempty" json:"numa,omitempty"`
24+
NumCtx int `yaml:"NumCtx,omitempty" json:"num_ctx,omitempty"`
25+
NumBatch int `yaml:"NumBatch,omitempty" json:"num_batch,omitempty"`
26+
NumGpu int `yaml:"NumGpu,omitempty" json:"num_gpu,omitempty"`
27+
MainGpu int `yaml:"MainGpu,omitempty" json:"main_gpu,omitempty"`
28+
LowVram bool `yaml:"LowVram,omitempty" json:"low_vram,omitempty"`
29+
VocabOnly bool `yaml:"VocabOnly,omitempty" json:"vocab_only,omitempty"`
30+
UseMmap bool `yaml:"UseMmap,omitempty" json:"use_mmap,omitempty"`
31+
UseMlock bool `yaml:"UseMlock,omitempty" json:"use_mlock,omitempty"`
32+
NumThread int `yaml:"NumThread,omitempty" json:"num_thread,omitempty"`
33+
MaxOutputTokens int `yaml:"MaxOutputTokens,omitempty" json:"max_output_tokens,omitempty"` // OpenAI ↓
34+
Detail string `yaml:"Detail,omitempty" json:"detail,omitempty"`
35+
ForceJson bool `yaml:"ForceJson,omitempty" json:"force_json,omitempty"`
36+
SchemaVersion string `yaml:"SchemaVersion,omitempty" json:"schema_version,omitempty"`
37+
CombineOutputs string `yaml:"CombineOutputs,omitempty" json:"combine_outputs,omitempty"`
38+
}

0 commit comments

Comments
 (0)