Skip to content

06 text default field test#15

Open
AlbaHerrerias wants to merge 153 commits intoport-python-testsfrom
port-06-text-default-field-test
Open

06 text default field test#15
AlbaHerrerias wants to merge 153 commits intoport-python-testsfrom
port-06-text-default-field-test

Conversation

@AlbaHerrerias
Copy link

Overview

Testing recommendations

Related Issues or Pull Requests

Checklist

  • Code is written and works correctly
  • Changes are covered by tests
  • Any new configurable parameters are documented in rel/overlay/etc/default.ini
  • Documentation changes were made in the src/docs folder
  • Documentation changes were backported (separated PR) to affected branches

@hulkoba hulkoba changed the title Port 06 text default field test 06 text default field test Nov 14, 2025
@hulkoba hulkoba force-pushed the port-python-tests branch 8 times, most recently from c413a13 to 25f1f55 Compare November 19, 2025 14:21
big-r81 and others added 30 commits January 24, 2026 20:36
Changing file mode doesn't work on Windows ACLs
with the settings from unix file systems. The erlang
vm doesn't set it on Windows so that those tests fail.
Exclude those tests when running on Windows.
put our AI/LLM disclaimer in the PR template
Previously, if users called the `_scheduler/docs` API at just the right moment,
when a job would appear in the replicator doc processor ets table as
`scheduled`, but it would not be in the replicator scheduler's ets table, users
would see get function clause errors that look like this along with 500 HTTP
response:

```
req_err(2666435525) unknown_error : function_clause [
 <<"couch_replicator_httpd_util:update_db_name/1 L183">>,
 <<"couch_replicator_httpd:handle_scheduler_doc/3 L157">>,
 <<"chttpd:handle_req_after_auth/2 L432">>,
 ...
]
```

To fix it, explicitly handle this state as transitional pending state returning
all the information we have about the job in the replicator doc processor.
`doc_lookup/3` could also return an `{ok, nil}` tuple if this is not the owner
node. We already expect it in the only place we call `doc_lookup/3` from but
the spec didn't have that case so we add it in here.
In the Explain output, bookmarks for queries on `text` (Search)
indexes are added in their unpacked format to the options.  This
leads to at least two problems:

- it is inconsistent with how empty bookmarks are rendered for
  `json` indexes, where the string `"nil"` is used,

- for non-trivial bookmarks, contents cannot be translated by
  `jiffy:encode/1` to a JSON representation so that it will
  crash.

Mitigate this issue by deferring the unpacking of bookmarks to
the execution phase so that explain will get affected by that.

Thanks @mojito317 for reporting this issue, and thanks @rnewson
for revisiting my original fix and guiding me about it.
…rk-formatting

fix(`mango`): formatting of `text` bookmarks in `_explain` output
Since we inspect the #full_doc_info{} record to skip deleted docs anyway, move
all the doc_id clauses to the `doc_fdi/3` callback to have one less callback to
worry about.

There is not performance penalty as by the time we call `doc_id/3` we already
have the full `#full_doc_info{}` record opened anyway.
In apache#5625 we introduced a BTree cache for dbs, and since view files
also use BTrees, let's use a cache for them as well. This should help
with concurrent view queries.

I test the improvments I used k6 for benchmarking with a 1M doc, q=16
db, and a view with 10M rows. With a 400 rps request arrival rate saw
more than a 50% improvement in average, p90, and p95 latencies:

  * Average:  14 -> 7 msec
  * Median:   12 -> 7 msec
  * P90       21 -> 10 msec
  * P95       27 -> 12 msec
  * Max       120 -> 77 msec

Since view results could have large kp node values due to custom
reduce functions, add max term limit for cache entries. To avoid
adding yet another config value use a few multiple of BTree chunk
size. This should help avoid config values clashing if users raise the
chunk size but forget to raise their cache term limit (it would render
the cache unusuable suddenly).

More detailed test results along with a test at a higher depth of 4 to
show further improvements.

```
BENCH_RATE=400 BENCH_DURATION=5m BENCH_VIEW_LIMIT=10 k6 run ./k6_couchdb_constant_arrival_view_query.js

    HTTP
    http_req_duration..............: avg=14.65ms min=2.08ms med=12.7ms  max=120.08ms p(90)=20.53ms p(95)=26.78ms
      { expected_response:true }...: avg=14.65ms min=2.08ms med=12.7ms  max=120.08ms p(90)=20.53ms p(95)=26.78ms
      { name:view_get }............: avg=14.65ms min=2.93ms med=12.7ms  max=120.08ms p(90)=20.53ms p(95)=26.78ms
    http_req_failed................: 0.00%  0 out of 120002
    http_reqs......................: 120002 399.891866/s
```

```
$ BENCH_RATE=400 BENCH_DURATION=5m BENCH_VIEW_LIMIT=10 k6 run ./k6_couchdb_constant_arrival_view_query.js

    HTTP
    http_req_duration..............: avg=7.45ms min=2.38ms med=6.9ms  max=77.78ms p(90)=10.21ms p(95)=11.5ms
      { expected_response:true }...: avg=7.45ms min=2.38ms med=6.9ms  max=77.78ms p(90)=10.21ms p(95)=11.5ms
      { name:view_get }............: avg=7.45ms min=2.38ms med=6.9ms  max=77.78ms p(90)=10.21ms p(95)=11.49ms
    http_req_failed................: 0.00%  0 out of 120004
    http_reqs......................: 120004 399.799018/s
```

```
$ BENCH_RATE=400 BENCH_DURATION=5m BENCH_VIEW_LIMIT=10 k6 run ./k6_couchdb_constant_arrival_view_query.js

    HTTP
    http_req_duration..............: avg=6.21ms min=2.22ms med=5.78ms max=77ms    p(90)=8.4ms  p(95)=9.5ms
      { expected_response:true }...: avg=6.21ms min=2.22ms med=5.78ms max=77ms    p(90)=8.4ms  p(95)=9.5ms
      { name:view_get }............: avg=6.21ms min=2.22ms med=5.78ms max=77ms    p(90)=8.4ms  p(95)=9.5ms
    http_req_failed................: 0.00%  0 out of 120003
    http_reqs......................: 120003 399.931891/s

```
- If there is no checkpoint and no `since_seq`, then replicate from scratch.
- If there is no checkpoint but `since_seq` is defined, then replicate with
  the `since_seq` field.
- If both checkpoint and `since_seq` exist, use the checkpoint to replicate.
- If the request includes `since_seq` field, the replication ID will be changed.
Previously we didn't handle the case when the "infinity" value was explicitly
set in the config file. We only handled the case when it was the default value.
When debugging processes getting stuck in hibernation bug [1] a few benchmarks
showed that hibernation can be pretty expensive. I saw 20% or so reduction in
latency in couch_work_queue if we hibernate after every single item insertion.

Erlang documents warn about this [2]:

> Use this feature with care, as hibernation implies at least two garbage
collections (when hibernating and shortly after waking up) and is not something
you want to do between each call to a busy server.

In a few places like the `couch_work_queue` and `couch_db_updater` we did
exactly that. However, since we added that more Erlang/OTP implemented a new
`gen_server` option - `{hibernate_after, Timeout}`. It will trigger hibernation
after an idle time. That seems ideal for us - it keeps expensive hibernation
out of the main data path, as docs warn us about, but once the server goes idle
we still get to run it to dereference any ref binaries.

Since we encountered the recent hibernation bug [1] also add an option to
disable it altogether, just to have a way to mitigate the issue when running on
OTP 27 and 28 before the fix is out.

[1] erlang/otp#10651
[2] https://www.erlang.org/doc/apps/stdlib/gen_server.html
[3] apache@d9eb87f
Send 404 for /_all_dbs and /_dbs_info with extra path parts
There are two main improvements:

1) Remove dynamic runtime reloading. That feature is not useful any longer
after we stopped doing online relups, and instead became a potential landmine
that can lead to metrics unexpectedly faling if stats description files, or
disk access to them changes at runtime. The only minor benefit previous
behavior had was was for local dev/run development, coupled with online module
reloading; but that hardly justifies the complexity and the confusion resulting
from stats all of the sudden breaking in production.

2) Strengthen a few asserts during stats loading. Only accept missing stats
description from applications, and stop on any other error. We don't want to
run with stats missing or unloadable or constantly spewing "unknown metric"
error.
…-arm

ci: temporarily disable freebsd-arm worker because it is too slow
…ct-index-test

Port Python test to Elixir: `12_use_correct_index_test`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.