Skip to content

Conversation

@Reskov
Copy link
Contributor

@Reskov Reskov commented Jun 7, 2025

We should balance connection count carefully:

  • This workload is both read and write intensive
  • A small table (10,000 rows) requires fewer connections than a large one
  • min_size = max_size to preallocate connections

@Reskov
Copy link
Contributor Author

Reskov commented Jun 7, 2025

@Dreamsorcerer I am not 💯 percent sure, but 1800 connections looks to high number for table with only 10000 rows. This change should speed up the update at least, but I am not sure will it hurt the read performance or not.

@Dreamsorcerer
Copy link
Contributor

Interesting, I've no idea...

@Reskov
Copy link
Contributor Author

Reskov commented Jun 7, 2025

The request handler will be exercised with query counts of 1, 5, 10, 15, and 20.

Yeah, for the last update test /updates/20 we will have 1800*20 = 36000 pending rows to update which is much higher than existing 10000 rows. I assume that is the reason. I propose to start with a small number like 1,2 or 3 and see the performance impact. After we can bisect between 32 which is max and 3 depending on results.

So, for example release with 3, after with (32+3) / 2 = 17 if worse next decreased to (17+3) / 2=10 if better next release increased to (32+17) / 2=24

@Dreamsorcerer
Copy link
Contributor

1800*20 = 36000

You said there was 56 CPUs, so there'd be 32 connections per process. So, you're currently reducing it from 32 -> 3.

@Reskov
Copy link
Contributor Author

Reskov commented Jun 7, 2025

Yes, correct. I suppose I even must decrease to 2 instead...

From this article here is test env https://www.techempower.com/benchmarks/#section=environment

Three homogeneous ProLiant DL360 Gen10 Plus equipped with Intel Xeon Gold 6330 CPU @ 2.00GHz (56 cores), 64 GB of memory, an enterprise SSD, and Mellanox Technologies MT28908 Family [ConnectX-6] 40Gbps Ethernet.

From postgres suggestion
https://wiki.postgresql.org/wiki/Number_Of_Database_Connections

A formula which has held up pretty well across a lot of benchmarks for years is that for optimal throughput the number of active connections should be somewhere near ((core_count * 2) + effective_spindle_count).

so in our case it will be 56*2 suggested total number of connections, effective_spindle_count=0 since all data is cached. we should have pool size set to 2, as per recommendation.

@Reskov
Copy link
Contributor Author

Reskov commented Jun 8, 2025

Local testing is not relevant since both database, application, test suite are sharing the same resources.
I've tried different pool sizes, and looks like it doesn't matter. Anyway I want to check on real and see how it goes.

Pool size = 2

queries

 Queries: 10 for query
 wrk -H 'Host: tfb-server' -H 'Accept: application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7' -H 'Connection: keep-alive' --latency -d 30 -c 32 --timeout 8 -t 6 "http://tfb-server:8080/queries/10"
---------------------------------------------------------
Running 30s test @ http://tfb-server:8080/queries/10
  6 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   817.80us  453.40us  11.74ms   84.55%
    Req/Sec     6.33k     0.96k   16.43k    66.63%
  Latency Distribution
     50%  704.00us
     75%    0.96ms
     90%    1.32ms
     99%    2.41ms
  1134242 requests in 30.10s, 502.75MB read
Requests/sec:  37682.98
Transfer/sec:     16.70MB

updates

Running 30s test @ http://tfb-server:8080/updates/10
  6 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.61ms    1.00ms  28.34ms   81.00%
    Req/Sec     3.21k   815.09     5.29k    62.67%
  Latency Distribution
     50%    1.45ms
     75%    2.07ms
     90%    2.73ms
     99%    4.18ms
  575038 requests in 30.00s, 254.89MB read
Requests/sec:  19165.28
Transfer/sec:      8.50MB

Pool size = 3

queries

  Queries: 10 for query
 wrk -H 'Host: tfb-server' -H 'Accept: application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7' -H 'Connection: keep-alive' --latency -d 30 -c 32 --timeout 8 -t 6 "http://tfb-server:8080/queries/10"
---------------------------------------------------------
Running 30s test @ http://tfb-server:8080/queries/10
  6 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.48ms    4.27ms 116.09ms   96.22%
    Req/Sec     5.98k     1.27k   16.38k    76.58%
  Latency Distribution
     50%  751.00us
     75%    0.99ms
     90%    1.40ms
     99%   20.20ms
  1072843 requests in 30.10s, 475.54MB read
Requests/sec:  35642.81
Transfer/sec:     15.80MB

update

 Queries: 10 for update
 wrk -H 'Host: tfb-server' -H 'Accept: application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7' -H 'Connection: keep-alive' --latency -d 30 -c 32 --timeout 8 -t 6 "http://tfb-server:8080/updates/10"
---------------------------------------------------------
Running 30s test @ http://tfb-server:8080/updates/10
  6 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.51ms  715.22us  31.35ms   85.04%
    Req/Sec     3.38k   473.51     4.91k    70.11%
  Latency Distribution
     50%    1.42ms
     75%    1.76ms
     90%    2.17ms
     99%    3.46ms
  605289 requests in 30.01s, 268.30MB read
Requests/sec:  20171.31
Transfer/sec:      8.94MB

Pool size = 10

queries

 Queries: 10 for query
 wrk -H 'Host: tfb-server' -H 'Accept: application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7' -H 'Connection: keep-alive' --latency -d 30 -c 32 --timeout 8 -t 6 "http://tfb-server:8080/queries/10"
---------------------------------------------------------
Running 30s test @ http://tfb-server:8080/queries/10
  6 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   794.09us  455.20us  21.88ms   85.86%
    Req/Sec     6.46k   823.88    11.44k    68.77%
  Latency Distribution
     50%  722.00us
     75%    0.97ms
     90%    1.23ms
     99%    1.96ms
  1159625 requests in 30.10s, 514.08MB read
Requests/sec:  38525.56
Transfer/sec:     17.08MB

updates

Running 30s test @ http://tfb-server:8080/updates/10
  6 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.56ms    1.38ms  64.39ms   98.16%
    Req/Sec     3.39k   336.80     5.84k    75.15%
  Latency Distribution
     50%    1.43ms
     75%    1.73ms
     90%    2.06ms
     99%    3.90ms
  608145 requests in 30.10s, 269.56MB read
Requests/sec:  20204.33
Transfer/sec:      8.96MB

Original (160) since I have 8 cores

max_size = min(1800 / multiprocessing.cpu_count(), 160)
max_size = max(int(max_size), 1)
min_size = max(int(max_size / 2), 1)

queries

Queries: 10 for query
 wrk -H 'Host: tfb-server' -H 'Accept: application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7' -H 'Connection: keep-alive' --latency -d 30 -c 32 --timeout 8 -t 6 "http://tfb-server:8080/queries/10"
---------------------------------------------------------
Running 30s test @ http://tfb-server:8080/queries/10
  6 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   803.99us  471.80us  22.90ms   87.33%
    Req/Sec     6.40k   581.02     8.10k    71.11%
  Latency Distribution
     50%  727.00us
     75%    0.97ms
     90%    1.24ms
     99%    2.02ms
  1147227 requests in 30.01s, 508.57MB read
Requests/sec:  38232.43
Transfer/sec:     16.95MB

updates

Running 30s test @ http://tfb-server:8080/updates/10
  6 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.55ms    1.05ms  41.89ms   94.92%
    Req/Sec     3.35k   343.25     4.79k    71.14%
  Latency Distribution
     50%    1.44ms
     75%    1.79ms
     90%    2.19ms
     99%    3.89ms
  601340 requests in 30.10s, 266.55MB read
Requests/sec:  19981.01
Transfer/sec:      8.86MB

@msmith-techempower msmith-techempower merged commit 99c3e0c into TechEmpower:master Jun 9, 2025
3 checks passed
@Dreamsorcerer
Copy link
Contributor

FYI, we may also see a performance increase from new releases of multidict, yarl etc. We realised there's a performance hit in the pre-built wheels that is going to be fixed shortly.

@Reskov
Copy link
Contributor Author

Reskov commented Jun 20, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants