upsert performance degradation #9685

dimitryshamonin · 2026-04-15T10:35:45Z

dimitryshamonin
Apr 15, 2026

Greetings!

In our project we handle around 300-400 mil of objects (around 20 keys each). About 50 GB of compressed data on disk.

We have a dgraph cluster of 12 alpha/ 4 groups (each node - 24 CPU/ 60+ Gb ram).

Data update takes about 5-6 hours (which is fine), however we do notice heavy write performance degradation over upload period (from ~20000 objects/s to ~5000):

Update itself is handled with upserts - batches of 500 objects in 8 parallel streams.

Single mutation looks like this:

  query { 
   X1 as var(func: eq(hash, \"12345\"))
  }
  mutations {
  set_nquads: "_:X1 <dgraph.type> \"PricePoint\" .
  _:X1 <hash> \"12345\" .
  _:X1 <key1> \"AAA\" .
  _:X1 <updated> \"2026-04-15T10:02:31.144165Z\" .
  _:X1 <AT> <0x342735> .
  _:X1 <AT> <0x36b0b2> .
  _:X1 <is_visible> \"false\" .
  _:X1 <is_editable> \"false\" .
  cond: "@if(eq(len(X1), 0))"
  }
  mutations @if(gt(len(X1), 0)) {
  set_nquads: "uid(X1) <updated> \"2026-04-15T10:02:31.144173Z\" .
  cond: "@if(gt(len(X1), 0))"
  }

Current alpha config:

cache:
  size-mb: 40000
  percentage: 10,45,45

badger:
  numcompactors: 4
  memtablesize: 67108864
  basetablesize: 33554432
  baselevelsize: 268435456
  nummemtables: 7
  numlevelzerotables: 5
  numlevelzerotablesstall: 15
  zstdcompressionlevel: 1
  bloomfalsepositive: 0.001
  maxlevels: 7
  levelsizemultiplier: 10

raft:
  pending_proposals: 32768
  snapshot-after-duration: 15m
  snapshot-after-entries: 100000

However tinkering with the settings does not seem to affect the rps much.

Cluster logs does not contains anything interesting except compactions.

Is this behavior normal or are we missing something?

matthewmcneely · 2026-04-16T16:56:22Z

matthewmcneely
Apr 16, 2026
Maintainer

Hi @dimitryshamonin, You didn't specify which version you're running, versions prior to v25.1 had some issues with the posting list cache management.

But regardless, the slowdown in mutation performance is not surprising as the node count in your graph increases. Each mutation must achieve Raft consensus within its group (majority acknowledgment + WAL writes across replicas), so more nodes per group means more network round trips and disk syncs per proposal. Additionally, with more groups, mutations touching predicates on different groups require cross-group distributed transaction coordination, adding further latency. Because Dgraph guarantees strong consistency, this is the tradeoff.

I do see one thing in your cache settings that might move the needle. At the moment, you're only giving the posting list cache 4GB (the first part of your 10,45,45 partitioning of 40GB). Every eq(hash, ...) upsert lookup hits this cache, and with 300-400M objects it's going to thrash badly. I'd look at that first.

Also, since you have the cores, I'd double the numcompactors. Sustained writes will block if the L0 block fills up faster than the compactors can drain it.

Hope this helps. And please post your results (good or bad) here so we can follow along.

0 replies

dimitryshamonin · 2026-04-23T12:00:55Z

dimitryshamonin
Apr 23, 2026
Author

Hi @matthewmcneely,

We plan to do next load tests this friday - ill update the results.

We have tried to change the cache distribution, however it does not seem to change much.

We usually see one hour of rps gains with spikes after predicate movement and than - constant degradation. Like this:

Strange that this timings is almost identical for more then 20 test uploads with different group/badger settings.

The performance is fine for our task, but it is good to know if you can squeeze some extra rps in case of SLA change.

P.S. We are using 25.3

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dgraph

upsert performance degradation #9685

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Dgraph

upsert performance degradation #9685

Uh oh!

dimitryshamonin Apr 15, 2026

Replies: 2 comments

Uh oh!

matthewmcneely Apr 16, 2026 Maintainer

Uh oh!

dimitryshamonin Apr 23, 2026 Author

dimitryshamonin
Apr 15, 2026

matthewmcneely
Apr 16, 2026
Maintainer

dimitryshamonin
Apr 23, 2026
Author