|
| 1 | +Tracking resource usage in Glusterd2 |
| 2 | +==================================== |
| 3 | + |
| 4 | +Tracing and profiling with `pprof` is intended for developers only. This can be |
| 5 | +used for debugging memory leaks/consumption, slow flows through the code and so |
| 6 | +on. Users will normally not want profiling enabled on their production systems. |
| 7 | +To investigate memory allocations in Glusterd2, it is needed to enable the |
| 8 | +profiling feature. This can be done by adding `"profiling": true` in |
| 9 | +the `--config` file. |
| 10 | +Enabling profiling makes standard Golang pprof endpoints available. For memory |
| 11 | +allocations `/debug/pprof/heap` is most useful. |
| 12 | +Capturing a snapshot of the current allocations in the Glusterd2 is pretty |
| 13 | +simple. On the node running Glusterd2, the go pprof tool command can be used: |
| 14 | +``` |
| 15 | +[root@fedora3 glusterd2]# go tool pprof http://localhost:24007/debug/pprof/heap |
| 16 | +File: glusterd2 |
| 17 | +Build ID: 7a94c2e498445577aaf7f910d6ef1c3adc19d553 |
| 18 | +Type: inuse_space |
| 19 | +Time: Nov 28, 2018 at 2:55pm (IST) |
| 20 | +Entering interactive mode (type "help" for commands, "o" for options) |
| 21 | +(pprof) |
| 22 | +``` |
| 23 | + |
| 24 | +For working with the size of the allocations, it is helpful to set all sizes to |
| 25 | +`megabytes`, otherwise `auto` is used as a `unit` and that would add human |
| 26 | +readable B, kB, MB postfixes. This however is not useful for sorting with |
| 27 | +scripts. So, set the `unit` to `megabytes` instead: |
| 28 | + |
| 29 | +``` |
| 30 | +(pprof) unit=megabytes |
| 31 | +``` |
| 32 | + |
| 33 | +``` |
| 34 | +(pprof) top |
| 35 | +Showing nodes accounting for 330.75MB, 98.52% of 335.73MB total |
| 36 | +Dropped 305 nodes (cum <= 1.68MB) |
| 37 | +Showing top 10 nodes out of 14 |
| 38 | + flat flat% sum% cum cum% |
| 39 | + 326MB 97.10% 97.10% 326MB 97.10% github.com/gluster/glusterd2/pkg/sunrpc.ReadFullRecord /root/work/src/github.com/gluster/glusterd2/pkg/sunrpc/record.go |
| 40 | + 2.38MB 0.71% 97.81% 2.38MB 0.71% github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/rafthttp.startStreamWriter /root/work/src/github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/rafthttp/stream.go |
| 41 | + 2.38MB 0.71% 98.52% 4.77MB 1.42% github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/rafthttp.startPeer /root/work/src/github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/rafthttp/peer.go |
| 42 | + 0 0% 98.52% 326.01MB 97.10% github.com/gluster/glusterd2/pkg/sunrpc.(*serverCodec).ReadRequestHeader /root/work/src/github.com/gluster/glusterd2/pkg/sunrpc/servercodec.go |
| 43 | + 0 0% 98.52% 4.83MB 1.44% github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).apply /root/work/src/github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/etcdserver/server.go |
| 44 | + 0 0% 98.52% 4.83MB 1.44% github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).applyAll /root/work/src/github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/etcdserver/server.go |
| 45 | + 0 0% 98.52% 4.77MB 1.42% github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).applyConfChange /root/work/src/github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/etcdserver/server.go |
| 46 | + 0 0% 98.52% 4.83MB 1.44% github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).applyEntries /root/work/src/github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/etcdserver/server.go |
| 47 | + 0 0% 98.52% 4.83MB 1.44% github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).run.func6 /root/work/src/github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/etcdserver/server.go |
| 48 | + 0 0% 98.52% 4.83MB 1.44% github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/pkg/schedule.(*fifo).run /root/work/src/github.com/gluster/glusterd2/vendor/github.com/coreos/etcd/pkg/schedule/schedule.go |
| 49 | +(pprof) |
| 50 | +``` |
| 51 | + |
| 52 | +Looking at the above output, we can see that the first line consumes hundreds |
| 53 | +of MB while the other lines are just a fraction of it. |
| 54 | +This makes sure that ReadFullRecord is consuming more memory than required. |
| 55 | +To understand things even more clearly, we can see which line of ReadFullRecord |
| 56 | +consumes the memory. To view that use the list command: |
| 57 | + |
| 58 | +``` |
| 59 | +(pprof) list ReadFullRecord |
| 60 | +Total: 335.73MB |
| 61 | +ROUTINE ======================== github.com/gluster/glusterd2/pkg/sunrpc.ReadFullRecord in /root/work/src/github.com/gluster/glusterd2/pkg/sunrpc/clientcodec.go |
| 62 | + 8.11kB 8.11kB (flat, cum) 0.0024% of Total |
| 63 | + . . 1:package sunrpc |
| 64 | + . . 2: |
| 65 | + . . 3:import ( |
| 66 | + 8.11kB 8.11kB 4: "bytes" |
| 67 | + . . 5: "io" |
| 68 | + . . 6: "net" |
| 69 | + . . 7: "net/rpc" |
| 70 | + . . 8: "sync" |
| 71 | + . . 9: |
| 72 | +ROUTINE ======================== github.com/gluster/glusterd2/pkg/sunrpc.ReadFullRecord in /root/work/src/github.com/gluster/glusterd2/pkg/sunrpc/record.go |
| 73 | + 326MB 326MB (flat, cum) 97.10% of Total |
| 74 | + . . 96:func ReadFullRecord(conn io.Reader) ([]byte, error) { |
| 75 | + . . 97: |
| 76 | + . . 98: // In almost all cases, RPC message contain only one fragment which |
| 77 | + . . 99: // is not too big in size. But set a cap on buffer size to prevent |
| 78 | + . . 100: // rogue clients from filling up memory. |
| 79 | + 326MB 326MB 101: record := bytes.NewBuffer(make([]byte, 0, maxRecordSize)) |
| 80 | + . . 102: var fragmentHeader uint32 |
| 81 | + . . 103: for { |
| 82 | + . . 104: // Read record fragment header |
| 83 | + . . 105: err := binary.Read(conn, binary.BigEndian, &fragmentHeader) |
| 84 | + . . 106: if err != nil { |
| 85 | +(pprof) |
| 86 | +``` |
| 87 | + |
| 88 | +The records allocation is the culprit and we need to look further into |
| 89 | +the code as to why the records is allocating so much and fix it. |
0 commit comments