Skip to content

Performance advice needed #4647

@bertalanimre

Description

@bertalanimre

Hello All,

I am running a GlusterFS on 3 nodes and storing around 1 million file/folder in one folder. This is a storage folder for a laravel project and from time to time, we suffer mkdir and fopen issues within the application. I think, it will be performance issue with GlusterFS, but I an fairly new to this service and I am learning as I go. My initial thoughts are enabling the readdir-ahead for reading the files and the nl-cache options but I do not know if this is enough, or even I should try to adjust them and if yes, I do not know the metric. For around 1 million files and 80K+ in a single folder, what should be the best setting? For contex, here is my current gluster volume:

Option                                   Value                                  
------                                   -----                                  
cluster.lookup-unhashed                  on (DEFAULT)                           
cluster.lookup-optimize                  on (DEFAULT)                           
cluster.rmdir-optimize                   on (DEFAULT)                           
cluster.min-free-disk                    10% (DEFAULT)                          
cluster.min-free-inodes                  5% (DEFAULT)                           
cluster.rebalance-stats                  off (DEFAULT)                          
cluster.subvols-per-directory            (null) (DEFAULT)                       
cluster.readdir-optimize                 off (DEFAULT)                          
cluster.rsync-hash-regex                 (null) (DEFAULT)                       
cluster.extra-hash-regex                 (null) (DEFAULT)                       
cluster.dht-xattr-name                   trusted.glusterfs.dht (DEFAULT)        
cluster.randomize-hash-range-by-gfid     off (DEFAULT)                          
cluster.rebal-throttle                   normal (DEFAULT)                       
cluster.lock-migration                   off                                    
cluster.force-migration                  off                                    
cluster.local-volume-name                (null) (DEFAULT)                       
cluster.weighted-rebalance               on (DEFAULT)                           
cluster.switch-pattern                   (null) (DEFAULT)                       
cluster.entry-change-log                 on (DEFAULT)                           
cluster.read-subvolume                   (null) (DEFAULT)                       
cluster.read-subvolume-index             -1 (DEFAULT)                           
cluster.read-hash-mode                   1 (DEFAULT)                            
cluster.background-self-heal-count       8 (DEFAULT)                            
cluster.metadata-self-heal               off (DEFAULT)                          
cluster.data-self-heal                   off (DEFAULT)                          
cluster.entry-self-heal                  off (DEFAULT)                          
cluster.self-heal-daemon                 on (DEFAULT)                           
cluster.heal-timeout                     600 (DEFAULT)                          
cluster.self-heal-window-size            8 (DEFAULT)                            
cluster.data-change-log                  on (DEFAULT)                           
cluster.metadata-change-log              on (DEFAULT)                           
cluster.data-self-heal-algorithm         (null) (DEFAULT)                       
cluster.eager-lock                       on (DEFAULT)                           
disperse.eager-lock                      on (DEFAULT)                           
disperse.other-eager-lock                on (DEFAULT)                           
disperse.eager-lock-timeout              1 (DEFAULT)                            
disperse.other-eager-lock-timeout        1 (DEFAULT)                            
cluster.quorum-type                      auto                                   
cluster.quorum-count                     (null) (DEFAULT)                       
cluster.choose-local                     true (DEFAULT)                         
cluster.self-heal-readdir-size           1KB (DEFAULT)                          
cluster.post-op-delay-secs               1 (DEFAULT)                            
cluster.ensure-durability                on (DEFAULT)                           
cluster.consistent-metadata              no (DEFAULT)                           
cluster.heal-wait-queue-length           128 (DEFAULT)                          
cluster.favorite-child-policy            none (DEFAULT)                         
cluster.full-lock                        yes (DEFAULT)                          
cluster.optimistic-change-log            on (DEFAULT)                           
diagnostics.latency-measurement          off                                    
diagnostics.dump-fd-stats                off (DEFAULT)                          
diagnostics.count-fop-hits               off                                    
diagnostics.brick-log-level              INFO                                   
diagnostics.client-log-level             INFO                                   
diagnostics.brick-sys-log-level          CRITICAL (DEFAULT)                     
diagnostics.client-sys-log-level         CRITICAL (DEFAULT)                     
diagnostics.brick-logger                 (null) (DEFAULT)                       
diagnostics.client-logger                (null) (DEFAULT)                       
diagnostics.brick-log-format             (null) (DEFAULT)                       
diagnostics.client-log-format            (null) (DEFAULT)                       
diagnostics.brick-log-buf-size           5 (DEFAULT)                            
diagnostics.client-log-buf-size          5 (DEFAULT)                            
diagnostics.brick-log-flush-timeout      120 (DEFAULT)                          
diagnostics.client-log-flush-timeout     120 (DEFAULT)                          
diagnostics.stats-dump-interval          0 (DEFAULT)                            
diagnostics.fop-sample-interval          0 (DEFAULT)                            
diagnostics.stats-dump-format            json (DEFAULT)                         
diagnostics.fop-sample-buf-size          65535 (DEFAULT)                        
diagnostics.stats-dnscache-ttl-sec       86400 (DEFAULT)                        
performance.cache-max-file-size          0 (DEFAULT)                            
performance.cache-min-file-size          0 (DEFAULT)                            
performance.cache-refresh-timeout        1 (DEFAULT)                            
performance.cache-priority                (DEFAULT)                             
performance.io-cache-size                32MB (DEFAULT)                         
performance.cache-size                   32MB (DEFAULT)                         
performance.io-thread-count              16 (DEFAULT)                           
performance.high-prio-threads            16 (DEFAULT)                           
performance.normal-prio-threads          16 (DEFAULT)                           
performance.low-prio-threads             16 (DEFAULT)                           
performance.least-prio-threads           1 (DEFAULT)                            
performance.enable-least-priority        on (DEFAULT)                           
performance.iot-watchdog-secs            (null) (DEFAULT)                       
performance.iot-cleanup-disconnected-reqs off (DEFAULT)                          
performance.iot-pass-through             false (DEFAULT)                        
performance.io-cache-pass-through        false (DEFAULT)                        
performance.quick-read-cache-size        128MB (DEFAULT)                        
performance.cache-size                   128MB (DEFAULT)                        
performance.quick-read-cache-timeout     1 (DEFAULT)                            
performance.qr-cache-timeout             1 (DEFAULT)                            
performance.quick-read-cache-invalidation false (DEFAULT)                        
performance.ctime-invalidation           false (DEFAULT)                        
performance.flush-behind                 on (DEFAULT)                           
performance.nfs.flush-behind             on (DEFAULT)                           
performance.write-behind-window-size     1MB (DEFAULT)                          
performance.resync-failed-syncs-after-fsync off (DEFAULT)                          
performance.nfs.write-behind-window-size 1MB (DEFAULT)                          
performance.strict-o-direct              off (DEFAULT)                          
performance.nfs.strict-o-direct          off (DEFAULT)                          
performance.strict-write-ordering        off (DEFAULT)                          
performance.nfs.strict-write-ordering    off (DEFAULT)                          
performance.write-behind-trickling-writes on (DEFAULT)                           
performance.aggregate-size               128KB (DEFAULT)                        
performance.nfs.write-behind-trickling-writes on (DEFAULT)                           
performance.lazy-open                    yes (DEFAULT)                          
performance.read-after-open              yes (DEFAULT)                          
performance.open-behind-pass-through     false (DEFAULT)                        
performance.read-ahead-page-count        4 (DEFAULT)                            
performance.read-ahead-pass-through      false (DEFAULT)                        
performance.readdir-ahead-pass-through   false (DEFAULT)                        
performance.md-cache-pass-through        false (DEFAULT)                        
performance.write-behind-pass-through    false (DEFAULT)                        
performance.md-cache-timeout             1 (DEFAULT)                            
performance.cache-swift-metadata         false (DEFAULT)                        
performance.cache-samba-metadata         false (DEFAULT)                        
performance.cache-capability-xattrs      true (DEFAULT)                         
performance.cache-ima-xattrs             true (DEFAULT)                         
performance.md-cache-statfs              off (DEFAULT)                          
performance.xattr-cache-list              (DEFAULT)                             
performance.nl-cache-pass-through        false (DEFAULT)                        
network.frame-timeout                    1800 (DEFAULT)                         
network.ping-timeout                     42 (DEFAULT)                           
network.tcp-window-size                  (null) (DEFAULT)                       
client.ssl                               off                                    
network.remote-dio                       disable (DEFAULT)                      
client.event-threads                     2 (DEFAULT)                            
client.tcp-user-timeout                  0                                      
client.keepalive-time                    20                                     
client.keepalive-interval                2                                      
client.keepalive-count                   9                                      
client.strict-locks                      off                                    
network.tcp-window-size                  (null) (DEFAULT)                       
network.inode-lru-limit                  16384 (DEFAULT)                        
auth.allow                               *                                      
auth.reject                              (null) (DEFAULT)                       
transport.keepalive                      1                                      
server.allow-insecure                    on (DEFAULT)                           
server.root-squash                       off (DEFAULT)                          
server.all-squash                        off (DEFAULT)                          
server.anonuid                           65534 (DEFAULT)                        
server.anongid                           65534 (DEFAULT)                        
server.statedump-path                    /var/run/gluster (DEFAULT)             
server.outstanding-rpc-limit             64 (DEFAULT)                           
server.ssl                               off                                    
auth.ssl-allow                           *                                      
server.manage-gids                       off (DEFAULT)                          
server.dynamic-auth                      on (DEFAULT)                           
client.send-gids                         on (DEFAULT)                           
server.gid-timeout                       300 (DEFAULT)                          
server.own-thread                        (null) (DEFAULT)                       
server.event-threads                     2 (DEFAULT)                            
server.tcp-user-timeout                  42 (DEFAULT)                           
server.keepalive-time                    20                                     
server.keepalive-interval                2                                      
server.keepalive-count                   9                                      
transport.listen-backlog                 1024                                   
ssl.own-cert                             (null) (DEFAULT)                       
ssl.private-key                          (null) (DEFAULT)                       
ssl.ca-list                              (null) (DEFAULT)                       
ssl.crl-path                             (null) (DEFAULT)                       
ssl.certificate-depth                    (null) (DEFAULT)                       
ssl.cipher-list                          (null) (DEFAULT)                       
ssl.dh-param                             (null) (DEFAULT)                       
ssl.ec-curve                             (null) (DEFAULT)                       
transport.address-family                 inet                                   
performance.write-behind                 on                                     
performance.read-ahead                   off                                    
performance.readdir-ahead                off                                    
performance.io-cache                     off                                    
performance.open-behind                  on                                     
performance.quick-read                   on                                     
performance.nl-cache                     off                                    
performance.stat-prefetch                on                                     
performance.client-io-threads            off                                    
performance.nfs.write-behind             on                                     
performance.nfs.read-ahead               off                                    
performance.nfs.io-cache                 off                                    
performance.nfs.quick-read               off                                    
performance.nfs.stat-prefetch            off                                    
performance.nfs.io-threads               off                                    
performance.force-readdirp               true (DEFAULT)                         
performance.cache-invalidation           false (DEFAULT)                        
performance.global-cache-invalidation    true (DEFAULT)                         
features.uss                             off                                    
features.snapshot-directory              .snaps                                 
features.show-snapshot-directory         off                                    
features.tag-namespaces                  off                                    
network.compression                      off                                    
network.compression.window-size          -15 (DEFAULT)                          
network.compression.mem-level            8 (DEFAULT)                            
network.compression.min-size             1024 (DEFAULT)                         
network.compression.compression-level    1 (DEFAULT)                            
network.compression.debug                false (DEFAULT)                        
features.default-soft-limit              80% (DEFAULT)                          
features.soft-timeout                    60 (DEFAULT)                           
features.hard-timeout                    5 (DEFAULT)                            
features.alert-time                      86400 (DEFAULT)                        
features.quota-deem-statfs               off                                    
geo-replication.indexing                 off                                    
geo-replication.indexing                 off                                    
geo-replication.ignore-pid-check         off                                    
geo-replication.ignore-pid-check         off                                    
features.quota                           off                                    
features.inode-quota                     off                                    
features.bitrot                          disable                                
debug.trace                              off                                    
debug.log-history                        no (DEFAULT)                           
debug.log-file                           no (DEFAULT)                           
debug.exclude-ops                        (null) (DEFAULT)                       
debug.include-ops                        (null) (DEFAULT)                       
debug.error-gen                          off                                    
debug.error-failure                      (null) (DEFAULT)                       
debug.error-number                       (null) (DEFAULT)                       
debug.random-failure                     off (DEFAULT)                          
debug.error-fops                         (null) (DEFAULT)                       
features.read-only                       off (DEFAULT)                          
features.worm                            off                                    
features.worm-file-level                 off                                    
features.worm-files-deletable            on                                     
features.default-retention-period        120 (DEFAULT)                          
features.retention-mode                  relax (DEFAULT)                        
features.auto-commit-period              180 (DEFAULT)                          
storage.linux-aio                        off (DEFAULT)                          
storage.linux-io_uring                   off (DEFAULT)                          
storage.batch-fsync-mode                 reverse-fsync (DEFAULT)                
storage.batch-fsync-delay-usec           0 (DEFAULT)                            
storage.owner-uid                        -1 (DEFAULT)                           
storage.owner-gid                        -1 (DEFAULT)                           
storage.node-uuid-pathinfo               off (DEFAULT)                          
storage.health-check-interval            30 (DEFAULT)                           
storage.build-pgfid                      off (DEFAULT)                          
storage.gfid2path                        on (DEFAULT)                           
storage.gfid2path-separator              : (DEFAULT)                            
storage.reserve                          1 (DEFAULT)                            
storage.health-check-timeout             20 (DEFAULT)                           
storage.fips-mode-rchecksum              on                                     
storage.force-create-mode                0000 (DEFAULT)                         
storage.force-directory-mode             0000 (DEFAULT)                         
storage.create-mask                      0777 (DEFAULT)                         
storage.create-directory-mask            0777 (DEFAULT)                         
storage.max-hardlinks                    100 (DEFAULT)                          
features.ctime                           on (DEFAULT)                           
config.gfproxyd                          off                                    
cluster.server-quorum-type               off                                    
cluster.server-quorum-ratio              51                                     
changelog.changelog                      off (DEFAULT)                          
changelog.changelog-dir                  {{ brick.path }}/.glusterfs/changelogs (DEFAULT)
changelog.encoding                       ascii (DEFAULT)                        
changelog.rollover-time                  15 (DEFAULT)                           
changelog.fsync-interval                 5 (DEFAULT)                            
changelog.changelog-barrier-timeout      120                                    
changelog.capture-del-path               off (DEFAULT)                          
features.barrier                         disable                                
features.barrier-timeout                 120                                    
features.trash                           off (DEFAULT)                          
features.trash-dir                       .trashcan (DEFAULT)                    
features.trash-eliminate-path            (null) (DEFAULT)                       
features.trash-max-filesize              5MB (DEFAULT)                          
features.trash-internal-op               off (DEFAULT)                          
cluster.enable-shared-storage            disable                                
locks.trace                              off (DEFAULT)                          
locks.mandatory-locking                  off (DEFAULT)                          
cluster.disperse-self-heal-daemon        enable (DEFAULT)                       
cluster.quorum-reads                     no (DEFAULT)                           
client.bind-insecure                     (null) (DEFAULT)                       
features.timeout                         45 (DEFAULT)                           
features.failover-hosts                  (null) (DEFAULT)                       
features.shard                           off                                    
features.shard-block-size                64MB (DEFAULT)                         
features.shard-lru-limit                 16384 (DEFAULT)                        
features.shard-deletion-rate             100 (DEFAULT)                          
features.scrub-throttle                  lazy                                   
features.scrub-freq                      biweekly                               
features.scrub                           false (DEFAULT)                        
features.expiry-time                     120                                    
features.signer-threads                  4                                      
features.cache-invalidation              off                                    
features.cache-invalidation-timeout      60 (DEFAULT)                           
ganesha.enable                           off                                    
features.leases                          off                                    
features.lease-lock-recall-timeout       60 (DEFAULT)                           
disperse.background-heals                8 (DEFAULT)                            
disperse.heal-wait-qlength               128 (DEFAULT)                          
cluster.heal-timeout                     600 (DEFAULT)                          
dht.force-readdirp                       on (DEFAULT)                           
disperse.read-policy                     gfid-hash (DEFAULT)                    
cluster.shd-max-threads                  1 (DEFAULT)                            
cluster.shd-wait-qlength                 1024 (DEFAULT)                         
cluster.locking-scheme                   full (DEFAULT)                         
cluster.granular-entry-heal              on                                     
features.locks-revocation-secs           0 (DEFAULT)                            
features.locks-revocation-clear-all      false (DEFAULT)                        
features.locks-revocation-max-blocked    0 (DEFAULT)                            
features.locks-monkey-unlocking          false (DEFAULT)                        
features.locks-notify-contention         yes (DEFAULT)                          
features.locks-notify-contention-delay   5 (DEFAULT)                            
disperse.shd-max-threads                 1 (DEFAULT)                            
disperse.shd-wait-qlength                1024 (DEFAULT)                         
disperse.cpu-extensions                  auto (DEFAULT)                         
disperse.self-heal-window-size           32 (DEFAULT)                           
cluster.use-compound-fops                off                                    
performance.parallel-readdir             off                                    
performance.rda-request-size             131072                                 
performance.rda-low-wmark                4096 (DEFAULT)                         
performance.rda-high-wmark               128KB (DEFAULT)                        
performance.rda-cache-limit              10MB                                   
performance.nl-cache-positive-entry      false (DEFAULT)                        
performance.nl-cache-limit               10MB                                   
performance.nl-cache-timeout             60 (DEFAULT)                           
cluster.brick-multiplex                  disable                                
cluster.brick-graceful-cleanup           disable                                
glusterd.vol_count_per_thread            100                                    
cluster.max-bricks-per-process           250                                    
disperse.optimistic-change-log           on (DEFAULT)                           
disperse.stripe-cache                    4 (DEFAULT)                            
cluster.halo-enabled                     False (DEFAULT)                        
cluster.halo-shd-max-latency             99999 (DEFAULT)                        
cluster.halo-nfsd-max-latency            5 (DEFAULT)                            
cluster.halo-max-latency                 5 (DEFAULT)                            
cluster.halo-max-replicas                99999 (DEFAULT)                        
cluster.halo-min-replicas                2 (DEFAULT)                            
features.selinux                         on                                     
cluster.daemon-log-level                 INFO                                   
debug.delay-gen                          off                                    
delay-gen.delay-percentage               10% (DEFAULT)                          
delay-gen.delay-duration                 100000 (DEFAULT)                       
delay-gen.enable                          (DEFAULT)                             
disperse.parallel-writes                 on (DEFAULT)                           
disperse.quorum-count                    0 (DEFAULT)                            
features.sdfs                            off                                    
features.cloudsync                       off                                    
features.ctime                           on                                     
ctime.noatime                            on                                     
features.cloudsync-storetype             (null) (DEFAULT)                       
features.enforce-mandatory-lock          off                                    
config.global-threading                  off                                    
config.client-threads                    16                                     
config.brick-threads                     16                                     
features.cloudsync-remote-read           off                                    
features.cloudsync-store-id              (null) (DEFAULT)                       
features.cloudsync-product-id            (null) (DEFAULT)                       
features.acl                             enable                                 
feature.simple-quota-pass-through        true                                   
feature.simple-quota.use-backend         false                                  
cluster.use-anonymous-inode              yes                                    
rebalance.ensure-durability              on (DEFAULT)

Now as any reckless person, I would ask an AI agent to tell me what's wrong and I did. It also pointed me to these options and it also suggests turning on cache-invalidation and stat-prefetch (if I see it correctly, this is already enabled). But as before, I do not know the metric of these options and I do not know if simply enabling them is enough of I need to adjust other parameters.

Like; what is the nl-cache-positive-entry ? Or the nl-cache-limit? I rather ask humans who work with this service than blindly trust the AI. I am sure, you can understand.

Thank you in advance !

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions