Skip to content
This repository was archived by the owner on Jun 27, 2022. It is now read-only.

Suddenly 504 Gateway Time-out after half year without any problems #14

@ghost

Description

We have a setup with about 30 users, calendar integration and a few gigabytes data running for about half an year now without any problems, but suddenly the container is very slow and in 95% of times responds with a 504 Gateway Time-Out.

It seems that it for some reason is totally busy with something like when there would be a ddos attack (what is not, I blocked all requests in the LB to verify). Sometimes it even restarts the ECS task because the health checks also run into timeouts.

The only thing noticeable in the logs are a lot of entries like this:

{"reqId":"R88qYRLYFO3NKB6IneRT","level":3,"time":"2022-03-04T15:53:23+00:00","remoteAddr":"10.192.10.76","user":"xxxx","app":"PHP","method":"PUT","url":"/remote.php/dav/files/xxxx/abcd.csv","message":{"Exception":"Error","Message":"fclose(): supplied resource is not a valid stream resource at /var/www/html/3rdparty/icewind/streams/src/Wrapper.php#96","Code":0,"Trace":[{"function":"onError","class":"OC\\Log\\ErrorHandler","type":"::","args":[2,"fclose(): supplied resource is not a valid stream resource","/var/www/html/3rdparty/icewind/streams/src/Wrapper.php",96,[]]},{"file":"/var/www/html/3rdparty/icewind/streams/src/Wrapper.php","line":96,"function":"fclose","args":[null]},{"file":"/var/www/html/3rdparty/icewind/streams/src/CallbackWrapper.php","line":117,"function":"stream_close","class":"Icewind\\Streams\\Wrapper","type":"->","args":[]},{"function":"stream_close","class":"Icewind\\Streams\\CallbackWrapper","type":"->","args":[]},{"file":"/var/www/html/3rdparty/guzzlehttp/psr7/src/Stream.php","line":108,"function":"fclose","args":[null]},{"file":"/var/www/html/3rdparty/guzzlehttp/psr7/src/Stream.php","line":74,"function":"close","class":"GuzzleHttp\\Psr7\\Stream","type":"->","args":[]},{"function":"__destruct","class":"GuzzleHttp\\Psr7\\Stream","type":"->","args":[]}],"File":"/var/www/html/lib/private/Log/ErrorHandler.php","Line":92,"CustomMessage":"--"},"userAgent":"Mozilla/5.0 (Macintosh) mirall/3.3.2git (build 7106) (Nextcloud, osx-21.3.0 ClientArchitecture: x86_64 OsArchitecture: x86_64)","version":"21.0.1.1"}

What I already tried:

  • Increasing task computing power to 8gb memory and 4vcpu (before 2gb/1vcpu)
  • Spawning 8 parallel instances (normally only 1 running)
  • Blocking all requests in LB
  • Deleting 100k small files
  • Restarting Redis

Database load seems normal.

I don't know what else I can do. Is it somehow possible to connect via ssh to the Nextcloud instance? Or do you have any other suggestions? What could be the problem here?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions