Skip to content

feat: Use S3 node store with seaweedfs #3498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open

Conversation

BYK
Copy link
Member

@BYK BYK commented Dec 31, 2024

Note

This patch may or may not make it to the main branch so please do not rely on this yet. You are, however, free to use it as a blueprint for your own, custom S3 or S3-like variations.

Enables S3 node store using Garage and sentry-nodestore-s3 by @stayallive

This should alleviate all the issues stemming from (ab)using PostgreSQL as the node store.

  • We should implement the 90-day retention through S3 lifecycle options: https://garagehq.deuxfleurs.fr/
  • We should find a good size for the node store size and make it variable (currently hard-coded at 100G) not relevant anymore
  • We should have a proper migration path for existing installs

Copy link

codecov bot commented Dec 31, 2024

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
3 1 2 0
View the top 1 failed test(s) by shortest run time
_integration-test.test_01_basics::test_login
Stack Traces | 7.65s run time
#x1B[0m#x1B[37m@contextlib#x1B[39;49;00m.contextmanager#x1B[90m#x1B[39;49;00m
    #x1B[94mdef#x1B[39;49;00m #x1B[92mmap_httpcore_exceptions#x1B[39;49;00m() -> typing.Iterator[#x1B[94mNone#x1B[39;49;00m]:#x1B[90m#x1B[39;49;00m
        #x1B[94mglobal#x1B[39;49;00m HTTPCORE_EXC_MAP#x1B[90m#x1B[39;49;00m
        #x1B[94mif#x1B[39;49;00m #x1B[96mlen#x1B[39;49;00m(HTTPCORE_EXC_MAP) == #x1B[94m0#x1B[39;49;00m:#x1B[90m#x1B[39;49;00m
            HTTPCORE_EXC_MAP = _load_httpcore_exceptions()#x1B[90m#x1B[39;49;00m
        #x1B[94mtry#x1B[39;49;00m:#x1B[90m#x1B[39;49;00m
>           #x1B[94myield#x1B[39;49;00m#x1B[90m#x1B[39;49;00m

#x1B[1m#x1B[31m../../../.local/lib/python3.12.../httpx/_transports/default.py#x1B[0m:101: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
#x1B[1m#x1B[31m../../../.local/lib/python3.12.../httpx/_transports/default.py#x1B[0m:250: in handle_request
    #x1B[0mresp = #x1B[96mself#x1B[39;49;00m._pool.handle_request(req)#x1B[90m#x1B[39;49;00m
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12.../httpcore/_sync/connection_pool.py#x1B[0m:256: in handle_request
    #x1B[0m#x1B[94mraise#x1B[39;49;00m exc #x1B[94mfrom#x1B[39;49;00m #x1B[94mNone#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12.../httpcore/_sync/connection_pool.py#x1B[0m:236: in handle_request
    #x1B[0mresponse = connection.handle_request(#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12.../httpcore/_sync/connection.py#x1B[0m:103: in handle_request
    #x1B[0m#x1B[94mreturn#x1B[39;49;00m #x1B[96mself#x1B[39;49;00m._connection.handle_request(request)#x1B[90m#x1B[39;49;00m
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12.../httpcore/_sync/http11.py#x1B[0m:136: in handle_request
    #x1B[0m#x1B[94mraise#x1B[39;49;00m exc#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12.../httpcore/_sync/http11.py#x1B[0m:106: in handle_request
    #x1B[0m) = #x1B[96mself#x1B[39;49;00m._receive_response_headers(**kwargs)#x1B[90m#x1B[39;49;00m
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12.../httpcore/_sync/http11.py#x1B[0m:177: in _receive_response_headers
    #x1B[0mevent = #x1B[96mself#x1B[39;49;00m._receive_event(timeout=timeout)#x1B[90m#x1B[39;49;00m
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12.../httpcore/_sync/http11.py#x1B[0m:217: in _receive_event
    #x1B[0mdata = #x1B[96mself#x1B[39;49;00m._network_stream.read(#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12.../httpcore/_backends/sync.py#x1B[0m:126: in read
    #x1B[0m#x1B[94mwith#x1B[39;49;00m map_exceptions(exc_map):#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[.../usr/lib/python3.12/contextlib.py#x1B[0m:158: in __exit__
    #x1B[0m#x1B[96mself#x1B[39;49;00m.gen.throw(value)#x1B[90m#x1B[39;49;00m
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

map = {<class 'TimeoutError'>: <class 'httpcore.ReadTimeout'>, <class 'OSError'>: <class 'httpcore.ReadError'>}

    #x1B[0m#x1B[37m@contextlib#x1B[39;49;00m.contextmanager#x1B[90m#x1B[39;49;00m
    #x1B[94mdef#x1B[39;49;00m #x1B[92mmap_exceptions#x1B[39;49;00m(#x1B[96mmap#x1B[39;49;00m: ExceptionMapping) -> typing.Iterator[#x1B[94mNone#x1B[39;49;00m]:#x1B[90m#x1B[39;49;00m
        #x1B[94mtry#x1B[39;49;00m:#x1B[90m#x1B[39;49;00m
            #x1B[94myield#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
        #x1B[94mexcept#x1B[39;49;00m #x1B[96mException#x1B[39;49;00m #x1B[94mas#x1B[39;49;00m exc:  #x1B[90m# noqa: PIE786#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
            #x1B[94mfor#x1B[39;49;00m from_exc, to_exc #x1B[95min#x1B[39;49;00m #x1B[96mmap#x1B[39;49;00m.items():#x1B[90m#x1B[39;49;00m
                #x1B[94mif#x1B[39;49;00m #x1B[96misinstance#x1B[39;49;00m(exc, from_exc):#x1B[90m#x1B[39;49;00m
>                   #x1B[94mraise#x1B[39;49;00m to_exc(exc) #x1B[94mfrom#x1B[39;49;00m #x1B[04m#x1B[96mexc#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31mE                   httpcore.ReadTimeout: timed out#x1B[0m

#x1B[1m#x1B[31m../../../.local/lib/python3.12.../site-packages/httpcore/_exceptions.py#x1B[0m:14: ReadTimeout

#x1B[33mThe above exception was the direct cause of the following exception:#x1B[0m

    #x1B[0m#x1B[37m@pytest#x1B[39;49;00m.fixture()#x1B[90m#x1B[39;49;00m
    #x1B[94mdef#x1B[39;49;00m #x1B[92mclient_login#x1B[39;49;00m():#x1B[90m#x1B[39;49;00m
        client = httpx.Client()#x1B[90m#x1B[39;49;00m
        response = client.get(SENTRY_TEST_HOST, follow_redirects=#x1B[94mTrue#x1B[39;49;00m)#x1B[90m#x1B[39;49;00m
        parser = BeautifulSoup(response.text, #x1B[33m"#x1B[39;49;00m#x1B[33mhtml.parser#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m)#x1B[90m#x1B[39;49;00m
        login_csrf_token = parser.find(#x1B[33m"#x1B[39;49;00m#x1B[33minput#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m, {#x1B[33m"#x1B[39;49;00m#x1B[33mname#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m: #x1B[33m"#x1B[39;49;00m#x1B[33mcsrfmiddlewaretoken#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m})[#x1B[33m"#x1B[39;49;00m#x1B[33mvalue#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m]#x1B[90m#x1B[39;49;00m
>       login_response = client.post(#x1B[90m#x1B[39;49;00m
            #x1B[33mf#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m#x1B[33m{#x1B[39;49;00mSENTRY_TEST_HOST#x1B[33m}#x1B[39;49;00m#x1B[.../auth/login/sentry/#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m,#x1B[90m#x1B[39;49;00m
            follow_redirects=#x1B[94mTrue#x1B[39;49;00m,#x1B[90m#x1B[39;49;00m
            data={#x1B[90m#x1B[39;49;00m
                #x1B[33m"#x1B[39;49;00m#x1B[33mop#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m: #x1B[33m"#x1B[39;49;00m#x1B[33mlogin#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m,#x1B[90m#x1B[39;49;00m
                #x1B[33m"#x1B[39;49;00m#x1B[33musername#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m: TEST_USER,#x1B[90m#x1B[39;49;00m
                #x1B[33m"#x1B[39;49;00m#x1B[33mpassword#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m: TEST_PASS,#x1B[90m#x1B[39;49;00m
                #x1B[33m"#x1B[39;49;00m#x1B[33mcsrfmiddlewaretoken#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m: login_csrf_token,#x1B[90m#x1B[39;49;00m
            },#x1B[90m#x1B[39;49;00m
            headers={#x1B[33m"#x1B[39;49;00m#x1B[33mReferer#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m: #x1B[33mf#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m#x1B[33m{#x1B[39;49;00mSENTRY_TEST_HOST#x1B[33m}#x1B[39;49;00m#x1B[.../auth/login/sentry/#x1B[39;49;00m#x1B[33m"#x1B[39;49;00m},#x1B[90m#x1B[39;49;00m
        )#x1B[90m#x1B[39;49;00m

#x1B[1m#x1B[31m_integration-test/test_01_basics.py#x1B[0m:63: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
#x1B[1m#x1B[31m../../../.local/lib/python3.12................../site-packages/httpx/_client.py#x1B[0m:1144: in post
    #x1B[0m#x1B[94mreturn#x1B[39;49;00m #x1B[96mself#x1B[39;49;00m.request(#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12................../site-packages/httpx/_client.py#x1B[0m:825: in request
    #x1B[0m#x1B[94mreturn#x1B[39;49;00m #x1B[96mself#x1B[39;49;00m.send(request, auth=auth, follow_redirects=follow_redirects)#x1B[90m#x1B[39;49;00m
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12.../site-packages/sentry_sdk/utils.py#x1B[0m:1814: in runner
    #x1B[0m#x1B[94mreturn#x1B[39;49;00m original_function(*args, **kwargs)#x1B[90m#x1B[39;49;00m
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12................../site-packages/httpx/_client.py#x1B[0m:914: in send
    #x1B[0mresponse = #x1B[96mself#x1B[39;49;00m._send_handling_auth(#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12................../site-packages/httpx/_client.py#x1B[0m:942: in _send_handling_auth
    #x1B[0mresponse = #x1B[96mself#x1B[39;49;00m._send_handling_redirects(#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12................../site-packages/httpx/_client.py#x1B[0m:979: in _send_handling_redirects
    #x1B[0mresponse = #x1B[96mself#x1B[39;49;00m._send_single_request(request)#x1B[90m#x1B[39;49;00m
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12................../site-packages/httpx/_client.py#x1B[0m:1014: in _send_single_request
    #x1B[0mresponse = transport.handle_request(request)#x1B[90m#x1B[39;49;00m
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31m../../../.local/lib/python3.12.../httpx/_transports/default.py#x1B[0m:249: in handle_request
    #x1B[0m#x1B[94mwith#x1B[39;49;00m map_httpcore_exceptions():#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[.../usr/lib/python3.12/contextlib.py#x1B[0m:158: in __exit__
    #x1B[0m#x1B[96mself#x1B[39;49;00m.gen.throw(value)#x1B[90m#x1B[39;49;00m
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    #x1B[0m#x1B[37m@contextlib#x1B[39;49;00m.contextmanager#x1B[90m#x1B[39;49;00m
    #x1B[94mdef#x1B[39;49;00m #x1B[92mmap_httpcore_exceptions#x1B[39;49;00m() -> typing.Iterator[#x1B[94mNone#x1B[39;49;00m]:#x1B[90m#x1B[39;49;00m
        #x1B[94mglobal#x1B[39;49;00m HTTPCORE_EXC_MAP#x1B[90m#x1B[39;49;00m
        #x1B[94mif#x1B[39;49;00m #x1B[96mlen#x1B[39;49;00m(HTTPCORE_EXC_MAP) == #x1B[94m0#x1B[39;49;00m:#x1B[90m#x1B[39;49;00m
            HTTPCORE_EXC_MAP = _load_httpcore_exceptions()#x1B[90m#x1B[39;49;00m
        #x1B[94mtry#x1B[39;49;00m:#x1B[90m#x1B[39;49;00m
            #x1B[94myield#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
        #x1B[94mexcept#x1B[39;49;00m #x1B[96mException#x1B[39;49;00m #x1B[94mas#x1B[39;49;00m exc:#x1B[90m#x1B[39;49;00m
            mapped_exc = #x1B[94mNone#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
    #x1B[90m#x1B[39;49;00m
            #x1B[94mfor#x1B[39;49;00m from_exc, to_exc #x1B[95min#x1B[39;49;00m HTTPCORE_EXC_MAP.items():#x1B[90m#x1B[39;49;00m
                #x1B[94mif#x1B[39;49;00m #x1B[95mnot#x1B[39;49;00m #x1B[96misinstance#x1B[39;49;00m(exc, from_exc):#x1B[90m#x1B[39;49;00m
                    #x1B[94mcontinue#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
                #x1B[90m# We want to map to the most specific exception we can find.#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
                #x1B[90m# Eg if `exc` is an `httpcore.ReadTimeout`, we want to map to#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
                #x1B[90m# `httpx.ReadTimeout`, not just `httpx.TimeoutException`.#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
                #x1B[94mif#x1B[39;49;00m mapped_exc #x1B[95mis#x1B[39;49;00m #x1B[94mNone#x1B[39;49;00m #x1B[95mor#x1B[39;49;00m #x1B[96missubclass#x1B[39;49;00m(to_exc, mapped_exc):#x1B[90m#x1B[39;49;00m
                    mapped_exc = to_exc#x1B[90m#x1B[39;49;00m
    #x1B[90m#x1B[39;49;00m
            #x1B[94mif#x1B[39;49;00m mapped_exc #x1B[95mis#x1B[39;49;00m #x1B[94mNone#x1B[39;49;00m:  #x1B[90m# pragma: no cover#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
                #x1B[94mraise#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
    #x1B[90m#x1B[39;49;00m
            message = #x1B[96mstr#x1B[39;49;00m(exc)#x1B[90m#x1B[39;49;00m
>           #x1B[94mraise#x1B[39;49;00m mapped_exc(message) #x1B[94mfrom#x1B[39;49;00m #x1B[04m#x1B[96mexc#x1B[39;49;00m#x1B[90m#x1B[39;49;00m
#x1B[1m#x1B[31mE           httpx.ReadTimeout: timed out#x1B[0m

#x1B[1m#x1B[31m../../../.local/lib/python3.12.../httpx/_transports/default.py#x1B[0m:118: ReadTimeout

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@aldy505
Copy link
Collaborator

aldy505 commented Dec 31, 2024

Any reason why you didn't use SeaweedFS per what you said yesterday?


if [[ $($garage bucket list | tail -1 | awk '{print $1}') != 'nodestore' ]]; then
node_id=$($garage status | tail -1 | awk '{print $1}')
$garage layout assign -z dc1 -c 100G "$node_id"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this 100G be a variable somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we should add a new GARAGE_STORAGE_SIZE env var to .env. That said not sure if that makes much sense as we would not honor any changes to that after the initial installation. Unless this actually reserves 100G, I think leaving it hard-coded to a "good enough" value and then documenting how to change this if needed would be a better option.

Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aldy505 added the env var regardless. Do you think 100G is a good size for the average self-hosted operator?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if 100G will be immediately allocated by Garage. If it's not (meaning the current storage space won't be modified) then I think it's good. If it's not, I think it's better to just allocate it to 25G space.

Comment on lines 91 to 104
SENTRY_NODESTORE = "sentry_nodestore_s3.S3PassthroughDjangoNodeStorage"
SENTRY_NODESTORE_OPTIONS = {
"delete_through": True,
"write_through": False,
"read_through": True,
"compression": False, # we have compression enabled in Garage itself
"endpoint_url": "http://garage:3900",
"bucket_path": "nodestore",
"bucket_name": "nodestore",
"retry_attempts": 3,
"region_name": "garage",
"aws_access_key_id": "<GARAGE_KEY_ID>",
"aws_secret_access_key": "<GARAGE_SECRET_KEY>",
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Docs) should we provide ways for the user to offload these to actual S3 or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something under the "experimental" part?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably would be better if we put it on the Experimental -> External Storage page. Will backlog that.

BYK and others added 2 commits December 31, 2024 15:23
@BYK
Copy link
Member Author

BYK commented Dec 31, 2024

@aldy505

Any reason why you didn't use SeaweedFS per what you said yesterday?

Well I started with that and realized 3 things:

  1. It really is not geared towards single-node setups and have nodes with different roles. This makes is more challenging to scale up or set up in our setup
  2. It has this paid admin interface. Not a deal breaker but it is clear that it is geared towards more "professional" setups
  3. Its S3 API interface support is not really great

Garage fits the bill much better as it is explicitly created for smaller setups like this, easy to expand without specialized roles, doesn't have any paid thing in it, and has much more decent and familiar S3 interface support.

@doc-sheet
Copy link
Contributor

It really is not geared towards single-node setups and have nodes with different roles. This makes is more challenging to scale up or set up in our setup

when I tried seaweedfs last time (and I still use it for sourcemap/profile storage tbh) it had single node ability via weed server command.
Like

weed server -filter=true -s3=true -master=true -volume=true

Some of them enabled by default.

@doc-sheet
Copy link
Contributor

doc-sheet commented May 25, 2025

I think garage/minio simpler for small setups, seaweedfs looks necessary for mid to high setups because all other services I know keep files as is.

And thousands of thousands small files like profiles not ideal to store on most popular filesystems i guess.

@aldy505
Copy link
Collaborator

aldy505 commented Jun 4, 2025

I think garage/minio simpler for small setups, seaweedfs looks necessary for mid to high setups because all other services I know keep files as is.

@doc-sheet Hey, I'm going to work on this PR. I'd think seaweed is better for self-hosted Sentry. One thing I don't like about Garage is that we need to specify the storage allocation beforehand, if we set it to 100GB, there might be some people that have more data than 100GB, I don't want that to cause any issues.

That said, since you said you've used seaweed before: How was your experience? How does it compare to MinIO or Ceph?

And thousands of thousands small files like profiles not ideal to store on most popular filesystems i guess.

Yeah if we set up an object storage, we might as well move filestore & profiles there too. But let's focus on nodestore first.

@doc-sheet
Copy link
Contributor

How was your experience? How does it compare to MinIO or Ceph?

It is a bit strange sometimes. But it is fine.

It has multiple options for filer store.
I didn't try leveldb storage aiming to fault tolerance.

At first I tried redis it worked for several months and then... I just lost all data.
It was there physically but wasn't available from API (s3 or web) - each list call returned different results.

I don't know if issue was in redis or weed itself. i suspect bug with ttl could be the reason too.

But after that incident I wiped cluster and started new one with scylla as a filer backend and it works fine for almost a year already despite that ttl bug.

Seaweedfs have multiple versions like

  • 3.89
  • 3.89_full
  • 3.89_large_disk
  • 3.89_large_disk_full

I suggest to use large_disk always. Documentation is not clear but it is easy to reach that limit
https://github.com/seaweedfs/seaweedfs/wiki/FAQ#how-to-configure-volumes-larger-than-30gb

I don't know difference between full and normal and just use _large_disk_full builds :)

Also I don't use s3 auth - I was too lazy to set it up.

Other than all that I have no problems and barely touched it after initial setup. It just works.
I added some volumes but not removed any yet.

As for minio and ceph.
I never used ceph.

But minio was the reason to look for alternatives.

Tons of profiles from js-sdk stored as different files started to affect my monitoring script and soon it might start to affect minio performance too.

And it is not that easy to scale minio. And probably impossible to optimize for small-files storage. At least in my low-cost setup.

@doc-sheet
Copy link
Contributor

doc-sheet commented Jun 4, 2025

let's focus on nodestore first.

If seaweedfs would control ttl then there is another catch.
I'm not sure if it is possible to control ttl with s3-api already.

weed have it's own settings for collections and it creates collection for each s3-bucket.
https://github.com/seaweedfs/seaweedfs/wiki/S3-API-FAQ#setting-ttl

But if sentry itself would cleanup old data I guess there is no difference.

@aldy505
Copy link
Collaborator

aldy505 commented Jun 5, 2025

How was your experience? How does it compare to MinIO or Ceph?

It is a bit strange sometimes. But it is fine.

It has multiple options for filer store. I didn't try leveldb storage aiming to fault tolerance.

At first I tried redis it worked for several months and then... I just lost all data. It was there physically but wasn't available from API (s3 or web) - each list call returned different results.

I don't know if issue was in redis or weed itself. i suspect bug with ttl could be the reason too.

But after that incident I wiped cluster and started new one with scylla as a filer backend and it works fine for almost a year already despite that ttl bug.

Seaweedfs have multiple versions like

  • 3.89
  • 3.89_full
  • 3.89_large_disk
  • 3.89_large_disk_full

I suggest to use large_disk always. Documentation is not clear but it is easy to reach that limit https://github.com/seaweedfs/seaweedfs/wiki/FAQ#how-to-configure-volumes-larger-than-30gb

I don't know difference between full and normal and just use _large_disk_full builds :)

Also I don't use s3 auth - I was too lazy to set it up.

Other than all that I have no problems and barely touched it after initial setup. It just works. I added some volumes but not removed any yet.

Good to know about Seaweed

As for minio and ceph. I never used ceph.

But minio was the reason to look for alternatives.

Tons of profiles from js-sdk stored as different files started to affect my monitoring script and soon it might start to affect minio performance too.

And it is not that easy to scale minio. And probably impossible to optimize for small-files storage. At least in my low-cost setup.

Ah so everyone has the same experience with minio.

let's focus on nodestore first.

If seaweedfs would control ttl then there is another catch. I'm not sure if it is possible to control ttl with s3-api already.

weed have it's own settings for collections and it creates collection for each s3-bucket. https://github.com/seaweedfs/seaweedfs/wiki/S3-API-FAQ#setting-ttl

But if sentry itself would cleanup old data I guess there is no difference.

The sentry cleanup job only cleans up the one on filesystem. If we're using S3, it won't clean up anything. We need to configure S3 data cleanup on our own.

@doc-sheet
Copy link
Contributor

doc-sheet commented Jun 6, 2025

Looks like i missed that seaweedfs now have an ability to control ttl with s3 api. And I even linked to correct section of FAQ. :)

I'd like to look into new integraton with seaweedfs.

Nad by the way I like the idea of expanding sentry images.

I am myself install some packages and modules.

Like maybe an extra step in install to build user provided Dockerfiles.

@aldy505
Copy link
Collaborator

aldy505 commented Jun 7, 2025

Nad by the way I like the idea of expanding sentry images.

I am myself and install some packages and modules.

Like maybe an extra step in install to build user provided Dockerfiles.

Yes but I don't think people would go to non-default setup if they don't need anything.

@aldy505
Copy link
Collaborator

aldy505 commented Jul 7, 2025

This is interesting, very close to MinIO yet is far lightweight. https://github.com/rustfs/rustfs

@aldy505 aldy505 mentioned this pull request Jul 20, 2025
1 task
@aldy505
Copy link
Collaborator

aldy505 commented Jul 20, 2025

I just tried trying to set things up with rustfs. It didn't work. I can't configure the administrative side of S3. See my PR here #3821

@aldy505
Copy link
Collaborator

aldy505 commented Jul 21, 2025

@BYK Do you remember why we didn't use MinIO? Was it about licensing issue?

@BYK
Copy link
Member Author

BYK commented Jul 21, 2025

@BYK Do you remember why we didn't use MinIO? Was it about licensing issue?

Both licensing and reported performance and scalability issues reported by others.

aldy505 added 2 commits July 31, 2025 19:15
* feat: seaweedfs as s3 nodestore backend

* fix: 'server' was missing for seaweed

* feat: remove minimum volume free space

* feat: specify hostname on ip

* fix: grpc port on seaweed should be `-{service}.port.grpc` instead of `-{service}.grpcPort`

* fix: wrong access key & secret key; use localhost for internal comms

* fix: create index directory

* test: add sentry-seaweedfs volume into expected volumes

* debug: aaaaaaaaaaaaaaaaaaaaaaarrrrggggggghhhhhhhhhhhhhhh

* test: correct ordering for expected volumes

* chore: seaweedfs healthcheck to multiple urls

See https://stackoverflow.com/a/14578575/3153224
@aldy505 aldy505 changed the title feat: Use S3 node store with garage feat: Use S3 node store with seaweedfs Aug 6, 2025
@aldy505 aldy505 marked this pull request as ready for review August 6, 2025 09:59
@aldy505
Copy link
Collaborator

aldy505 commented Aug 6, 2025

Now, what's missing is the retention days. We can't change the retention days mid installation, the bucket needs to be recreated.

@aldy505
Copy link
Collaborator

aldy505 commented Aug 6, 2025

Integration tests isn't passing. I think we should hold this off for a bit.

@hubertdeng123
Copy link
Member

@aldy505 I also noticed that recently we've added objectstore. Perhaps related here?
getsentry/sentry#97271

@aldy505
Copy link
Collaborator

aldy505 commented Aug 7, 2025

@aldy505 I also noticed that recently we've added objectstore. Perhaps related here? getsentry/sentry#97271

@hubertdeng123 I asked Jan last week, it's not being used on SaaS yet. Quoting him:

Right, we're planning to make this an intermediary layer to some backend - we do not have a strong story for self hosted yet. We wouldn't be using postgres for sure, instead offer two alternatives: Any S3-compatible backend or raw disk.
Our first use case is event attachments followed, by release files and debug files. We do consider replacing nodestore, but it's not on our roadmap yet. Will likely take months to get to the point where we can plan that .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

Postgres nodestore_node is huge Cleaning nodestore_node table
4 participants