Conversation
internal/dns/srvRecordDialer_test.go
Outdated
| "github.com/stretchr/testify/assert" | ||
| ) | ||
|
|
||
| func TestNewSRVRecordDialer(t *testing.T) { |
There was a problem hiding this comment.
Consider using github.com/stretchr/testify/suite for tests. That package lets you organize your tests better, moving setup/teardown code into separate methods, etc. It provides a lot of the same structure as frameworks like TestNG for Java.
internal/dns/srvRecordDialer.go
Outdated
| LookupSRV(ctx context.Context, service, proto, name string) (string, []*net.SRV, error) | ||
| } | ||
|
|
||
| func NewSRVRecordDialer(fqdns []string, sortBy string, resolver Resolver) (http.RoundTripper, error) { |
There was a problem hiding this comment.
You probably want to use the functional options pattern here. This function is a primary entry point for this package and it's signature will likely need to change as this package evolves over time. Using functional options allows clients to be backwards compatible and resilient in the face of new features.
internal/dns/srvRecordDialer.go
Outdated
|
|
||
| //TODO: add retry logic if we receive conn error? or just move to next one? | ||
| switch d.sortBy { | ||
| case "weight": |
There was a problem hiding this comment.
Instead of using constants whose value you switch on, a more extensible way would be to create a Sorter strategy that did the sorting. Then, instead of passing sortBy, you'd have someone pass this code a Sorter. This package can provide (2) implementations of Sorter: one for weight and one for priority.
There was a problem hiding this comment.
I think this sorting logic is going to be replace soon with actual sampling/load-balancing code, right @maurafortino ?
There was a problem hiding this comment.
@denopink there is sampling/load-balancing code in the getAddrByWeight function nested inside the switch statement: https://github.com/xmidt-org/caduceus/blob/4a72c1e544ab1c9c609872b1f6624bd92269f4e5/internal/dns/srvRecordDialer.go#L111C36-L112C21
There was a problem hiding this comment.
Spoke with John and we're going to be changing from sorter strategy to chooser strategy
There was a problem hiding this comment.
I may have missed the conversation behind the algo being used in getAddrByWeight.
I recommend adding a short description of the expected behavior of getAddrByWeight.
There was a problem hiding this comment.
I may have missed the conversation behind the algo being used in
getAddrByWeight.I recommend adding a short description of the expected behavior of
getAddrByWeight.
@denopink I don't think we had a discussion on which algo to use. i'm just using weighted random for now. i can set up a time though to discuss if needed.
There was a problem hiding this comment.
got it, just letting you know I don't think getAddrByWeight is actually implementing “weighted random” load balancing. weighted random load balancing usually means routing requests evenly across healthy targets in a random order.
No need for a meeting (imo) if you aren't focusing on the load balancing specifics just yet, I just wanted to point it out in case I was missing something. Sounds like it's a place holder while you're working on this. thx for clarifying 🙂
internal/dns/srvRecordDialer.go
Outdated
| d.srvs = append(d.srvs, addrs...) | ||
| } | ||
|
|
||
| // TODO: ask wes/john whether 1 or more net.LookupSRV error should trigger an error from NewSRVRecordDailer |
There was a problem hiding this comment.
When a customer registers a set of svr records (via a single registration call to webpa) and one of the src record fails, I would think we wouldn't want the srv record to silently fail... otherwise our customer may not realize there's an issue with one of their srv records.
What are your thoughts @schmidtw @johnabass?
internal/dns/srvRecordDialer.go
Outdated
| func getAddrByWeight(srvs []*net.SRV) (*net.SRV, int, error) { | ||
| if len(srvs) == 0 { | ||
| return nil, -1, errors.New("no SRV records available") | ||
| } | ||
|
|
||
| totalWeight := 0 | ||
| for _, srv := range srvs { | ||
| totalWeight += int(srv.Weight) | ||
| } | ||
|
|
||
| if totalWeight == 0 { | ||
| totalWeight = len(srvs) | ||
| } | ||
|
|
||
| randWeight := rand.Intn(totalWeight) | ||
| currentWeight := 0 | ||
|
|
||
| for i, srv := range srvs { | ||
| currentWeight += int(srv.Weight) | ||
| if randWeight < currentWeight { | ||
| return srv, i, nil | ||
| } | ||
| } | ||
|
|
||
| return nil, -1, errors.New("failed to choose an SRV record by weight") | ||
| } |
There was a problem hiding this comment.
best to convert these errors into package errors for testing and general error handling
internal/dns/srvRecordDialer.go
Outdated
| return conn, errs | ||
| } | ||
| default: | ||
| return nil, fmt.Errorf("unknown loadBalancingScheme type: %s", d.sortBy) |
There was a problem hiding this comment.
best to convert this to a package error for testing and general error handling
internal/dns/srvRecordDialer.go
Outdated
| _, addrs, err := resolver.LookupSRV(context.Background(), "", "", fqdn) | ||
| if err != nil { | ||
| errs = errors.Join(errs, | ||
| fmt.Errorf("srv lookup failure: `%s`", fqdn), |
There was a problem hiding this comment.
best to convert this to a package error for testing and general error handling
There was a problem hiding this comment.
can you clarify what you mean by package error?
There was a problem hiding this comment.
package level variable with an assigned error
There was a problem hiding this comment.
is that common practice for us? i don't think i've ever seen us make error a package level variable. do you have an example? maybe i'm not understanding fully.
There was a problem hiding this comment.
There was a problem hiding this comment.
oooh yeah was def not understanding fully. thx!
internal/dns/srvRecordDialer.go
Outdated
|
|
||
| // TODO: ask wes/john whether 1 or more net.LookupSRV error should trigger an error from NewSRVRecordDailer | ||
| if len(d.srvs) == 0 { | ||
| return nil, errors.Join(fmt.Errorf("expected atleast 1 srv record from fqdn list `%v`", d.fqdns), errs) |
There was a problem hiding this comment.
best to convert this to a package error for testing and general error handling
denopink
left a comment
There was a problem hiding this comment.
Left a few suggestions, great progress so far! 🍻
There was a problem hiding this comment.
moved to internal/dns/srvRecordDialer.go
There was a problem hiding this comment.
@denopink @schmidtw I would like discuss how we are going to be choosing between weight and priority. @johnabass mentioned we normally would choose one or the other (he wasn't sure exactly which one is chosen but for this example we'll go with weight) and then if there are two with the same weight we look to see which has the higher priority. And then vice versa if we use priority. Is this how we want to do that or are we thinking of distributing events differently?
internal/README.md
Outdated
There was a problem hiding this comment.
ignore this file - will be removing.
|
pausing work on this for now after discussing with John and Wes. Will work on in the background - but prioritizing batching and releasing new caduceus before finalizing the design for "load balancing" the SRVs |
|
closing this PR for now - could be reopened when we revisit the SRV logic |
What's Included: