-
Notifications
You must be signed in to change notification settings - Fork 7
traffic components #271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
traffic components #271
Conversation
2023858 to
52e7818
Compare
6e9fa46 to
bbda3f2
Compare
asloobq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
submitting partial review
| // process delta files and count records in [deltaWindowStart, newestDeltaTs] | ||
| // files are sorted newest to oldest, records within files are sorted newest to oldest | ||
| // stop when the newest record in a file is older than the window | ||
| int sum = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: rename sum to totalRecords or totalRequests
| /** | ||
| * Extract timestamp from SQS message (from SentTimestamp attribute) | ||
| */ | ||
| private Long extractTimestampFromMessage(Message msg) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you reuse this method from SQSMessageParser.extractTimestamp()
| } | ||
|
|
||
| /** | ||
| * Count SQS messages from oldestQueueTs to oldestQueueTs + 5 minutes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you remind me why the unprocessed messages in queue are capped upto 5 mins ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not wish to read the entire queue (to apply the allowlist based on timestamp), just 5 minutes at a time.
| for (Message msg : sqsMessages) { | ||
| Long ts = extractTimestampFromMessage(msg); | ||
|
|
||
| if (ts < oldestQueueTs || ts > windowEnd) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ts < oldestQueueTs I think this is not possible because oldestQueueTs was determined from the same sqsMessages
| * Determine traffic status based on current vs baseline traffic. | ||
| * Logs warnings at 50%, 75%, and 90% of the circuit breaker threshold. | ||
| */ | ||
| TrafficStatus determineStatus(int sumCurrent, int baselineTraffic) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: rename sumCurrent to last24HrTraffic or similar
| LOGGER.warn("high_message_volume: 90% of threshold reached, sumCurrent={}, threshold={} ({}x{}), thresholdPercent={}%", | ||
| sumCurrent, threshold, thresholdMultiplier, baselineTraffic, String.format("%.1f", thresholdPercent)); | ||
| } else if (thresholdPercent >= 75.0) { | ||
| LOGGER.warn("high_message_volume: 75% of threshold reached, sumCurrent={}, threshold={} ({}x{}), thresholdPercent={}%", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the log messages are same except for 90/75/50. I think you can do it with one log message
| } | ||
|
|
||
| if (sumCurrent >= threshold) { | ||
| LOGGER.error("circuit_breaker_triggered: traffic threshold breached, sumCurrent={}, threshold={} ({}x{})", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the log type ERROR on purpose ? I think the system is working fine so this shouldn't be an error. May be WARN ?
| * @param sqsMessages List of SQS messages this consumer has read (non-denylisted) | ||
| * @param queueAttributes Queue attributes including invisible message count (can be null) | ||
| * @param denylistedCount Number of denylisted messages read by this consumer | ||
| * @param filteredAsTooRecentCount Number of messages filtered as "too recent" by window reader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't these be part of sqsMessages ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, these are not returned by SqsBatchProcessor
// Filter for eligible messages (>= deltaWindowSeconds old)
List<SqsParsedMessage> eligibleMessages = filterEligibleMessages(parsedBatch, currentTime);
return BatchProcessingResult.withMessages(eligibleMessages);
But they will appear in the invisible message count returned by getQueueAttributes. We remove / "deduplicate" this from the invisible count, since we are using that to represent messages from other consumers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this is a bit confusing TBH.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I will add to the list, to revisit this when the usage of trafficcalculator is added.
| List<TrafficFilterRule> rules = new ArrayList<>(); | ||
| try { | ||
| JsonArray denylistRequests = config.getJsonArray("denylist_requests"); | ||
| if (denylistRequests == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so config should at least have an empty list ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, just to avoid key typos when configuring
|
|
||
| TrafficFilterRule rule = new TrafficFilterRule(range, ipAddresses); | ||
|
|
||
| LOGGER.info("loaded traffic filter rule: range=[{}, {}], IPs={}", rule.getRangeStart(), rule.getRangeEnd(), rule.getIpAddresses()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this log all the IP addresses or just the size of the list ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would log the IP addresses we configured. Do you think we should avoid this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On one hand, this could flood the logs if the list is too big, on the other hand it would be useful for debugging. Let's keep it for now.
| */ | ||
| private static class TrafficFilterRule { | ||
| private final List<Long> range; | ||
| private final List<String> ipAddresses; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably a Set is better to do contains() efficiently
…timestamp parsing effort
…oldestqueuetimestamp back
…, use two longs for time ranges instead of List<Long>
No description provided.