Skip to content

Commit 5b6da41

Browse files
committed
feat: Implement intelligent GH API call monitoring
* Wrapped all GitHub API calls with comprehensive profiling utilities that provide detailed performance and quota monitoring capabilities. * Added intelligent rate limit monitoring with threshold-based alerting: - ERROR level: Critical alerts when <50 calls remaining - WARN level: Warnings when <100 calls remaining - INFO level: Notifications when <500 calls remaining - DEBUG level: All API calls with duration, URL, and rate limit data * Enhanced logging includes human-readable reset timestamps, total quotas, and repository context for better operational visibility. * Created comprehensive test coverage including edge cases for all rate limit threshold scenarios and API response variations. * Migrated logging configuration to dedicated documentation with detailed examples, troubleshooting guides, and operational best practices. * Added production-ready monitoring capabilities to prevent service disruptions and optimize API usage patterns. This enhancement provides administrators with proactive rate limit management, detailed API interaction debugging, and early warning systems for quota exhaustion. Jira: https://issues.redhat.com/browse/SRVKP-7037 Signed-off-by: Chmouel Boudjnah <[email protected]>
1 parent 4ae3540 commit 5b6da41

File tree

9 files changed

+628
-147
lines changed

9 files changed

+628
-147
lines changed

docs/content/docs/install/logging.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
---
2+
title: Logging
3+
weight: 4
4+
---
5+
6+
## Logging Configuration
7+
8+
Pipelines-as-Code uses a ConfigMap named `pac-config-logging` in its namespace (by default, `pipelines-as-code`) to configure the logging behavior of its controllers.
9+
10+
To view the ConfigMap, use the following command:
11+
12+
```bash
13+
kubectl get configmap pac-config-logging -n pipelines-as-code
14+
```
15+
16+
To see the full content of the ConfigMap, run:
17+
18+
```bash
19+
kubectl get configmap pac-config-logging -n pipelines-as-code -o yaml
20+
```
21+
22+
The `data` section of the ConfigMap contains the following keys:
23+
24+
* `loglevel.pipelinesascode`: The log level for the `pipelines-as-code-controller` component.
25+
* `loglevel.pipelines-as-code-webhook`: The log level for the `pipelines-as-code-webhook` component.
26+
* `loglevel.pac-watcher`: The log level for the `pipelines-as-code-watcher` component.
27+
28+
You can change the log level from `info` to `debug` or any other supported value. For example, to set the log level for the `pipelines-as-code-watcher` to `debug`, run:
29+
30+
```bash
31+
kubectl patch configmap pac-config-logging -n pipelines-as-code --type json -p '[{"op": "replace", "path": "/data/loglevel.pac-watcher", "value":"debug"}]'
32+
```
33+
34+
The controller will automatically pick up the new log level.
35+
36+
If you want to use the same log level for all Pipelines-as-Code components, you can remove the individual `loglevel.*` keys. In this case, all components will use the log level defined in the `level` field of the `zap-logger-config`.
37+
38+
```bash
39+
kubectl patch configmap pac-config-logging -n pipelines-as-code --type json -p '[{"op": "remove", "path": "/data/loglevel.pac-watcher"}, {"op": "remove", "path": "/data/loglevel.pipelines-as-code-webhook"}, {"op": "remove", "path": "/data/loglevel.pipelinesascode"}]'
40+
```
41+
42+
The `zap-logger-config` supports the following log levels:
43+
44+
* `debug`: Fine-grained debugging information.
45+
* `info`: Normal operational logging.
46+
* `warn`: Unexpected but non-critical errors.
47+
* `error`: Critical errors that are unexpected during normal operation.
48+
* `dpanic`: Triggers a panic (crash) in development mode.
49+
* `panic`: Triggers a panic (crash).
50+
* `fatal`: Immediately exits with a status of 1.
51+
52+
For more details, see the [Knative logging documentation](https://knative.dev/docs/serving/observability/logging/config-logging).
53+
54+
## Debugging API Interactions
55+
56+
If you need to troubleshoot interactions with the Git provider API (e.g., GitHub), you can enable detailed API request logging. This is useful for debugging permission issues or unexpected API responses.
57+
58+
To enable this feature, set the log level for the `pipelines-as-code-controller` to `debug`. This will cause the controller to log the duration, URL, and remaining rate-limit for each API call it makes.
59+
60+
You can enable this for the main controller with the following `kubectl` command:
61+
62+
```bash
63+
kubectl patch configmap pac-config-logging -n pipelines-as-code --type json -p '[{"op": "replace", "path": "/data/loglevel.pipelinesascode", "value":"debug"}]'
64+
```
65+
66+
### Rate Limit Monitoring
67+
68+
When debug logging is enabled, Pipelines-as-Code automatically monitors GitHub API rate limits and provides intelligent warnings when limits are running low. This helps prevent API quota exhaustion and provides early warning for potential service disruptions.
69+
70+
The rate limit monitoring includes:
71+
72+
* **Debug level**: All API calls log their duration, URL, and remaining rate limit count
73+
* **Info level**: When remaining calls drop below 500, logs include additional context like total limit and reset time
74+
* **Warning level**: When remaining calls drop below 100, warnings are logged to alert administrators
75+
* **Error level**: When remaining calls drop below 50, critical alerts are logged indicating immediate attention is needed
76+
77+
#### Example Log Messages
78+
79+
```
80+
# Debug level - normal API call logging
81+
DEBUG GitHub API call for repo myorg/myrepo to https://api.github.com/repos/myorg/myrepo/pulls took 245ms, ratelimit-remaining: 4850
82+
83+
# Info level - moderate rate limit usage
84+
INFO GitHub API rate limit moderate (repo: myorg/myrepo): 350/5000 remaining, resets at 1672531200 (15:30:00 UTC)
85+
86+
# Warning level - low rate limit
87+
WARN GitHub API rate limit running low (repo: myorg/myrepo): 75/5000 remaining, resets at 1672531200 (15:30:00 UTC)
88+
89+
# Error level - critically low rate limit
90+
ERROR GitHub API rate limit critically low (repo: myorg/myrepo): 25/5000 remaining, resets at 1672531200 (15:30:00 UTC)
91+
```
92+
93+
The rate limit information includes:
94+
95+
* **Remaining calls**: Number of API calls left in the current rate limit window
96+
* **Total limit**: The maximum number of calls allowed (typically 5000 for authenticated requests)
97+
* **Reset time**: When the rate limit window resets (shown as Unix timestamp and human-readable time)
98+
* **Repository context**: Which repository triggered the API call (when available)
99+
100+
This monitoring helps administrators:
101+
102+
* **Prevent service disruptions**: Early warnings allow proactive measures before hitting rate limits
103+
* **Optimize API usage**: Identify repositories or operations that consume excessive API calls
104+
* **Plan maintenance windows**: Schedule intensive operations around rate limit reset times
105+
* **Debug authentication issues**: Rate limit headers can indicate token validity and permissions

docs/content/docs/install/settings.md

Lines changed: 0 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -403,91 +403,3 @@ A few settings are available to configure this feature:
403403
The provider set to `GitHub App` by tkn pac bootstrap, used to detect if a
404404
GitHub App is already configured when a user runs the bootstrap command a
405405
second time or the `webhook add` command.
406-
407-
## Logging Configuration
408-
409-
Pipelines-as-Code uses the ConfigMap named `pac-config-logging` in the same namespace (`pipelines-as-code` by default) as the controllers. To get the ConfigMap use the following command:
410-
411-
```bash
412-
$ kubectl get configmap pac-config-logging -n pipelines-as-code
413-
414-
NAME DATA AGE
415-
pac-config-logging 4 9m44s
416-
```
417-
418-
To retrieve the content of the ConfigMap:
419-
420-
```bash
421-
$ kubectl get configmap pac-config-logging -n pipelines-as-code -o yaml
422-
423-
apiVersion: v1
424-
kind: ConfigMap
425-
metadata:
426-
labels:
427-
app.kubernetes.io/instance: default
428-
app.kubernetes.io/part-of: pipelines-as-code
429-
name: pac-config-logging
430-
namespace: pipelines-as-code
431-
data:
432-
loglevel.pac-watcher: info
433-
loglevel.pipelines-as-code-webhook: info
434-
loglevel.pipelinesascode: info
435-
zap-logger-config: |
436-
{
437-
"level": "info",
438-
"development": false,
439-
"sampling": {
440-
"initial": 100,
441-
"thereafter": 100
442-
},
443-
"outputPaths": ["stdout"],
444-
"errorOutputPaths": ["stderr"],
445-
"encoding": "json",
446-
"encoderConfig": {
447-
"timeKey": "ts",
448-
"levelKey": "level",
449-
"nameKey": "logger",
450-
"callerKey": "caller",
451-
"messageKey": "msg",
452-
"stacktraceKey": "stacktrace",
453-
"lineEnding": "",
454-
"levelEncoder": "",
455-
"timeEncoder": "iso8601",
456-
"durationEncoder": "",
457-
"callerEncoder": ""
458-
}
459-
}
460-
```
461-
462-
The `loglevel.*` fields define the log level for the controllers:
463-
464-
* loglevel.pipelinesascode - the log level for the pipelines-as-code-controller component
465-
* loglevel.pipelines-as-code-webhook - the log level for the pipelines-as-code-webhook component
466-
* loglevel.pac-watcher - the log level for the pipelines-as-code-watcher component
467-
468-
You can change the log level from `info` to `debug` or any other supported values. For example, select the `debug` log level for the pipelines-as-code-watcher component:
469-
470-
```bash
471-
kubectl patch configmap pac-config-logging -n pipelines-as-code --type json -p '[{"op": "replace", "path": "/data/loglevel.pac-watcher", "value":"debug"}]'
472-
```
473-
474-
After this command, the controller gets a new log level value.
475-
If you want to use the same log level for all Pipelines-as-Code components, delete `level.*` values from configmap:
476-
477-
```bash
478-
kubectl patch configmap pac-config-logging -n pipelines-as-code --type json -p '[ {"op": "remove", "path": "/data/loglevel.pac-watcher"}, {"op": "remove", "path": "/data/loglevel.pipelines-as-code-webhook"}, {"op": "remove", "path": "/data/loglevel.pipelinesascode"}]'
479-
```
480-
481-
In this case, all Pipelines-as-Code components get a common log level from `zap-logger-config` - `level` field from the json.
482-
483-
`zap-logger-config` supports the following log levels:
484-
485-
* debug - fine-grained debugging
486-
* info - normal logging
487-
* warn - unexpected but non-critical errors
488-
* error - critical errors; unexpected during normal operation
489-
* dpanic - in debug mode, trigger a panic (crash)
490-
* panic - trigger a panic (crash)
491-
* fatal - immediately exit with exit status 1 (failure)
492-
493-
See more: <https://knative.dev/docs/serving/observability/logging/config-logging>

hack/dev/kind/install.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -197,6 +197,7 @@ function configure_pac() {
197197
kubectl patch configmap -n pipelines-as-code -p \
198198
'{"data":{"catalog-1-type": "artifacthub", "catalog-1-id": "pachub", "catalog-1-name": "pipelines-as-code", "catalog-1-url": "https://artifacthub.io"}}' \
199199
--type merge pipelines-as-code
200+
kubectl patch configmap pac-config-logging -n pipelines-as-code --type json -p '[{"op": "replace", "path": "/data/loglevel.pipelinesascode", "value":"debug"}]'
200201
set +x
201202
if [[ -n ${PAC_PASS_SECRET_FOLDER} ]]; then
202203
echo "Installing PAC secrets"

pkg/provider/github/acl.go

Lines changed: 22 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,9 @@ func (v *Provider) CheckPolicyAllowing(ctx context.Context, event *info.Event, a
2020
// TODO: caching
2121
opt := github.ListOptions{PerPage: v.PaginedNumber}
2222
for {
23-
members, resp, err := v.Client().Teams.ListTeamMembersBySlug(ctx, event.Organization, team, &github.TeamListTeamMembersOptions{ListOptions: opt})
23+
members, resp, err := Wrap(v, func() ([]*github.User, *github.Response, error) {
24+
return v.Client().Teams.ListTeamMembersBySlug(ctx, event.Organization, team, &github.TeamListTeamMembersOptions{ListOptions: opt})
25+
})
2426
if resp.StatusCode == http.StatusNotFound {
2527
// we explicitly disallow the policy when the team is not found
2628
// maybe we should ignore it instead? i'd rather keep this explicit
@@ -169,7 +171,9 @@ func (v *Provider) aclAllowedOkToTestFromAnOwner(ctx context.Context, event *inf
169171
// aclAllowedOkToTestCurrentEvent only check if this is issue comment event
170172
// have /ok-to-test regex and sender is allowed.
171173
func (v *Provider) aclAllowedOkToTestCurrentComment(ctx context.Context, revent *info.Event, id int64) (bool, error) {
172-
comment, _, err := v.Client().Issues.GetComment(ctx, revent.Organization, revent.Repository, id)
174+
comment, _, err := Wrap(v, func() (*github.IssueComment, *github.Response, error) {
175+
return v.Client().Issues.GetComment(ctx, revent.Organization, revent.Repository, id)
176+
})
173177
if err != nil {
174178
return false, err
175179
}
@@ -235,7 +239,9 @@ func (v *Provider) aclCheckAll(ctx context.Context, rev *info.Event) (bool, erro
235239
//
236240
// ex: dependabot, *[bot] etc...
237241
func (v *Provider) checkPullRequestForSameURL(ctx context.Context, runevent *info.Event) (bool, error) {
238-
pr, resp, err := v.Client().PullRequests.Get(ctx, runevent.Organization, runevent.Repository, runevent.PullRequestNumber)
242+
pr, resp, err := Wrap(v, func() (*github.PullRequest, *github.Response, error) {
243+
return v.Client().PullRequests.Get(ctx, runevent.Organization, runevent.Repository, runevent.PullRequestNumber)
244+
})
239245
if err != nil {
240246
return false, err
241247
}
@@ -258,7 +264,9 @@ func (v *Provider) checkSenderOrgMembership(ctx context.Context, runevent *info.
258264
}
259265

260266
for {
261-
users, resp, err := v.Client().Organizations.ListMembers(ctx, runevent.Organization, opt)
267+
users, resp, err := Wrap(v, func() ([]*github.User, *github.Response, error) {
268+
return v.Client().Organizations.ListMembers(ctx, runevent.Organization, opt)
269+
})
262270
// If we are 404 it means we are checking a repo owner and not a org so let's bail out with grace
263271
if resp != nil && resp.StatusCode == http.StatusNotFound {
264272
return false, nil
@@ -282,10 +290,12 @@ func (v *Provider) checkSenderOrgMembership(ctx context.Context, runevent *info.
282290

283291
// checkSenderRepoMembership check if user is allowed to run CI.
284292
func (v *Provider) checkSenderRepoMembership(ctx context.Context, runevent *info.Event) (bool, error) {
285-
isCollab, _, err := v.Client().Repositories.IsCollaborator(ctx,
286-
runevent.Organization,
287-
runevent.Repository,
288-
runevent.Sender)
293+
isCollab, _, err := Wrap(v, func() (bool, *github.Response, error) {
294+
return v.Client().Repositories.IsCollaborator(ctx,
295+
runevent.Organization,
296+
runevent.Repository,
297+
runevent.Sender)
298+
})
289299

290300
return isCollab, err
291301
}
@@ -313,8 +323,10 @@ func (v *Provider) GetStringPullRequestComment(ctx context.Context, runevent *in
313323
ListOptions: github.ListOptions{PerPage: v.PaginedNumber},
314324
}
315325
for {
316-
comments, resp, err := v.Client().Issues.ListComments(ctx, runevent.Organization, runevent.Repository,
317-
prNumber, opt)
326+
comments, resp, err := Wrap(v, func() ([]*github.IssueComment, *github.Response, error) {
327+
return v.Client().Issues.ListComments(ctx, runevent.Organization, runevent.Repository,
328+
prNumber, opt)
329+
})
318330
if err != nil {
319331
return nil, err
320332
}

0 commit comments

Comments
 (0)