Skip to content

Commit dbc6bf3

Browse files
authored
Allow scanners to run asynchronously and send their results later (#24447)
1 parent 4504512 commit dbc6bf3

File tree

16 files changed

+402
-52
lines changed

16 files changed

+402
-52
lines changed

docs/topics/api/overview.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -481,6 +481,7 @@ These are `v5` specific changes - `v4` changes apply also.
481481
* 2025-09-01: added ``created`` and ``updated`` query parameters to addon search api. https://github.com/mozilla/addons/issues/15814
482482
* 2025-09-18: added /rollback endpoint to version detail api. https://github.com/mozilla/addons/issues/15696
483483
* 2026-02-19: added ``listingcontentreview`` endpoint for addons. https://github.com/mozilla/addons/issues/16050
484+
* 2026-02-19: added the ability to patch a scanner result. https://github.com/mozilla/addons/issues/16004
484485

485486
.. _`#11380`: https://github.com/mozilla/addons-server/issues/11380/
486487
.. _`#11379`: https://github.com/mozilla/addons-server/issues/11379/

docs/topics/api/scanners.rst

Lines changed: 24 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@ Scanners
77
These APIs are subject to change at any time and are for internal use only.
88

99

10-
---------------------
11-
Scanner Results
12-
---------------------
10+
--------------------
11+
List scanner results
12+
--------------------
1313

1414
.. _scanner-results:
1515

@@ -29,3 +29,24 @@ This endpoint returns a list of labelled scanner results.
2929
:>json object results: The scanner (raw) results.
3030
:>json string created: The date the result was created, formatted with `this format <http://ecma-international.org/ecma-262/5.1/#sec-15.9.1.15>`_.
3131
:>json string|null model_version: The model version when applicable, ``null`` otherwise.
32+
33+
34+
----------------------
35+
Patch - Update results
36+
----------------------
37+
38+
.. _scanner-result-patch:
39+
40+
This endpoint allows to update scanner results.
41+
42+
.. note::
43+
Requires JWT authentication using the service account credentials
44+
associated with the scanner webhook.
45+
46+
.. http:patch:: /api/v5/scanner/results/(int:pk)/
47+
48+
:query string id: The scanner result ID.
49+
:<json object results: The scanner results.
50+
:statuscode 204: Results successfully updated.
51+
:statuscode 400: Invalid payload.
52+
:statuscode 409: Scanner results already recorded.

docs/topics/development/scanner_pipeline.md

Lines changed: 67 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,10 @@ occur on AMO. These webhooks are registed in the AMO (django) admin.
99

1010
When a [scanner webhook event](#scanner-webhook-events) occurs, AMO will send an
1111
HTTP request to each webhook subscribed to this event. The payload sent to the
12-
webook depends on the event. The response from each webhook will lead to the
13-
creation of a [scanner result](#scanner-results).
12+
webook depends on the event. AMO creates a [scanner result](#scanner-results)
13+
before calling the webhook, and includes a `scanner_result_url` in the payload
14+
that allows the scanner to [asynchronously send its results](#asynchronous-scanning)
15+
back to AMO.
1416

1517
Each service registered as a scanner webhook must be protected with a shared
1618
secret (api) key. Read [the scanners authentication
@@ -48,11 +50,13 @@ validation chain.
4850

4951
The payload sent looks like this. Assuming correct permissions, the URL in
5052
`download_url` allows the services notified for this event to download the (raw)
51-
uploaded file.
53+
uploaded file. The `scanner_result_url` allows the scanner to send results
54+
asynchronously.
5255

5356
```json
5457
{
55-
"download_url": "http://olympia.test/uploads/file/42"
58+
"download_url": "http://olympia.test/uploads/file/42",
59+
"scanner_result_url": "http://olympia.test/api/v5/scanner/results/123/"
5660
}
5761
```
5862

@@ -68,7 +72,8 @@ The payload sent looks like this:
6872
"version_id": 42,
6973
"download_source_url": "http://olympia.test/downloads/source/42",
7074
"license_slug": "MPL-2.0",
71-
"activity_log_id": 2170
75+
"activity_log_id": 2170,
76+
"scanner_result_url": "http://olympia.test/api/v5/scanner/results/124/"
7277
}
7378
```
7479

@@ -112,6 +117,8 @@ These actions are defined in `src/olympia/scanners/actions.py`.
112117
(scanners-authentication)=
113118
### Authentication
114119

120+
#### Authenticating incoming webhook calls
121+
115122
Scanners must verify the incoming requests using the `Authorization` header and
116123
not allow unauthenticated requests. For every webhook call, AMO will send this
117124
header using the _API key_ defined in the Django admin as follows:
@@ -124,11 +131,57 @@ Authorization: HMAC-SHA256 <hexdigest>
124131
the _API key_ used as the secret key. Make sure to hash the _raw_ request's
125132
body.
126133

134+
#### Authenticating asynchronous result submissions
135+
136+
When sending results asynchronously via PATCH to the `scanner_result_url`,
137+
scanners must authenticate using JWT credentials. Each scanner webhook has an
138+
automatically created service account, and the JWT keys for this account are
139+
displayed in the Django admin after creating the webhook.
140+
141+
Use the JWT key and secret to generate a JWT token and include it in the
142+
`Authorization` header when making PATCH requests to submit results
143+
asynchronously.
144+
127145
### API response
128146

129-
Scanners must return a JSON response that contains the following fields:
147+
Scanners can choose to return results synchronously or asynchronously:
148+
149+
#### Synchronous response
150+
151+
Scanners can return a JSON response immediately that contains the following fields:
130152

131153
- `version`: the scanner version
154+
- `matchedRules`: an array of matched rule identifiers (string)
155+
156+
(asynchronous-scanning)=
157+
#### Asynchronous response
158+
159+
Scanners can also return a quick acknowledgment response (or any response) and
160+
send their results later using the `scanner_result_url` provided in the webhook
161+
payload. This is useful for long-running scans.
162+
163+
To send results asynchronously:
164+
165+
1. The scanner receives a webhook call with a `scanner_result_url` in the payload
166+
2. The scanner returns a quick response (e.g., HTTP 202 Accepted)
167+
3. The scanner performs its analysis
168+
4. The scanner sends a PATCH request to the `scanner_result_url` with the results
169+
170+
The PATCH request must be authenticated using the [service account JWT
171+
credentials](#scanners-authentication) and include a JSON body with a `results`
172+
field:
173+
174+
```json
175+
{
176+
"results": {
177+
"version": "1.0.0",
178+
"matchedRules": []
179+
}
180+
}
181+
```
182+
183+
The `results` field should contain the same data structure as a synchronous
184+
response would return.
132185

133186
### Creating a new scanner
134187

@@ -149,7 +202,13 @@ import { createExpressApp } from "addons-scanner-utils";
149202
const handler = (req, res) => {
150203
console.log({ data: req.body });
151204

205+
// Option 1: Synchronous response
152206
res.json({ version: "1.0.0" });
207+
208+
// Option 2: Asynchronous response (for long-running scans)
209+
// res.status(202).json({ message: "Scan started" });
210+
// // Perform scanning asynchronously and later send results to:
211+
// // req.body.scanner_result_url
153212
};
154213

155214
const app = createExpressApp({
@@ -183,7 +242,8 @@ When uploading a new file, you should see the following in the console:
183242
```js
184243
{
185244
data: {
186-
download_url: "http://olympia.test/uploads/file/fa7868396b7e44ef8a0711f608f534f7/?access_token=w0Tl7qmJqBMQ4gtitKbcdKozulWVQWhkU0wEA10N"
245+
download_url: "http://olympia.test/uploads/file/fa7868396b7e44ef8a0711f608f534f7/?access_token=w0Tl7qmJqBMQ4gtitKbcdKozulWVQWhkU0wEA10N",
246+
scanner_result_url: "http://olympia.test/api/v5/scanner/results/123/"
187247
}
188248
}
189249
```

src/olympia/api/tests/utils.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,11 @@ def put(self, url, data=None, **client_kwargs):
5252
url, data, HTTP_AUTHORIZATION=self.authorization(), **client_kwargs
5353
)
5454

55+
def patch(self, url, data=None, **client_kwargs):
56+
return self.client.patch(
57+
url, data, HTTP_AUTHORIZATION=self.authorization(), **client_kwargs
58+
)
59+
5560
def auth_required(self, cls):
5661
"""
5762
Tests that the JWT Auth class is on the class, without having

src/olympia/constants/permissions.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,9 @@
102102
# Can submit language packs. #11788 and #11793
103103
LANGPACK_SUBMIT = AclPermission('LanguagePack', 'Submit')
104104

105+
# Scanners can use a special endpoint to update their results.
106+
SCANNERS_PATCH_RESULTS = AclPermission('Scanners', 'PatchResults')
107+
105108
# Can submit add-ons signed with Mozilla internal certificate, or add-ons with
106109
# a guid ending with reserved suffixes like @mozilla.com
107110
SYSTEM_ADDON_SUBMIT = AclPermission('SystemAddon', 'Submit')

src/olympia/scanners/admin.py

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1191,9 +1191,7 @@ def formatted_events_list(self, obj):
11911191

11921192
def service_account(self, obj):
11931193
try:
1194-
user = UserProfile.objects.get_service_account(
1195-
name=obj.service_account_name
1196-
)
1194+
user = obj.service_account
11971195
except UserProfile.DoesNotExist:
11981196
return '(will be automatically created)'
11991197

@@ -1213,9 +1211,7 @@ def save_model(self, request, obj, form, change):
12131211
if not change:
12141212
# Display the JWT keys only once on creation.
12151213
try:
1216-
user = UserProfile.objects.get_service_account(
1217-
name=obj.service_account_name
1218-
)
1214+
user = obj.service_account
12191215
api_key = APIKey.get_jwt_key(user=user)
12201216
messages.add_message(
12211217
request,

src/olympia/scanners/api_urls.py

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,18 @@
11
from django.conf import settings
22
from django.urls import re_path
33

4-
from .views import ScannerResultView
4+
from .views import ScannerResultView, patch_scanner_result
55

66

7-
urlpatterns = (
8-
[re_path(r'^results/$', ScannerResultView.as_view(), name='scanner-results')]
9-
if settings.INTERNAL_ROUTES_ALLOWED
10-
else []
11-
)
7+
urlpatterns = [
8+
re_path(
9+
r'^results/(?P<pk>\d+)/$',
10+
patch_scanner_result,
11+
name='scanner-result-patch',
12+
),
13+
]
14+
15+
if settings.INTERNAL_ROUTES_ALLOWED:
16+
urlpatterns.append(
17+
re_path(r'^results/$', ScannerResultView.as_view(), name='scanner-results')
18+
)
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
from django.db import migrations
2+
3+
from olympia.constants.scanners import SCANNER_SERVICE_ACCOUNTS_GROUP
4+
5+
6+
def add_permission_to_group(apps, schema_editor):
7+
Group = apps.get_model('access', 'Group')
8+
9+
group = Group.objects.get(name=SCANNER_SERVICE_ACCOUNTS_GROUP)
10+
rules = group.rules.split(',') if group.rules else []
11+
rules.append('Scanners:PatchResults')
12+
group.rules = ','.join(rules)
13+
group.save()
14+
15+
16+
class Migration(migrations.Migration):
17+
18+
dependencies = [
19+
('scanners', '0069_add_service_accounts_to_group'),
20+
]
21+
22+
operations = [migrations.RunPython(add_permission_to_group)]

src/olympia/scanners/models.py

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -294,7 +294,7 @@ class Meta:
294294

295295
def save(self, *args, **kwargs):
296296
service_account, created = UserProfile.objects.get_or_create_service_account(
297-
name=self.service_account_name,
297+
name=self._service_account_name,
298298
notes=(
299299
'Service account automatically created for '
300300
f'the "{self.name}" scanner webhook.'
@@ -308,9 +308,13 @@ def save(self, *args, **kwargs):
308308
return super().save(*args, **kwargs)
309309

310310
@property
311-
def service_account_name(self):
311+
def _service_account_name(self):
312312
return f'webhook-{self.name}'
313313

314+
@property
315+
def service_account(self):
316+
return UserProfile.objects.get_service_account(self._service_account_name)
317+
314318
def __str__(self):
315319
return self.name
316320

@@ -366,6 +370,10 @@ class Meta(AbstractScannerResult.Meta):
366370
def rule_model(self):
367371
return self.matched_rules.rel.model
368372

373+
@property
374+
def webhook(self):
375+
return self.webhook_event.webhook if self.webhook_event else None
376+
369377
def get_rules_queryset(self):
370378
# See: https://github.com/mozilla/addons-server/issues/13143
371379
return super().get_rules_queryset().filter(is_active=True)

src/olympia/scanners/serializers.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,3 +23,20 @@ class Meta:
2323

2424
def get_scanner(self, obj):
2525
return obj.get_scanner_name()
26+
27+
28+
class PatchScannerResultSerializer(serializers.Serializer):
29+
"""Serializer for updating scanner result data via PATCH endpoint."""
30+
31+
results = serializers.JSONField()
32+
33+
def validate(self, data):
34+
# Validate that no extra fields are present in the initial data.
35+
if hasattr(self, 'initial_data'):
36+
extra_fields = set(self.initial_data.keys()) - set(self.fields.keys())
37+
if extra_fields:
38+
raise serializers.ValidationError(
39+
{field: 'Unexpected field.' for field in extra_fields}
40+
)
41+
42+
return data

0 commit comments

Comments
 (0)