-
Notifications
You must be signed in to change notification settings - Fork 95
SeBS Cloudflare Compatibility & 311/130 Benchmark Translation to Nodejs #274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… sebs/cloudflare folder for comprehensive cloudflare doc and next steps
… if all configs are set up manually
…ge is needed and the wrappers have to work
…, at this point the other benchmarks may be translated to nodejs
…and 000 to nodejs. fixed support for nosql in wrapper and sebs. fixed durable objects implementation for sebs
When running the handler as a thread and then awaiting said thread (asyncio.to_thread), we can run asyncio.run() in the subsequent call stack.
|
Warning Rate limit exceeded@ldzgch has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 38 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds comprehensive Cloudflare Workers integration: new Cloudflare system (config, resources, packaging, deployment), R2 and Durable Objects adapters, Node/Python runtime wrappers and proxies, benchmark functions and Dockerfiles, CLI/config updates, build tooling, and documentation. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant Container as Container HTTP Handler
participant Worker as Cloudflare Worker (proxy)
participant DO as Durable Object
participant R2 as R2 Storage
Client->>Container: POST /invoke (event)
activate Container
Container->>Container: parse event, add request-id & timestamps
Container->>Worker: forward invocation (sets X-Worker-URL)
activate Worker
alt NoSQL operation
Worker->>DO: POST /nosql/{operation}
activate DO
DO-->>Worker: JSON response
deactivate DO
end
alt R2 operation
Worker->>R2: POST /r2/upload or GET /r2/download
activate R2
R2-->>Worker: data / key
deactivate R2
end
Worker-->>Container: result + measurement block
deactivate Worker
Container->>Container: compute timings & memory metrics
Container-->>Client: JSON response (begin, end, compute_time, memory_used, result)
deactivate Container
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
sebs.py (1)
102-105: Update help text to reflect Cloudflare container support.The help text states "(AWS only)" but this PR adds Cloudflare container deployment support. Update the documentation to reflect the expanded platform support.
🔎 Proposed fix
@click.option( "--container-deployment/--no-container-deployment", default=False, - help="Deploy functions as containers (AWS only). When enabled, functions are packaged as container images and pushed to Amazon ECR." + help="Deploy functions as containers. When enabled, functions are packaged as container images (AWS: Amazon ECR, Cloudflare: Container Workers)." )
🤖 Fix all issues with AI Agents
In @benchmarks/100.webapps/120.uploader/python/function_cloudflare.py:
- Around line 8-9: Move the module-level call storage.storage.get_instance()
into the request handler so the storage client is lazily initialized; inside the
handler, try to get the singleton via storage.storage.get_instance(), and if
that raises/returns None call storage.init_instance() (or the appropriate
initializer) and then call get_instance() again, assigning the result to a local
variable (e.g., client) before using it; update any references to the
module-level client to use this local/client-in-handler to avoid import-time
failures.
In @benchmarks/100.webapps/130.crud-api/nodejs/function.js:
- Around line 6-13: The addProduct function calls the asynchronous
nosqlClient.insert but does not await it; change addProduct to be async and
await the nosqlClient.insert call (or return its promise) so the handler waits
for completion; update the function signature (addProduct) and the call to
nosqlClient.insert to use await and propagate or handle errors as appropriate to
avoid fire-and-forget behavior.
In @benchmarks/300.utilities/311.compression/nodejs/function.js:
- Around line 108-113: The archive is being written into the source directory
(archive_path = path.join(download_path, archive_name)) and may be picked up by
collectFiles while creating the tar, corrupting the output; change
createTarGzArchive invocation to write the archive outside the directory being
archived (e.g., create the tar in a temp dir or sibling path) or exclude the
archive file from collection by ensuring archive_path is outside download_path
or by passing an explicit exclude for archive_name to collectFiles; update
references around archive_name, archive_path, download_path, and the call to
createTarGzArchive (and collectFiles if applicable) so the archive isn’t placed
inside the directory being archived.
In @benchmarks/300.utilities/311.compression/nodejs/package.json:
- Around line 6-8: The package.json currently depends on "uuid": "^10.0.0" which
requires Node.js 16+, but systems.json still lists Node.js 14 as a supported
runtime; fix by either removing Node.js 14 from the supported runtimes in
systems.json (so only Node 16+ remains) or by downgrading the package.json uuid
dependency to a Node‑14 compatible version (e.g., change "uuid" to a 9.x range
like ^9.0.0); update the corresponding file (package.json for the dependency
change or systems.json for the runtime change) and ensure any CI/test configs
are consistent with the chosen option.
In @benchmarks/wrappers/cloudflare/nodejs/build.js:
- Around line 64-130: The regex-based transformations in asyncNosqlPlugin
(inside build.onLoad for function.js) are unsafe: the function declaration
replacement (`/^(\s*)function\s+(\w+)\s*\(/gm`) forces all functions async and
the line-by-line call-rewrite regex turns constructor calls like `new
UserClass(...)` into invalid `new await UserClass(...)`. Replace the in-plugin
regex approach with a proper AST-based transform run before esbuild (e.g., use
@babel/parser + @babel/traverse or SWC/Recast) to selectively: (1) add await
only to actual async call expressions to nosqlClient methods and user-defined
async functions, (2) avoid modifying constructors (skip NewExpression nodes) and
generators, and (3) preserve non-async function declarations instead of globally
prefixing with async; alternatively restrict the plugin to only operate on
explicitly annotated sections rather than arbitrary regex edits.
In @benchmarks/wrappers/cloudflare/nodejs/container/nosql.js:
- Around line 39-48: The current error handling reads the response body multiple
times causing failures; change it to read the body once into a variable (e.g.,
const body = await response.text()), then attempt to parse JSON from that body
to extract errorData.error and fall back to the raw body if parsing fails,
assigning the result to errorMsg and then throwing the Error using that single
errorMsg (operate on the same response/body and use the existing response and
errorMsg variables).
- Around line 86-99: The query method contains leftover debug console.error
statements that pollute production logs; remove all console.error lines inside
async query (which calls this._make_request) and keep the rest of the
implementation intact (construct params, await this._make_request('query',
params), compute const items = result.items || [] and return items) so behavior
is unchanged.
In @benchmarks/wrappers/cloudflare/nodejs/container/package.json:
- Around line 7-9: Update the dependency version for "@cloudflare/containers" in
package.json because "^1.0.0" has no published releases; change the version
string to a valid published range such as "^0.0.20" or the specific latest
"0.0.31", then reinstall (npm install) and update package-lock.json so container
builds succeed.
In @benchmarks/wrappers/cloudflare/nodejs/container/storage.js:
- Around line 15-18: The class contains a duplicate static field declaration
"worker_url" defined twice; remove the redundant declaration so only a single
"static worker_url = null; // Set by handler from X-Worker-URL header" remains
(keep the comment), and verify no other duplicate static declarations exist in
the class and that all references to WorkerStorage.worker_url (or similar) still
resolve.
In @benchmarks/wrappers/cloudflare/nodejs/container/worker.js:
- Around line 31-38: The query method currently ignores its keyPrefix argument
and returns all stored items; update async query(keyPrefix) to filter the
listing by the prefix (e.g., only include entries where the key startsWith
keyPrefix) before pushing values into items so queries return the intended
subset; modify the loop over this.ctx.storage.list() (the [k, v] destructuring)
to check keyPrefix (handle undefined/empty) and only push v when the key matches
the prefix.
- Around line 309-324: Remove the temporary debug console.log statements inside
the '/r2/upload' branch: delete the logs that print bucket/key, env.R2
existence, received byte length, and the putResult (the lines using console.log
before and after the put). Keep the try/catch and the error reporting
(console.error for env.R2.put errors) and retain the successful behavior (no
debug prints); locate these in the code that checks url.pathname ===
'/r2/upload' and uses request.arrayBuffer() and env.R2.put(key, data).
In @benchmarks/wrappers/cloudflare/nodejs/nosql.js:
- Around line 62-66: The update method uses a non-atomic read-modify-write (the
update function calls get then insert), which allows concurrent updates to
clobber each other; either document this limitation or make update atomic: add a
version/etag field to records and implement optimistic concurrency (in update:
loop read existing via get, compute merged, set incremented version, attempt
insert only if version matches and retry on conflict), or switch to a Cloudflare
Durable Object/transactional API to perform the merge server-side; reference the
update, get, and insert functions when applying the change.
In @benchmarks/wrappers/cloudflare/nodejs/storage.js:
- Around line 151-176: download_stream has inconsistent handling of missing
objects: it returns null when R2 returns no object (obj == null) but throws if
no backend or if an R2 object lacks arrayBuffer()/text(); make the behavior
consistent by choosing one policy (prefer returning null for missing content)
and apply it across all branches: in download_stream, replace the final throw
and the fallback for objects without arrayBuffer/text with returning null, and
ensure the local filesystem branch also returns null rather than throwing when
the file is absent; update logic around instance.handle, obj,
obj.arrayBuffer()/obj.text(), and the fs/localPath check so every "not found"
path returns null and only unexpected failures (e.g., actual IO errors) throw.
- Around line 87-89: The download method uses this.download_stream instead of
the resolved singleton instance; update the call in async download(__bucket,
key, filepath) to use the resolved instance (storage.instance || this) by
replacing this.download_stream with instance.download_stream so the singleton
pattern is respected (refer to download and download_stream and
storage.instance).
In @benchmarks/wrappers/cloudflare/python/container/handler.py:
- Around line 135-137: The code computes an unused timing variable: it sets end
= datetime.datetime.now().timestamp() and compute_time = end - begin but never
uses compute_time; either remove those two lines (end and compute_time) or
include compute_time in the response/log (for example attach it to the response
JSON or a response header). Locate the timing code that uses begin, end and
compute_time in handler.py and either delete the end/compute_time assignments or
propagate compute_time into the outgoing response or a logging call (keeping the
begin variable if still needed elsewhere).
In @benchmarks/wrappers/cloudflare/python/container/storage.py:
- Around line 71-77: Add a timeout to the urllib.request.urlopen call and chain
the raised RuntimeError to preserve the original traceback: pass a sensible
timeout (e.g., 10 seconds) into the urllib.request.urlopen(req) invocation so
the call cannot hang indefinitely, and change the error re-raise to use "raise
RuntimeError(...)" with "from e" to chain the original exception thrown in the
except block referencing the existing exception variable e and the current
RuntimeError raise site.
- Around line 138-144: In _upload_with_key wrap the urlopen call with an
explicit timeout (pass the same timeout constant used elsewhere, e.g.,
timeout=DEFAULT_TIMEOUT or self._timeout) — change urllib.request.urlopen(req)
to urllib.request.urlopen(req, timeout=...) — and when re-raising the failure
convert to RuntimeError using exception chaining (raise RuntimeError(f"Failed to
upload to R2: {e}") from e) so the original traceback is preserved; keep the
existing log/print of the error and include the request/ key context if
available.
- Around line 91-101: The download_stream function currently calls
urllib.request.urlopen(url) without a timeout and re-raises RuntimeError without
exception chaining; update the call to pass the same timeout value used in
upload (e.g., urlopen(url, timeout=TIMEOUT)) and in both except blocks
(urllib.error.HTTPError and generic Exception) raise the RuntimeError using
exception chaining (raise RuntimeError(...) from e) so the original traceback is
preserved; reference the download_stream function, urllib.request.urlopen,
urllib.error.HTTPError, and the exception variable e when making these edits.
- Around line 168-201: The list request in download_directory lacks a timeout
and the raised RuntimeError loses the original exception context; add a timeout
to urllib.request.urlopen(req, timeout=...) when listing objects and ensure each
download call uses a timeout (update self.download signature to accept/forward a
timeout parameter or pass a configured timeout value), and re-raise the error
with exception chaining using "raise RuntimeError(f'Failed to download
directory: {e}') from e" so the original traceback is preserved.
In @benchmarks/wrappers/cloudflare/python/nosql.py:
- Around line 115-122: The class nosql_kv has incorrect references to nosql
instead of itself; in the static method init_instance you should instantiate and
assign nosql_kv.instance = nosql_kv() and set that instance's env (e.g.,
nosql_kv.instance.env = entry.env) rather than referencing nosql; update the
init_instance method (and any other places in this class referencing nosql) to
use the correct class name nosql_kv and the instance class attribute instance.
- Around line 8-15: The class uses the wrong identifier `nosql` instead of the
actual class `nosql_do`; change the type annotation and all
assignments/instantiation to refer to `nosql_do` (e.g., replace
`Optional["nosql"]` with `Optional["nosql_do"]`, replace `nosql.instance =
nosql()` with `nosql_do.instance = nosql_do()`) and ensure the binding
assignment uses `nosql_do.DO_BINDING_NAME` and sets `nosql_do.instance.binding =
getattr(entry.env, nosql_do.DO_BINDING_NAME)`.
- Around line 203-207: The get_instance method in class nosql_kv is creating the
wrong class: it calls nosql() (the nosql_do class) instead of instantiating
nosql_kv(); update the body of get_instance to set nosql.instance = nosql_kv()
when instance is None and return nosql.instance, ensuring the singleton
references the nosql_kv class rather than nosql.
- Around line 190-195: The code currently does a fragile get_res.replace("'",
"\"") on the result of run_sync(self.get_table(table_name).get(key)) before
json.loads; remove this blind replacement and instead ensure the stored value is
valid JSON (fix insert/update to write proper JSON), or if the backend returns a
Python literal use ast.literal_eval on get_res and then convert to JSON/dict, or
if get() can return an already-parsed object skip json.loads entirely; update
the logic around run_sync(self.get_table(table_name).get(key)), get_res, and the
json.loads call to handle proper parsing without string-replacement.
- Around line 161-168: The get method in nosql_kv returns the raw result instead
of a parsed dict; update nosql_kv.get to mirror query by checking for a
None/empty response and calling json.loads on the returned string before
returning it. Locate the get method that calls
run_sync(self.get_table(table_name).get(self.key_maker(primary_key,
secondary_key))) and replace the direct return with logic that handles a None
result and returns json.loads(get_res) (or the already-parsed dict) so callers
always receive a dict or None.
In @benchmarks/wrappers/cloudflare/python/storage.py:
- Line 60: The code attempts to call append on written_files which is
initialized as a set (self.written_files) causing an AttributeError; change the
call in the method that currently does self.written_files.append(filepath) to
use the set API (self.written_files.add(filepath)) or, if order/duplicates
matter, change the initializer from set() to [] and keep append—prefer using
self.written_files.add(filepath) to preserve the existing set semantics.
- Around line 111-113: get_instance currently uses invalid syntax raise "must
init storage singleton first" and is missing the @staticmethod decorator; fix by
adding @staticmethod above def get_instance() and replace the string raise with
raising a proper Exception (e.g., raise RuntimeError("must init storage
singleton first")) so the method correctly signals the error when
storage.instance is None; keep the method body referencing storage.instance as
written.
In @config/cloudflare-test.json:
- Around line 2-13: The runtime block currently declares `"runtime": {
"language": "nodejs", "version": "18" }` which is incorrect for Cloudflare
Workers; update the experiments.runtime configuration by removing the Node.js
language/version pair and instead set the Cloudflare Workers compatibility flags
(e.g., add `compatibility_date` with a date >= "2024-09-23" and `nodejs_compat:
true` when Node.js APIs are required), keeping the `"experiments"` object and
other fields intact and ensuring `nodejs_compat` is documented in the config
comment or README for future maintainers.
In @config/systems.json:
- Around line 389-391: The UUID version in config/systems.json ("packages": {
"uuid": "3.4.0" }) conflicts with the compression benchmark's package.json which
uses "uuid": "^10.0.0"; pick one canonical version and update the other to match
(either change config/systems.json uuid to "^10.0.0" or change the benchmark's
package.json uuid to "3.4.0") so all deployments use the same uuid package
version across providers; ensure you update the package.json or config entry and
run dependency resolution/tests to confirm compatibility.
In @experiments.json:
- Around line 52-54: The committed experiments.json exposes a real credential
value under the "credentials" -> "account_id" field; remove the sensitive value
from the PR and either delete the file from the branch or replace the value with
a placeholder like "<YOUR_ACCOUNT_ID>" in experiments.json, then add
experiments.json to .gitignore (or move this test output to a non-committed
sample file) to prevent future commits; ensure you also purge the secret from
the PR history or force-push a branch without the file if necessary.
In @package.json:
- Line 1: Delete the empty root package.json file (remove package.json from the
repo) because this project uses isolated benchmark packages; if a root
package.json is actually required, replace it with a meaningful file containing
the intended configuration or add a clear explanation in the repo README
indicating its purpose instead of leaving an empty {}.
In @sebs/cloudflare/cloudflare.py:
- Around line 969-981: The final line returning (directory, total_size, "") is
unreachable (there is an earlier return image_tag) and uses undefined
total_size; remove this leftover return statement or, if the intent was to
return size info, move calculation/definition of total_size into the same scope
and replace the earlier return image_tag accordingly. In short: delete the
unreachable return (directory, total_size, "") from the end of the function in
cloudflare.py (or consolidate/define total_size and adjust the earlier return to
a single consistent return value) so no undefined variable remains.
- Line 148: Credential verification and other HTTP calls in cloudflare.py are
missing timeouts which can cause indefinite hangs; define a default timeout
constant (e.g., DEFAULT_REQUEST_TIMEOUT = 10) near the module top and pass it to
all requests calls (e.g., change
requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers) to
requests.get(..., headers=headers, timeout=DEFAULT_REQUEST_TIMEOUT)) including
the worker-exists check and the workers.dev subdomain fetch, and update any
requests.post/put/delete calls similarly; optionally wrap these calls to catch
requests.exceptions.Timeout/RequestException where appropriate.
- Line 73: The initialize method uses a mutable default argument (config:
Dict[str, str] = {}); change the signature to use None as the default (e.g.,
config: Optional[Dict[str, str]] = None) and inside the initialize method set
config = {} if config is None to avoid sharing the same dict across calls;
update any type annotations/uses accordingly and ensure references to initialize
and its parameter name remain consistent.
In @sebs/cloudflare/durable_objects.py:
- Around line 100-113: The update_cache method uses a non-existent private
attribute self._cache_client which will raise AttributeError; change references
in update_cache to the public property self.cache_client (i.e., call
self.cache_client.update_nosql(...)) so it uses the NoSQLStorage-exposed client;
verify any other uses in the DurableObjects class for similar _cache_client
references and replace them with cache_client to remain consistent with the
parent API.
- Around line 185-203: The method remove_table mutates the inner dict while
iterating over tables.items(), which can raise RuntimeError; change the logic in
remove_table to first locate the matching benchmark and table_key without
deleting during iteration (e.g., iterate over a snapshot like
list(tables.items()) or record table_key to delete), then perform the deletion
after the loop; update references to self._tables, the local variable tables,
and the inner loop that checks table_name == name to reflect this two-step
find-then-delete approach and keep the logging/return behavior unchanged.
- Around line 30-39: The constructor __init__ accepts a credentials:
CloudflareCredentials but never stores it, causing _get_auth_headers to
reference missing self._credentials; fix by assigning the passed credentials to
an instance attribute (self._credentials = credentials) in __init__ so
_get_auth_headers can use it, or if credentials truly aren't needed remove the
credentials parameter and any references to _get_auth_headers; update only the
DurableObjects class __init__ and related references (_credentials and
_get_auth_headers) to keep consistency.
In @sebs/cloudflare/function.py:
- Around line 56-62: The deserializer uses hardcoded keys "Library" and "HTTP"
that don't match the serialized trigger type strings; replace the static mapping
with the actual serialized type names by using LibraryTrigger.typename() and
HTTPTrigger.typename() (e.g. mapping = {LibraryTrigger.typename():
LibraryTrigger, HTTPTrigger.typename(): HTTPTrigger}) and then lookup
trigger_type = mapping.get(trigger["type"]); keep the assert and call
ret.add_trigger(trigger_type.deserialize(trigger)) as before so deserialization
succeeds.
In @sebs/cloudflare/r2.py:
- Around line 81-84: The requests calls in _create_bucket, list_buckets, and
remove_bucket are missing timeouts and can hang indefinitely; update each
requests.<method> call (e.g., the POST in _create_bucket, the GET in
list_buckets, and the DELETE in remove_bucket) to pass a timeout argument (use a
shared constant or self._timeout if available, e.g., timeout=DEFAULT_TIMEOUT)
and ensure the same timeout is applied consistently across all requests calls in
this class; optionally, wrap the calls to catch requests.exceptions.Timeout and
handle/log it in the respective methods (e.g., _create_bucket, list_buckets,
remove_bucket).
- Around line 13-14: The typename() function returns the misspelled string
"Cloudlfare.R2"; update the return value in typename() to the correct service
identifier "Cloudflare.R2" so logs and identifiers use the correct spelling.
- Around line 58-60: The _create_bucket method uses a mutable default for the
buckets parameter (buckets: list[str] = []), which can cause shared-state bugs;
change the signature to accept None (e.g., buckets: Optional[list[str]] = None)
and inside _create_bucket initialize buckets = [] if buckets is None, or update
callers to always pass an explicit list, ensuring the function no longer relies
on a shared default list instance.
In @sebs/cloudflare/README.md:
- Around line 105-118: The "Container Deployment" limitation in the README is
outdated; update the "Container Deployment" and "Container Support Architecture"
sections to reflect that container support has been implemented: remove the "Not
currently implemented" limitation, document that Node.js and Python container
benchmark wrappers were added, and describe the implemented components
(container orchestrator, worker proxy for R2/Durable Objects, and provided
Dockerfiles such as the python Dockerfile) rather than listing what "would be
required"; also remove or update any references to a missing CloudflareContainer
class or missing package_code()/create_function() changes to accurately describe
the current implementation and where container integration is located.
In @sebs/cloudflare/triggers.py:
- Around line 136-142: The async_invoke method currently creates a new
ThreadPoolExecutor on every call (pool = ThreadPoolExecutor()) and never shuts
it down, leaking threads; change this to use a shared executor (e.g., a
module-level or Cloudflare trigger class attribute like _executor) and have
async_invoke submit to that shared executor instead of creating a new one, and
ensure the executor is cleanly shutdown when the application or trigger instance
is closed (implement a close()/shutdown() method or __del__ to call
_executor.shutdown(wait=True)); keep the call to submit(self.sync_invoke,
payload) and the returned Future unchanged.
🟠 Major comments (18)
benchmarks/wrappers/cloudflare/nodejs/container/package.json-7-9 (1)
7-9: Update version constraint to match published releases;^1.0.0has no available versions.The
@cloudflare/containerspackage exists and is an official Cloudflare package, but the version constraint"^1.0.0"is incorrect. The latest published version is0.0.31, and no version1.0.0or higher exists. Change the version to a valid range such as"^0.0.20"or the specific latest"0.0.31"to ensurenpm installsucceeds during container builds.experiments.json-52-54 (1)
52-54: Credentials exposed in committed file.The
account_idvalue appears to be a real Cloudflare account identifier. This file looks like test output data that should not be committed to the repository. Consider:
- Adding
experiments.jsonto.gitignore- Removing this file from the PR
- If this is intentional sample data, replace the account_id with a placeholder like
"<YOUR_ACCOUNT_ID>"benchmarks/300.utilities/311.compression/nodejs/function.js-108-113 (1)
108-113: Archive created inside source directory may include itself.The archive is created at
path.join(download_path, archive_name), which is inside thedownload_pathbeing archived. IfcollectFilesencounters the partially-written archive file during traversal, it will attempt to include it, leading to corrupted output or unpredictable behavior.🔎 Proposed fix: Create archive in a separate location
// Compress directory const compress_begin = Date.now(); const archive_name = `${key}.tar.gz`; - const archive_path = path.join(download_path, archive_name); + const archive_path = path.join('/tmp', `${archive_name}-${uuidv4()}`); await createTarGzArchive(download_path, archive_path); const compress_end = Date.now();Committable suggestion skipped: line range outside the PR's diff.
benchmarks/100.webapps/120.uploader/python/function_cloudflare.py-8-9 (1)
8-9: Module-level storage initialization may fail.
storage.storage.get_instance()is called at module import time. If the storage singleton hasn't been initialized viainit_instance()yet, this will raise an exception (once the bug instorage.pyis fixed). Consider lazy initialization inside the handler.🔎 Proposed fix: Lazy initialization
from . import storage -client = storage.storage.get_instance() +client = None SEBS_USER_AGENT = "SeBS/1.2 (https://github.com/spcl/serverless-benchmarks) SeBS Benchmark Suite/1.2" async def do_request(url, download_path): headers = {'User-Agent': SEBS_USER_AGENT} res = await pyfetch(url, headers=headers) bs = await res.bytes() with open(download_path, 'wb') as f: f.write(bs) def handler(event): + global client + if client is None: + client = storage.storage.get_instance() bucket = event.get('bucket').get('bucket')Committable suggestion skipped: line range outside the PR's diff.
benchmarks/300.utilities/311.compression/nodejs/package.json-6-8 (1)
6-8: Remove Node.js 14 support or downgrade uuid dependency.The
uuidv10.0.0 dependency requires Node.js 16 or newer, butsystems.jsonconfigures Node.js 14 as a supported version, creating an incompatibility. Either remove Node.js 14 from the supported runtimes or downgradeuuidto a version compatible with Node.js 14 (e.g., v9.x).benchmarks/wrappers/cloudflare/nodejs/container/worker.js-31-38 (1)
31-38:querymethod ignoreskeyPrefixparameter.The
querymethod accepts akeyPrefixparameter but doesn't use it. It returns all items from storage instead of filtering by the prefix. This could cause incorrect behavior when querying for specific subsets of data.🔎 Proposed fix
async query(keyPrefix) { - const list = await this.ctx.storage.list(); + const list = await this.ctx.storage.list({ prefix: keyPrefix }); const items = []; for (const [k, v] of list) { items.push(v); } return { items }; }benchmarks/wrappers/cloudflare/nodejs/container/nosql.js-39-48 (1)
39-48: Error handling consumes response body twice.On line 43,
response.json()consumes the response body. If that succeeds, line 43 then tries to accessawait response.text()as a fallback, but this will fail because the body stream is already consumed. Ifresponse.json()throws (line 44-46), thenresponse.text()is called again, which will also fail.🔎 Proposed fix
if (!response.ok) { - let errorMsg; - try { - const errorData = await response.json(); - errorMsg = errorData.error || await response.text(); - } catch { - errorMsg = await response.text(); - } + const errorText = await response.text(); + let errorMsg = errorText; + try { + const errorData = JSON.parse(errorText); + errorMsg = errorData.error || errorText; + } catch { + // Keep errorText as-is + } throw new Error(`NoSQL operation failed: ${errorMsg}`); }benchmarks/wrappers/cloudflare/python/container/handler.py-148-150 (1)
148-150: Fix platform-dependentru_maxrsscalculation for macOS compatibility.
resource.getrusage().ru_maxrssreturns different units across platforms:
- Linux: kilobytes
- macOS/BSD: bytes
The current code divides by 1024.0, which correctly converts Linux kilobytes to megabytes but produces kilobytes instead of megabytes on macOS. Use platform detection to apply the correct conversion factor:
- Linux: divide by 1024
- macOS/BSD: divide by 1,048,576
This same issue exists in
benchmarks/wrappers/cloudflare/python/handler.pyline 111.benchmarks/wrappers/cloudflare/nodejs/build.js-64-130 (1)
64-130: Regex-based async transformation will break constructor calls and violates esbuild best practices.The
asyncNosqlPluginuses regex patterns to transform JavaScript code, which causes two critical issues:
Constructor calls fail: The pattern at lines 97–116 matches
new UserClass(...)and transforms it tonew await UserClass(...), which is invalid JavaScript syntax. Thenewkeyword is neither in thecontrolFlowlist nor handled as a special case.All functions become async: Line 83 converts every function declaration to
async, which breaks:
- Functions returning non-Promise values
- Callbacks and synchronous contexts
- Generators and function-as-value patterns
Anti-pattern for esbuild: This violates esbuild plugin best practices. According to esbuild documentation, AST-heavy transforms should use a proper parser (Babel, SWC, or Recast) as a pre/post-build step rather than regex mutation inside the plugin.
Replace the regex-based transformation with either:
- An external AST transform step using
@babel/parser+@babel/traversebefore esbuild- A dedicated transformer service (SWC, Recast) that preserves code semantics
- Or limit the plugin to simple text replacements on explicitly marked sections
sebs/cloudflare/durable_objects.py-30-39 (1)
30-39:credentialsparameter is unused.The
credentialsparameter is passed to__init__but never stored. The_get_auth_headersmethod referencesself._credentialswhich doesn't exist. Either store the credentials or remove the parameter and the unused method.🔎 Proposed fix
def __init__( self, region: str, cache_client: Cache, resources: Resources, credentials: CloudflareCredentials, ): super().__init__(region, cache_client, resources) + self._credentials = credentials # Tables are just logical names - Durable Objects are accessed via Worker bindings self._tables: Dict[str, Dict[str, str]] = defaultdict(dict)benchmarks/wrappers/cloudflare/python/container/storage.py-71-77 (1)
71-77: Add timeout tourlopencalls to prevent indefinite hangs.Network calls without timeouts can hang indefinitely if the worker proxy is unresponsive, causing benchmark runs to stall. Also, chain the exception with
from efor proper traceback.🔎 Proposed fix
try: - with urllib.request.urlopen(req) as response: + with urllib.request.urlopen(req, timeout=30) as response: result = json.loads(response.read().decode('utf-8')) return result['key'] except Exception as e: print(f"R2 upload error: {e}") - raise RuntimeError(f"Failed to upload to R2: {e}") + raise RuntimeError(f"Failed to upload to R2: {e}") from ebenchmarks/wrappers/cloudflare/python/container/storage.py-138-144 (1)
138-144: Add timeout and exception chaining to _upload_with_key.Consistent with other methods.
🔎 Proposed fix
try: - with urllib.request.urlopen(req) as response: + with urllib.request.urlopen(req, timeout=30) as response: result = json.loads(response.read().decode('utf-8')) print(f"[storage._upload_with_key] Upload successful, key={result['key']}") except Exception as e: print(f"R2 upload error: {e}") - raise RuntimeError(f"Failed to upload to R2: {e}") + raise RuntimeError(f"Failed to upload to R2: {e}") from ebenchmarks/wrappers/cloudflare/python/container/storage.py-91-101 (1)
91-101: Add timeout and exception chaining to download_stream.Same issue as upload - missing timeout and exception chaining.
🔎 Proposed fix
try: - with urllib.request.urlopen(url) as response: + with urllib.request.urlopen(url, timeout=30) as response: return response.read() except urllib.error.HTTPError as e: if e.code == 404: - raise RuntimeError(f"Object not found: {key}") + raise RuntimeError(f"Object not found: {key}") from e else: - raise RuntimeError(f"Failed to download from R2: {e}") + raise RuntimeError(f"Failed to download from R2: {e}") from e except Exception as e: print(f"R2 download error: {e}") - raise RuntimeError(f"Failed to download from R2: {e}") + raise RuntimeError(f"Failed to download from R2: {e}") from ebenchmarks/wrappers/cloudflare/python/container/storage.py-168-201 (1)
168-201: Add timeout to download_directory list request and chain exception.The list operation and subsequent downloads should have timeouts.
🔎 Proposed fix
try: req = urllib.request.Request(list_url) req.add_header('User-Agent', 'SeBS/1.2 (https://github.com/spcl/serverless-benchmarks) SeBS Benchmark Suite/1.2') - with urllib.request.urlopen(req) as response: + with urllib.request.urlopen(req, timeout=30) as response: # ... rest of code ... except Exception as e: print(f"Error listing/downloading directory: {e}") - raise RuntimeError(f"Failed to download directory: {e}") + raise RuntimeError(f"Failed to download directory: {e}") from esebs/cloudflare/triggers.py-136-142 (1)
136-142: Thread pool executor leak: new pool created per invocation.Each call to
async_invokecreates a newThreadPoolExecutorthat is never shut down, leading to resource leakage. Use a shared executor or ensure proper cleanup.🔎 Proposed fix
+_executor = None + +def _get_executor(): + global _executor + if _executor is None: + _executor = concurrent.futures.ThreadPoolExecutor(max_workers=4) + return _executor + class HTTPTrigger(Trigger): # ... existing code ... def async_invoke(self, payload: dict) -> concurrent.futures.Future: """ Asynchronously invoke a Cloudflare Worker via HTTP. """ - pool = concurrent.futures.ThreadPoolExecutor() - fut = pool.submit(self.sync_invoke, payload) + fut = _get_executor().submit(self.sync_invoke, payload) return futsebs/cloudflare/cloudflare.py-148-148 (1)
148-148: Add timeouts torequestscalls to prevent indefinite hangs.Multiple HTTP requests lack timeouts:
- Line 148: credential verification
- Line 1069: checking if worker exists
- Line 1259: fetching workers.dev subdomain
🔎 Proposed fixes
# Line 148 - response = requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers) + response = requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers, timeout=30) # Line 1069 - response = requests.get(url, headers=headers) + response = requests.get(url, headers=headers, timeout=30) # Line 1259 - resp = requests.get(url, headers=headers) + resp = requests.get(url, headers=headers, timeout=30)Also applies to: 1069-1069, 1259-1259
sebs/cloudflare/r2.py-81-84 (1)
81-84: Add timeouts to allrequestscalls to prevent indefinite hangs.HTTP requests without timeouts can block indefinitely if the server is unresponsive, causing the application to hang. This is flagged in multiple locations:
_create_bucket(line 82),list_buckets(line 294), andremove_bucket(line 363).🔎 Proposed fix for line 82
create_bucket_response = requests.post( - create_bucket_uri, json=params, headers=self._get_auth_headers() + create_bucket_uri, json=params, headers=self._get_auth_headers(), timeout=30 )🔎 Proposed fix for line 294
- response = requests.get(list_buckets_uri, headers=self._get_auth_headers()) + response = requests.get(list_buckets_uri, headers=self._get_auth_headers(), timeout=30)🔎 Proposed fix for line 363
- response = requests.delete(delete_bucket_uri, headers=self._get_auth_headers()) + response = requests.delete(delete_bucket_uri, headers=self._get_auth_headers(), timeout=30)Also applies to: 293-294, 362-364
benchmarks/wrappers/cloudflare/python/nosql.py-190-195 (1)
190-195: Fragile string manipulation for JSON parsing.Replacing single quotes with double quotes (
get_res.replace("\'", "\"")) is a fragile approach that will break if the data contains escaped quotes or actual single-quote characters in values.🔎 Proposed fix
If the data is stored as JSON, it should already use double quotes. If it's stored with single quotes, consider:
- Ensuring data is stored as valid JSON in
insert/update- Using
ast.literal_evalfor Python literal syntax- Storing raw JSON without the quote replacement
get_res = run_sync(self.get_table(table_name).get(key)) - get_res = get_res.replace("\'", "\"") - ##print("gr", get_res) - - res.append(json.loads(get_res)) + if get_res is not None: + res.append(json.loads(get_res))
🟡 Minor comments (16)
config/cloudflare-test.json-2-13 (1)
2-13: Clarify the runtime configuration for Cloudflare Workers.Cloudflare Workers do not run a full Node.js runtime; they run a V8-based Workers runtime. The
"language": "nodejs"specification with"version": "18"does not reflect how Cloudflare Workers function. If Node.js APIs are required, enable thenodejs_compatcompatibility flag (requirescompatibility_date >= 2024-09-23) instead of specifying a language/version pair.package.json-1-1 (1)
1-1: Remove the empty root package.json.This repository uses isolated, self-contained benchmark packages in subdirectories and has no monorepo or workspace configuration that would justify a root
package.json. Adding an empty{}file serves no purpose and will only create confusion for developers and potentially cause unexpected behavior with Node.js tooling. If this file is genuinely needed for a specific purpose, clarify its use; otherwise, remove it.config/systems.json-389-391 (1)
389-391: UUID version inconsistency with benchmark package.json.The Cloudflare Node.js deployment specifies
uuid: "3.4.0", but the compression benchmark'spackage.jsonspecifiesuuid: "^10.0.0". This version mismatch may cause conflicts or unexpected behavior during deployment.🔎 Suggested fix: Align versions
Either update this to match the benchmark:
"packages": { - "uuid": "3.4.0" + "uuid": "^10.0.0" }Or update the benchmark's package.json to use
3.4.0for consistency with other providers (AWS also uses3.4.0).sebs/cloudflare/README.md-105-118 (1)
105-118: Documentation inconsistency: Container support status is unclear.This section states that "Container Deployment: Not currently implemented" and lists it as a limitation. However, the PR description explicitly states:
- "Adds Node.js and Python container benchmark wrappers."
- "Adds Node.js and Python container benchmark wrappers."
- "Added container support for Python and Node.js, container orchestrator and worker proxy for R2/Durable Objects."
Additionally, this PR includes
dockerfiles/cloudflare/python/Dockerfile, suggesting container support is at least partially implemented.Please update this documentation section to accurately reflect the current implementation status. If containers are implemented, remove this from limitations and update lines 299-332 (Container Support Architecture) to reflect what was actually built rather than what "would be required."
benchmarks/wrappers/cloudflare/python/container/handler.py-135-137 (1)
135-137: Remove unusedcompute_timevariable.The
compute_timevariable is calculated but never used. Either use it in the response or remove it.🔎 Proposed fix
# Calculate timing end = datetime.datetime.now().timestamp() - compute_time = end - beginbenchmarks/wrappers/cloudflare/nodejs/container/worker.js-309-324 (1)
309-324: Remove debug logging statements.The R2 upload handler contains multiple
console.logdebug statements (lines 311-312, 314, 319-320) that should be removed before merging to production.🔎 Proposed fix
} else if (url.pathname === '/r2/upload') { // Upload to R2 - console.log(`[worker.js /r2/upload] bucket=${bucket}, key=${key}`); - console.log(`[worker.js /r2/upload] env.R2 exists:`, !!env.R2); const data = await request.arrayBuffer(); - console.log(`[worker.js /r2/upload] Received ${data.byteLength} bytes`); // Use the key as-is (container already generates unique keys if needed) try { const putResult = await env.R2.put(key, data); - console.log(`[worker.js /r2/upload] R2.put() returned:`, putResult); - console.log(`[worker.js /r2/upload] Successfully uploaded to R2 with key=${key}`); } catch (error) { console.error(`[worker.js /r2/upload] R2.put() error:`, error); throw error; }benchmarks/wrappers/cloudflare/nodejs/container/nosql.js-86-99 (1)
86-99: Remove debug logging statements.The
querymethod contains multipleconsole.errordebug statements that should be removed before merging. These add noise to production logs.🔎 Proposed fix
async query(tableName, primaryKey, secondaryKeyName) { const params = { table_name: tableName, primary_key: primaryKey, secondary_key_name: secondaryKeyName, }; const result = await this._make_request('query', params); - console.error(`[nosql.query] result:`, JSON.stringify(result)); - console.error(`[nosql.query] result.items:`, result.items); - console.error(`[nosql.query] Array.isArray(result.items):`, Array.isArray(result.items)); const items = result.items || []; - console.error(`[nosql.query] returning items:`, items); return items; }benchmarks/wrappers/cloudflare/nodejs/nosql.js-62-66 (1)
62-66: Race condition inupdate: concurrent updates may lose data.The read-modify-write pattern in
updateis not atomic. If two concurrent updates occur, one may overwrite the other's changes. Consider documenting this limitation or using Durable Object transactions if available.benchmarks/wrappers/cloudflare/nodejs/container/storage.js-15-18 (1)
15-18: Duplicate static field declaration:worker_urldefined twice.This is flagged by static analysis and will cause the second declaration to overwrite the first (which is redundant but may indicate copy-paste error).
🔎 Proposed fix
static worker_url = null; // Set by handler from X-Worker-URL header - - - static worker_url = null; // Set by handler from X-Worker-URL headerbenchmarks/wrappers/cloudflare/python/container/storage.py-146-151 (1)
146-151: Potential issue:downloaddoesn't handle/tmpprefix like the worker version.Comparing with
benchmarks/wrappers/cloudflare/python/storage.py(lines 51-59), the worker version redirects writes to/tmpand tracks written files. This container version creates directories but doesn't ensure the path is under/tmp, which could fail in restricted environments.🔎 Proposed fix
def download(self, bucket, key, filepath): """Download file to disk""" data = self.download_stream(bucket, key) + # Ensure writes go to /tmp in container environment + if not filepath.startswith('/tmp'): + filepath = '/tmp' + os.path.abspath(filepath) os.makedirs(os.path.dirname(filepath), exist_ok=True) with open(filepath, 'wb') as f: f.write(data)benchmarks/wrappers/cloudflare/python/nosql.py-161-168 (1)
161-168:getmethod innosql_kvreturns raw result instead of parsed JSON.Unlike
querywhich callsjson.loads, thegetmethod returns the raw result without parsing. This inconsistency could cause type errors for callers expecting a dict.🔎 Proposed fix
def get( self, table_name: str, primary_key: Tuple[str, str], secondary_key: Tuple[str, str] ) -> Optional[dict]: get_res = run_sync( self.get_table(table_name).get( self.key_maker(primary_key, secondary_key) )) - return get_res + if get_res is None: + return None + return json.loads(get_res)sebs/cloudflare/r2.py-13-14 (1)
13-14: Typo intypename(): "Cloudlfare" should be "Cloudflare".This typo will cause inconsistencies when the typename is used for logging or identification purposes.
🔎 Proposed fix
@staticmethod def typename() -> str: - return "Cloudlfare.R2" + return "Cloudflare.R2"benchmarks/wrappers/cloudflare/nodejs/storage.js-151-176 (1)
151-176: Inconsistent error handling:download_streamreturnsnullfor missing objects but then throws at line 175.When R2 returns no object, the method returns
null(line 156). However, if no backend is available, it throws an error (line 175). Additionally, if the object is found in R2 but has neitherarrayBuffernortextmethods, it falls through to throwing even though the object exists.Consider aligning the behavior — either always return
nullfor missing objects or always throw.🔎 Proposed fix for clearer control flow
async download_stream(__bucket, key) { const instance = storage.instance || this; if (instance.handle) { const obj = await instance.handle.get(key); - if (!obj) return null; + if (!obj) { + throw new Error(`download_stream(): object not found in R2: ${key}`); + } // R2 object provides arrayBuffer()/text() helpers in Workers if (typeof obj.arrayBuffer === 'function') { const ab = await obj.arrayBuffer(); return Buffer.from(ab); } if (typeof obj.text === 'function') { return await obj.text(); } - // Fallback: return null - return null; + throw new Error(`download_stream(): R2 object has no readable content: ${key}`); } // Fallback to local filesystem const localPath = path.join('/tmp', key); if (fs && fs.existsSync(localPath)) { return fs.readFileSync(localPath); } throw new Error('download_stream(): object not found'); }sebs/cloudflare/cloudflare.py-73-73 (1)
73-73: Avoid mutable default argument.Using
{}as a default argument is a Python pitfall — the same dict instance is shared across calls.🔎 Proposed fix
- def initialize(self, config: Dict[str, str] = {}, resource_prefix: Optional[str] = None): + def initialize(self, config: Optional[Dict[str, str]] = None, resource_prefix: Optional[str] = None): + if config is None: + config = {}benchmarks/wrappers/cloudflare/nodejs/storage.js-87-89 (1)
87-89: Useinstanceinstead ofthisfor consistency with the singleton pattern.Other methods use
storage.instance || thisbutdownloadcallsthis.download_streamdirectly. For consistency and correctness when called on the singleton, useinstance.🔎 Proposed fix
async download(__bucket, key, filepath) { const instance = storage.instance || this; - const data = await this.download_stream(__bucket, key); + const data = await instance.download_stream(__bucket, key);sebs/cloudflare/r2.py-58-60 (1)
58-60: Avoid mutable default argument.Using
[]as a default argument is a common Python pitfall — the same list instance is shared across all calls if not overwritten.🔎 Proposed fix
def _create_bucket( - self, name: str, buckets: list[str] = [], randomize_name: bool = False + self, name: str, buckets: Optional[list[str]] = None, randomize_name: bool = False ) -> str: + if buckets is None: + buckets = [] for bucket_name in buckets:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI Agents
In @sebs/cloudflare/README.md:
- Around line 280-285: Add a clear limitation note to the "Platform Limitations"
section stating that Cloudflare does not support cold start detection and all
invocations will report is_cold: false; reference the handler example's
hardcoded is_cold: false and warn users this makes cold-start metrics
incomparable to platforms like AWS Lambda/Azure Functions and may skew benchmark
comparisons.
- Around line 191-250: The README response schema and metrics extraction
description are wrong; update the docs to reflect the actual handler and
extractor: note that the Python handler
(benchmarks/wrappers/cloudflare/python/handler.py) returns extra unused fields
(results_time, is_cold_worker, container_id, environ_container_id) and that
measurements are nested at result.output['result']['measurement'] rather than
using the top-level compute_time; document that HTTPTrigger.sync_invoke in
sebs/cloudflare/triggers.py ignores top-level compute_time and instead reads
cpu_time_us or cpu_time_ms to populate provider_times.execution, wall_time_us or
wall_time_ms to populate times.benchmark, is_cold from the nested measurement,
and memory_used_mb for memory metrics; remove or stop claiming top-level
compute_time is used, and explicitly list which fields are consumed vs
returned-but-unused.
🧹 Nitpick comments (3)
sebs/cloudflare/README.md (3)
138-142: Clarify tier-specific limitations and document paid plan differences.The memory and CPU limits are stated as "128MB memory, 50ms CPU time on free tier," but the README doesn't mention that paid plans have different limits. This could mislead users deploying to paid accounts.
Add a subsection under "Platform Limitations" clarifying:
- Free tier: 128MB memory, 50ms CPU time
- Paid plans: Different limits (specify actual limits)
- Link to Cloudflare's pricing/limits documentation
300-306: Specify Cloudflare's container image size limit instead of leaving it vague.Line 305 states "Ensure container image size is under Cloudflare's limits" but doesn't specify the actual limit. This forces users to refer to external documentation.
If Cloudflare publishes a container image size limit, include it explicitly (e.g., "container images must be under X MB").
70-76: Minor wording: Line 76 is slightly awkward."This provides the behavior of SeBS to invoke serverless functions via either library or http triggers" is grammatically unclear. Consider rewording to: "This enables SeBS to invoke serverless functions via either library or HTTP triggers."
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
sebs/cloudflare/README.md
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: mcopik
Repo: spcl/serverless-benchmarks PR: 99
File: sebs/azure/azure.py:137-0
Timestamp: 2025-06-19T16:19:09.758Z
Learning: C++ support in the serverless-benchmarks project is currently limited to AWS Lambda only. Azure Functions implementation (sebs/azure/azure.py) does not support C++ and only supports Python and Node.js languages.
🔇 Additional comments (1)
sebs/cloudflare/README.md (1)
313-318: All external documentation links are valid and current.The four Cloudflare documentation URLs in the References section are accessible and reflect the current state of Cloudflare's APIs and services, including recent features like Workflows in the Bindings documentation and R2 storage classes. No updates needed.
…val suggestion, function deserialize suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI Agents
In @sebs/cloudflare/durable_objects.py:
- Around line 100-113: The update_cache method is using a non-existent private
attribute self._cache_client which will raise AttributeError; change the call in
update_cache to use the public property self.cache_client.update_nosql(...)
(matching retrieve_cache) and keep the same arguments and logging; ensure
update_cache references self.cache_client (not self._cache_client) and that
deployment_name, benchmark and self._tables[benchmark] are passed unchanged.
- Around line 30-40: The __init__ method for the Durable Objects class accepts a
credentials: CloudflareCredentials parameter but never stores it, causing
_get_auth_headers to raise AttributeError when accessing self._credentials; fix
by assigning the parameter to an instance variable (e.g., self._credentials =
credentials) inside __init__ so _get_auth_headers and other methods can
reference it reliably.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
sebs/cloudflare/durable_objects.pysebs/cloudflare/function.pysebs/sebs.py
🚧 Files skipped from review as they are similar to previous changes (1)
- sebs/cloudflare/function.py
🧰 Additional context used
🧬 Code graph analysis (2)
sebs/sebs.py (2)
sebs/utils.py (1)
has_platform(260-261)sebs/cloudflare/cloudflare.py (1)
Cloudflare(25-1611)
sebs/cloudflare/durable_objects.py (3)
sebs/cloudflare/config.py (12)
CloudflareCredentials(9-127)typename(32-33)typename(140-141)typename(210-211)resources(218-219)credentials(214-215)api_token(36-37)api_key(44-45)update_cache(118-121)update_cache(167-173)update_cache(248-251)sebs/faas/config.py (1)
Resources(57-147)sebs/cache.py (3)
Cache(37-463)get_nosql_config(177-178)update_nosql(190-193)
🪛 Ruff (0.14.10)
sebs/cloudflare/durable_objects.py
35-35: Unused method argument: credentials
(ARG002)
55-55: Avoid specifying long messages outside the exception class
(TRY003)
116-116: Unused method argument: primary_key
(ARG002)
116-116: Unused method argument: secondary_key
(ARG002)
146-146: Unused method argument: data
(ARG002)
147-147: Unused method argument: primary_key
(ARG002)
148-148: Unused method argument: secondary_key
(ARG002)
167-167: Avoid specifying long messages outside the exception class
(TRY003)
182-182: f-string without any placeholders
Remove extraneous f prefix
(F541)
🔇 Additional comments (2)
sebs/sebs.py (1)
109-112: LGTM! Platform registration now follows consistent pattern.The Cloudflare registration now correctly uses
has_platform("cloudflare"), matching the pattern used for AWS, Azure, GCP, and OpenWhisk. This addresses the previous review concern about inconsistent registration.sebs/cloudflare/durable_objects.py (1)
185-212: LGTM! Dictionary mutation issue resolved.The two-step approach with
list()wrapping correctly preventsRuntimeErrorfrom modifying the dictionary during iteration. The logic first locates the matching benchmark and table_key, then performs deletion after the loop completes.
…ost processing in the build script
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🤖 Fix all issues with AI agents
In @benchmarks/100.webapps/130.crud-api/nodejs/function.js:
- Around line 23-47: The total and average calculations in queryProducts are
incorrect: priceSum currently adds product.price without accounting for
product.quantity and avgPrice divides sum(prices) by sum(quantity), which is
meaningless; change the aggregation so priceSum accumulates product.price *
product.quantity (i.e., priceSum += product.price * product.quantity) and
compute avgPrice as quantitySum > 0 ? priceSum / quantitySum : 0.0 (average unit
price per item), keeping product names appended to products and returning
total_cost: priceSum and avg_price: avgPrice.
In @sebs/cloudflare/cloudflare.py:
- Around line 1253-1280: The requests.get call in the method that fetches the
workers.dev subdomain should include a timeout to avoid hanging; update the call
to requests.get(url, headers=headers, timeout=...) and choose a sensible timeout
(e.g., 5-10s), and adjust the surrounding exception handling to specifically
catch requests.Timeout and requests.RequestException where appropriate so the
method still logs via self.logging.warning and returns None on timeout or other
request errors; refer to the existing method using _get_auth_headers,
self._workers_dev_subdomain, and the current requests.get usage to locate and
modify the call.
- Around line 148-154: The requests.get call used to verify Cloudflare
credentials lacks a timeout and can hang; update the call to include a sensible
timeout (e.g., timeout=10) or use a configurable attribute like self._timeout,
i.e., change requests.get(f"{self._api_base_url}/user/tokens/verify",
headers=headers) to include timeout, and add handling for
requests.exceptions.Timeout/requests.exceptions.RequestException to raise a
clear RuntimeError if the request times out or fails; refer to the requests.get
invocation and the response/status_code check to locate where to add the timeout
and error handling.
- Around line 1209-1213: Replace the bare except that follows response.json()
with a specific exception handler for JSON parse failures: catch
json.JSONDecodeError and ValueError (and bind it to a variable, e.g., "e")
around the call to response.json() where error_data and error_msg are set, then
fall back to the existing log path (self.logging.info(f"Container
provisioning... ({elapsed}s elapsed)"))—optionally include the exception text in
the log; ensure json is imported if not already and update the except block that
currently surrounds response.json()/error_msg assignment.
- Around line 799-814: The nested function replacer closes over loop-scoped
variables control_flow and builtins causing stale bindings; fix by capturing
their current values into locals when defining replacer (e.g., assign
local_control_flow = control_flow and local_builtins = builtins just before
defining replacer and refer to those inside replacer, or bind them as default
args like def replacer(match, control_flow=control_flow, builtins=builtins):),
then keep using replacer in the re.sub call so the correct sets are used for
each iteration.
- Around line 73-83: The initialize method currently declares a mutable default
config and never uses it; change the signature to use a safe default by
replacing config: Dict[str, str] = {} with config: Optional[Dict[str, str]] =
None (or remove the parameter entirely if it’s not needed), and if you keep it,
ensure you normalize it at the top of initialize (e.g., config = config or {})
before use and pass relevant values into initialize_resources or other code
paths; keep references to the existing initialize, _verify_credentials, and
initialize_resources symbols when making the change.
- Around line 1066-1077: The bare except around response.json() in the function
that handles the worker check should be replaced with a specific exception
handler for JSON parse errors: catch json.JSONDecodeError (and optionally
ValueError for compatibility) instead of using a bare except, and handle/log the
exception (e.g., with self.logging.debug/warning) before returning None; ensure
json is imported if not already and update the try/except that wraps
response.json().get("result") to use the specific exception types and an
exception variable (e) for logging.
🧹 Nitpick comments (8)
benchmarks/100.webapps/130.crud-api/nodejs/function.js (1)
49-78: Consider adding input validation.The handler assumes all required fields are present in
request.bodyandrequest.path. If upstream wrappers don't guarantee this, the function may throw unclear errors when accessing undefined properties.Consider adding validation for:
- Required body fields (
cart,product_id,name,price,quantityfor PUT)- Required path parameters (
idfor GET /cart/{id})- Data types (e.g., ensuring
priceandquantityare numbers)Note: Line 67 uses
body.cartfor a GET request, which is unconventional (GET requests typically use path/query parameters), but may be intentional for this benchmark's design.benchmarks/wrappers/cloudflare/nodejs/build.js (1)
108-155: Consider adding documentation for the post-processing logic.The regex-based post-processing that converts dynamic
__requirecalls into static imports is clever but complex. Adding comments explaining:
- Why this transformation is necessary (Cloudflare Workers ESM requirements)
- The pattern being matched (
__require<digits>("node:..."))- The cache mechanism
would improve maintainability.
📝 Example documentation
- // POST-PROCESS: Replace dynamic requires with static imports + // POST-PROCESS: Replace dynamic requires with static imports + // Cloudflare Workers require ESM with static imports for node: modules. + // esbuild generates __require\d*("node:...") calls for dynamic imports, + // so we transform them into static imports with a cache lookup. console.log('Post-processing to fix node: module imports...');sebs/cloudflare/durable_objects.py (1)
183-183: Remove unnecessary f-string prefix.The f-string on line 183 doesn't contain any placeholders, so the
fprefix is unnecessary.✨ Proposed fix
- self.logging.info(f"Durable Objects data is managed within the Worker") + self.logging.info("Durable Objects data is managed within the Worker")sebs/cloudflare/cloudflare.py (5)
385-391: Remove unused variabler2_bucket_configured.The variable
r2_bucket_configuredis assigned on line 385 but never used afterward. This can be safely removed to clean up the code.🧹 Proposed fix
# Add R2 bucket binding for benchmarking files (for both native and container deployments) - r2_bucket_configured = False try: storage = self.system_resources.get_storage() bucket_name = storage.get_bucket(Resources.StorageBucketType.BENCHMARKS) if bucket_name: toml_content += f"""# R2 bucket binding for benchmarking files # This bucket is used by fs and path polyfills to read benchmark data [[r2_buckets]] binding = "R2" bucket_name = "{bucket_name}" """ - r2_bucket_configured = True self.logging.info(f"R2 bucket '{bucket_name}' will be bound to worker as 'R2'")
654-657: Rename unused loop variable.The loop variable
dirnameson line 654 is not used within the loop body. Following Python conventions, it should be renamed to_dirnamesto indicate it's intentionally unused.✨ Proposed fix
# Calculate total size of the package directory total_size = 0 - for dirpath, dirnames, filenames in os.walk(directory): + for dirpath, _dirnames, filenames in os.walk(directory): for filename in filenames: filepath = os.path.join(dirpath, filename) total_size += os.path.getsize(filepath)
855-866: Remove unused variableresult.The variable
resultassigned on line 855 from thesubprocess.run()call is never used. Since the success is verified by the lack of exception, this assignment can be removed.🧹 Proposed fix
# Install Node.js dependencies (needed for all containers for worker.js) self.logging.info(f"Installing @cloudflare/containers for worker.js orchestration in {directory}") try: - result = subprocess.run( + subprocess.run( ["npm", "install", "--production"], cwd=directory, capture_output=True, text=True, check=True, timeout=120 ) self.logging.info("npm install completed successfully")
924-927: Rename unused loop variable.The loop variable
dirnameson line 924 is not used within the loop body. Following Python conventions, it should be renamed to_dirnames.✨ Proposed fix
# Calculate package size (approximate, as it's a source directory) total_size = 0 - for dirpath, dirnames, filenames in os.walk(directory): + for dirpath, _dirnames, filenames in os.walk(directory): for filename in filenames: filepath = os.path.join(dirpath, filename) total_size += os.path.getsize(filepath)
1495-1512: Rename unused loop variable.The loop variable
request_idon line 1495 is not used within the loop body. Following Python conventions, it should be renamed to_request_idto indicate it's intentionally unused.✨ Proposed fix
- for request_id, result in requests.items(): + for _request_id, result in requests.items(): # Count cold/warm starts if result.stats.cold_start:
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
benchmarks/100.webapps/130.crud-api/nodejs/function.jsbenchmarks/wrappers/cloudflare/nodejs/build.jssebs/cloudflare/README.mdsebs/cloudflare/cloudflare.pysebs/cloudflare/durable_objects.py
✅ Files skipped from review due to trivial changes (1)
- sebs/cloudflare/README.md
🧰 Additional context used
🧬 Code graph analysis (2)
sebs/cloudflare/durable_objects.py (3)
sebs/cloudflare/resources.py (1)
config(37-38)sebs/cloudflare/config.py (12)
CloudflareCredentials(9-127)typename(32-33)typename(140-141)typename(210-211)resources(218-219)credentials(214-215)api_token(36-37)api_key(44-45)update_cache(118-121)update_cache(167-173)update_cache(248-251)sebs/cache.py (3)
Cache(37-463)get_nosql_config(177-178)update_nosql(190-193)
sebs/cloudflare/cloudflare.py (2)
sebs/cloudflare/resources.py (4)
config(37-38)CloudflareSystemResources(16-95)get_storage(56-78)_get_auth_headers(40-54)sebs/cloudflare/triggers.py (8)
typename(19-20)typename(72-73)LibraryTrigger(7-57)HTTPTrigger(60-154)url(80-82)url(85-86)trigger_type(23-24)trigger_type(76-77)
🪛 Ruff (0.14.10)
sebs/cloudflare/durable_objects.py
56-56: Avoid specifying long messages outside the exception class
(TRY003)
117-117: Unused method argument: primary_key
(ARG002)
117-117: Unused method argument: secondary_key
(ARG002)
147-147: Unused method argument: data
(ARG002)
148-148: Unused method argument: primary_key
(ARG002)
149-149: Unused method argument: secondary_key
(ARG002)
168-168: Avoid specifying long messages outside the exception class
(TRY003)
183-183: f-string without any placeholders
Remove extraneous f prefix
(F541)
sebs/cloudflare/cloudflare.py
73-73: Unused method argument: config
(ARG002)
73-73: Do not use mutable data structures for argument defaults
Replace with None; initialize within function
(B006)
117-117: Do not catch blind exception: Exception
(BLE001)
128-131: Avoid specifying long messages outside the exception class
(TRY003)
134-137: Avoid specifying long messages outside the exception class
(TRY003)
148-148: Probable use of requests call without timeout
(S113)
151-154: Avoid specifying long messages outside the exception class
(TRY003)
162-162: Starting a process with a partial executable path
(S607)
174-174: Starting a process with a partial executable path
(S607)
184-184: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
184-184: Avoid specifying long messages outside the exception class
(TRY003)
186-188: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
186-188: Avoid specifying long messages outside the exception class
(TRY003)
190-190: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
190-190: Avoid specifying long messages outside the exception class
(TRY003)
196-196: Starting a process with a partial executable path
(S607)
208-208: Starting a process with a partial executable path
(S607)
218-218: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
218-218: Avoid specifying long messages outside the exception class
(TRY003)
220-222: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
220-222: Avoid specifying long messages outside the exception class
(TRY003)
224-224: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
224-224: Avoid specifying long messages outside the exception class
(TRY003)
227-227: Unused method argument: container_uri
(ARG002)
385-385: Local variable r2_bucket_configured is assigned to but never used
Remove assignment to unused variable r2_bucket_configured
(F841)
387-387: Do not catch blind exception: Exception
(BLE001)
416-416: Avoid specifying long messages outside the exception class
(TRY003)
423-423: Unused method argument: architecture
(ARG002)
464-464: Unused method argument: is_cached
(ARG002)
482-482: Starting a process with a partial executable path
(S607)
496-496: Starting a process with a partial executable path
(S607)
507-507: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
508-508: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
508-508: Avoid specifying long messages outside the exception class
(TRY003)
510-510: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
511-511: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
511-511: Avoid specifying long messages outside the exception class
(TRY003)
513-515: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
513-515: Avoid specifying long messages outside the exception class
(TRY003)
525-525: Starting a process with a partial executable path
(S607)
533-533: Do not catch blind exception: Exception
(BLE001)
643-646: Avoid specifying long messages outside the exception class
(TRY003)
647-650: Avoid specifying long messages outside the exception class
(TRY003)
654-654: Loop control variable dirnames not used within loop body
Rename unused dirnames to _dirnames
(B007)
685-687: Avoid specifying long messages outside the exception class
(TRY003)
705-705: Avoid specifying long messages outside the exception class
(TRY003)
718-718: f-string without any placeholders
Remove extraneous f prefix
(F541)
804-804: Function definition does not bind loop variable control_flow
(B023)
804-804: Function definition does not bind loop variable builtins
(B023)
855-855: Local variable result is assigned to but never used
Remove assignment to unused variable result
(F841)
856-856: Starting a process with a partial executable path
(S607)
864-864: Do not catch blind exception: Exception
(BLE001)
865-865: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
866-866: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
866-866: Avoid specifying long messages outside the exception class
(TRY003)
924-924: Loop control variable dirnames not used within loop body
Rename unused dirnames to _dirnames
(B007)
956-956: subprocess call: check for execution of untrusted input
(S603)
957-957: Starting a process with a partial executable path
(S607)
969-969: Consider moving this statement to an else block
(TRY300)
975-975: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
976-976: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
978-978: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
978-978: Avoid specifying long messages outside the exception class
(TRY003)
1011-1011: Avoid specifying long messages outside the exception class
(TRY003)
1066-1066: Probable use of requests call without timeout
(S113)
1071-1071: Do not use bare except
(E722)
1132-1132: subprocess call: check for execution of untrusted input
(S603)
1167-1167: Consider moving this statement to an else block
(TRY300)
1170-1170: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
1170-1170: Avoid specifying long messages outside the exception class
(TRY003)
1175-1175: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
1176-1176: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
1178-1178: Unused method argument: package_dir
(ARG002)
1212-1212: Do not use bare except
(E722)
1256-1256: Probable use of requests call without timeout
(S113)
1278-1278: Do not catch blind exception: Exception
(BLE001)
1348-1348: Avoid specifying long messages outside the exception class
(TRY003)
1365-1365: Unused method argument: benchmark
(ARG002)
1391-1391: Unused method argument: resources
(ARG002)
1438-1438: Unused method argument: functions
(ARG002)
1438-1438: Unused method argument: code_package
(ARG002)
1458-1458: Unused method argument: start_time
(ARG002)
1459-1459: Unused method argument: end_time
(ARG002)
1495-1495: Loop control variable request_id not used within loop body
Rename unused request_id to _request_id
(B007)
🔇 Additional comments (11)
benchmarks/100.webapps/130.crud-api/nodejs/function.js (3)
6-13: Previous issue resolved.The missing
awaiton the asyncinsertoperation flagged in the previous review has been addressed. The function now properly awaits the insert operation, preventing fire-and-forget behavior.
15-21: LGTM!The function correctly awaits the async
getoperation and returns the result. Implementation is clean and straightforward.
1-4: No issue found. Line 3'snosql.nosql.get_instance()pattern is correct based on the module structure. The nosql.js file exports a class namednosql, and the double property access (variablenosqlcontaining the exports, accessing propertynosqlfor the class) is the natural result of ES6 named exports. While the identical naming between the variable and class is unconventional, it is not a functional error.benchmarks/wrappers/cloudflare/nodejs/build.js (5)
5-28: LGTM!The recursive file enumeration logic correctly excludes build artifacts, test files, and configuration files while collecting source files for bundling.
30-36: LGTM!The file copy utility correctly ensures the destination directory exists before copying.
173-173: LGTM!The script execution is straightforward and correct.
54-60: No issues found. Therequest-polyfill.jsfile exists in the wrapper directory where the build script expects it.
88-106: No action needed. Cloudflare Workers fully supports ES2020 and all later JavaScript features through its V8 engine, which is kept in sync with Chrome stable. Thees2020target in this build configuration is appropriate and compatible.sebs/cloudflare/durable_objects.py (3)
30-40: LGTM! Previous credential storage issue resolved.The
__init__method now correctly stores thecredentialsparameter asself._credentials(line 38), fixing the previously reportedAttributeError. The initialization logic is sound.
101-114: LGTM! Cache client property usage is now correct.The
update_cachemethod now correctly usesself.cache_client(line 107) instead of the non-existentself._cache_client, matching the pattern inretrieve_cache. This resolves the previously reportedAttributeError.
186-213: LGTM! Dictionary mutation issue resolved.The
remove_tablemethod now uses a safe two-step approach: first finding the benchmark and table_key (lines 198-206 withlist()wrappers to avoid mutation), then performing the deletion after iteration (lines 209-210). This prevents theRuntimeError: dictionary changed size during iterationthat was previously flagged.
| async function queryProducts(cartId) { | ||
| const res = await nosqlClient.query( | ||
| nosqlTableName, | ||
| ["cart_id", cartId], | ||
| "product_id" | ||
| ); | ||
|
|
||
| const products = []; | ||
| let priceSum = 0; | ||
| let quantitySum = 0; | ||
|
|
||
| for (const product of res) { | ||
| products.push(product.name); | ||
| priceSum += product.price; | ||
| quantitySum += product.quantity; | ||
| } | ||
|
|
||
| const avgPrice = quantitySum > 0 ? priceSum / quantitySum : 0.0; | ||
|
|
||
| return { | ||
| products: products, | ||
| total_cost: priceSum, | ||
| avg_price: avgPrice | ||
| }; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n benchmarks/100.webapps/130.crud-api/nodejs/function.js | head -80Repository: spcl/serverless-benchmarks
Length of output: 2363
Fix the price aggregation logic.
Lines 36 and 40 contain incorrect calculations for a shopping cart total:
- Line 36 adds
product.pricedirectly topriceSum, ignoring the quantity for that item - Line 40 calculates average price as
sum(price) / sum(quantity), which is not a meaningful metric
Since product.price is a unit price (as shown in the addProduct function at line 11), total_cost should be sum(price × quantity), not sum(price). For example, 2 units at $10 + 3 units at $5 should total $35, not $15.
Proposed fix:
for (const product of res) {
products.push(product.name);
- priceSum += product.price;
+ priceSum += product.price * product.quantity;
quantitySum += product.quantity;
}
const avgPrice = quantitySum > 0 ? priceSum / quantitySum : 0.0;This calculates the correct total cost and average price per item.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async function queryProducts(cartId) { | |
| const res = await nosqlClient.query( | |
| nosqlTableName, | |
| ["cart_id", cartId], | |
| "product_id" | |
| ); | |
| const products = []; | |
| let priceSum = 0; | |
| let quantitySum = 0; | |
| for (const product of res) { | |
| products.push(product.name); | |
| priceSum += product.price; | |
| quantitySum += product.quantity; | |
| } | |
| const avgPrice = quantitySum > 0 ? priceSum / quantitySum : 0.0; | |
| return { | |
| products: products, | |
| total_cost: priceSum, | |
| avg_price: avgPrice | |
| }; | |
| } | |
| async function queryProducts(cartId) { | |
| const res = await nosqlClient.query( | |
| nosqlTableName, | |
| ["cart_id", cartId], | |
| "product_id" | |
| ); | |
| const products = []; | |
| let priceSum = 0; | |
| let quantitySum = 0; | |
| for (const product of res) { | |
| products.push(product.name); | |
| priceSum += product.price * product.quantity; | |
| quantitySum += product.quantity; | |
| } | |
| const avgPrice = quantitySum > 0 ? priceSum / quantitySum : 0.0; | |
| return { | |
| products: products, | |
| total_cost: priceSum, | |
| avg_price: avgPrice | |
| }; | |
| } |
🤖 Prompt for AI Agents
In @benchmarks/100.webapps/130.crud-api/nodejs/function.js around lines 23 - 47,
The total and average calculations in queryProducts are incorrect: priceSum
currently adds product.price without accounting for product.quantity and
avgPrice divides sum(prices) by sum(quantity), which is meaningless; change the
aggregation so priceSum accumulates product.price * product.quantity (i.e.,
priceSum += product.price * product.quantity) and compute avgPrice as
quantitySum > 0 ? priceSum / quantitySum : 0.0 (average unit price per item),
keeping product names appended to products and returning total_cost: priceSum
and avg_price: avgPrice.
| def initialize(self, config: Dict[str, str] = {}, resource_prefix: Optional[str] = None): | ||
| """ | ||
| Initialize the Cloudflare Workers platform. | ||
| Args: | ||
| config: Additional configuration parameters | ||
| resource_prefix: Prefix for resource naming | ||
| """ | ||
| # Verify credentials are valid | ||
| self._verify_credentials() | ||
| self.initialize_resources(select_prefix=resource_prefix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix mutable default argument and unused parameter.
Line 73 has two issues:
- The
configparameter is defined but never used in the method body. - The mutable default
config: Dict[str, str] = {}can cause unexpected behavior if the same dict instance is reused across calls.
🔧 Proposed fix
- def initialize(self, config: Dict[str, str] = {}, resource_prefix: Optional[str] = None):
+ def initialize(self, config: Optional[Dict[str, str]] = None, resource_prefix: Optional[str] = None):
"""
Initialize the Cloudflare Workers platform.
Args:
config: Additional configuration parameters
resource_prefix: Prefix for resource naming
"""
+ if config is None:
+ config = {}
# Verify credentials are valid
self._verify_credentials()🧰 Tools
🪛 Ruff (0.14.10)
73-73: Unused method argument: config
(ARG002)
73-73: Do not use mutable data structures for argument defaults
Replace with None; initialize within function
(B006)
🤖 Prompt for AI Agents
In @sebs/cloudflare/cloudflare.py around lines 73 - 83, The initialize method
currently declares a mutable default config and never uses it; change the
signature to use a safe default by replacing config: Dict[str, str] = {} with
config: Optional[Dict[str, str]] = None (or remove the parameter entirely if
it’s not needed), and if you keep it, ensure you normalize it at the top of
initialize (e.g., config = config or {}) before use and pass relevant values
into initialize_resources or other code paths; keep references to the existing
initialize, _verify_credentials, and initialize_resources symbols when making
the change.
| response = requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers) | ||
|
|
||
| if response.status_code != 200: | ||
| raise RuntimeError( | ||
| f"Failed to verify Cloudflare credentials: {response.status_code} - {response.text}\n" | ||
| f"Please check that your CLOUDFLARE_API_TOKEN and CLOUDFLARE_ACCOUNT_ID are correct." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add timeout to requests call for security.
The requests.get() call on line 148 lacks a timeout parameter, which can cause the application to hang indefinitely if the Cloudflare API is unresponsive. This is a security and reliability concern.
🔒 Proposed fix
- response = requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers)
+ response = requests.get(
+ f"{self._api_base_url}/user/tokens/verify",
+ headers=headers,
+ timeout=30 # 30 second timeout for credential verification
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| response = requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers) | |
| if response.status_code != 200: | |
| raise RuntimeError( | |
| f"Failed to verify Cloudflare credentials: {response.status_code} - {response.text}\n" | |
| f"Please check that your CLOUDFLARE_API_TOKEN and CLOUDFLARE_ACCOUNT_ID are correct." | |
| ) | |
| response = requests.get( | |
| f"{self._api_base_url}/user/tokens/verify", | |
| headers=headers, | |
| timeout=30 # 30 second timeout for credential verification | |
| ) | |
| if response.status_code != 200: | |
| raise RuntimeError( | |
| f"Failed to verify Cloudflare credentials: {response.status_code} - {response.text}\n" | |
| f"Please check that your CLOUDFLARE_API_TOKEN and CLOUDFLARE_ACCOUNT_ID are correct." | |
| ) |
🧰 Tools
🪛 Ruff (0.14.10)
148-148: Probable use of requests call without timeout
(S113)
151-154: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In @sebs/cloudflare/cloudflare.py around lines 148 - 154, The requests.get call
used to verify Cloudflare credentials lacks a timeout and can hang; update the
call to include a sensible timeout (e.g., timeout=10) or use a configurable
attribute like self._timeout, i.e., change
requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers) to
include timeout, and add handling for
requests.exceptions.Timeout/requests.exceptions.RequestException to raise a
clear RuntimeError if the request times out or fails; refer to the requests.get
invocation and the response/status_code check to locate where to add the timeout
and error handling.
| def replacer(match): | ||
| prefix = match.group(1) | ||
| assignment = match.group(2) or '' | ||
| func_name = match.group(3) | ||
|
|
||
| if func_name in control_flow or func_name in builtins: | ||
| return match.group(0) | ||
|
|
||
| return f"{prefix}{assignment}await {func_name}(" | ||
|
|
||
| line = re.sub( | ||
| r'(^|\s+|;|,|\()((?:const|let|var)\s+\w+\s*=\s*)?(\w+)\s*\(', | ||
| replacer, | ||
| line | ||
| ) | ||
| transformed_lines.append(line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Fix closure issue in loop - function doesn't bind loop variables.
The replacer function defined on line 799 inside the loop references control_flow and builtins variables, but these references won't bind correctly because they're defined outside the function in the loop's scope. This can lead to incorrect behavior where the function uses stale values.
🐛 Proposed fix
lines = content.split('\n')
transformed_lines = []
control_flow = {'if', 'for', 'while', 'switch', 'catch', 'return'}
builtins = {'console', 'require', 'push', 'join', 'split', 'map', 'filter',
'reduce', 'forEach', 'find', 'findIndex', 'some', 'every',
'includes', 'parseInt', 'parseFloat', 'isNaN', 'Array',
'Object', 'String', 'Number', 'Boolean', 'Math', 'JSON',
'Date', 'RegExp', 'Error', 'Promise'}
for line in lines:
# Skip function declarations
if re.search(r'\bfunction\s+\w+\s*\(', line) or re.search(r'=\s*(async\s+)?function\s*\(', line):
transformed_lines.append(line)
continue
# Add await before likely user-defined function calls
- def replacer(match):
+ def replacer(match, control_flow=control_flow, builtins=builtins):
prefix = match.group(1)
assignment = match.group(2) or ''
func_name = match.group(3)
if func_name in control_flow or func_name in builtins:
return match.group(0)
return f"{prefix}{assignment}await {func_name}("📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def replacer(match): | |
| prefix = match.group(1) | |
| assignment = match.group(2) or '' | |
| func_name = match.group(3) | |
| if func_name in control_flow or func_name in builtins: | |
| return match.group(0) | |
| return f"{prefix}{assignment}await {func_name}(" | |
| line = re.sub( | |
| r'(^|\s+|;|,|\()((?:const|let|var)\s+\w+\s*=\s*)?(\w+)\s*\(', | |
| replacer, | |
| line | |
| ) | |
| transformed_lines.append(line) | |
| def replacer(match, control_flow=control_flow, builtins=builtins): | |
| prefix = match.group(1) | |
| assignment = match.group(2) or '' | |
| func_name = match.group(3) | |
| if func_name in control_flow or func_name in builtins: | |
| return match.group(0) | |
| return f"{prefix}{assignment}await {func_name}(" | |
| line = re.sub( | |
| r'(^|\s+|;|,|\()((?:const|let|var)\s+\w+\s*=\s*)?(\w+)\s*\(', | |
| replacer, | |
| line | |
| ) | |
| transformed_lines.append(line) |
🧰 Tools
🪛 Ruff (0.14.10)
804-804: Function definition does not bind loop variable control_flow
(B023)
804-804: Function definition does not bind loop variable builtins
(B023)
🤖 Prompt for AI Agents
In @sebs/cloudflare/cloudflare.py around lines 799 - 814, The nested function
replacer closes over loop-scoped variables control_flow and builtins causing
stale bindings; fix by capturing their current values into locals when defining
replacer (e.g., assign local_control_flow = control_flow and local_builtins =
builtins just before defining replacer and refer to those inside replacer, or
bind them as default args like def replacer(match, control_flow=control_flow,
builtins=builtins):), then keep using replacer in the re.sub call so the correct
sets are used for each iteration.
| response = requests.get(url, headers=headers) | ||
|
|
||
| if response.status_code == 200: | ||
| try: | ||
| return response.json().get("result") | ||
| except: | ||
| return None | ||
| elif response.status_code == 404: | ||
| return None | ||
| else: | ||
| self.logging.warning(f"Unexpected response checking worker: {response.status_code}") | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace bare except with specific exception.
The bare except: clause on line 1071 catches all exceptions, including system-exiting exceptions like KeyboardInterrupt. This should catch a specific exception like json.JSONDecodeError or ValueError to handle JSON parsing errors gracefully.
🔧 Proposed fix
if response.status_code == 200:
- try:
- return response.json().get("result")
- except:
- return None
+ try:
+ return response.json().get("result")
+ except (json.JSONDecodeError, ValueError, KeyError):
+ return None📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| response = requests.get(url, headers=headers) | |
| if response.status_code == 200: | |
| try: | |
| return response.json().get("result") | |
| except: | |
| return None | |
| elif response.status_code == 404: | |
| return None | |
| else: | |
| self.logging.warning(f"Unexpected response checking worker: {response.status_code}") | |
| return None | |
| response = requests.get(url, headers=headers) | |
| if response.status_code == 200: | |
| try: | |
| return response.json().get("result") | |
| except (json.JSONDecodeError, ValueError, KeyError): | |
| return None | |
| elif response.status_code == 404: | |
| return None | |
| else: | |
| self.logging.warning(f"Unexpected response checking worker: {response.status_code}") | |
| return None |
🧰 Tools
🪛 Ruff (0.14.10)
1066-1066: Probable use of requests call without timeout
(S113)
1071-1071: Do not use bare except
(E722)
🤖 Prompt for AI Agents
In @sebs/cloudflare/cloudflare.py around lines 1066 - 1077, The bare except
around response.json() in the function that handles the worker check should be
replaced with a specific exception handler for JSON parse errors: catch
json.JSONDecodeError (and optionally ValueError for compatibility) instead of
using a bare except, and handle/log the exception (e.g., with
self.logging.debug/warning) before returning None; ensure json is imported if
not already and update the try/except that wraps response.json().get("result")
to use the specific exception types and an exception variable (e) for logging.
| error_data = response.json() | ||
| error_msg = error_data.get('error', 'Container provisioning') | ||
| self.logging.info(f"{error_msg}... ({elapsed}s elapsed)") | ||
| except: | ||
| self.logging.info(f"Container provisioning... ({elapsed}s elapsed)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace bare except with specific exception.
The bare except: clause on line 1212 catches all exceptions. This should catch specific exceptions like json.JSONDecodeError or requests.exceptions.JSONDecodeError to handle JSON parsing errors appropriately.
🔧 Proposed fix
try:
error_data = response.json()
error_msg = error_data.get('error', 'Container provisioning')
self.logging.info(f"{error_msg}... ({elapsed}s elapsed)")
- except:
+ except (json.JSONDecodeError, ValueError):
self.logging.info(f"Container provisioning... ({elapsed}s elapsed)")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| error_data = response.json() | |
| error_msg = error_data.get('error', 'Container provisioning') | |
| self.logging.info(f"{error_msg}... ({elapsed}s elapsed)") | |
| except: | |
| self.logging.info(f"Container provisioning... ({elapsed}s elapsed)") | |
| error_data = response.json() | |
| error_msg = error_data.get('error', 'Container provisioning') | |
| self.logging.info(f"{error_msg}... ({elapsed}s elapsed)") | |
| except (json.JSONDecodeError, ValueError): | |
| self.logging.info(f"Container provisioning... ({elapsed}s elapsed)") |
🧰 Tools
🪛 Ruff (0.14.10)
1212-1212: Do not use bare except
(E722)
🤖 Prompt for AI Agents
In @sebs/cloudflare/cloudflare.py around lines 1209 - 1213, Replace the bare
except that follows response.json() with a specific exception handler for JSON
parse failures: catch json.JSONDecodeError and ValueError (and bind it to a
variable, e.g., "e") around the call to response.json() where error_data and
error_msg are set, then fall back to the existing log path
(self.logging.info(f"Container provisioning... ({elapsed}s
elapsed)"))—optionally include the exception text in the log; ensure json is
imported if not already and update the except block that currently surrounds
response.json()/error_msg assignment.
| try: | ||
| headers = self._get_auth_headers() | ||
| url = f"{self._api_base_url}/accounts/{account_id}/workers/subdomain" | ||
| resp = requests.get(url, headers=headers) | ||
| if resp.status_code == 200: | ||
| body = resp.json() | ||
| sub = None | ||
| # result may contain 'subdomain' or nested structure | ||
| if isinstance(body, dict): | ||
| sub = body.get("result", {}).get("subdomain") | ||
|
|
||
| if sub: | ||
| self._workers_dev_subdomain = sub | ||
| return sub | ||
| else: | ||
| self.logging.warning( | ||
| "Could not find workers.dev subdomain in API response; " | ||
| "please enable the workers.dev subdomain in your Cloudflare dashboard." | ||
| ) | ||
| return None | ||
| else: | ||
| self.logging.warning( | ||
| f"Failed to fetch workers.dev subdomain: {resp.status_code} - {resp.text}" | ||
| ) | ||
| return None | ||
| except Exception as e: | ||
| self.logging.warning(f"Error fetching workers.dev subdomain: {e}") | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add timeout to requests call for reliability.
The requests.get() call on line 1256 lacks a timeout parameter, which can cause the application to hang indefinitely if the Cloudflare API is unresponsive. This affects reliability when fetching the workers.dev subdomain.
🔒 Proposed fix
try:
headers = self._get_auth_headers()
url = f"{self._api_base_url}/accounts/{account_id}/workers/subdomain"
- resp = requests.get(url, headers=headers)
+ resp = requests.get(url, headers=headers, timeout=30)
if resp.status_code == 200:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| headers = self._get_auth_headers() | |
| url = f"{self._api_base_url}/accounts/{account_id}/workers/subdomain" | |
| resp = requests.get(url, headers=headers) | |
| if resp.status_code == 200: | |
| body = resp.json() | |
| sub = None | |
| # result may contain 'subdomain' or nested structure | |
| if isinstance(body, dict): | |
| sub = body.get("result", {}).get("subdomain") | |
| if sub: | |
| self._workers_dev_subdomain = sub | |
| return sub | |
| else: | |
| self.logging.warning( | |
| "Could not find workers.dev subdomain in API response; " | |
| "please enable the workers.dev subdomain in your Cloudflare dashboard." | |
| ) | |
| return None | |
| else: | |
| self.logging.warning( | |
| f"Failed to fetch workers.dev subdomain: {resp.status_code} - {resp.text}" | |
| ) | |
| return None | |
| except Exception as e: | |
| self.logging.warning(f"Error fetching workers.dev subdomain: {e}") | |
| return None | |
| try: | |
| headers = self._get_auth_headers() | |
| url = f"{self._api_base_url}/accounts/{account_id}/workers/subdomain" | |
| resp = requests.get(url, headers=headers, timeout=30) | |
| if resp.status_code == 200: | |
| body = resp.json() | |
| sub = None | |
| # result may contain 'subdomain' or nested structure | |
| if isinstance(body, dict): | |
| sub = body.get("result", {}).get("subdomain") | |
| if sub: | |
| self._workers_dev_subdomain = sub | |
| return sub | |
| else: | |
| self.logging.warning( | |
| "Could not find workers.dev subdomain in API response; " | |
| "please enable the workers.dev subdomain in your Cloudflare dashboard." | |
| ) | |
| return None | |
| else: | |
| self.logging.warning( | |
| f"Failed to fetch workers.dev subdomain: {resp.status_code} - {resp.text}" | |
| ) | |
| return None | |
| except Exception as e: | |
| self.logging.warning(f"Error fetching workers.dev subdomain: {e}") | |
| return None |
🧰 Tools
🪛 Ruff (0.14.10)
1256-1256: Probable use of requests call without timeout
(S113)
1278-1278: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
In @sebs/cloudflare/cloudflare.py around lines 1253 - 1280, The requests.get
call in the method that fetches the workers.dev subdomain should include a
timeout to avoid hanging; update the call to requests.get(url, headers=headers,
timeout=...) and choose a sensible timeout (e.g., 5-10s), and adjust the
surrounding exception handling to specifically catch requests.Timeout and
requests.RequestException where appropriate so the method still logs via
self.logging.warning and returns None on timeout or other request errors; refer
to the existing method using _get_auth_headers, self._workers_dev_subdomain, and
the current requests.get usage to locate and modify the call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🤖 Fix all issues with AI agents
In @benchmarks/wrappers/cloudflare/python/nosql.py:
- Line 192: The code is brittle because get_res is being massaged with
get_res.replace("'", "\"") before JSON parsing; stop doing that and parse the
data in its true format: ensure the producer stores JSON strings (fix the writer
to use json.dumps) or, if the KV contains Python literal reprs, replace the
fragile replace() with a safe parser (use json.loads(get_res) when data is valid
JSON, or ast.literal_eval(get_res) when Python literals are expected) and add
validation/error handling; remove the replace call and update the code that
reads/writes the KV so the stored format is unambiguous and parsed with the
proper function (reference the get_res variable and the surrounding KV
read/write logic).
- Line 91: Replace the hard assertion "assert len(keys) <= 100" with explicit
validation that raises a descriptive exception when the limit is exceeded: check
if len(keys) > 100 and raise a ValueError (or a custom exception) that includes
the actual len(keys) and the maximum allowed (100) so callers get a clear error
message instead of an abrupt assertion failure; update any docstring/comments
for the function in nosql.py that handles "keys" to reflect this explicit error
behavior.
- Line 184: Replace the hard assertion "assert len(keys) <= 100" with proper
error handling: detect when len(keys) > 100 and raise a descriptive exception
(e.g., ValueError) that includes the actual count and context (which
operation/function and parameter causing it). Apply the same pattern used to fix
nosql_do.query(): validate the keys collection early in the function that
constructs/uses the "keys" variable and raise a clear, actionable error instead
of asserting so the caller can handle it.
- Line 117: The class-level annotation for the singleton "instance" is
referencing the wrong name "nosql"; update the type hint to reference the
correct class "nosql_kv" (i.e., change the annotation on the variable instance:
Optional["nosql"] to Optional["nosql_kv"]) so the forward reference matches the
actual class name used in this module (look for the class definition named
nosql_kv and the variable instance).
- Around line 120-122: The init_instance function incorrectly constructs and
assigns nosql.instance using the nosql class; change it to instantiate and
assign nosql_kv instead (i.e., use nosql_kv() for creation and assign to
nosql.instance.env from the entry.env), ensuring init_instance references the
nosql_kv class for both the new instance and the instance property.
- Around line 204-207: get_instance currently constructs the wrong class by
calling nosql() (the nosql_do alias) instead of the KV implementation; change
the instantiation so that when nosql.instance is None you assign nosql.instance
= nosql_kv() (or the correct nosql_kv class/constructor) and return
nosql.instance, ensuring you're referencing the nosql_kv symbol used for the
key-value implementation rather than nosql().
- Line 88: Remove the leftover debug print by deleting the "print(key)" call in
benchmarks/wrappers/cloudflare/python/nosql.py; if key inspection is still
needed use the module's logger (e.g., logger.debug(...)) instead of printing to
stdout to avoid leaking debug output in production.
🧹 Nitpick comments (5)
benchmarks/wrappers/cloudflare/python/nosql.py (5)
51-51: Remove unused variable assignment.The
put_resvariable is assigned but never used. Consider removing it or using it for error handling.♻️ Proposed fix
send_data = self.data_pre(data) k=self.key_maker(primary_key, secondary_key) - put_res = run_sync(self.get_table(table_name).put(k, send_data)) + run_sync(self.get_table(table_name).put(k, send_data)) return
94-94: Consider implementing bulk operations.The TODO comment indicates that bulk operations would be preferred over individual gets. This could improve performance for queries returning multiple keys.
Would you like me to help investigate or implement bulk get operations for this query method?
140-144: Remove unused variable assignment.Similar to line 51, the
put_resvariable is assigned but never used.♻️ Proposed fix
- put_res = ( - run_sync(self.get_table(table_name).put( + run_sync(self.get_table(table_name).put( self.key_maker(primary_key, secondary_key), - json.dumps(data)) - )) + json.dumps(data))) return
154-158: Remove unused variable assignment.The
put_resvariable is assigned but never used.♻️ Proposed fix
- put_res = run_sync( - self.get_table(table_name).put( + run_sync( + self.get_table(table_name).put( self.key_maker(primary_key, secondary_key), - json.dumps(data) - )) + json.dumps(data))) return
187-187: Consider implementing bulk operations.Similar to the TODO in
nosql_do.query(), bulk operations would improve performance.Would you like me to help investigate or implement bulk get operations for this query method?
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
benchmarks/wrappers/cloudflare/python/nosql.py
🧰 Additional context used
🧬 Code graph analysis (1)
benchmarks/wrappers/cloudflare/python/nosql.py (1)
benchmarks/wrappers/cloudflare/python/container/nosql.py (8)
init_instance(18-22)insert(53-66)update(68-81)get(83-92)query(94-103)delete(105-111)get_instance(114-117)nosql(11-117)
🪛 Ruff (0.14.10)
benchmarks/wrappers/cloudflare/python/nosql.py
23-23: Use explicit conversion flag
Replace with conversion flag
(RUF010)
26-26: Use explicit conversion flag
Replace with conversion flag
(RUF010)
38-38: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
40-40: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
51-51: Local variable put_res is assigned to but never used
Remove assignment to unused variable put_res
(F841)
125-125: Use explicit conversion flag
Replace with conversion flag
(RUF010)
128-128: Use explicit conversion flag
Replace with conversion flag
(RUF010)
140-140: Local variable put_res is assigned to but never used
Remove assignment to unused variable put_res
(F841)
154-154: Local variable put_res is assigned to but never used
Remove assignment to unused variable put_res
(F841)
🔇 Additional comments (1)
benchmarks/wrappers/cloudflare/python/nosql.py (1)
9-15: Previously reported critical issues have been fixed.The class name mismatches in
nosql_dothat were flagged in the previous review have been correctly addressed. The type annotation now uses"nosql_do"and all instance references usenosql_do.instance.
| keys = [] | ||
| for key in list_res: | ||
| if key.startswith(prefix_key): | ||
| print(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove debug print statement.
This print statement appears to be leftover debug code and should be removed for production use.
🔧 Proposed fix
for key in list_res:
if key.startswith(prefix_key):
- print(key)
keys.append(key)🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/nosql.py at line 88, Remove the
leftover debug print by deleting the "print(key)" call in
benchmarks/wrappers/cloudflare/python/nosql.py; if key inspection is still
needed use the module's logger (e.g., logger.debug(...)) instead of printing to
stdout to avoid leaking debug output in production.
| print(key) | ||
| keys.append(key) | ||
| ##print("keys", keys) | ||
| assert len(keys) <= 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace hard assertion with proper error handling.
The hard assertion will crash the program if more than 100 keys are found. This should raise a descriptive exception instead to provide better error handling.
🛡️ Proposed fix
##print("keys", keys)
- assert len(keys) <= 100
+ if len(keys) > 100:
+ raise ValueError(f"Query returned {len(keys)} keys, exceeding the 100 key limit")🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/nosql.py at line 91, Replace the hard
assertion "assert len(keys) <= 100" with explicit validation that raises a
descriptive exception when the limit is exceeded: check if len(keys) > 100 and
raise a ValueError (or a custom exception) that includes the actual len(keys)
and the maximum allowed (100) so callers get a clear error message instead of an
abrupt assertion failure; update any docstring/comments for the function in
nosql.py that handles "keys" to reflect this explicit error behavior.
| for key in list_res.keys: | ||
| keys.append(key.name) | ||
| ##print("keys", keys) | ||
| assert len(keys) <= 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace hard assertion with proper error handling.
Same issue as in nosql_do.query(): the hard assertion will crash if more than 100 keys are found. Replace with a descriptive exception.
🛡️ Proposed fix
##print("keys", keys)
- assert len(keys) <= 100
+ if len(keys) > 100:
+ raise ValueError(f"Query returned {len(keys)} keys, exceeding the 100 key limit")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| assert len(keys) <= 100 | |
| if len(keys) > 100: | |
| raise ValueError(f"Query returned {len(keys)} keys, exceeding the 100 key limit") |
🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/nosql.py at line 184, Replace the hard
assertion "assert len(keys) <= 100" with proper error handling: detect when
len(keys) > 100 and raise a descriptive exception (e.g., ValueError) that
includes the actual count and context (which operation/function and parameter
causing it). Apply the same pattern used to fix nosql_do.query(): validate the
keys collection early in the function that constructs/uses the "keys" variable
and raise a clear, actionable error instead of asserting so the caller can
handle it.
| for key in keys: | ||
|
|
||
| get_res = run_sync(self.get_table(table_name).get(key)) | ||
| get_res = get_res.replace("\'", "\"") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fragile string manipulation for JSON parsing.
Replacing single quotes with double quotes to fix JSON parsing suggests the data is being stored or retrieved in an incorrect format. This approach is brittle and could fail if the data contains escaped quotes or quotes within string values.
Investigation needed
Consider the following scenarios where this will break:
- Data containing:
{"name": "O'Brien"}→ becomes{"name": "O"Brien"}(invalid JSON) - Data containing:
{"quote": "She said 'hello'"}→ mismatched quotes
Root cause options:
- If the KV store is incorrectly storing Python repr() instead of JSON, fix the storage format
- If this is intentional, document why and add validation
- Consider using
ast.literal_eval()if Python literals are expected
🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/nosql.py at line 192, The code is
brittle because get_res is being massaged with get_res.replace("'", "\"") before
JSON parsing; stop doing that and parse the data in its true format: ensure the
producer stores JSON strings (fix the writer to use json.dumps) or, if the KV
contains Python literal reprs, replace the fragile replace() with a safe parser
(use json.loads(get_res) when data is valid JSON, or ast.literal_eval(get_res)
when Python literals are expected) and add validation/error handling; remove the
replace call and update the code that reads/writes the KV so the stored format
is unambiguous and parsed with the proper function (reference the get_res
variable and the surrounding KV read/write logic).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @benchmarks/wrappers/cloudflare/python/nosql.py:
- Around line 161-168: The get method currently returns the raw stored string,
violating its Optional[dict] contract; update the get function (involving get(),
key_maker(), get_table(), run_sync()) to parse the retrieved JSON string with
json.loads before returning, handle the case where the run_sync result is None
by returning None, and propagate or wrap JSON parsing errors appropriately so
callers receive a dict or None consistent with insert/update behavior.
🧹 Nitpick comments (1)
benchmarks/wrappers/cloudflare/python/nosql.py (1)
54-63: Clarify update vs. insert semantics.The comment on line 54 questions whether
updateshould have different behavior frominsert. In typical NoSQL patterns, update might want to:
- Merge with existing data rather than overwrite
- Return an error if the key doesn't exist (vs. insert creating it)
- Perform atomic updates or conditional writes
If this distinction matters for your use case, the methods should be differentiated; otherwise, consider removing the comment or explicitly documenting that they're intentionally identical.
Would you like me to suggest an implementation that differentiates insert and update operations?
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
benchmarks/wrappers/cloudflare/python/nosql.py
🧰 Additional context used
🪛 Ruff (0.14.10)
benchmarks/wrappers/cloudflare/python/nosql.py
23-23: Use explicit conversion flag
Replace with conversion flag
(RUF010)
26-26: Use explicit conversion flag
Replace with conversion flag
(RUF010)
38-38: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
40-40: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
51-51: Local variable put_res is assigned to but never used
Remove assignment to unused variable put_res
(F841)
125-125: Use explicit conversion flag
Replace with conversion flag
(RUF010)
128-128: Use explicit conversion flag
Replace with conversion flag
(RUF010)
140-140: Local variable put_res is assigned to but never used
Remove assignment to unused variable put_res
(F841)
154-154: Local variable put_res is assigned to but never used
Remove assignment to unused variable put_res
(F841)
🔇 Additional comments (1)
benchmarks/wrappers/cloudflare/python/nosql.py (1)
29-40: Verify pickle security implications.Pickle deserialization can be unsafe with untrusted data. Ensure that the Durable Object storage is fully trusted and cannot be manipulated by untrusted sources. If there's any possibility of untrusted data entering this path, switch to JSON serialization instead.
| def get( | ||
| self, table_name: str, primary_key: Tuple[str, str], secondary_key: Tuple[str, str] | ||
| ) -> Optional[dict]: | ||
| get_res = run_sync( | ||
| self.get_table(table_name).get( | ||
| self.key_maker(primary_key, secondary_key) | ||
| )) | ||
| return get_res |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parse JSON in get method to match the return type contract.
The insert and update methods store data using json.dumps() (lines 143, 157), but this get method returns the raw string without parsing. This violates the return type Optional[dict] and creates inconsistent behavior with nosql_do.get().
🔧 Proposed fix
def get(
self, table_name: str, primary_key: Tuple[str, str], secondary_key: Tuple[str, str]
) -> Optional[dict]:
get_res = run_sync(
self.get_table(table_name).get(
self.key_maker(primary_key, secondary_key)
))
- return get_res
+ if get_res is None:
+ return None
+ return json.loads(get_res)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def get( | |
| self, table_name: str, primary_key: Tuple[str, str], secondary_key: Tuple[str, str] | |
| ) -> Optional[dict]: | |
| get_res = run_sync( | |
| self.get_table(table_name).get( | |
| self.key_maker(primary_key, secondary_key) | |
| )) | |
| return get_res | |
| def get( | |
| self, table_name: str, primary_key: Tuple[str, str], secondary_key: Tuple[str, str] | |
| ) -> Optional[dict]: | |
| get_res = run_sync( | |
| self.get_table(table_name).get( | |
| self.key_maker(primary_key, secondary_key) | |
| )) | |
| if get_res is None: | |
| return None | |
| return json.loads(get_res) |
🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/nosql.py around lines 161 - 168, The
get method currently returns the raw stored string, violating its Optional[dict]
contract; update the get function (involving get(), key_maker(), get_table(),
run_sync()) to parse the retrieved JSON string with json.loads before returning,
handle the case where the run_sync result is None by returning None, and
propagate or wrap JSON parsing errors appropriately so callers receive a dict or
None consistent with insert/update behavior.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @benchmarks/wrappers/cloudflare/python/storage.py:
- Around line 46-61: The bug is double-prefixing "/tmp" for already-absolute tmp
paths causing invalid paths; fix by (1) in download(bucket, key, filepath) add
the actual filesystem path (real_fp) to written_files instead of the requested
filepath (self.written_files.add(real_fp)) and (2) in upload(bucket, key,
filepath) when you detect filepath in written_files, only prepend "/tmp" if the
filepath does not already start with "/tmp" (e.g., if filepath in
self.written_files: filepath = filepath if filepath.startswith("/tmp") else
"/tmp" + os.path.abspath(filepath)). Ensure these changes use the existing
upload, download, and written_files symbols.
- Around line 101-109: In adownload_stream, get_res from
get_bucket(...).get(key) is an R2ObjectBody and does not have .bytes(); replace
the await get_res.bytes() call with await get_res.arrayBuffer() and convert the
returned ArrayBuffer to Python bytes by wrapping it in a JS Uint8Array (e.g.,
js.Uint8Array.new(array_buffer)) and then calling bytes(...) on that result so
the function returns proper binary data instead of breaking at runtime.
🧹 Nitpick comments (3)
benchmarks/wrappers/cloudflare/python/storage.py (3)
35-38: Unusedbucketparameter inget_bucket
bucketis never used here, which triggers ARG002 and can confuse readers. If you don’t plan to support multiple bindings, either drop the parameter or rename it to_bucketto make the intent explicit.
75-96: Unusedput_resfrombobj.put
put_resis assigned but never used, triggering F841 and slightly obscuring intent. Since you only care about the side effect of the PUT (and exceptions will propagate), you can drop the variable:♻️ Simplify `aupload_stream`
- bobj = self.get_bucket(bucket) - put_res = await bobj.put(unique_key, data_js) - return unique_key + bobj = self.get_bucket(bucket) + await bobj.put(unique_key, data_js) + return unique_key
111-116: Duplicatereturn storage.instanceis dead codeThe second
return storage.instanceis unreachable and can be removed:✂️ Remove dead code
def get_instance(): if storage.instance is None: raise RuntimeError("must init storage singleton first") - return storage.instance - return storage.instance + return storage.instanceThe current exception message is short and clear; TRY003 from Ruff is mostly stylistic here, so I’d keep it as-is unless you prefer a custom exception type.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
benchmarks/wrappers/cloudflare/python/storage.py
🧰 Additional context used
🪛 Ruff (0.14.10)
benchmarks/wrappers/cloudflare/python/storage.py
35-35: Unused method argument: bucket
(ARG002)
48-48: Probable insecure usage of temporary file or directory: "/tmp"
(S108)
57-57: Probable insecure usage of temporary file or directory: "/tmp"
(S108)
58-58: Probable insecure usage of temporary file or directory: "/tmp"
(S108)
95-95: Local variable put_res is assigned to but never used
Remove assignment to unused variable put_res
(F841)
114-114: Avoid specifying long messages outside the exception class
(TRY003)
| def upload(self, bucket, key, filepath): | ||
| if filepath in self.written_files: | ||
| filepath = "/tmp" + os.path.abspath(filepath) | ||
| with open(filepath, "rb") as f: | ||
| unique_key = self.upload_stream(bucket, key, f.read()) | ||
| return unique_key | ||
|
|
||
| def download(self, bucket, key, filepath): | ||
| data = self.download_stream(bucket, key) | ||
| # should only allow writes to tmp dir. so do have to edit the filepath here? | ||
| real_fp = filepath | ||
| if not filepath.startswith("/tmp"): | ||
| real_fp = "/tmp" + os.path.abspath(filepath) | ||
|
|
||
| self.written_files.add(filepath) | ||
| with open(real_fp, "wb") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/tmp prefixing can generate invalid paths for already-absolute tmp files
The interaction between upload, download, and written_files has a corner case:
downloadrecords the requestedfilepathinwritten_files.uploadrewrites any path found inwritten_filesby prepending"/tmp" + os.path.abspath(filepath).
If a caller ever uses a path that already starts with /tmp (e.g., via download_directory(..., out_path="/tmp")), you’ll end up trying to open /tmp/tmp/..., which likely doesn’t exist.
You can retain the current behavior for non‑/tmp paths and avoid the double prefix with a small guard:
🔧 Suggested fix for `/tmp` double-prefix
def upload(self, bucket, key, filepath):
- if filepath in self.written_files:
- filepath = "/tmp" + os.path.abspath(filepath)
+ if filepath in self.written_files and not filepath.startswith("/tmp"):
+ filepath = "/tmp" + os.path.abspath(filepath)Note: Ruff’s S108 warning about /tmp is expected in Workers, since /tmp is the canonical writable directory; your os.path.abspath usage already mitigates simple traversal issues.
🧰 Tools
🪛 Ruff (0.14.10)
48-48: Probable insecure usage of temporary file or directory: "/tmp"
(S108)
57-57: Probable insecure usage of temporary file or directory: "/tmp"
(S108)
58-58: Probable insecure usage of temporary file or directory: "/tmp"
(S108)
🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/storage.py around lines 46 - 61, The
bug is double-prefixing "/tmp" for already-absolute tmp paths causing invalid
paths; fix by (1) in download(bucket, key, filepath) add the actual filesystem
path (real_fp) to written_files instead of the requested filepath
(self.written_files.add(real_fp)) and (2) in upload(bucket, key, filepath) when
you detect filepath in written_files, only prepend "/tmp" if the filepath does
not already start with "/tmp" (e.g., if filepath in self.written_files: filepath
= filepath if filepath.startswith("/tmp") else "/tmp" +
os.path.abspath(filepath)). Ensure these changes use the existing upload,
download, and written_files symbols.
| async def adownload_stream(self, bucket, key): | ||
| bobj = self.get_bucket(bucket) | ||
| get_res = await bobj.get(key) | ||
| if get_res == jsnull: | ||
| print("key not stored in bucket") | ||
| return b'' | ||
| # Always read as raw binary data (Blob/ArrayBuffer) | ||
| data = await get_res.bytes() | ||
| return bytes(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
Cloudflare Workers R2 JavaScript API Bucket.get() return object methods bytes() arrayBuffer()
💡 Result:
Short answer: env.BUCKET.get(key) returns Promise<R2ObjectBody | R2Object | null>. When you get an R2ObjectBody it exposes a ReadableStream in .body plus convenience read helpers — arrayBuffer(), text(), json(), blob() and bodyUsed. There is no bytes() method on the R2ObjectBody; use arrayBuffer() (or blob()/text()) to read the bytes. [1][2]
Example:
- Get object and read as ArrayBuffer:
const obj = await env.MY_BUCKET.get("file");
if (obj?.body) {
const buf = await obj.arrayBuffer(); // ArrayBuffer of the object's bytes
// buf.byteLength ...
}
Sources: Cloudflare Workers R2 API reference (R2Object / R2ObjectBody) and Download objects docs. [1][2]
🏁 Script executed:
# First, understand the file structure and locate relevant files
git ls-files | grep -E "(cloudflare|R2|storage)" | head -20Repository: spcl/serverless-benchmarks
Length of output: 1122
🏁 Script executed:
# Look at the storage.py file to understand the full context
cat -n benchmarks/wrappers/cloudflare/python/storage.pyRepository: spcl/serverless-benchmarks
Length of output: 4927
🏁 Script executed:
# Search for usage of 'bytes()' method or 'get_res' patterns in the codebase
rg "\.bytes\(\)" --type py -B 2 -A 2Repository: spcl/serverless-benchmarks
Length of output: 915
🏁 Script executed:
# Search for R2 imports or references
rg -i "r2\|bucket\.get" --type py -B 2 -A 2Repository: spcl/serverless-benchmarks
Length of output: 52
Fix: R2 get() returns R2ObjectBody which has no .bytes() method
The Cloudflare R2 API for Bucket.get() returns an R2ObjectBody that exposes arrayBuffer(), text(), json(), and blob(), but not bytes(). Line 108 will fail at runtime with "bytes is not a function".
Use arrayBuffer() instead:
Suggested fix
data = await get_res.arrayBuffer()
return bytes(js.Uint8Array.new(data))🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/storage.py around lines 101 - 109, In
adownload_stream, get_res from get_bucket(...).get(key) is an R2ObjectBody and
does not have .bytes(); replace the await get_res.bytes() call with await
get_res.arrayBuffer() and convert the returned ArrayBuffer to Python bytes by
wrapping it in a JS Uint8Array (e.g., js.Uint8Array.new(array_buffer)) and then
calling bytes(...) on that result so the function returns proper binary data
instead of breaking at runtime.
Summary by CodeRabbit
New Features
Configuration
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.