Skip to content

Conversation

vdusek
Copy link
Contributor

@vdusek vdusek commented Sep 13, 2024

Description

  • RQ's batch add requests can handle more than 25 requests.
  • It was implemented in the same way as in the TS client.
  • Applies for both sync & async versions.
    • I do not use multi-threading for the sync version, since it is out of the scope of this issue. If users want performance, they should go with the client's async version. But if we decide it is worth to implement it, let's open another issue.

Issues

Testing

It was tested on this code sample (plus the sync alternative):

import asyncio

from apify_client import ApifyClientAsync
from crawlee._utils.crypto import get_random_id

TOKEN = '...'


async def main() -> None:
    apify_client = ApifyClientAsync(token=TOKEN)
    rqs_client = apify_client.request_queues()
    rq = await rqs_client.get_or_create(name='my-rq')
    rq_client = apify_client.request_queue(rq['id'])

    print('Add a single request...')
    result_1 = await rq_client.add_request({'url': 'http://example.com', 'uniqueKey': get_random_id()})
    print(f'result: {result_1}')

    print('Add multiple requests...')
    requests = [
        {
            'url': f'https://example.com/{i}/',
            'uniqueKey': get_random_id(),
        }
        for i in range(110)
    ]
    result_2 = await rq_client.batch_add_requests(requests)
    print(f'result: {result_2}')


if __name__ == '__main__':
    asyncio.run(main())

Checklist

  • CI passed

@vdusek vdusek requested a review from janbuchar September 13, 2024 11:21
@vdusek vdusek self-assigned this Sep 13, 2024
@github-actions github-actions bot added this to the 98th sprint - Tooling team milestone Sep 13, 2024
@github-actions github-actions bot added the t-tooling Issues with this label are in the ownership of the tooling team. label Sep 13, 2024
@vdusek
Copy link
Contributor Author

vdusek commented Sep 13, 2024

I'm not sure what to do with the Redbaron and the check_async_docstrings (comparing sync & async docstrings) as they differ (max_parallel arg).

@janbuchar
Copy link
Contributor

@vdusek regarding Redbaron, that library hasn't had a release in ~5 years, so the issue has been there forever, I suspect. I believe it's safe to ignore that 🤷

If we must ensure parity of sync and async docstrings, I guess we can add that argument to the sync version and raise a NotImplementedError if someone tries to use it.

Copy link
Contributor

@janbuchar janbuchar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks sane, please consider my comments

@vdusek vdusek merged commit 9110ee0 into master Sep 17, 2024
16 checks passed
@vdusek vdusek deleted the batch-add-requests-handle-more-than-25-requests branch September 17, 2024 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-tooling Issues with this label are in the ownership of the tooling team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ApifyRequestQueueClient.batch_add_requests cannot handle more than 25 requests

2 participants