feat(source-Monday): Update CDK to v6#55225
feat(source-Monday): Update CDK to v6#55225Anatolii Yatsuk (tolik0) merged 32 commits intomasterfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
|
Anatolii Yatsuk (@tolik0) is this ready to go? If so, David Gold (@dbgold17) can you review this and see if we should merge this first and apply your changes on top, so you get to work with the new CDK? |
David Gold (dbgold17)
left a comment
There was a problem hiding this comment.
Hi Anatolii Yatsuk (@tolik0) , I left a bunch of comments, a lot of small things and then some which have to do with me not understanding the changes you're making. Would you be able to write a couple sentences on the PR description about the structural changes you needed to make in order to update the CDK to version 6? It would be super helpful for me and for others looking at this code in the future :)
Also, how have you tested these changes? Do you feel confident that they won't break anything for uses?
| python = "^3.9,<3.12" | ||
| airbyte-cdk = "0.78.6" # Breaks with newer versions of the CDK | ||
| python = ">=3.10,<3.12" | ||
| airbyte_cdk = { path = "/home/anatolii/airbytehq/airbyte/airbyte-cdk/python/airbyte-python-cdk/", develop = true } |
There was a problem hiding this comment.
this should be replaced with a public package version right?
There was a problem hiding this comment.
Added dev version
| connectorType: source | ||
| definitionId: 80a54ea2-9959-4040-aac1-eee42423ec9b | ||
| dockerImageTag: 2.1.13 | ||
| dockerImageTag: 2.2.0 |
There was a problem hiding this comment.
is there consensus about whether to bump minor or patch number in this version?
There was a problem hiding this comment.
This should be the minor update
| return extracted if isinstance(extracted, list) else [extracted] | ||
| return [] | ||
| decoded_response = self.decoder.decode(response) | ||
| for response_body in decoded_response: |
There was a problem hiding this comment.
why are you now looping over this response where we weren't before?
There was a problem hiding this comment.
The underlying CDK class changed
airbyte-integrations/connectors/source-monday/source_monday/graphql_requester.py
Outdated
Show resolved
Hide resolved
| last_page_size: int, | ||
| last_record: Optional[Record], | ||
| last_page_token_value: Optional[Any]) -> Optional[Tuple[Optional[int], Optional[int]]]: | ||
| """ |
There was a problem hiding this comment.
update docstring to conform with new interface.
| self._sub_page: Optional[int] = self.start_from_page | ||
|
|
||
| def next_page_token(self, response, last_records: List[Mapping[str, Any]]) -> Optional[Tuple[Optional[int], Optional[int]]]: | ||
| def next_page_token(self, |
There was a problem hiding this comment.
why is this change necessary? It doesn't seem like you're using any of these arguments.
If it's only in order to conform to some interface spec, can you note that in a comment?
There was a problem hiding this comment.
Yes, the underlying class in the CDK changed, so we need to align with it
| response_decoded = self.decoder.decode(response) | ||
| for response_body in response_decoded: | ||
| if not response_body["data"]["boards"]: | ||
| yield from [] |
There was a problem hiding this comment.
do you actually want to return a generator object with nothing in it here or just continue?
There was a problem hiding this comment.
Fixed
| @@ -37,32 +37,30 @@ class MondayActivityExtractor(RecordExtractor): | |||
| decoder: Decoder = JsonDecoder(parameters={}) | |||
|
|
|||
| def extract_records(self, response: requests.Response) -> List[Record]: | |||
There was a problem hiding this comment.
should the return type annotation be updated?
There was a problem hiding this comment.
Fixed
|
David Gold (@dbgold17) Sorry, this PR is not in review yet. I still need to add StateDelegatingRetriever, which hasn’t been merged yet. |
Sounds good, sorry for the premature review! |
|
/format-fix
|
|
/format-fix
|
… into tolik0/source-monday/add-grouping-partition-router
Brian Lai (brianjlai)
left a comment
There was a problem hiding this comment.
one suggestions on refactoring try_extract_records() to process records iteratively since the underlying call became a generator, but otherwise no other comments and looks good to me!
| return [x for x in extracted if x] | ||
| return extracted if isinstance(extracted, list) else [extracted] | ||
| return [] | ||
| decoded_response = self.decoder.decode(response) |
There was a problem hiding this comment.
because self.decoder.decode(response) has been changed to become a generator, can we refactor this method to also iterate over the generator and yield from extracted below instead of storing them in the result list?
There was a problem hiding this comment.
Fixed
| values["text"] = display_value | ||
|
|
||
| return result | ||
| yield from result |
There was a problem hiding this comment.
In addition to the above comment, if we refactor self.try_extract_records() to be an iterable, we should also yield record at the end of each iteration of the for loop so we don't have to keep everything in memory with result
There was a problem hiding this comment.
Fixed
| concurrency_level: | ||
| type: ConcurrencyLevel | ||
| default_concurrency: "{{ config.get('num_workers', 4) }}" | ||
| max_concurrency: 40 |
There was a problem hiding this comment.
Monday actually has a pretty detailed section on rate limits: https://developer.monday.com/api-reference/docs/rate-limits
40 actually seems like an acceptable threshold since they call out only allowing for at most 40 concurrent requests at any given moment. But let's directly reference https://developer.monday.com/api-reference/docs/rate-limits#concurrency-limit this in the comment
|
/format-fix
|
Brian Lai (brianjlai)
left a comment
There was a problem hiding this comment.
i had a couple small questions, but i'm fine with just answering or addressing and no need to block. ![]()
| else: | ||
| path = [p.eval(self.config) for p in field_path] | ||
| if "*" in path: | ||
| extracted = dpath.values(body, path) |
There was a problem hiding this comment.
small nit/question, why the switch to use dpath.values() instead of dpath.util.values()? same for the below get() as well
There was a problem hiding this comment.
Updated according to the latest implementation of DpathExtractor. Here is the PR for the util change: #38847
| result = self.try_extract_records(response, self.field_path_incremental) | ||
|
|
||
| for record in result: | ||
| if "updated_at" in record: |
There was a problem hiding this comment.
why don't we have to edit some of the fields on each record object anymore?
There was a problem hiding this comment.
I moved this to transformations
There was a problem hiding this comment.
ah got it, must've missed that. thanks for clarifying
|
/format-fix
|





What
Update CDK to v6.
How
boardsanditemsstream toStateDelegatingStream.User Impact
The sync should become quicker as the source is now concurrent.
Can this PR be safely reverted and rolled back?