-
Notifications
You must be signed in to change notification settings - Fork 52
feat: Migrate Notion source to Connector V2 structure #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 56 commits
Commits
Show all changes
59 commits
Select commit
Hold shift + click to select a range
d618716
Mantain the current v1 file
5171d9b
finished
7b93cb6
Black format and changelog add
2946f52
Fix Makefile issue with lint
3fc17e8
more fixes
5d83d7c
restore main files
40bc3ff
More restored files from main
1682181
removed useless variable
cff69aa
Optimized imports
98e6398
few ruff fixes
7269c66
last ruff fix
8b844e0
version updated
698d715
added notion-client to base.in
16885f3
remove unused error
70ee2a5
changed reference
4bcde56
ruff fix
4be07b0
Fixed and saving files now
314a068
Merge branch 'main' into DS-87-notion-source-migration
9a4147e
addressed
7c5a5d9
Roman Access Config request addressed
4ffe9c6
Library type_check done
1fc45a4
black fix
f2bc82d
merge main solve conlicts
04f518f
version file matching
d43985e
More libraries that needed to be capsulated
d767980
merge main
622c05b
Remove leftover comment
3cbaa50
Multiple PR changes assigned
2e4f45b
merge main
7820f09
fixes
c0c7efe
tries
f58db9e
More Client
d5d3339
most done
0e000b1
missed this
3cc4086
trying
f82c0b0
black
9f02b74
version change
e4d8118
async client
2579ace
connector.py updates
2d4a1d7
autopep8 updates
b3802b2
merge main
502aa1a
Roman comments addressed
abd9f1a
version bump
d2263ea
params issue
450aff6
stop ignoring Notion
3672f3a
merge conflict
4b1e612
my bad, versions dont match
047dabf
Async Indexes, making it work
8705d0b
merge main to feature branch
bryan-unstructured 7550c53
migrate notion source connector to V2
bryan-unstructured 7df67c3
add integration tests for downloading notion database
bryan-unstructured 6bfe4eb
fix expected output files in notion e2e test
bryan-unstructured dba56b3
make sure the recursive child block getter to point at the next page …
bryan-unstructured 71b0c44
fix syntax
bryan-unstructured c0f7325
fix block retrieval logic
bryan-unstructured f99982b
remove unnecessary e2e test for notion connector
bryan-unstructured f90080a
Add more complex integration test
bryan-unstructured 806d140
update CHANGELOG
bryan-unstructured 5dc64e7
Merge branch 'main' into DS-87-notion-source-migration
bryan-unstructured File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,9 @@ | ||
| ## 0.3.12-dev4 | ||
|
|
||
| ### Enhancements | ||
|
|
||
| * **Migrate Notion Source Connector to V2** | ||
|
|
||
| ## 0.3.12-dev3 | ||
|
|
||
| ### Enhancements | ||
|
|
||
5 changes: 5 additions & 0 deletions
5
test/integration/connectors/expected_results/notion_database/directory_structure.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| { | ||
| "directory_structure": [ | ||
| "1572c3765a0a80d3a34ac5c0eecd1e88.html" | ||
| ] | ||
| } |
24 changes: 24 additions & 0 deletions
24
...nnectors/expected_results/notion_database/downloads/1572c3765a0a80d3a34ac5c0eecd1e88.html
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| <table> | ||
| <tr> | ||
| <th> | ||
| Author | ||
| </th> | ||
| <th> | ||
| Item | ||
| </th> | ||
| </tr> | ||
| <tr> | ||
| <td> | ||
| <div> | ||
| <span> | ||
| test-author | ||
| </span> | ||
| </div> | ||
| </td> | ||
| <td> | ||
| <div> | ||
| test-page-in-database | ||
| </div> | ||
| </td> | ||
| </tr> | ||
| </table> |
39 changes: 39 additions & 0 deletions
39
...nnectors/expected_results/notion_database/file_data/1572c3765a0a80d3a34ac5c0eecd1e88.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| { | ||
| "identifier": "1572c3765a0a80d3a34ac5c0eecd1e88", | ||
| "connector_type": "notion", | ||
| "source_identifiers": { | ||
| "filename": "1572c3765a0a80d3a34ac5c0eecd1e88.html", | ||
| "fullpath": "1572c3765a0a80d3a34ac5c0eecd1e88.html", | ||
| "rel_path": "1572c3765a0a80d3a34ac5c0eecd1e88.html" | ||
| }, | ||
| "metadata": { | ||
| "url": null, | ||
| "version": null, | ||
| "record_locator": { | ||
| "database_id": "1572c3765a0a80d3a34ac5c0eecd1e88" | ||
| }, | ||
| "date_created": "2024-12-09T11:54:00.000Z", | ||
| "date_modified": "2025-01-05T18:31:00.000Z", | ||
| "date_processed": "1736123419.51279", | ||
| "permissions_data": null, | ||
| "filesize_bytes": null | ||
| }, | ||
| "additional_metadata": { | ||
| "created_by": { | ||
| "id": "118d872b-594c-8171-b46f-00020d10d8b2", | ||
| "object": "user" | ||
| }, | ||
| "last_edited_by": { | ||
| "id": "118d872b-594c-8171-b46f-00020d10d8b2", | ||
| "object": "user" | ||
| }, | ||
| "parent": { | ||
| "page_id": "1572c376-5a0a-80d8-9619-cb35a622b8cc", | ||
| "type": "page_id" | ||
| }, | ||
| "url": "https://www.notion.so/1572c3765a0a80d3a34ac5c0eecd1e88" | ||
| }, | ||
| "reprocess": false, | ||
| "local_download_path": "/private/var/folders/h7/n848df9s5yn7ml8rxb61vhyc0000gp/T/tmp_lvvqhyy/1572c3765a0a80d3a34ac5c0eecd1e88.html", | ||
| "display_name": null | ||
| } |
5 changes: 5 additions & 0 deletions
5
test/integration/connectors/expected_results/notion_page/directory_structure.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| { | ||
| "directory_structure": [ | ||
| "1572c3765a0a806299f0dd6999f9e4c7.html" | ||
| ] | ||
| } |
149 changes: 149 additions & 0 deletions
149
...n/connectors/expected_results/notion_page/downloads/1572c3765a0a806299f0dd6999f9e4c7.html
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,149 @@ | ||
| <html> | ||
| <head> | ||
| <title> | ||
| test-doc1 | ||
| </title> | ||
| </head> | ||
| <body> | ||
| <p> | ||
| test-doc1 | ||
| </p> | ||
| <div> | ||
| <br/> | ||
| <div> | ||
| testtext2 testtext2 testtext2 testtext2 testtext2 testtext2 testtext2 testtext2 testtext2 testtext2 | ||
| </div> | ||
| <div> | ||
| testtext2 testtext2 testtext2 testtext2 testtext2 testtext2 testtext2 testtext2 testtext2 testtext2 | ||
| </div> | ||
| <img src='https://prod-files-secure.s3.us-west-2.amazonaws.com/9e97f74a-ce4a-43ae-b704-4b8501948642/902effc2-1280-4e9c-92cc-77d940b24ac0/image.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIAT73L2G45FSPPWI6X%2F20250106%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20250106T003029Z&X-Amz-Expires=3600&X-Amz-Signature=db71dd59388022b1b5d64115da9daebb52b5c547a709c80552e1b0004a6e1968&X-Amz-SignedHeaders=host&x-id=GetObject'/> | ||
| <div> | ||
| <ol style='margin-left: 0px'> | ||
| <li> | ||
| Testdoc2 List Item 1 | ||
| </li> | ||
| </ol> | ||
| <ol style='margin-left: 10px'> | ||
| <li> | ||
| Testdoc2 List Item 1 Nested Item A | ||
| </li> | ||
| <li> | ||
| Testdoc2 List Item 1 Nested Item B | ||
| </li> | ||
| </ol> | ||
| </div> | ||
| <ol style='margin-left: 0px'> | ||
| <li> | ||
| Testdoc2 List Item 2 | ||
| </li> | ||
| <li> | ||
| Testdoc2 List Item 3 | ||
| </li> | ||
| </ol> | ||
| <br/> | ||
| <div> | ||
| <input type='checkbox'/> | ||
|
|
||
| </div> | ||
| <div> | ||
| Testdoc2 Checklist Item 1 | ||
| </div> | ||
| <div> | ||
| <input type='checkbox'/> | ||
|
|
||
| </div> | ||
| <div> | ||
| Testdoc2 Checklist Item 2 (checked) | ||
| </div> | ||
| <div> | ||
|
|
||
| </div> | ||
| <img src='https://pf-emoji-service--cdn.us-east-1.prod.public.atl-paas.net/standard/caa27a19-fc09-4452-b2b4-a301552fd69c/32x32/1f603.png'/> | ||
| <img src='https://pf-emoji-service--cdn.us-east-1.prod.public.atl-paas.net/standard/caa27a19-fc09-4452-b2b4-a301552fd69c/32x32/1f603.png'/> | ||
| <div> | ||
| <b> | ||
| Testdoc2 bold text | ||
| </b> | ||
| </div> | ||
| <div> | ||
| <i> | ||
| Testdoc2 italic text | ||
| </i> | ||
| </div> | ||
| <div> | ||
| <b> | ||
| Testdoc2 Heading 1 Sized Text | ||
| </b> | ||
| </div> | ||
| <div> | ||
| <b> | ||
| Testdoc2 Heading 2 Sized Text | ||
| </b> | ||
| </div> | ||
| <div> | ||
| Testdoc2 Heading 3 Sized Text | ||
| </div> | ||
| <div> | ||
| Testdoc2 Heading 4 Sized Text | ||
| </div> | ||
| <div> | ||
| Testdoc2 Heading 5 Sized Text | ||
| </div> | ||
| <table> | ||
| <tr> | ||
| <td> | ||
| <b> | ||
| Testdoc2 Table: Column 1 Row 0 | ||
| </b> | ||
| </td> | ||
| <td> | ||
| <b> | ||
| Testdoc2 Table: Column 2 Row 0 | ||
| </b> | ||
| </td> | ||
| <td> | ||
| <b> | ||
| Testdoc2 Table: Column 3 Row 0 | ||
| </b> | ||
| </td> | ||
| </tr> | ||
| <tr> | ||
| <td> | ||
| <b> | ||
| Testdoc2 Table: Column 1 Row 1 | ||
| </b> | ||
| </td> | ||
| <td> | ||
| <b> | ||
| Testdoc2 Table: Column 2 Row 1 | ||
| </b> | ||
| </td> | ||
| <td> | ||
| <b> | ||
| Testdoc2 Table: Column 3 Row 1 | ||
| </b> | ||
| </td> | ||
| </tr> | ||
| <tr> | ||
| <td> | ||
| <b> | ||
| Testdoc2 Table: Column 1 Row 2 | ||
| </b> | ||
| </td> | ||
| <td> | ||
| <b> | ||
| Testdoc2 Table: Column 2 Row 2 | ||
| </b> | ||
| </td> | ||
| <td> | ||
| <b> | ||
| Testdoc2 Table: Column 3 Row 2 | ||
| </b> | ||
| </td> | ||
| </tr> | ||
| </table> | ||
| <img src='https://prod-files-secure.s3.us-west-2.amazonaws.com/9e97f74a-ce4a-43ae-b704-4b8501948642/bef64626-dfad-4bf2-9486-0fe380e90e4f/image.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIAT73L2G45FSPPWI6X%2F20250106%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20250106T003030Z&X-Amz-Expires=3600&X-Amz-Signature=8c2fb5310dc771d2bad1f4ff16a68781e8ee13e243b25865d1a084cc8dcaffa1&X-Amz-SignedHeaders=host&x-id=GetObject'/> | ||
| <br/> | ||
| </div> | ||
| </body> | ||
| </html> |
39 changes: 39 additions & 0 deletions
39
...n/connectors/expected_results/notion_page/file_data/1572c3765a0a806299f0dd6999f9e4c7.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| { | ||
| "identifier": "1572c3765a0a806299f0dd6999f9e4c7", | ||
| "connector_type": "notion", | ||
| "source_identifiers": { | ||
| "filename": "1572c3765a0a806299f0dd6999f9e4c7.html", | ||
| "fullpath": "1572c3765a0a806299f0dd6999f9e4c7.html", | ||
| "rel_path": "1572c3765a0a806299f0dd6999f9e4c7.html" | ||
| }, | ||
| "metadata": { | ||
| "url": null, | ||
| "version": null, | ||
| "record_locator": { | ||
| "page_id": "1572c3765a0a806299f0dd6999f9e4c7" | ||
| }, | ||
| "date_created": "2024-12-09T18:13:00.000Z", | ||
| "date_modified": "2024-12-30T15:16:00.000Z", | ||
| "date_processed": "1736123422.122014", | ||
| "permissions_data": null, | ||
| "filesize_bytes": null | ||
| }, | ||
| "additional_metadata": { | ||
| "created_by": { | ||
| "id": "118d872b-594c-8171-b46f-00020d10d8b2", | ||
| "object": "user" | ||
| }, | ||
| "last_edited_by": { | ||
| "id": "118d872b-594c-8171-b46f-00020d10d8b2", | ||
| "object": "user" | ||
| }, | ||
| "parent": { | ||
| "page_id": "1182c376-5a0a-8042-9a2a-fb003e00d57b", | ||
| "type": "page_id" | ||
| }, | ||
| "url": "https://www.notion.so/test-doc1-1572c3765a0a806299f0dd6999f9e4c7" | ||
| }, | ||
| "reprocess": false, | ||
| "local_download_path": "/private/var/folders/h7/n848df9s5yn7ml8rxb61vhyc0000gp/T/tmp59aqv6nt/1572c3765a0a806299f0dd6999f9e4c7.html", | ||
| "display_name": null | ||
| } |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be replacing the
## 0.3.12-dev3line.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry can you please elaborate why do we want to replace the ## 0.3.12-dev3 line?
0.3.12-dev3 is about Vectara Connector migration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because those all get squashed into a single
0.3.12changelog entry when we cut a release.