Skip to content

Commit 7f584d1

Browse files
authored
Confluence source connector: now accepts username/password, username/API token, or PAT for authentication (#455)
1 parent 0373070 commit 7f584d1

File tree

8 files changed

+156
-43
lines changed

8 files changed

+156
-43
lines changed
Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,22 @@
11
- `<name>` (_required_) - A unique name for this connector.
22
- `<url>` (_required_) - The URL to the target Confluence Cloud instance.
3-
- `<user-email>` (_required_) - The email address of the user who has access to the instance.
4-
- `<api-token>` (_required_) - The Confluence API token that provides access to the instance.
53
- `<max-num-of-spaces>` - The maximum number of Confluence spaces to access within the Confluence Cloud instance. The default is `500` unless otherwise specified.
64
- `<max-num-of-docs-from-each-space>` - The maximum number of documents to access within each space. The default is `150` unless otherwise specified.
7-
- `spaces` is an array of strings, with each `<space-name>` specifying the name of a space to access, for example: `["luke","paul"]`. By default, if no space names are specified, and the `<max-num-of-spaces>` is exceeded for the instance, be aware that you might get unexpected results.
5+
- `spaces` is an array of strings, with each `<space-name>` specifying the name of a space to access, for example: `["luke","paul"]`. By default, if no space names are specified, and the `<max-num-of-spaces>` is exceeded for the instance, be aware that you might get unexpected results.
6+
7+
For API token authentication:
8+
9+
- `<username>` - The name or email address of the target user.
10+
- `<api-token>` - The user's API token value.
11+
- For `cloud`, `true` if you are using Confluence Cloud. The default is `false` if not otherwise specified.
12+
13+
For personal access token (PAT) authentication:
14+
15+
- `<personal-access-token>` - The target user's PAT value.
16+
- `cloud` should always be `false`.
17+
18+
For password authentication:
19+
20+
- `<username>` - The name or email address of the target user.
21+
- `<password>` - The user's password.
22+
- For `cloud`, `true` if you are using Confluence Cloud. The default is `false` if not otherwise specified.

snippets/general-shared-text/confluence-cli-api.mdx

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,16 +10,16 @@ import AdditionalIngestDependencies from '/snippets/general-shared-text/ingest-d
1010

1111
The following environment variables:
1212

13-
- `CONFLUENCE_URL` - The URL of the Confluence instance, represented by `--url` (CLI) or `url` (Python).
13+
- `CONFLUENCE_URL` - The target Confluence site's URL, represented by `--url` (CLI) or `url` (Python).
1414
- One of the following:
1515

16-
- `CONFLUENCE_API_TOKEN` - The value of the Confluence API token for authenticating with the Confluence instance, represented by `--api-token` (CLI) or `api_token` (Python).
17-
- `CONFLUENCE_ACCESS_TOKEN` - The value of the Confluence personal access token for authenticating with the Confluence instance, represented by `--access-token` (CLI) or `access_token` (Python).
18-
19-
- `CONFLUENCE_USER_EMAIL` - The user's email address for authenticating with the Confluence instance, represented by `--user-email` (CLI) or `user_email` (Python).
16+
- For API token authentication: `CONFLUENCE_USERNAME` and `CONFLUENCE_TOKEN` - The name or email address,and API token of the target Confluence user, represented by `--username` (CLI) or `username` (Python) and `--token` (CLI) or `token` (Python), respectively.
17+
- For personal access token (PAT) authentication: `CONFLUENCE_TOKEN` - The PAT for the target Confluence user, represented by `--token` (CLI) or `token` (Python).
18+
- For password authentication: `CONFLUENCE_USERNAME` and `CONFLUENCE_PASSWORD` - The name or email address, and password of the target Confluence user, represented by `--username` (CLI) or `username` (Python) and `--password` (CLI) or `password` (Python), respectively.
2019

2120
Additional settings include:
2221

2322
- `--spaces` (CLI) or `spaces` (Python): Optionally, the list of the names of the specific spaces to access, expressed as a comma-separated list of strings (CLI) or an array of strings (Python), with each string representing a space's name. The default is no specific spaces, if not otherwise specified.
2423
- `--max-num-of-spaces` (CLI) or `max_num_of_spaces` (Python): Optionally, the maximum number of spaces to access, expressed as an integer. The default value is `500` if not otherwise specified.
2524
- `--max-num-of-docs-from-each-space` (CLI) or `max_num_of_docs_from_each_space` (Python): Optionally, the maximum number of documents to access from each space, expressed as an integer. The default value is `100` if not otherwise specified.
25+
- `--cloud` or `--no-cloud` (CLI) or `cloud` (Python): Optionally, whether to use Confluence Cloud (`--cloud` for CLI or `cloud=True` for Python). The default is `--no-cloud` (CLI) or `cloud=False` (Python) if not otherwise specified.
Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
Fill in the following fields:
22

33
- **Name** (_required_): A unique name for this connector.
4-
- **URL** (_required_): The URL to the target Confluence Cloud instance.
5-
- **User Email** (_required_): The email address of the user who has access to the instance.
6-
- **API Token** (_required_): The Confluence API token that provides access to the instance.
7-
- **Max Number of Spaces**: The maximum number of Confluence spaces to access within the Confluence Cloud instance.
4+
- **URL** (_required_): The target Confluence site's URL.
5+
- For personal access token (PAT) authentication: for **Authentication Method**, select **Personal Access Token**. Then enter the PAT into the **Personal Access Token** field.
6+
- For API token or password authentication: for **Authentication Method**, select **Password or API token**. Then enter the user's name or email address into the **Username** field and the API token or password into the **Password** field. Also, if you are using Confluence Cloud, check the **Cloud** box.
7+
- **Max number of spaces**: The maximum number of Confluence spaces to access within the Confluence Cloud instance.
88
The default is 500 unless otherwise specified.
9-
- **Max Number of Docs Per Space**: The maximum number of documents to access within each space.
9+
- **Max number of docs per space**: The maximum number of documents to access within each space.
1010
The default is 150 unless otherwise specified.
11-
- **List of Spaces**: A comma-separated string that lists the names of all of the spaces to access, for example: `luke,paul`.
11+
- **List of spaces**: A comma-separated string that lists the names of all of the spaces to access, for example: `luke,paul`.
1212
By default, if no space names are specified, and the **Max Number of Spaces** is reached for the instance, be aware that you might get
1313
unexpected results.
Lines changed: 25 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,27 @@
1+
- A [Confluence Cloud account](https://www.atlassian.com/software/confluence/pricing) or
2+
[Confluence Data Center installation](https://confluence.atlassian.com/doc/installing-confluence-data-center-203603.html).
3+
- The site URL for your [Confluence Cloud account](https://community.atlassian.com/t5/Confluence-questions/confluence-cloud-url/qaq-p/1157148) or
4+
[Confluence Data Center installation](https://confluence.atlassian.com/confkb/how-to-find-your-site-url-to-set-up-the-confluence-data-center-and-server-mobile-app-938025792.html).
5+
- A user in your [Confluence Cloud account](https://confluence.atlassian.com/cloud/invite-edit-and-remove-users-744721624.html) or
6+
[Confluence Data Center installation](https://confluence.atlassian.com/doc/add-and-invite-users-138313.html).
7+
- The user must have the correct permissions in your
8+
[Conflunce Cloud account](https://support.atlassian.com/confluence-cloud/docs/what-are-confluence-cloud-permissions-and-restrictions/) or
9+
[Confluence Data Center installation](https://confluence.atlassian.com/doc/permissions-and-restrictions-139557.html) to
10+
access the target spaces and pages.
11+
- One of the following:
12+
13+
- For Confluence Cloud or Confluence Data Center, the target user's name or email address, and password.
14+
[Change a Confluence Cloud user's password](https://support.atlassian.com/confluence-cloud/docs/change-your-confluence-password/).
15+
[Change a Confluence Data Center user's password](https://confluence.atlassian.com/doc/change-your-password-139416.html).
16+
- For Confluence Cloud only, the target user's name or email address, and API token.
17+
[Create an API token](https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/).
18+
- For Confluence Data Center only, the target user's personal access token (PAT).
19+
[Create a PAT](https://confluence.atlassian.com/enterprise/using-personal-access-tokens-1026032365.html).
20+
21+
- Optionally, the names of the specific [spaces](https://support.atlassian.com/confluence-cloud/docs/navigate-spaces/) in the Confluence instance to access.
22+
23+
The following video provides related setup information for Confluence Cloud:
24+
125
<iframe
226
width="560"
327
height="315"
@@ -6,16 +30,4 @@ title="YouTube video player"
630
frameborder="0"
731
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
832
allowfullscreen
9-
></iframe>
10-
11-
- A [Confluence account](https://www.atlassian.com/software/confluence).
12-
- The URL for the target Confluence instance.
13-
- A [Confluence user](https://confluence.atlassian.com/doc/add-and-invite-users-138313.html) with
14-
the correct [permissions](https://support.atlassian.com/confluence-cloud/docs/what-are-confluence-cloud-permissions-and-restrictions/) to
15-
access the target spaces and pages in the Confluence instance.
16-
- One of the following:
17-
18-
- A [Confluence Cloud API token](https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/) for the target Confluence user.
19-
- For Unstructured Ingest only, a [Confluence personal access token](https://confluence.atlassian.com/enterprise/using-personal-access-tokens-1026032365.html) for the target Confluence user.
20-
21-
- Optionally, the names of the specific [spaces](https://support.atlassian.com/confluence-cloud/docs/navigate-spaces/) in the Confluence instance to access.
33+
></iframe>
Lines changed: 39 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,52 @@
11
```bash CLI
22
#!/usr/bin/env bash
33

4+
# For API token authentication:
45
unstructured-ingest \
56
confluence \
7+
--token $CONFLUENCE_TOKEN \
68
--url $CONFLUENCE_URL \
9+
--username $CONFLUENCE_USERNAME \
10+
--cloud \
711
--spaces luke,paul \
8-
--user-email $CONFLUENCE_USER_EMAIL \
9-
--api-token $CONFLUENCE_API_TOKEN \
10-
# Or:
11-
# --access-token $CONFLUENCE_ACCESS_TOKEN \
12+
--max-num-of-spaces 500 \
13+
--max-num-of-docs-from-each-space 150 \
1214
--output-dir $LOCAL_FILE_OUTPUT_DIR \
1315
--partition-by-api \
1416
--api-key $UNSTRUCTURED_API_KEY \
1517
--partition-endpoint $UNSTRUCTURED_API_URL \
1618
--chunking-strategy by_title \
1719
--embedding-provider huggingface
18-
```
20+
21+
# For personal access token (PAT) authentication:
22+
unstructured-ingest \
23+
confluence \
24+
--token $CONFLUENCE_TOKEN \
25+
--url $CONFLUENCE_URL \
26+
--spaces luke,paul \
27+
--max-num-of-spaces 500 \
28+
--max-num-of-docs-from-each-space 150 \
29+
--output-dir $LOCAL_FILE_OUTPUT_DIR \
30+
--partition-by-api \
31+
--api-key $UNSTRUCTURED_API_KEY \
32+
--partition-endpoint $UNSTRUCTURED_API_URL \
33+
--chunking-strategy by_title \
34+
--embedding-provider huggingface
35+
36+
# For password authentication:
37+
unstructured-ingest \
38+
confluence \
39+
--password $CONFLUENCE_PASSWORD \
40+
--url $CONFLUENCE_URL \
41+
--username $CONFLUENCE_USERNAME \
42+
--no-cloud \
43+
--spaces luke,paul \
44+
--max-num-of-spaces 500 \
45+
--max-num-of-docs-from-each-space 150 \
46+
--output-dir $LOCAL_FILE_OUTPUT_DIR \
47+
--partition-by-api \
48+
--api-key $UNSTRUCTURED_API_KEY \
49+
--partition-endpoint $UNSTRUCTURED_API_URL \
50+
--chunking-strategy by_title \
51+
--embedding-provider huggingface
52+
```

snippets/source_connectors/confluence.v2.py.mdx

Lines changed: 27 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,18 +22,40 @@ if __name__ == "__main__":
2222
Pipeline.from_configs(
2323
context=ProcessorConfig(),
2424
indexer_config=ConfluenceIndexerConfig(
25-
spaces=["luke", "paul"]
25+
spaces=["luke", "paul"],
26+
max_num_of_spaces=500,
27+
max_num_of_docs_from_each_space=150
2628
),
2729
downloader_config=ConfluenceDownloaderConfig(download_dir=os.getenv("LOCAL_FILE_DOWNLOAD_DIR")),
30+
31+
# For API token authentication:
2832
source_connection_config=ConfluenceConnectionConfig(
2933
access_config=ConfluenceAccessConfig(
30-
api_token=os.getenv("CONFLUENCE_API_TOKEN")
31-
# Or:
32-
# access_token=os.getenv("CONFLUENCE_ACCESS_TOKEN")
34+
token=os.getenv("CONFLUENCE_TOKEN")
3335
),
3436
url=os.getenv("CONFLUENCE_URL"),
35-
user_email=os.getenv("CONFLUENCE_USER_EMAIL")
37+
username=os.getenv("CONFLUENCE_USERNAME"),
38+
cloud=True
3639
),
40+
41+
# For personal access token (PAT) authentication:
42+
# source_connection_config=ConfluenceConnectionConfig(
43+
# access_config=ConfluenceAccessConfig(
44+
# token=os.getenv("CONFLUENCE_TOKEN")
45+
# ),
46+
# url=os.getenv("CONFLUENCE_URL")
47+
# ),
48+
49+
# For password authentication:
50+
# source_connection_config=ConfluenceConnectionConfig(
51+
# access_config=ConfluenceAccessConfig(
52+
# password=os.getenv("CONFLUENCE_PASSWORD")
53+
# ),
54+
# url=os.getenv("CONFLUENCE_URL"),
55+
# username=os.getenv("CONFLUENCE_USERNAME"),
56+
# cloud=False
57+
# ),
58+
3759
partitioner_config=PartitionerConfig(
3860
partition_by_api=True,
3961
api_key=os.getenv("UNSTRUCTURED_API_KEY"),

snippets/source_connectors/confluence_rest_change.mdx

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,26 @@ curl --request 'PUT' --location \
88
'{
99
"config": {
1010
"url": "<url>",
11-
"user_email": "<user-email>",
12-
"api_token": "<api-token>",
1311
"max_num_of_spaces": <max-num-of-spaces>,
1412
"max_num_of_docs_from_each_space": <max-num-of-docs-from-each-space>,
15-
"spaces": ["<space-name>", "<space-name>"]
13+
"spaces": ["<space-name>", "<space-name>"],
14+
15+
# For API token authentication:
16+
17+
"username": "<username>",
18+
"token": "<api-token>",
19+
"cloud": "<true|false>"
20+
21+
# For personal access token (PAT) authentication:
22+
23+
"token": "<personal-access-token>",
24+
"cloud": "false"
25+
26+
# For password authentication:
27+
28+
"username": "<username>",
29+
"password": "<password>",
30+
"cloud": "<true|false>"
1631
}
1732
}'
1833
```

snippets/source_connectors/confluence_rest_create.mdx

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,26 @@ curl --request 'POST' --location \
1010
"type": "confluence",
1111
"config": {
1212
"url": "<url>",
13-
"user_email": "<user-email>",
14-
"api_token": "<api-token>",
1513
"max_num_of_spaces": <max-num-of-spaces>,
1614
"max_num_of_docs_from_each_space": <max-num-of-docs-from-each-space>,
17-
"spaces": ["<space-name>", "<space-name>"]
15+
"spaces": ["<space-name>", "<space-name>"],
16+
17+
# For API token authentication:
18+
19+
"username": "<username>",
20+
"token": "<api-token>",
21+
"cloud": "<true|false>"
22+
23+
# For personal access token (PAT) authentication:
24+
25+
"token": "<personal-access-token>",
26+
"cloud": "false"
27+
28+
# For password authentication:
29+
30+
"username": "<username>",
31+
"password": "<password>",
32+
"cloud": "<true|false>"
1833
}
1934
}'
2035
```

0 commit comments

Comments
 (0)