Skip to content

Commit 420605b

Browse files
committed
v3.0.0 User Quickstart Instructions
1 parent 9222684 commit 420605b

File tree

1 file changed

+278
-40
lines changed

1 file changed

+278
-40
lines changed

netapp-neo/USER_QUICKSTART.md

Lines changed: 278 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
1-
# NetApp Connector User Quick Start Guide
1+
# NetApp Connector User Quick Start Guide (v3.0+)
22

3-
This guide assumes that you have deployed the NetApp connector and are ready to start using it. If you have not yet deployed the connector, please refer to the [NetApp Connector README](README.md).
3+
This guide assumes that you have deployed the NetApp connector and are ready to start using it. If you have not yet deployed the connector, please refer to the [NetApp Connector README](/C:/Program%20Files/Joplin/resources/app.asar/README.md "README.md").
44

5-
## 1. Getting Started
6-
7-
> [!IMPORTANT]
5+
> \[!IMPORTANT\]
86
> The NetApp Connector for M365 Copilot is currently in **Private Preview**. This means that the connector is not yet fully supported and may have some limitations. The connector requires a license to activate. You can request access to the connector by joining the Early Access Program (EAP). Please book a meeting with the following link to join the EAP: [Book a meeting with NetApp](https://outlook.office.com/bookwithme/user/d636d7a02ad8477c9af9a0cbb029af4d@netapp.com/meetingtype/nm-mXkp-TUO1CdzOmFfIBw2?anonymous&ismsaljsauthenabled&ep=mlink).
97
10-
The easiest way to get started is by using the pre-built Docker image or [helm chart](../charts/netapp-copilot-connector/README.md). You can run the connector in a Docker container or deploy it to a Kubernetes cluster using Helm.
8+
## 1\. Getting Started
9+
10+
The easiest way to get started is by using the pre-built Docker image. You can run the connector in a Docker container or deploy it to a Kubernetes cluster using Helm.
1111

1212
```bash
1313
docker pull ghcr.io/netapp/netapp-copilot-connector:latest
@@ -16,65 +16,303 @@ docker pull ghcr.io/netapp/netapp-copilot-connector:latest
1616
or for a specific version:
1717

1818
```bash
19-
docker pull ghcr.io/netapp/netapp-copilot-connector:2.7.0
19+
docker pull ghcr.io/netapp/netapp-copilot-connector:3.0.0
2020
```
2121

2222
### Running the Container
2323

24-
1. Download the [Sample .env file](./dist/.env.example) and rename it to `.env`.
25-
2. Configure the `.env` file with the required environment variables. The following environment variables are required:
24+
1. **Download the sample configuration files:**
25+
26+
- Download the [Sample .env file](/C:/Program%20Files/Joplin/resources/app.asar/.env.example ".env.example") and rename it to `.env`
27+
- Download the [docker-compose.yml](/C:/Program%20Files/Joplin/resources/app.asar/docker-compose.yml "docker-compose.yml") file
28+
29+
2. **Configure the `.env` file with the required environment variables:**
30+
31+
```bash
32+
# NetApp Settings (Required)
33+
NETAPP_CONNECTOR_LICENSE=your-licence-key-here
34+
35+
# Microsoft Graph configuration (Required)
36+
MS_GRAPH_CLIENT_ID=your-client-id-here
37+
MS_GRAPH_CLIENT_SECRET=your-client-secret-here
38+
MS_GRAPH_TENANT_ID=your-tenant-id-here
39+
40+
# Database Configuration (Required- PostgreSQL is recommended)
41+
## For PostgreSQL:
42+
DATABASE_URL=postgresql://user:password@localhost:5432/netapp_connector
43+
## or for MySQL:
44+
DATABASE_URL=mysql://user:password@localhost:3306/netapp_connector
2645

27-
````bash
28-
# NetApp Settings
29-
NETAPP_CONNECTOR_LICENSE=your-licence-key-here # Mandatory
46+
# Authentication (Optional - defaults provided)
47+
JWT_SECRET_KEY=your-secret-key-here
48+
ACCESS_TOKEN_EXPIRE_MINUTES=1440
3049

31-
# Microsoft Graph configuration
32-
MS_GRAPH_CLIENT_ID=your-client-id-here # Mandatory
33-
MS_GRAPH_CLIENT_SECRET=your-client-secret-here # Mandatory
34-
MS_GRAPH_TENANT_ID=your-tenant-id-here # Mandatory```
35-
````
50+
# Multi-container deployments (Optional)
51+
ENCRYPTION_KEY=your-shared-encryption-key
52+
```
3653

37-
3. Download the latest docker-compose file from the [dist](./dist) directory.
38-
4. Run the following docker-compose command to deploy the connector:
54+
3. **Run the connector using Docker Compose:**
3955

4056
```bash
4157
docker-compose up -d
4258
```
4359

44-
> [!TIP]
45-
> You can enable GPU support by uncommenting the `deploy` section in the `docker-compose.yml` file. This will allow the connector to leverage GPU acceleration for faster data extraction and conversion. Make sure you have the NVIDIA Container Toolkit installed on your host machine.
60+
> \[!TIP\]
61+
> **GPU Acceleration Support**: The connector now supports GPU acceleration for faster document processing. Three variants are available:
62+
>
63+
> - `netapp-copilot-connector:latest` - CPU-only (smallest, ~2.5GB)
64+
> - `netapp-copilot-connector:latest-cuda` - NVIDIA GPU support (~8GB)
65+
> - `netapp-copilot-connector:latest-rocm` - AMD GPU support (~7GB)
66+
>
67+
> For GPU support, uncomment the `deploy` section in the `docker-compose.yml` file and ensure you have the appropriate GPU runtime installed.
4668
4769
### Using Helm
4870

49-
If you are using Kubernetes, you can deploy the connector using Helm. Please refer to the [Helm Deployment](helm/README.md) document for more information.
71+
If you are using Kubernetes, you can deploy the connector using Helm. Please refer to the [Helm Deployment](/C:/Program%20Files/Joplin/resources/app.asar/charts/netapp-copilot-connector/README.md "charts/netapp-copilot-connector/README.md") document for more information.
72+
73+
### Database Options
74+
75+
**Version 3.0+ supports multiple database backends:**
76+
77+
- **PostgreSQL**: For production deployments with high availability (Recommended)
78+
- **MySQL**: Alternative production database option
79+
80+
To use PostgreSQL or MySQL, set the `DATABASE_URL` environment variable:
81+
82+
```bash
83+
# PostgreSQL example
84+
DATABASE_URL=postgresql://user:password@localhost:5432/netapp_connector
85+
86+
# MySQL example
87+
DATABASE_URL=mysql://user:password@localhost:3306/netapp_connector
88+
```
89+
90+
## 2\. Initial Setup and First Admin User
91+
92+
> \[!IMPORTANT\]
93+
> A dedicated stand-alone desktop UI is available for Windows, MacOS and Linux: [Download the Desktop App](/C:/Program%20Files/Joplin/resources/app.asar/client "./client").
94+
95+
The easiest way to set up the connector and create your first admin user is through the desktop application. The desktop app provides a user-friendly interface for:
96+
97+
- User management
98+
- Adding and configuring SMB shares
99+
- Monitoring crawl progress
100+
- Managing connector settings
101+
102+
Alternatively, you can use the API directly by accessing the interactive documentation at `http://localhost:8080/docs`
103+
104+
## 3\. Adding Your First Share
105+
106+
When configuring your first SMB share (either through the desktop app or API), you'll need to provide the following information:
107+
108+
### Required Configuration
109+
110+
- **Share Path**: The UNC path to your SMB share (e.g., `\\server\share`)
111+
- **Authentication**: Domain username and password for accessing the share
112+
- **Kerberos Settings**: Your Active Directory realm (e.g., `YOUR.REALM.DOMAIN`)
113+
- **Crawl Schedule**: When to automatically scan for new/changed files (e.g., daily at 2 AM)
114+
115+
### File Processing Rules
116+
117+
Configure how the connector should process files in your share:
118+
119+
**File Filtering Options:**
120+
121+
- **Include Patterns**: Only process specific file types or paths (e.g., `*.pdf`, `*.docx`, `**/reports/**`)
122+
- **Exclude Patterns**: Skip certain files or directories (e.g., `*.tmp`, `.git/*`, `**/temp/**`)
123+
- **File Size Limits**: Set minimum and maximum file sizes to process
124+
125+
**Content Management:**
126+
127+
- **Copilot Upload**: Choose whether to upload files to Microsoft 365 Copilot for search
128+
- **Content Persistence**: Decide whether to keep extracted content in the local database after upload
129+
130+
### Common Configuration Scenarios
131+
132+
**Scenario 1: Office Documents Only**
133+
134+
- Include patterns: `*.pdf`, `*.docx`, `*.xlsx`, `*.pptx`
135+
- Enable Copilot upload for searchability
136+
137+
**Scenario 2: Database-Only Archive**
138+
139+
- Include specific file types or paths
140+
- Disable Copilot upload to keep content local only
141+
142+
**Scenario 3: Full Share with Exclusions**
143+
144+
- Exclude temporary files, system folders, and backups
145+
- Process all other content types
146+
147+
### New Rule Configuration Options (v3.0+)
148+
149+
- **`include_patterns`**: Only process files matching these patterns (mutually exclusive with `exclude_patterns`)
150+
- **`exclude_patterns`**: Skip files matching these patterns
151+
- **`persist_file_content`**: Keep extracted content in database after Graph upload (default: `true`)
152+
- **`enable_copilot_upload`**: Upload files to Microsoft Graph/Copilot (default: `true`)
153+
154+
> \[!NOTE\]
155+
> **Pattern Filtering**: Use glob patterns like `*.pdf`, `**/*.docx`, or `**/reports/**`. You cannot use both `include_patterns` and `exclude_patterns` in the same share.
156+
157+
## 4\. Triggering Your First Crawl
158+
159+
After adding a share, you can trigger an immediate crawl to test the configuration and start indexing files:
160+
161+
### Starting a Crawl
162+
163+
- **Desktop App**: Use the "Start Crawl" button next to your configured share, or wait for the scheduled crawl to run
164+
- **API**: Use the crawl endpoint for the specific share
165+
166+
### Monitoring Progress
50167

51-
## 2. Adding your first share
168+
You can monitor the crawl progress through:
52169

53-
> [!IMPORTANT]
54-
> A dedicated stand-alone desktop UI is available for Windows, MacOS and Linux: [Download the Desktop App](./client).
170+
- **Real-time Status**: View current crawl status, files processed, and any errors
171+
- **Crawl Statistics**: See total files found, successfully processed, and completion time
172+
- **Error Reporting**: Identify any files or directories that couldn't be accessed
55173

56-
## 3. Viewing the results in Microsoft 365 Copilot
174+
### What Happens During a Crawl
57175

58-
> [!WARNING]
59-
> You must perform this step after you have added your first share and the connector is running in order to see the results in Microsoft 365 Copilot.
176+
1. **Discovery**: The connector scans the share for files matching your configured rules
177+
2. **Content Extraction**: Text content is extracted from supported file types
178+
3. **ACL Processing**: File permissions are analyzed and mapped to Microsoft Entra users/groups
179+
4. **Upload**: Files are uploaded to Microsoft Graph (if enabled) for Copilot integration
180+
5. **Database Storage**: File metadata and content are stored in the local database
60181

61-
Once the connector is running and you have added your first share, you can start using it with Microsoft 365 Copilot. The connector will automatically index the files in the configured shares and make them available for Copilot to use. Please visit the [Search and Intelligence](https://admin.microsoft.com/Adminportal/Home?source=applauncher#/MicrosoftSearch/connectors) section of the Microsoft 365 Admin Center and ensure that you have selected **_Include Connector Results_** for the NetApp Connector. This will allow Copilot to access the indexed files.
182+
## 5\. Viewing Results in Microsoft 365 Copilot
62183

63-
![Select Include Connector Results in the Search and Intelligence Admin Centre](./media/2025-07-15_09-47-23.png)
184+
> \[!WARNING\]
185+
> You must perform this step after you have added your first share and completed at least one successful crawl to see results in Microsoft 365 Copilot.
64186
65-
## 4. (Advanced) Using the API and creating an admin user
187+
1. **Visit the Microsoft 365 Admin Center**: Go to [Search and Intelligence](https://admin.microsoft.com/Adminportal/Home?source=applauncher#/MicrosoftSearch/connectors)
188+
2. **Enable Connector Results**: Ensure you have selected **_Include Connector Results_** for the NetApp Connector
189+
3. **Test in Microsoft 365 Copilot**: Try searching for content from your indexed files using natural language queries
66190

67-
The API documentation is available at `http://YourConnectorIP:8000/docs` after starting the connector. Please refer to our [API User Guide](./USER_API_GUIDE.md) for more details on how to use the API.
191+
![Select Include Connector Results in the Search and Intelligence Admin Centre](/C:/Program%20Files/Joplin/resources/app.asar/media/2025-07-15_09-47-23.png)
68192

69-
If you have any feedback or questions regarding the NetApp Connector or its Documentation, please reach out to us open a GitHub issue at [NetApp Innovation Labs](https://github.com/NetApp/Innovation-Labs/issues).
193+
## 6\. New Features in Version 3.0+
70194

71-
## 5. (Advanced) Firewall permissions
195+
### Enhanced Content Extraction
196+
197+
- **Docling Fallback**: Automatic fallback to Docling when MarkItDown fails to extract content from PDFs
198+
- **GPU Acceleration**: Support for NVIDIA CUDA and AMD ROCm for faster document processing
199+
- **Extractor Tracking**: Database tracks which extractor was used for each file
200+
201+
### Advanced Database Support
202+
203+
- **Multi-Database Support**: PostgreSQL (Recommended), and MySQL
204+
- **Database Size Monitoring**: New `/database/size` endpoint for storage monitoring
205+
- **Enhanced ACL Storage**: Stores both raw and resolved ACL information
206+
207+
### Improved Enumeration System
208+
209+
- **Rule Change Detection**: Automatically cleans up files that no longer match updated rules
210+
- **Stale Record Cleanup**: Automatic cleanup of orphaned database records
211+
212+
### Enhanced Security & Compliance
213+
214+
- **ACL Strict Mode**: Control ACL fallback behavior with `ACL_STRICT_MODE` environment variable
215+
- **Content Persistence Override**: Deployment-level control with `PERSIST_FILE_CONTENT_OVERRIDE`
216+
- **Proxy Support**: Full proxy server support for corporate environments
217+
218+
### Monitoring & Operations
219+
220+
- **Health Monitoring**: Enhanced `/monitoring` endpoints for system status
221+
- **Operation Logging**: Comprehensive operation tracking and logging
222+
- **Share Deletion Cleanup**: Automatic Microsoft Graph cleanup when shares are deleted
223+
224+
## 7\. Troubleshooting Common Issues
225+
226+
### Authentication Issues
227+
228+
- Ensure `realm` matches your Active Directory domain exactly
229+
- Verify `use_kerberos` is set to `"required"`
230+
- Check that the user account has access to the SMB share
231+
232+
### Content Extraction Issues
233+
234+
- Check logs for extractor information
235+
- For GPU acceleration, ensure proper GPU runtime is installed
236+
- Verify file types are supported by the extractors
237+
238+
### Database Issues
239+
240+
- For multi-container deployments, ensure `ENCRYPTION_KEY` is set consistently across all nodes
241+
- Monitor database size using the `/database/size` endpoint
242+
- Check database connectivity if using PostgreSQL/MySQL
243+
244+
### Microsoft Graph Issues
245+
246+
- Verify all Graph API credentials are correct
247+
- Check proxy configuration if behind corporate firewall
248+
- Ensure connector is enabled in Microsoft 365 Admin Center
249+
250+
## 8\. Advanced Configuration
251+
252+
### Proxy Configuration
253+
254+
For corporate environments with proxy servers:
255+
256+
```bash
257+
# HTTP/HTTPS proxy
258+
HTTPS_PROXY=http://proxy.company.com:8080
259+
HTTP_PROXY=http://proxy.company.com:8080
260+
261+
# Proxy authentication (optional)
262+
PROXY_USERNAME=proxy_user
263+
PROXY_PASSWORD=proxy_password
264+
265+
# SSL configuration
266+
GRAPH_VERIFY_SSL=true
267+
GRAPH_TIMEOUT=30
268+
```
269+
270+
### SSL Inspection Firewalls
271+
272+
For environments with SSL inspection:
273+
274+
```bash
275+
# Option 1: Disable SSL verification (less secure)
276+
GRAPH_VERIFY_SSL=false
277+
278+
# Option 2: Custom CA bundle (recommended)
279+
SSL_CERT_FILE=/app/data/custom_ca_bundle.pem
280+
```
281+
282+
### Multi-Container Deployments
283+
284+
For high availability setups:
285+
286+
```bash
287+
# Generate shared encryption key
288+
python3 -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
289+
290+
# Set in all containers
291+
ENCRYPTION_KEY=your_generated_key_here
292+
```
293+
294+
## 9\. API Access
295+
296+
The NetApp Connector provides a comprehensive REST API for programmatic access. The interactive API documentation is available at `http://localhost:8080/docs` after starting the connector.
297+
298+
For detailed API usage examples and advanced operations, please refer to our [API User Guide](/netapp-neo/USER_API_GUIDE.md).
299+
300+
## 10\. Firewall Permissions
72301

73302
If your organization's proxy or firewalls block communication to unknown domains, add the following rules to the 'allow' list:
74303

75-
| M365 Enterprise | M365 Government (GCC) | M365 GCCH |
76-
| -------------------------------------------- | ------------------------------------------- | ------------------------------------------------------------------- |
77-
| \*.office.com | \*.office.com | _.office.com, _.office365.us |
78-
| https://login.microsoftonline.com | https://login.microsoftonline.com | https://login.microsoftonline.com, https://login.microsoftonline.us |
79-
| https://graph.microsoft.com/ | https://graph.microsoft.com/ | https://graph.microsoft.com/, https://graph.microsoft.us/ |
80-
| https://huggingface.co/ds4sd/docling-models/ | https://huggingface.co/ds4sd/docling-models | https://huggingface.co/ds4sd/docling-models |
304+
| M365 Enterprise | M365 Government (GCC) | M365 GCCH |
305+
| -------------------------------------------- | -------------------------------------------- | ------------------------------------------------------------------- |
306+
| \*.office.com | \*.office.com | \*.office.com, \*.office365.us |
307+
| https://login.microsoftonline.com | https://login.microsoftonline.com | https://login.microsoftonline.com, https://login.microsoftonline.us |
308+
| https://graph.microsoft.com/ | https://graph.microsoft.com/ | https://graph.microsoft.com/, https://graph.microsoft.us/ |
309+
| https://huggingface.co/ds4sd/docling-models/ | https://huggingface.co/ds4sd/docling-models/ | https://huggingface.co/ds4sd/docling-models/ |
310+
311+
## Support
312+
313+
If you have any feedback or questions regarding the NetApp Connector or its Documentation, please reach out to us by opening a GitHub issue at [NetApp Innovation Labs](https://github.com/NetApp/Innovation-Labs/issues).
314+
315+
---
316+
317+
**Version**: 3.0+
318+
**Last Updated**: September 2025

0 commit comments

Comments
 (0)