Skip to content

Commit 1c91d74

Browse files
committed
docs(looker-studio): enhance Data API readme with schema inference limitations and troubleshooting tips
- Added a section detailing schema inference limitations, including issues with mixed data types, incomplete sampling, and nested object depth. - Expanded troubleshooting section with specific guidance on authentication errors, schema inference problems, and query issues. - Improved best practices for data structure and query execution to optimize schema inference and performance. Files: - tutorial/markdown/connectors/looker-studio/dataapi/readme.md
1 parent b81d674 commit 1c91d74

File tree

1 file changed

+52
-6
lines changed
  • tutorial/markdown/connectors/looker-studio/dataapi

1 file changed

+52
-6
lines changed

tutorial/markdown/connectors/looker-studio/dataapi/readme.md

Lines changed: 52 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,13 @@ What runs:
8585
- Nested fields use dot notation (for example, `address.city`). Arrays and objects not expanded become stringified values.
8686
- If the collection has no documents or your query returns no rows, schema inference will fail.
8787

88+
> **⚠️ Schema Inference Limitations**: Field types are inferred from sampled data and may not capture all variations in your dataset. Common issues include:
89+
> - **Mixed data types**: Fields containing both numbers and text will be typed as STRING
90+
> - **Incomplete sampling**: Fields present only in unsampled documents may not be detected
91+
> - **Array complexity**: Arrays of objects become stringified JSON rather than individual fields
92+
> - **Nested object depth**: Very deep object hierarchies may not be fully expanded
93+
> - **Empty or null values**: Fields with only null values may not be detected or may be typed incorrectly
94+
8895
## Data Retrieval
8996

9097
- Only the fields requested by Looker Studio are returned. Nested values are extracted using dot paths where possible.
@@ -94,15 +101,54 @@ What runs:
94101

95102
## Tips and Best Practices
96103

97-
- Prefer `Query by Collection` for quick starts and simpler schemas.
98-
- Always add a `LIMIT` when exploring with custom queries.
99-
- Ensure your user has at least query and read access on the target collections.
104+
- **Prefer `Query by Collection` for quick starts and simpler schemas**: Collection mode provides more predictable schema inference than custom queries.
105+
- **Always add a `LIMIT` when exploring with custom queries**: Use `LIMIT 100-1000` for initial testing to ensure fast schema inference and data retrieval.
106+
- **Ensure your user has at least query and read access** on the target collections and system catalogs for metadata discovery.
107+
- **For consistent schema inference**: Structure your data with consistent field types across documents. Avoid mixing numbers and strings in the same field.
108+
- **Handle complex nested data**: Consider flattening deeply nested objects in your SQL++ queries for better Looker Studio compatibility.
109+
- **Test schema inference separately**: Use small LIMIT clauses first to verify schema detection before processing large datasets.
100110

101111
## Troubleshooting
102112

103-
- Authentication error: Check host/port, credentials, and that the Data API is reachable from Looker Studio.
104-
- Empty schema or no fields: Ensure the collection has data; for custom queries, verify the statement and add `LIMIT` to improve sampling.
105-
- Query errors from the service: Review the error text surfaced in Looker Studio; fix syntax, permissions, or keyspace names.
113+
### Authentication and Connection Issues
114+
- **Authentication error**: Check host/port, credentials, and that the Data API is reachable from Looker Studio.
115+
- **Timeout or network errors**: Verify network connectivity and firewall settings between Looker Studio and your Couchbase cluster.
116+
117+
### Schema Inference Problems
118+
- **Empty schema or no fields detected**:
119+
- Ensure the collection contains documents and is not empty
120+
- For custom queries, verify the statement returns results and add appropriate `LIMIT` clauses
121+
- Check that your user has permissions to read the collection and execute queries
122+
123+
- **INFER statement failures**:
124+
- The connector first attempts `INFER collection` or `INFER (customQuery)` with sampling options
125+
- If INFER fails, it falls back to executing your query with `LIMIT 1` and inferring from a single document
126+
- INFER may fail on very large collections or complex queries - the fallback usually resolves this
127+
128+
- **Fields appear as STRING when they should be NUMBER**:
129+
- Your data has mixed types (some documents have numbers, others have strings) in the same field
130+
- The connector defaults to STRING for safety when types are inconsistent
131+
- Consider data cleanup or use SQL++ functions to cast types consistently
132+
133+
- **Missing fields that exist in your data**:
134+
- Schema inference is sample-based - fields present only in unsampled documents may not be detected
135+
- Try increasing the collection size or adjusting your query to ensure representative sampling
136+
- For custom queries, ensure your query includes all the fields you want to expose
137+
138+
- **Nested fields not working correctly**:
139+
- Very deep object hierarchies may not be fully expanded by the INFER process
140+
- Arrays of objects become stringified JSON instead of individual fields
141+
- Consider flattening complex structures in your SQL++ query for better field detection
142+
143+
- **"No properties in any INFER flavors" error**:
144+
- The INFER statement succeeded but found no recognizable field structures
145+
- This typically happens with collections containing only primitive values or very inconsistent document structures
146+
- Try a custom query that shapes the data into a more consistent structure
147+
148+
### Query and Data Issues
149+
- **Query errors from the service**: Review the error text surfaced in Looker Studio; fix syntax, permissions, or keyspace names.
150+
- **Permission errors during schema inference**: Ensure your user can execute INFER statements and read from system catalogs.
151+
- **Performance issues**: Add appropriate `LIMIT` clauses and avoid very complex JOINs for better connector performance.
106152

107153
## Next Steps
108154

0 commit comments

Comments
 (0)