You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(looker-studio): enhance Data API readme with schema inference limitations and troubleshooting tips
- Added a section detailing schema inference limitations, including issues with mixed data types, incomplete sampling, and nested object depth.
- Expanded troubleshooting section with specific guidance on authentication errors, schema inference problems, and query issues.
- Improved best practices for data structure and query execution to optimize schema inference and performance.
Files:
- tutorial/markdown/connectors/looker-studio/dataapi/readme.md
Copy file name to clipboardExpand all lines: tutorial/markdown/connectors/looker-studio/dataapi/readme.md
+52-6Lines changed: 52 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -85,6 +85,13 @@ What runs:
85
85
- Nested fields use dot notation (for example, `address.city`). Arrays and objects not expanded become stringified values.
86
86
- If the collection has no documents or your query returns no rows, schema inference will fail.
87
87
88
+
> **⚠️ Schema Inference Limitations**: Field types are inferred from sampled data and may not capture all variations in your dataset. Common issues include:
89
+
> -**Mixed data types**: Fields containing both numbers and text will be typed as STRING
90
+
> -**Incomplete sampling**: Fields present only in unsampled documents may not be detected
91
+
> -**Array complexity**: Arrays of objects become stringified JSON rather than individual fields
92
+
> -**Nested object depth**: Very deep object hierarchies may not be fully expanded
93
+
> -**Empty or null values**: Fields with only null values may not be detected or may be typed incorrectly
94
+
88
95
## Data Retrieval
89
96
90
97
- Only the fields requested by Looker Studio are returned. Nested values are extracted using dot paths where possible.
@@ -94,15 +101,54 @@ What runs:
94
101
95
102
## Tips and Best Practices
96
103
97
-
- Prefer `Query by Collection` for quick starts and simpler schemas.
98
-
- Always add a `LIMIT` when exploring with custom queries.
99
-
- Ensure your user has at least query and read access on the target collections.
104
+
-**Prefer `Query by Collection` for quick starts and simpler schemas**: Collection mode provides more predictable schema inference than custom queries.
105
+
-**Always add a `LIMIT` when exploring with custom queries**: Use `LIMIT 100-1000` for initial testing to ensure fast schema inference and data retrieval.
106
+
-**Ensure your user has at least query and read access** on the target collections and system catalogs for metadata discovery.
107
+
-**For consistent schema inference**: Structure your data with consistent field types across documents. Avoid mixing numbers and strings in the same field.
108
+
-**Handle complex nested data**: Consider flattening deeply nested objects in your SQL++ queries for better Looker Studio compatibility.
109
+
-**Test schema inference separately**: Use small LIMIT clauses first to verify schema detection before processing large datasets.
100
110
101
111
## Troubleshooting
102
112
103
-
- Authentication error: Check host/port, credentials, and that the Data API is reachable from Looker Studio.
104
-
- Empty schema or no fields: Ensure the collection has data; for custom queries, verify the statement and add `LIMIT` to improve sampling.
105
-
- Query errors from the service: Review the error text surfaced in Looker Studio; fix syntax, permissions, or keyspace names.
113
+
### Authentication and Connection Issues
114
+
-**Authentication error**: Check host/port, credentials, and that the Data API is reachable from Looker Studio.
115
+
-**Timeout or network errors**: Verify network connectivity and firewall settings between Looker Studio and your Couchbase cluster.
116
+
117
+
### Schema Inference Problems
118
+
-**Empty schema or no fields detected**:
119
+
- Ensure the collection contains documents and is not empty
120
+
- For custom queries, verify the statement returns results and add appropriate `LIMIT` clauses
121
+
- Check that your user has permissions to read the collection and execute queries
122
+
123
+
-**INFER statement failures**:
124
+
- The connector first attempts `INFER collection` or `INFER (customQuery)` with sampling options
125
+
- If INFER fails, it falls back to executing your query with `LIMIT 1` and inferring from a single document
126
+
- INFER may fail on very large collections or complex queries - the fallback usually resolves this
127
+
128
+
-**Fields appear as STRING when they should be NUMBER**:
129
+
- Your data has mixed types (some documents have numbers, others have strings) in the same field
130
+
- The connector defaults to STRING for safety when types are inconsistent
131
+
- Consider data cleanup or use SQL++ functions to cast types consistently
132
+
133
+
-**Missing fields that exist in your data**:
134
+
- Schema inference is sample-based - fields present only in unsampled documents may not be detected
135
+
- Try increasing the collection size or adjusting your query to ensure representative sampling
136
+
- For custom queries, ensure your query includes all the fields you want to expose
137
+
138
+
-**Nested fields not working correctly**:
139
+
- Very deep object hierarchies may not be fully expanded by the INFER process
140
+
- Arrays of objects become stringified JSON instead of individual fields
141
+
- Consider flattening complex structures in your SQL++ query for better field detection
142
+
143
+
-**"No properties in any INFER flavors" error**:
144
+
- The INFER statement succeeded but found no recognizable field structures
145
+
- This typically happens with collections containing only primitive values or very inconsistent document structures
146
+
- Try a custom query that shapes the data into a more consistent structure
147
+
148
+
### Query and Data Issues
149
+
-**Query errors from the service**: Review the error text surfaced in Looker Studio; fix syntax, permissions, or keyspace names.
150
+
-**Permission errors during schema inference**: Ensure your user can execute INFER statements and read from system catalogs.
151
+
-**Performance issues**: Add appropriate `LIMIT` clauses and avoid very complex JOINs for better connector performance.
0 commit comments