feat(ingest/bigquery): improve profiler to support multiple partition columns and support external table profiling#12825
feat(ingest/bigquery): improve profiler to support multiple partition columns and support external table profiling#12825acrylJonny wants to merge 173 commits intomasterfrom
Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
…profiling # Conflicts: # metadata-ingestion/src/datahub/ingestion/source/bigquery_v2/bigquery_config.py
| try: | ||
| # Query for actual values of this column using the date filters | ||
| discover_query = f""" | ||
| SELECT DISTINCT `{col_name}` as col_value, COUNT(*) as row_count |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
View details in Aikido Security
| # Non-partitioned table - apply row limit or safety limit | ||
| if self.config.profiling.profiling_row_limit > 0: | ||
| row_limit = max(1, int(self.config.profiling.profiling_row_limit)) | ||
| custom_sql = f"SELECT * FROM {safe_table_ref} LIMIT {row_limit}" |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
View details in Aikido Security
Checklist