Skip to content

Commit 96f9611

Browse files
authored
Merge pull request #205511 from MariDjo/master
Added: Resolving Delta logs failed
2 parents 8ac17a2 + 421e599 commit 96f9611

File tree

1 file changed

+15
-1
lines changed

1 file changed

+15
-1
lines changed

articles/synapse-analytics/sql/resources-self-help-sql-on-demand.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ms.service: synapse-analytics
66
ms.topic: overview
77
ms.subservice: sql
88
ms.custom: event-tier1-build-2022
9-
ms.date: 05/16/2022
9+
ms.date: 07/21/2022
1010
ms.author: stefanazaric
1111
ms.reviewer: sngun, wiassaf
1212
---
@@ -883,6 +883,20 @@ If the dataset is valid, [create a support ticket](../../azure-portal/supportabi
883883

884884
Now you can continue using the Delta Lake folder with Spark pool. You'll provide copied data to Microsoft support if you're allowed to share this information. The Azure team will investigate the content of the `delta_log` file and provide more information about possible errors and workarounds.
885885

886+
### Resolving Delta logs failed
887+
888+
The following error indicates that serverless SQL pool cannot resolve Delta logs:
889+
```
890+
Resolving Delta logs on path '%ls' failed with error: Cannot parse json object from log folder.
891+
```
892+
The most common cause is that `last_checkpoint_file` in `_delta_log` folder is larger than 200 bytes due to the `checkpointSchema` field added in Spark 3.3.
893+
894+
There are two options available to circumvent this error:
895+
* Modify appropriate config in Spark notebook and generate a new checkpoint, so that `last_checkpoint_file` gets re-created. In case you are using Azure Databricks, the config modification is the following: `spark.conf.set("spark.databricks.delta.checkpointSchema.writeThresholdLength", 0);`
896+
* Downgrade to Spark 3.2.1.
897+
898+
Our engineering team is currently working on a full support for Spark 3.3.
899+
886900
## Performance
887901

888902
Serverless SQL pool assigns the resources to the queries based on the size of the dataset and query complexity. You can't change or limit the resources that are provided to the queries. There are some cases where you might experience unexpected query performance degradations and you might have to identify the root causes.

0 commit comments

Comments
 (0)