-
Notifications
You must be signed in to change notification settings - Fork 212
Description
We have an Azure Function App with Event Grid-triggered functions that fails to process blobs when they are uploaded via the .NET SDK (OpenWriteAsync), but works perfectly when the exact same files are uploaded manually via Azure Storage Explorer.
We suspect this is a race condition, where the BlobCreated Event Grid event is firing before the upload stream from our SDK application is fully committed.
Our Setup
-
Upload Method (Failing): We are using a C# console application to generate and upload test files. The logic uses BlobContainerClient.GetBlobClient(blobName) and then calls await blobClient.OpenWriteAsync(true) to get a stream. We then serialize our JSON or ZipArchive data directly into this upload stream.
-
Upload Method (Working): We manually drag-and-drop the same generated files into the blob container using Azure Storage Explorer. When uploaded this way, the functions process them perfectly.
-
Azure Function App Setup: We have two functions monitoring the same container, both using the [BlobTrigger] binding with Source = BlobTriggerSource.EventGrid.
Function 1 (Unzipped JSON): A function (e.g., JsonBlobProcessor) designed to process raw .json files.
Function 2 (Zipped JSON): A function named ZippedJsonBlobProcessor designed to process .zip files.
Symptoms and Errors
When we upload files using our C# SDK streaming application, both functions are triggered but fail:
ZippedJsonBlobProcessor Error: This function fails immediately when trying to read the incoming Stream as a ZipArchive. This error strongly suggests it's reading an incomplete or empty file.
Unexpected error processing zip file bigobject_227.zip: Error message: New offset cannot be less than 0. Value was -18 (Parameter 'offset')
JsonBlobProcessor Error: This function (for raw JSON) also fails, typically with a JsonException (not shown) indicating an incomplete or malformed JSON stream, which is consistent with the same race condition.
Our hypothesis is that:
Uploading via Storage Explorer is an atomic operation. The file is fully uploaded, and then the BlobCreated event is fired. The function receives a complete, valid blob.
Uploading via blobClient.OpenWriteAsync() is a streamed operation. The BlobCreated event is fired immediately upon stream creation (when the blob is 0 bytes or locked) and not when the stream is fully uploaded and committed.
This race condition means our functions are triggered and attempt to read a blob that is still being actively written to, causing them to fail.
By the Azure Support team we got offered the following response, however, this is legacy code as we speak and no longer supported:
You are seeing multiple BlobCreated events when using OpenWriteAsync is because the method creates the blob in three steps.
- It sends a PUT request to create the blob.
- And then sends a PUT request to set its properties (such as metadata).
- Lastly it sends a PUT request to commit the blob. Each of these requests triggers a separate BlobCreated event.
If you want to use OpenWriteAsync, you can disable the BlobCreated event for the first two stages by setting the BlobRequestOptions parameter to DisableContentMD5Validation and DisableContentCrc64Validation to true.
This will disable the BlobCreated event for the first two stages, and only the final stage will generate the BlobCreated event.
Our use of OpenWriteAsync is intentional — it allows us to stream large files directly to Azure Storage without excessive memory consumption.
While we are aware that UploadAsync can be combined with constructs like System.IO.Pipelines to achieve a similar result, that approach introduces additional complexity and overhead in our upload logic.
We want to ensure our implementation aligns with supported and reliable Azure Functions behaviour, both for this project and for future solutions that rely on streamed uploads.