Skip to content

[Bug]: certain jpeg attachments not recognized by EmbededImagesRepairToolBase.GetImageFormat method #3048

@johannedin

Description

@johannedin

Version

  • I confirm that I am using the latest version

Source Version

Azure DevOps Server 2022

Target Version

Azure DevOps Service

Relevant configuration

"Processors": [
    {
        "ProcessorType": "TfsWorkItemMigrationProcessor",
        "AttachRevisionHistory": false,
        "Enabled": true,
        "FixHtmlAttachmentLinks": true,
        "FilterWorkItemsThatAlreadyExistInTarget": true,
        "GenerateMigrationComment": true,
        "MaxGracefulFailures": 100,
        "PauseAfterEachWorkItem": false,
        "SkipRevisionWithInvalidAreaPath": false,
        "SkipRevisionWithInvalidIterationPath": false,
        "SourceName": "Source",
        "TargetName": "Target",
        "WIQLQuery": "SELECT [System.Id] FROM WorkItems WHERE [System.TeamProject] = @TeamProject AND [System.WorkItemType] NOT IN ('Test Case', 'Test Suite', 'Test Plan','Shared Steps','Shared Parameter','Feedback Request') ORDER BY [System.ChangedDate] desc",
        "WorkItemCreateRetryLimit": 50,
        "WorkItemIDs": []
    }
],
"CommonTools": {
    "TfsEmbededImagesTool": {
    "Enabled": true
        },

Relevant log output

{"Timestamp":"2025-11-03T09:35:40.7044912+00:00","Level":"Error","MessageTemplate":"EmbededImagesRepairEnricher: Unable to fix HTML field attachments for work item {wiId} from {oldTfsurl} to {newTfsurl}","TraceId":"a3146b9de09d257138cbb87d49e2ba51","SpanId":"a63627dd33a00656","Exception":"System.Exception: Downloaded image [C:\\Users\\localAdminUser\\AppData\\Local\\Temp\\2\\snucke.jpg] from Work Item [0] Field: [Description] could not be identified as an image. Authentication issue?\r\n   at MigrationTools.Tools.TfsEmbededImagesTool.UploadedAndRetrieveAttachmentLinkUrl(String matchedSourceUri, String sourceFieldName, WorkItemData targetWorkItem, String sourcePersonalAccessToken) in C:\\git\\azure-devops-migration-tools\\src\\MigrationTools.Clients.TfsObjectModel\\Tools\\TfsEmbededImagesTool.cs:line 177\r\n   at MigrationTools.Tools.TfsEmbededImagesTool.FixEmbededImages(WorkItemData wi, String oldTfsurl, String newTfsurl, String sourcePersonalAccessToken) in C:\\git\\azure-devops-migration-tools\\src\\MigrationTools.Clients.TfsObjectModel\\Tools\\TfsEmbededImagesTool.cs:line 102","Properties":{"wiId":"0","oldTfsurl":"http://vm-ado-server2/DefaultCollection/","newTfsurl":"https://dev.azure.com/FM-STARE-MIGRATION-TESTS/","SourceContext":"MigrationTools.Tools.TfsEmbededImagesTool","versionString":"16.3.3-Local.1-3-g9a8779e7","targetWorkItemId":null,"sourceRevisionInt":1,"sourceWorkItemId":75500,"totalWorkItems":1,"currentWorkItem":1,"sourceWorkItemTypeName":"Task","ProcessId":3228}}

What happened?

Migration tool is not detecting an image as a jpeg and flags it as a potential Authentication Issue. What happens is that the EmbededImagesRepairToolBase.GetImageFormat is too restrictive when identifying jpegs.

A suggested reimplementation of GetImageFormat could be:

        protected static ImageFormat GetImageFormat(byte[] bytes)
        {
            if (bytes != null && bytes.Length > 1)
            {
                // BMP: 42 4D
                var bmp = new byte[] { 0x42, 0x4D };
                if (bmp.SequenceEqual(bytes.Take(bmp.Length)))
                    return ImageFormat.bmp;
            }
   
            if (bytes == null || bytes.Length < 4)
                throw new ArgumentException("Byte array too short to determine image format.");

            // GIF: GIF87a or GIF89a
            var gif87a = System.Text.Encoding.ASCII.GetBytes("GIF87a");
            var gif89a = System.Text.Encoding.ASCII.GetBytes("GIF89a");

            // PNG: 89 50 4E 47 0D 0A 1A 0A
            var png = new byte[] { 0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A };

            // TIFF: II* or MM*
            var tiffLE = new byte[] { 0x49, 0x49, 0x2A, 0x00 };
            var tiffBE = new byte[] { 0x4D, 0x4D, 0x00, 0x2A };

            // JPEG: FF D8 (simplified for all variants)
            var jpegSOI = new byte[] { 0xFF, 0xD8 };

            // SVG: starts with "<svg" (text-based)
            var svgTag = System.Text.Encoding.ASCII.GetBytes("<svg");

            // Check GIF
            if (gif87a.SequenceEqual(bytes.Take(gif87a.Length)) ||
                gif89a.SequenceEqual(bytes.Take(gif89a.Length)))
                return ImageFormat.gif;

            // Check PNG
            if (png.SequenceEqual(bytes.Take(png.Length)))
                return ImageFormat.png;

            // Check TIFF
            if (tiffLE.SequenceEqual(bytes.Take(tiffLE.Length)) ||
                tiffBE.SequenceEqual(bytes.Take(tiffBE.Length)))
                return ImageFormat.tiff;

            // Check JPEG
            if (jpegSOI.SequenceEqual(bytes.Take(jpegSOI.Length)))
                return ImageFormat.jpeg;

            // Check SVG (text-based)
            if (bytes.Length >= 4 && svgTag.SequenceEqual(bytes.Take(svgTag.Length)))
                return ImageFormat.svg;

            return ImageFormat.unknown;
        }

Improved version checks only for the Start of Image (SOI) marker FF D8, which is mandatory for all JPEG files.
This means every valid JPEG, regardless of metadata (JFIF, Exif, ICC, Photoshop IRB, Adobe APP14), will be detected.

Old version hardcodes multiple 4-byte patterns (FF D8 FF E0, FF D8 FF E1, etc.), which:

  • Covers only some variants.
  • Misses others (e.g., APP14 Adobe marker FF EE or rare APP segments).

Improved version also aims to support all images types supported by Azure DevOps.

For a sample image that isn't working with the current implementation use this image.

Debug in Visual Studio

  • Visual Studio Debug

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions