Possible file system corruption when truncating or overwriting file data with fault tolerant mode enabled

Background Information
----------------------

When overwriting file data or truncating a file in fault tolerant mode, special handling is implemented within FILEX to prevent "losing" the cluster chain in case the application is interrupted due to a power failure or an application fault. When fault tolerant operation is enabled, the cluster chain are deleted backward starting from the end of the chain. This prevents leaving a partially deleted chain being left in the FAT table attached to nothing and thus permanently wasting space. This logic is implemented in the function _fx_fault_tolerant_cleanup_FAT_chain(). To reduce the number of reads the cluster chain deletion, or cleanup as it is named in the code, is performed in chunks called sessions.

Summary
-------

When cleaning chains of a certain length or located at specific locations in the FAT table the loop logic within _fx_fault_tolerant_cleanup_FAT_chain() might miss the end of the chain and start a new session at the cluster following the end of the chain thus leading to file system corruption. Two scenarios were identified when that appears to be the case.

Scenario 1
----------

When overwriting file data somewhere further than the start of a file and with a write spanning exactly the number of clusters that can be stored in the internal cluster cache the following logic will cause a new session to be started instead of terminating the cleanup operation.

- fx_fault_tolerant_cleanup_FAT_chain.c line 270.

            /* Move to next cluster. */
            current_cluster = next_cluster;
        } while ((next_cluster >= FX_FAT_ENTRY_START) &&
                 (next_cluster < media_ptr -> fx_media_fat_reserved) &&
                 (next_cluster != tail_cluster) &&
                 (cache_count < cache_max));

        /* Get next session. */
        if (cache_count == cache_max)
        {
            next_session = next_cluster;
        }


In the previous code segment, the end condition of the while loop used to setup a session is shown. In the case where the (cache_count == cache_max) condition is true that the end of the section of cluster chain to delete is reached the condition (next_cluster != tail_cluster) is ignored and a new session is started regardless.

Scenario 2
----------

When using a FAT12 media when the cluster chain cleanup detects that a FAT entry spans two different sectors, which is only possible in FAT12, a new session is created at the boundary. This is probably done to prevent a corrupted FAT entry from being unrecoverable in case of a power failure. In this scenario if the end of the cluster chain to cleanup falls exactly on one of those boundaries the cleanup logic will erroneously start a new session with similar results to Scenario 1 above.

- fx_fault_tolerant_cleanup_FAT_chain.c line 260.

            /* Check whether FAT entry spans multiple sectors.  */
            if (_fx_utility_FAT_entry_multiple_sectors_check(media_ptr, current_cluster))
            {
                if (head_cluster == next_session || next_session == FX_FREE_CLUSTER)
                {
                    next_session = next_cluster;
                }
                break;
            }

            /* Move to next cluster. */
            current_cluster = next_cluster;
        } while ((next_cluster >= FX_FAT_ENTRY_START) &&
                 (next_cluster < media_ptr -> fx_media_fat_reserved) &&
                 (next_cluster != tail_cluster) &&
                 (cache_count < cache_max));


In the case _fx_utility_FAT_entry_multiple_sectors_check() returns true, the loop end condition is bypassed and a new session is started even if (next_cluster == tail_cluster), This cause the continuation of the cluster cleanup operation.

Potential Fix
-------------

A potential solution would be, for the two conditions above, to check if the end of the cluster chain to delete was reached before setting up a new session.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Possible file system corruption when truncating or overwriting file data with fault tolerant mode enabled #77

Background Information

Summary

Scenario 1

Scenario 2

Potential Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Possible file system corruption when truncating or overwriting file data with fault tolerant mode enabled #77

Description

Background Information

Summary

Scenario 1

Scenario 2

Potential Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions