Skip to content

can't read contents of file longer than 4K #18

@krisk0

Description

@krisk0

I have no problem parsing .img with files of small size. However, when I attempt to read in chunks a file of size 4898 bytes, the following error occurs:

  File "/home/mE/i/ext4.bugr/./show_md5.py", line 25, in <module>
    show_md5(volume, volume.root)
  File "/home/mE/i/ext4.bugr/./show_md5.py", line 21, in show_md5
    really_show(file_name, node)
  File "/home/mE/i/ext4.bugr/./show_md5.py", line 10, in really_show
    piece = i.read(4096)
  File "/usr/lib/python3.9/site-packages/ext4.py", line 1020, in read
    blocks[-1] = blocks[-1][:byte_len]
IndexError: list index out of range

The 25-line script to calculate md5 hash of all files in root directory:

#!/usr/bin/python3

import sys, hashlib, ext4

def really_show(name, node):
    sys.stderr.write('working on ' + name + '\n')
    hash_ = hashlib.md5()
    i = node.open_read()
    while True:
        piece = i.read(4096)
        if not piece:
            break
        hash_.update(piece)
    print(name + ' ' + hash_.hexdigest())

def show_md5(volume, node):
    for file_name,inode_no,file_type in node.open_dir():
        if file_type != ext4.InodeType.FILE:
            continue
        node = volume.get_inode(inode_no)
        really_show(file_name, node)

with open('/tmp/ext4-1M.img', 'rb') as i:
    volume = ext4.Volume(i)
    show_md5(volume, volume.root)

Script to create /tmp/ext4-1M.img is here.

I am using ext4 package for testing a software that parses .img and calculates hash. Me and my colleagues are unhappy that it only works for images with small files. How do I count hash of a big file?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions