-
-
Notifications
You must be signed in to change notification settings - Fork 404
Open
Description
I want to use pygit2 to trace back the vulnerability introduction time based on the fix patches of open-source components.
On Ubuntu, testing the Linux repository takes about 13 seconds.
time (git checkout 2734d6c1b1a089fb593ef6a23d4b70903526fe0c && git blame -L 3883,3886 kernel/trace/ring_buffer.c)
Updating files: 100% (77303/77303), done.
Note: switching to '2734d6c1b1a089fb593ef6a23d4b70903526fe0c'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:
git switch -c <new-branch-name>
Or undo this operation with:
git switch -
Turn off this advice by setting config variable advice.detachedHead to false
HEAD is now at 2734d6c1b1a0 Linux 5.14-rc2
bf41a158cacba (Steven Rostedt 2008-10-04 02:00:59 -0400 3883) return reader->read == rb_page_commit(reader) &&
bf41a158cacba (Steven Rostedt 2008-10-04 02:00:59 -0400 3884) (commit == reader ||
bf41a158cacba (Steven Rostedt 2008-10-04 02:00:59 -0400 3885) (commit == head &&
bf41a158cacba (Steven Rostedt 2008-10-04 02:00:59 -0400 3886) head->read == rb_page_commit(commit)));
real 0m13.343s
user 0m6.960s
sys 0m5.266s
However, the execution time of pygit2 is significantly slower than direct command-line operations, and the running results are inconsistent.
Here is my python code:
def show_commit_line_blame(repo_path, commit_hash, file_path, min_line_number, max_line_number):
repo = pygit2.Repository(repo_path)
blame = repo.blame(
file_path,
newest_commit=commit_hash,
flags=pygit2.GIT_BLAME_FIRST_PARENT,
min_line=min_line_number,
max_line=max_line_number
)
for line_number in range(min_line_number, max_line_number+1):
hunk = blame.for_line(line_number)
blamed_commit = repo.get(hunk.final_commit_id)
utc_time = datetime.fromtimestamp(blamed_commit.commit_time, tz=timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
print(f"Line {line_number} in {file_path} last modified by commit {blamed_commit.id} on {utc_time}")
output:
Line 3883 in kernel/trace/ring_buffer.c last modified by commit 92b29b86fe2e183d44eb467e5e74a5f718ef2e43 on 2008-10-20T20:35:07Z
Line 3884 in kernel/trace/ring_buffer.c last modified by commit 92b29b86fe2e183d44eb467e5e74a5f718ef2e43 on 2008-10-20T20:35:07Z
Line 3885 in kernel/trace/ring_buffer.c last modified by commit 92b29b86fe2e183d44eb467e5e74a5f718ef2e43 on 2008-10-20T20:35:07Z
Line 3886 in kernel/trace/ring_buffer.c last modified by commit 92b29b86fe2e183d44eb467e5e74a5f718ef2e43 on 2008-10-20T20:35:07Z
used : 50.29366707801819s
environment:
ubuntu 22.04 (6.8.0-58-generic #60~22.04.1-Ubuntu SMP)
Python 3.10.12
pygit2 1.18.1
Could you give me some suggestions to improve execution efficiency?
Metadata
Metadata
Assignees
Labels
No labels