Skip to content
This repository was archived by the owner on Nov 8, 2022. It is now read-only.

Commit 93e1712

Browse files
initial commit
0 parents  commit 93e1712

File tree

6 files changed

+137
-0
lines changed

6 files changed

+137
-0
lines changed

.editorconfig

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# EditorConfig is awesome: https://EditorConfig.org
2+
3+
# top-most EditorConfig file
4+
root = true
5+
6+
[*]
7+
indent_style = space
8+
indent_size = 4
9+
end_of_line = lf
10+
charset = utf-8
11+
trim_trailing_whitespace = true
12+
insert_final_newline = true

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
/build
2+
/dist
3+
/__pycache__
4+
/pdfmerge.spec

README.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# PdfMerge
2+
3+
Simple CLI program that merges consecutive ".pdf"-files in a directory.
4+
5+
This software is intended for a very specific use case:
6+
7+
Our current scanner can not easily output a front and back page of a scanned document into a single pdf. It can only do that if you scan your documents one at a time which obviously is very inconvenient for the employee if he has to scan hundreds of documents at a time.
8+
9+
The workaround is that the scanner saves every front and back page into seperate files. Each filename has the current timestamp (down to the millisecond) and an auto incremented counter. This program simply merges these consecutive files in a given directory.
10+
11+
## Example:
12+
13+
If we have a directory with the following files:
14+
15+
```
16+
1.pdf
17+
3.pdf
18+
5.pdf
19+
4.pdf
20+
```
21+
22+
the program will do the following:
23+
24+
```
25+
{uuid.uuid4()}.pdf # file containing 1.pdf and 3.pdf
26+
{uuid.uuid4()}.pdf # file containing 4.pdf and 5.pdf
27+
```
28+
29+
---
30+
31+
The algorithm for that is very sophisticated:
32+
33+
It sorts the files in the selected directory by their name. Thats it.
34+
35+
36+
# Compile
37+
38+
This software is written in python 3.9 and uses [pyinstaller](https://www.pyinstaller.org/) to create native binaries.
39+
40+
```bash
41+
pyinstaller pdfmerge.py --onefile --icon=app.ico
42+
```

app.ico

66.1 KB
Binary file not shown.

pdfmerge.py

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
from PyPDF2 import PdfFileMerger
2+
from tkinter import filedialog
3+
import tkinter as tk
4+
import uuid
5+
import os
6+
7+
root = tk.Tk()
8+
root.withdraw()
9+
10+
11+
def pdf_concat():
12+
pdfs = []
13+
14+
file_path = filedialog.askdirectory()
15+
destination = file_path + os.path.sep + 'merged'
16+
17+
# load all "".pdf" files from the selected directory
18+
# and put them into a list
19+
for file in os.listdir(file_path):
20+
if file.endswith(".pdf"):
21+
pdfs.append(os.path.join(file_path, file))
22+
23+
pdfs.sort()
24+
pdfLen = len(pdfs)
25+
26+
print(f'Found {pdfLen} ".pdf files" in {file_path}')
27+
28+
# return if directory was empty
29+
if (pdfLen < 1):
30+
return
31+
32+
# create the destination folder
33+
if (os.path.isdir(destination) == False):
34+
os.mkdir(destination)
35+
else:
36+
# else clear the folder
37+
filelist = [f for f in os.listdir(destination) if f.endswith(".pdf")]
38+
for f in filelist:
39+
os.remove(os.path.join(destination, f))
40+
41+
# how many pdfs will be generated
42+
length = int(len(pdfs) / 2) + 1
43+
x = 1
44+
45+
# chunk the list into chunks of 2 and merge them
46+
for i in chunks(pdfs, 2):
47+
newFilename = f'{destination}{os.path.sep}{uuid.uuid4()}.pdf'
48+
49+
merger = PdfFileMerger()
50+
51+
for pdf in i:
52+
merger.append(pdf)
53+
54+
merger.write(newFilename)
55+
merger.close()
56+
57+
# give the user progress feedback
58+
progress = int((x / length) * 100)
59+
print(f'{newFilename} ({x} von {length}) {progress}%')
60+
61+
x += 1
62+
63+
# prevent automatic closing of the process
64+
input()
65+
66+
### php: array_chunk
67+
def chunks(lst, n):
68+
for i in range(0, len(lst), n):
69+
yield lst[i:i + n]
70+
71+
72+
if __name__ == '__main__':
73+
pdf_concat()

requirements.txt

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
altgraph==0.17
2+
macholib==1.14
3+
pyinstaller==4.2
4+
pyinstaller-hooks-contrib==2021.1
5+
PyPDF2==1.26.0
6+
tk==0.1.0

0 commit comments

Comments
 (0)