Skip to content

Commit 00cdeaf

Browse files
committed
Add file name obfuscation
1 parent 8c42db4 commit 00cdeaf

File tree

2 files changed

+85
-5
lines changed

2 files changed

+85
-5
lines changed

README.md

Lines changed: 49 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,11 @@ The content is encrypted with AES-256 in Python using PyCryptodome, and decrypte
2828

2929
* ~~Sanitize code. F.ex. self.config variables should not be overwritten on other places than on_config.~~
3030
* ~~Rework search_index handling to be more bulletproof.~~
31-
* Add optional obfuscation of filenames. F.ex. to make it impossible to guess image names.
31+
* ~~Add optional obfuscation of file names. F.ex. to make it impossible to guess image names.~~
3232
* Rework password handling or inventory of some sort
3333
* ~~As we are vulnerable to brute force: Review strenght of used passwords.~~
3434
* download self-hosted cryptojs just once (check hash of js files)
35-
* ~~Add button press to decrypt without password (just to hide content from search engines)
35+
* ~~Add button press to decrypt without password (just to hide content from search engines)~~
3636
* ...to be defined
3737

3838
# Table of Contents
@@ -43,7 +43,7 @@ The content is encrypted with AES-256 in Python using PyCryptodome, and decrypte
4343
* [Secret from environment](#secret-from-environment)
4444
* [Customization](#default-vars-customization)
4545
* [Translations](#translations)
46-
* [Obfuscate pages](#obfuscate-pages)
46+
* [Obfuscate pages](#obfuscate-pages) **NEW**
4747
* [Features](#features)
4848
* [HighlightJS support](#highlightjs-support) *(default)*
4949
* [Arithmatex support](#arithmatex-support) *(default)*
@@ -58,6 +58,7 @@ The content is encrypted with AES-256 in Python using PyCryptodome, and decrypte
5858
* [Add button](#add-button)
5959
* [Reload scripts](#reload-scripts)
6060
* [Self-host crypto-js](#self-host-crypto-js)
61+
* [File name obfuscation](#filename-obfuscation)
6162
* [Contributing](#contributing)
6263

6364

@@ -595,6 +596,51 @@ plugins:
595596
selfhost_download: false
596597
```
597598

599+
### File name obfuscation
600+
601+
Imagine your pages contain many images and you labeled them "1.jpg", "2.jpg" and so on for some reason.
602+
If you'd like to encrypt one of these pages, an attacker could try guessing the image file names
603+
and would be able to download them despite not having the password to the page.
604+
605+
This feature should make it impossible or at least way harder for an external attacker to guess the file names.
606+
Please also check and disable directory listing for that matter.
607+
Keep in mind that you hosting provider is still able to see all your images and files.
608+
609+
To counter file name guessing you could active the feature like this:
610+
611+
```yaml
612+
plugins:
613+
- encryptcontent:
614+
selfhost: true
615+
selfhost_download: false
616+
hash_filenames:
617+
extensions:
618+
- 'png'
619+
- 'jpg'
620+
- 'jpeg'
621+
- 'svg'
622+
except:
623+
- 'lilien.svg'
624+
```
625+
626+
At `extensions` we define which file name extensions to obfuscate
627+
(extension is taken from the part after the last ".",
628+
so the extension of "image.jpg" is "jpg" and of "archive.tar.gz" is "gz").
629+
630+
You can define multiple exceptions at the `except` list.
631+
The file names that end with these strings will be skipped.
632+
You should use this if some images are used by themes or other plugins.
633+
Otherwise, you'd need to change these file names to the obfuscated ones.
634+
635+
The file names are obfuscated in a way that the corresponding file is hashed with MD5
636+
and the hash is added to the file name
637+
(If the file content is not changed the file name also not changes), like this:
638+
639+
some_image_1_bb80db433751833b8f8b4ad23767c0fc.jpg
640+
("bb80db433751833b8f8b4ad23767c0fc" being the MD5 hash of said image.)
641+
642+
> The file name obfuscation is currently applied to the whole site - not just the encrypted pages...
643+
598644
# Contributing
599645

600646
From reporting a bug to submitting a pull request: every contribution is appreciated and welcome.

encryptcontent/plugin.py

Lines changed: 36 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,7 @@ class encryptContentPlugin(BasePlugin):
8484
('selfhost', config_options.Type(bool, default=False)),
8585
('selfhost_download', config_options.Type(bool, default=True)),
8686
('translations', config_options.Type(dict, default={}, required=False)),
87+
('hash_filenames', config_options.Type(dict, default={}, required=False)),
8788
# legacy features, doesn't exist anymore
8889
)
8990

@@ -95,6 +96,13 @@ def __hash_md5__(self, text):
9596
key.update(text.encode('utf-8'))
9697
return key.digest()
9798

99+
def __hash_md5_file__(self, fname):
100+
hash_md5 = hashlib.md5()
101+
with open(fname, "rb") as f:
102+
for chunk in iter(lambda: f.read(4096), b""):
103+
hash_md5.update(chunk)
104+
return hash_md5.hexdigest()
105+
98106
def __encrypt_text_aes__(self, text, password):
99107
""" Encrypts text with AES-256. """
100108
BLOCK_SIZE = 32
@@ -328,6 +336,34 @@ def on_pre_build(self, config, **kwargs):
328336
except Exception as exp:
329337
logger.exception(exp)
330338

339+
def on_files(self, files, config, **kwargs):
340+
"""
341+
The files event is called after the files collection is populated from the docs_dir.
342+
Use this event to add, remove, or alter files in the collection.
343+
Note that Page objects have not yet been associated with the file objects in the collection.
344+
Use Page Events to manipulate page specific data.
345+
"""
346+
if 'extensions' in self.config['hash_filenames']:
347+
for file in files:
348+
349+
if 'except' in self.config['hash_filenames']:
350+
skip = False
351+
for check in self.config['hash_filenames']['except']:
352+
if file.src_path.endswith(check):
353+
skip = True
354+
if skip:
355+
continue
356+
357+
ext = file.src_path.rsplit('.',1)[1].lower()
358+
if ext in self.config['hash_filenames']['extensions']:
359+
hash = self.__hash_md5_file__(file.abs_src_path)
360+
filename, ext = file.abs_dest_path.rsplit('.',1)
361+
filename = filename + "_" + hash
362+
file.abs_dest_path = filename + "." + ext
363+
filename, ext = file.url.rsplit('.',1)
364+
filename = filename + "_" + hash
365+
file.url = filename + "." + ext
366+
331367
def on_page_markdown(self, markdown, page, config, **kwargs):
332368
"""
333369
The page_markdown event is called after the page's markdown is loaded from file and
@@ -543,8 +579,6 @@ def on_post_page(self, output_content, page, config, **kwargs):
543579
if self.setup['search_plugin_found']:
544580
location = page.url.lstrip('/')
545581
self.setup['locations'][location] = page.encryptcontent['password']
546-
print(page.title)
547-
print(page.encryptcontent)
548582
delattr(page, 'encryptcontent')
549583

550584
return output_content

0 commit comments

Comments
 (0)