Skip to content

Commit 95d5fb0

Browse files
committed
Add basic entropy calculation
1 parent 7e7aa26 commit 95d5fb0

File tree

3 files changed

+57
-31
lines changed

3 files changed

+57
-31
lines changed

README.md

Lines changed: 44 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -51,92 +51,104 @@ The above example can be explained as:
5151
- Pass the output of `hex` to `emit` with the argument `'decrypted'`, creating a `decrypted` field
5252

5353
## Functions
54-
`btoa()`
55-
Encodes input to a Base64 string.
54+
### `btoa()`
55+
- Encodes input to a Base64 string.
5656

5757
`b64(), atob()`
58-
Decodes a Base64 encoded string.
58+
- Decodes a Base64 encoded string.
5959

6060
`b32()`
61-
Decodes a Base32 encoded string.
61+
- Decodes a Base32 encoded string.
6262

6363
`b58()`
64-
Decodes a Base58 encoded string.
64+
- Decodes a Base58 encoded string.
6565

6666
`rotx(count)`
67-
Implements Caesarian shift. The count argument specifies the amount to shift and must be an integer.
67+
- Implements Caesarian shift. The count argument specifies the amount to shift and must be an integer.
6868

6969
`rol(count)`
70-
Implements rotate-on-left to each character within the string using an 8 bit boundary. The count argument specifies the amount to rotate and must be an integer.
70+
- Implements rotate-on-left to each character within the string using an 8 bit boundary. The count argument specifies the amount to rotate and must be an integer.
7171

7272
`ror(count)`
73-
Implements rotate-on-right to each character within the string using an 8 bit boundary. The count argument specifies the amount to rotate and must be an integer.
73+
- Implements rotate-on-right to each character within the string using an 8 bit boundary.
74+
- The count argument specifies the amount to rotate and must be an integer.
7475

7576
`xor(key)`
76-
Implements basic XOR cipher against the field with the supplied key. The key can be provided as a string or integer.
77+
- Implements basic XOR cipher against the field with the supplied key.
78+
- The key can be provided as a string or integer.
7779

7880
`rc4('key')`
79-
Implements the RC4 cipher against the field with the supplied key. The key provided must be a string.
81+
- Implements the RC4 cipher against the field with the supplied key.
82+
- The key provided must be a string.
8083

8184
`hex()`
82-
Transforms input into its hexadecimal representation.
85+
- Transforms input into its hexadecimal representation.
8386

8487
`unhex()`
85-
Transforms hexadecimal input into its byte form.
88+
- Transforms hexadecimal input into its byte form.
8689

8790
`save('name')`
88-
Saves the current state to memory as name.
91+
- Saves the current state to memory as name.
8992

9093
`load('name')`
91-
Recalls the previously saved state name from memory.
94+
- Recalls the previously saved state name from memory.
9295

9396
`ascii()`
94-
Transforms input into ASCII output. Non-printable characters will be replaced with a period.
97+
- Transforms input into ASCII output. Non-printable characters will be replaced with a period.
9598

9699
`emit('name')`
97-
Outputs the current state as UTF-8 to the field name.
100+
- Outputs the current state as UTF-8 to the field name.
98101

99102
`substr(offset, count)`
100-
Returns a substring of the input, starting at the index offset with the number of characters count. Set the count to `'null'` to return from the start offset to the end of the input.
103+
- Returns a substring of the input, starting at the index offset with the number of characters count.
104+
- Set the count to `'null'` to return from the start offset to the end of the input.
101105

102106
`slice(start, end)`
103-
Returns a slice of the input, starting at start offset to the end offset. Set the end to `'null'` to go to the end of the input.
107+
- Returns a slice of the input, starting at start offset to the end offset.
108+
- Set the end to `'null'` to go to the end of the input.
104109

105110
`decode('codec')`
106-
Returns a decoded version of the input based on the codec, python codec list is available on https://docs.python.org/3/library/codecs.html#standard-encodings
111+
- Returns a decoded version of the input based on the codec.
112+
- Python codec list is available on https://docs.python.org/3/library/codecs.html#standard-encodings
107113

108114
`escape`
109-
Returns a string where control characters, \, and non-ASCII characters are backslash escaped (e.g. `\x0a`, `\\`, `\x80`).
115+
- Returns a string where control characters, \, and non-ASCII characters are backslash escaped (e.g. `\x0a`, `\\`, `\x80`).
110116

111117
`unescape`
112-
Returns a string run through python unicode_escape (i.e. return the unicode point(s)). Reverses `escape`. Also unescapes Unicode codepoints (`\uxxxx` or `\Uxxxxxxxx`), which `escape` does not produce.
118+
- Returns a string run through python unicode_escape (i.e. return the unicode point(s)). Reverses `escape`.
119+
- Also unescapes Unicode codepoints (`\uxxxx` or `\Uxxxxxxxx`), which `escape` does not produce.
113120

114121
`htmlescape`
115-
Returns a string with `&`, `<`, and `>` XML escaped like `&amp;`.
122+
- Returns a string with `&`, `<`, and `>` XML escaped like `&amp;`.
116123

117124
`htmlunescape`
118-
Returns a string with HTML references like `&gt;` and `&#62;` unescaped to `>`.
125+
- Returns a string with HTML references like `&gt;` and `&#62;` unescaped to `>`.
119126

120127
`tr('from', 'to')`
121-
Takes an argument to translate "from" and an argument of characters to translate "to" and then returns a result with the result (similar to `tr` in Unix).
128+
- Takes an argument to translate "from" and an argument of characters to translate "to" and then returns a result with the result (similar to `tr` in Unix).
122129

123130
`rev()`
124-
Returns the input in reverse order.
131+
- Returns the input in reverse order.
125132

126133
`find('subseq', start)`
127-
Returns the index of a subsequence "subseq" starting at index "start", or `-1` if the subsequence is not found.
134+
- Returns the index of a subsequence "subseq" starting at index "start", or `-1` if the subsequence is not found.
128135

129136
`b32re()`
130-
Returns a reverse-endian base32 decoded string, as used in the SunBurst DGA.
137+
- Returns a reverse-endian base32 decoded string, as used in the SunBurst DGA.
131138

132139
`b64re()`
133-
Returns a reverse-endian base64 decoded string.
140+
- Returns a reverse-endian base64 decoded string.
134141

135142
`zlib_inflate()`
136-
Returns zlib.decompress() inflated bytes. Default window size of -15 (raw inflate) is used if a wbits value is not provided.
143+
- Returns zlib.decompress() inflated bytes.
144+
- Default window size of -15 (raw inflate) is used if a wbits value is not provided.
137145

138146
`zlib_deflate()`
139-
Returns zlib.compress() deflated bytes. Default level of -1 (currently 6) and window size of -15 (raw deflate) if values are not provided.
147+
- Returns zlib.compress() deflated bytes.
148+
- Default level of -1 (currently 6) and window size of -15 (raw deflate) if values are not provided.
149+
150+
`entropy()`
151+
- Returns base2 entropy of input. The maximum entropy for Unicode strings can be greater than 8.
140152

141153
_Note: you must use **single quotes** around the strings._
142154

@@ -152,6 +164,7 @@ Unicode values can be expressed as `\u0000` or `\U00000000`
152164
Quotation marks (double quotes) **cannot** be used.
153165

154166
`"This is not a valid string"`
167+
155168
## Integers
156169
Integers can be specified numerically or as hexadecimal representations by prefixing values with a 0x.
157170

@@ -234,6 +247,7 @@ Steven (malvidin on github)
234247
## 2.4.1
235248
- Added support for null argument padding, so `find('decrypt2')` is equivalent to `find('decrypt2', 0)`
236249
- Added zlib_deflate for internal validation of zlib_inflate, which can also be used for information analysis
250+
- Add basic entropy calculation
237251

238252
## 2.4.0
239253
Merged pull request from Steven (malvidin on github)

bin/lib/decryptlib.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,11 @@
44
import base64
55
import binascii
66
import itertools
7+
import math
78
import string
89
import tokenize
910
import zlib
11+
from collections import Counter
1012
from html import escape as html_escape, unescape as html_unescape
1113
from io import StringIO
1214

@@ -494,6 +496,13 @@ def fn_zlib_deflate(data, args):
494496
raise Exception(f"{fn_name}(): {exc}\n" + zlib.compressobj.__doc__)
495497

496498

499+
@numargs(0)
500+
def fn_entropy(data, args):
501+
data_len = len(data)
502+
counter = Counter(data)
503+
return abs(sum(c/data_len * math.log2(c/data_len) for c in counter.values()))
504+
505+
497506
def get_args(g):
498507
global g_record
499508
global g_register
@@ -655,5 +664,8 @@ def parse_statement(s):
655664
elif cmd == "zlib_deflate":
656665
yield fn_zlib_deflate, get_args(g)
657666

667+
elif cmd == "entropy":
668+
yield fn_entropy, get_args(g)
669+
658670
else:
659671
raise Exception(f"'{cmd}' is not a recognized command")

default/searchbnf.conf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,5 +43,5 @@ usage = public
4343
#tags = searchcommands_app
4444
#
4545
[decrypt-options]
46-
syntax = field=<string> atob | b64 | btoa | b32 | b58 | hex | unhex | rol(<int>) | ror(<int>) | rotx('<string>') | xor('<string>') | rc4('<string>') | emit('<string>') | load('<string>') | save('<string>') | substr(<int>, <int>) | slice(<int>, <int>) | ascii | decode('<string>') | escape | unescape | tr('<string>', '<string>') | rev | find('<string>', <int>) | b32re | b64re | zlib_inflate(<int>)| zlib_deflate(<int>, <int>)
46+
syntax = field=<string> atob | b64 | btoa | b32 | b58 | hex | unhex | rol(<int>) | ror(<int>) | rotx('<string>') | xor('<string>') | rc4('<string>') | emit('<string>') | load('<string>') | save('<string>') | substr(<int>, <int>) | slice(<int>, <int>) | ascii | decode('<string>') | escape | unescape | tr('<string>', '<string>') | rev | find('<string>', <int>) | b32re | b64re | zlib_inflate(<int>)| zlib_deflate(<int>, <int>) | entropy()
4747
description = Pass the field name to work with, then the command or command(s) to be used, an emit() option can be passed to choose the field to return, defaults to the field name "decrypted"

0 commit comments

Comments
 (0)