Skip to content

Commit 8a84fc6

Browse files
author
Maximilian Golla
committed
The initial release
0 parents  commit 8a84fc6

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+11898
-0
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# C extensions
2+
*.so

README.md

Lines changed: 276 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,276 @@
1+
OMEN: Ordered Markov ENumerator
2+
================================
3+
4+
OMEN is a Markov model-based password guesser written in C. It generates password candidates according to their occurrence probabilities, i.e., it outputs most likely passwords first. OMEN significantly improves guessing speed over existing proposals.
5+
If you are interested in the details on how OMEN improves on existing Markov model-based password guessing approaches, please refer to [OMEN: Faster Password Guessing Using an Ordered Markov Enumerator](https://hal.archives-ouvertes.fr/hal-01112124/file/omen.pdf).
6+
7+
User Guide
8+
-----------
9+
10+
OMEN consists of two separate program modules: `createNG` and `enumNG`. `createNG`
11+
calculates the probabilities based on a given list of passwords and stores them
12+
on the hard disk. Based on these probabilities `enumNG` enumerates new
13+
passwords in the correct order (descending).
14+
15+
### Installation
16+
17+
Use a recent Linux version make sure you have installed `git` (Git version control system), `gcc` (GNU Compiler Collection), and `make` (GNU Make). You can install it under Ubuntu Linux via:
18+
19+
`sudo apt-get install build-essential git`
20+
21+
Check out the source code via:
22+
23+
`git clone https://github.com/RUB-SysSec/OMEN.git OMEN`
24+
25+
Change into the newly created directory `OMEN` and run:
26+
27+
`make`
28+
29+
If compilation is successful, you can find `createNG` and `enumNG` within the current directory.
30+
31+
```
32+
.
33+
├── alphabetCreator
34+
├── createNG
35+
├── docs
36+
│   ├── CHANGELOG.md
37+
│   ├── LICENSE
38+
│   └── screenshots
39+
├── enumNG
40+
├── evalPW
41+
├── makefile
42+
├── README.md
43+
└── src
44+
├── alphabetCreator.c
45+
...
46+
```
47+
48+
If you like, you can now remove the `src` folder and the `makefile` file, they are no longer used.
49+
50+
### Basic Usage
51+
52+
Before one can generate any passwords, the n-gram probabilities have to be estimated using
53+
`createNG`. To calculate the probabilities using the default settings, `createNG` must be
54+
called giving a path to a password list that should be trained:
55+
56+
`./createNG --iPwdList /tmp/password-training-list.txt`
57+
58+
Each password of the given list must be in a new line. The module then
59+
reads and evaluates the list generating a couple of files. Besides a config file (`createConfig`) storing the used settings (in this case the default setting), several files are created containing information about the grams and the password length. These files have the extension '`.level`':
60+
61+
* **IP.level** (Initial Probability): Stores the probabilities of the first
62+
(n-1)-gram of each password.
63+
* **CP.level** (Conditional Probability): Stores the probabilities of the actual
64+
n-grams.
65+
* **EP.level** (End Probability): Stores the probabilities of the last (n-1)-gram
66+
of each password.
67+
* **LN.level** (Length): Stores the probabilities for the password length.
68+
69+
The probabilities of each n-gram and the lengths are mapped to levels between 0
70+
(most likely) and 10 (least likely). Once those files are created, `enumNG` can
71+
be used to generate a list of passwords ordered by probabilities. Currently, `enumNG` supports three modes of operation: *file*, *stdout*, *simulated plaintext attack*. In the default mode of `enumNG`, a list of password guesses based on these levels is created. Using the command
72+
73+
`./enumNG`
74+
75+
generates 1 billion passwords and **stores them in a text file**, which can be found
76+
in the '*results*' folder. The passwords in this file are ordered by level (i.e., by
77+
probability). Since common text editors are not able to handle such huge files,
78+
it is recommended for testing to reduce the number of passwords created. This
79+
can be done using the argument `-m`.
80+
81+
`./enumNG -m 10000`
82+
83+
It will create an ordered list with 10,000 passwords only. If you are interested in printing the passwords to the **standard output (stdout) stream** use the argument `-p`.
84+
85+
`./enumNG -p -m 10000`
86+
87+
If you are interested in evaluating the guessing performance against a *plaintext* password test set use the argument `-s`. Please note: In this mode OMEN benefits from the adaptive length scheduling algorithm incorporating live feedback, which is not available (due to the missing feedback channel) in *file*, *stdout* mode.
88+
89+
`./enumNG -s=password-testing-list.txt -m 10000`
90+
91+
The result of this evaluation can be found in the '*results*' folder.
92+
93+
Both modules provide a help dialog which can be shown using the `-h` or `--help` argument.
94+
95+
### Password Cracking
96+
97+
Besides the [academic use case](https://password-guessing.org) of [improving proabilistic password modeling](https://hal.archives-ouvertes.fr/hal-01112124/file/omen.pdf), [estimating guess numbers](https://github.com/RUB-SysSec/Password-Guessing-Framework) or [password strength](https://www.internetsociety.org/sites/default/files/06_3.pdf), one might be interested in cracking hashed (unknown) passwords. Popular password cracking utilities like [Hashcat](https://github.com/hashcat/hashcat) and [John the Ripper](https://github.com/magnumripper/JohnTheRipper) support hundreds of
98+
hash and cipher formats and could be easily integrated due to their support to
99+
read password candidates via their standard input (stdin) stream.
100+
101+
`./enumNG -p -m 10000 | ./hashcat64.bin ...`
102+
103+
or
104+
105+
`./enumNG -p -m 10000 | ./john --stdin ...`
106+
107+
For optimal guessing performance, consider to train `createNG` with a password distribution that is similar to the one you like to crack.
108+
109+
Please note: Using probabilistic password modeling to crack passwords, in general, should only be considered against slow hashes (e.g., [bcrypt](https://en.wikipedia.org/wiki/Bcrypt), [PBKDF2](https://en.wikipedia.org/wiki/PBKDF2), [scrypt](https://en.wikipedia.org/wiki/Scrypt), or [Argon2](https://en.wikipedia.org/wiki/Argon2)) were the number of feasible guesses is limited or in very targeted attacks. In contrast, for very fast hashes ([MD5](https://en.wikipedia.org/wiki/MD5), [SHA-1](https://en.wikipedia.org/wiki/SHA-1), or [NTLM](https://en.wikipedia.org/wiki/NT_LAN_Manager)), using [good dictionaries](https://weakpass.com) and mangling rules (e.g., best64.rule) are the way to go.
110+
111+
If you are interested in this topic, consider to read the following papers and their related work (this list is incomplete, you can help by expanding it):
112+
113+
**Probabilistic Context-Free Grammars**
114+
* Password Cracking Using Probabilistic Context-Free Grammars (SP '09)
115+
* Guess Again (and Again and Again): Measuring Password Strength by Simulating Password-Cracking Algorithms (SP '12)
116+
* On the Semantic Patterns of Passwords and their Security Impact (NDSS '14)
117+
* Next Gen PCFG Password Cracking (TIFS '15)
118+
* ...
119+
* [Software A](https://github.com/lakiw/pcfg_cracker), [Software B](https://sites.google.com/site/reusablesec/Home/password-cracking-tools/probablistic_cracker)
120+
121+
**Markov Models**
122+
* Fast Dictionary Attacks on Passwords Using Time-Space Tradeoff (CCS '05)
123+
* OMEN+: When Privacy meets Security: Leveraging personal information for password cracking (CoRR '13)
124+
* A Study of Probabilistic Password Models (SP '14)
125+
* OMEN: Faster Password Guessing Using an Ordered Markov Enumerator (ESSoS '15)
126+
* ...
127+
* [Software A](http://openwall.info/wiki/john/markov), [Software B](https://github.com/RUB-SysSec/OMEN)
128+
129+
**Neural Networks**
130+
* Fast, Lean, and Accurate: Modeling Password Guessability Using Neural Networks (USENIX '16)
131+
* Using Neural Networks for Password Cracking (Blog post by Sebastian Neef '16)
132+
* Design and Evaluation of a Data-Driven Password Meter (CHI '17)
133+
* ...
134+
* [Software A](https://github.com/gehaxelt/RNN-Passwords), [Software B](https://github.com/cupslab/neural_network_cracking)
135+
136+
**Hybrids**
137+
* John the Ripper '*Incremental*' Mode
138+
* Introducing the PRINCE Attack-Mode (PASSWORDS '14)
139+
* ...
140+
* [Software A](http://www.openwall.com/john/doc/MODES.shtml), [Software B](https://github.com/hashcat/princeprocessor)
141+
142+
**Approach Comparison**
143+
* Measuring Real-World Accuracies and Biases in Modeling Password Guessability (USENIX '15)
144+
* A Framework for Comparing Password Guessing Strategies (PASSWORDS '15)
145+
* PARS: A Uniform and Open-source Password Analysis and Research System (ACSAC '15)
146+
* ...
147+
* [Software A](https://password-guessing.org), [Software B](https://pgs.ece.cmu.edu)
148+
149+
### Advanced Usage
150+
151+
Both modules provide several command line arguments to select the various
152+
modes available and change the default settings. For instance, the probability
153+
distribution created during the `createNG` process may be manipulated by
154+
choosing one of the supported smoothing functions, the n-gram size, or the used
155+
alphabet. All available parameters for `createNG`, a short description, and the default values can be seen by calling the program with `-h` or `--help`. The same works for `enumNG` where for instance, the enumeration mode, the used length scheduling algorithm (only used in `-s` mode, see '*Basic Usage*' section), and the maximum amount of attempts can be selected. If no enumeration mode is given, the
156+
default mode is executed, storing all created passwords in a text file in the
157+
'*results*' folder.
158+
159+
OMEN+
160+
-----
161+
162+
OMEN+ is based on [When Privacy Meets Security: Leveraging Personal Information for Password Cracking](https://arxiv.org/pdf/1304.6584.pdf)
163+
and is an additional feature of OMEN (implemented in the same binary). Using additional personal information about a user (e.g., a password hint or personal background information scraped from a social network) may help in speeding up the password guessing process (comparable to John the Ripper '*Single crack*' mode).
164+
165+
166+
Therefore, a related hint or several hints (tabulator separated) must be provided in a separate file. Furthermore, an alpha file is required containing the respective
167+
alpha values (tab separated in one line). Alpha values are used to weight the impact of the provided hints. Important is that for each hint in a
168+
line an alpha has to be specified in the alpha file. These alphas have to be in
169+
the same order as the hints per line.
170+
171+
Exemplary, we want to guess the password "*Mary'sPW2305*". The
172+
corresponding line in the hint file containing *first name*, *username*, *date of
173+
birth*, and *email address* looks like the following:
174+
175+
```
176+
mary mary1 19880523 mary1@yahoo.com
177+
```
178+
179+
An alpha file should order the related alpha values for *first name*, *username*,
180+
*date of birth*, and *email address* in the same order as in the hint file. In
181+
example:
182+
183+
```
184+
1 2 1 2
185+
```
186+
187+
For the usage of OMEN+ `enumNG` must be called giving a path to a hint and an
188+
alpha file:
189+
190+
`./enumNG -H hint-file.txt -a alpha-file.txt`
191+
192+
Performance
193+
-----------
194+
![OMEN](/docs/screenshots/performance.png?raw=true "OMEN")
195+
196+
197+
198+
Smoothing Configuration
199+
-----------------------
200+
201+
The smoothing function is selected and configured using a configuration file (`createConfig`).
202+
The file must contain the name of the smoothing function and may contain the
203+
values for any variable parameters. The file should be formatted like this:
204+
205+
```
206+
<name>
207+
-<parameter>_<target> <value>
208+
...
209+
```
210+
211+
At this time, the only supported smoothing functions are **none** or **additive** smoothing.
212+
213+
The allowed parameters (`<parameter>`) are:
214+
* **levelAdjust** (level adjustment factor, heavily influence performance, i.e., good are 100-250)
215+
* **delta** (additive smoothing adds a value δ (delta) to each n-gram, i.e., 0,1,2, ...)
216+
217+
The allowed targets (`<target>`) are:
218+
* **IP** (Initial Probability)
219+
* **CP** (Conditional Probability)
220+
* **EP** (End Probability)
221+
* **all** (Parameter is used for all possible targets)
222+
223+
Notice, one value for a single target overwrites the one set for all.
224+
225+
An exemplary for the **add1(250)** (Additive Smoothing (δ=1), Level Adjustment Factor of 250) smoothing setting:
226+
```
227+
additive
228+
-delta_all 1
229+
-delta_LN 0
230+
-levelAdjust_all 250
231+
-levelAdjust_CP 2
232+
-levelAdjust_LN 1
233+
```
234+
235+
Additional Program Modules
236+
--------------------------
237+
238+
Besides the two main modules `createNG` and `enumNG`, OMEN provides two other
239+
program modules: `evalPW` and `alphabetCreator`. `evalPW` evaluates a given
240+
password and `alphabetCreator` creates an alphabet with the most frequent
241+
character in a given password list. Both modules should be considered experimental.
242+
243+
#### evalPW
244+
245+
It reads a given password and evaluates its strength by returning a password-level. The result is based on the levels generated by `createNG`. The password-level is the sum of each
246+
occurring n-gram level, based on the level lists IP, CP, and EP. The current
247+
implementation of `evalPW` is only a prototype and does not support the whole
248+
possible functionality and contains **lots of bugs**. For example, the actual password length does not influence the password-level. Therefore, only passwords with the same length can
249+
be compared to each other.
250+
251+
`./evalPW --pw=demo123`
252+
253+
#### alphabetCreator
254+
255+
If you want to limit OMEN to passwords complying to a given alphabet you can specify this in the configuration file (`createConfig`). To determine the most promising alphabet, the `alphabetCreator` might be able to help you. The program module creates a new alphabet based on a given password list. The characters of the new alphabet are ordered by their frequency in the password list, beginning with the highest frequency. The length of the alphabet is variable. The created alphabet is based on the 8-bit ASCII table
256+
according to ISO 8859-1 (not allowing ’\n’, ’\r’, ’\t’, and ’ ’ (space)).
257+
Characters that are not part of this table are ignored. Also, an existing
258+
alphabet may be extended with the most frequent characters.
259+
260+
`./alphabetCreator --pwList password-training-list.txt -s 95 -a some_alpha -o new_alpha`
261+
262+
License
263+
-------
264+
265+
The **Ordered Markov ENumerator (OMEN)** is licensed under the MIT license. Refer to [docs/LICENSE](docs/LICENSE) for more information.
266+
267+
### Third-Party Libraries
268+
* **getopt** is part of the GNU C Library (glibc) and used to parse command
269+
line arguments. The developer, the license, and the source code can be downloaded
270+
[here](http://www.gnu.org/software/libc/).
271+
* **uthash** is a hash table for C structures developed by Troy D. Hanson. The
272+
source code and the license can be downloaded [here](http://troydhanson.github.com/uthash/).
273+
274+
Contact
275+
-------
276+
Visit our [website](https://www.mobsec.rub.de) and follow us on [Twitter](https://twitter.com/hgi_bochum). If you are interested in passwords, consider to contribute and to attend at the [International Conference on Passwords (PASSWORDS)](https://passwordscon.org).

docs/CHANGELOG.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# Change Log
2+
All notable changes to this project will be documented in this file.
3+
This project adheres to [Semantic Versioning](http://semver.org/).
4+
5+
## [Unreleased]
6+
### Added
7+
- Parallelization for OMEN+
8+
- Incorporation of feedback based learning into OMEN
9+
- Refactoring of the sorting algorithm for the n-grams (cpp std library sort?)
10+
11+
## [0.3.0] - 2016-07-21
12+
### Added
13+
- Modus OMEN+: boost hints and enumerate based on modified levels
14+
- Input format for usage of OMEN+:
15+
- `hint-file`: new line separated lines containing tab separated additional information attributes, each line has to have the same attribute order
16+
- `alpha-file`: tab separated alpha values for each additional information attribute, alphas has to be integers
17+
18+
## [0.2.0] - 2016-01-31
19+
### Added
20+
- Use more standard headers (`stdbool.h` and `inttypes.h`)
21+
- Replace argumentInterpreter by auto-generated parser from the *GNU getopt* tool
22+
- Change project directory structure
23+
- Add version numbers
24+
- Add README
25+
- Solve bug with results folder, and remove its subfolders
26+
- defines.h to remove dependency circles
27+
- Increase maximum attempts of guesses to 10^15
28+
- Adjust default settings (change default n-gram size to 4, and adjust smoothing parameters)
29+
30+
## [0.1.0] - 2013-05-22
31+
### Added
32+
- Initial version for OMEN
33+
- Main modules: training (`createNG`) and enumeration (`enumNG`)
34+
- Utility modules: `alphabetCreator` and `evalPW`
35+
36+
[Unreleased]:
37+
[0.3.0]:
38+
[0.2.0]:
39+
[0.1.0]:

docs/LICENSE

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
The MIT License (MIT)
2+
3+
Copyright (c) 2017 Horst Goertz Institute for IT-Security (Ruhr-University Bochum)
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.
22+
23+
Third-Party Libraries
24+
25+
getopt is part of the GNU C Library (glibc) and used to parse command line arguments.
26+
The developer, the license, and the source code can be downloaded here:
27+
http://www.gnu.org/software/libc/
28+
29+
uthash is a hash table for C structures developed by Troy D. Hanson.
30+
The source code and the license can be downloaded here:
31+
http://troydhanson.github.com/uthash/

docs/screenshots/performance.png

515 KB
Loading

0 commit comments

Comments
 (0)