You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+8-6Lines changed: 8 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,18 +14,18 @@ The following filter types are currently implemented:
14
14
* Xor filter: 8 and 16 bit variants; needs less space than cuckoo filters, with faster lookup
15
15
* Xor+ filter: 8 and 16 bit variants; compressed xor filter
16
16
17
-
# Password Lookup Tool
17
+
##Password Lookup Tool
18
18
19
19
Included is a tool to build a filter from a list of known password (hashes), and a tool to do lookups. That way, the password list can be queried locally, without requiring a large file. The filter is only 650 MB, instead of the original file which is 11 GB. At the cost of some false positives (unknown passwords reported as known, with about 1% probability).
20
20
21
-
## Generate the Password Filter File
21
+
###Generate the Password Filter File
22
22
23
23
Download the latest SHA-1 password file that is ordered by hash,
24
-
for example the file pwned-passwords-sha1-ordered-by-hash-v4.7z (10 GB)
24
+
for example the file pwned-passwords-sha1-ordered-by-hash-v4.7z (~10 GB)
25
25
from https://haveibeenpwned.com/passwords
26
26
with about 550 million passwords.
27
27
28
-
If you have enough disk space, you can extract the hash file (25 GB),
28
+
If you have enough disk space, you can extract the hash file (~25 GB),
29
29
and convert it as follows:
30
30
31
31
mvn clean install
@@ -37,9 +37,9 @@ To save disk space, you can extract the file on the fly (Mac OS X using Keka):
37
37
/Applications/Keka.app/Contents/Resources/keka7z e -so
@@ -48,3 +48,5 @@ If yes, it will (for sure) either show "Found", or "Found; common",
48
48
which means it was seen 10 times or more often.
49
49
Passwords not in the list will show "Not found" with more than 99% probability,
50
50
and with less than 1% probability "Found" or "Found; common".
51
+
52
+
Internally, the tool uses a xor+ filter (see above) with 8 bits per fingerprint. One bit of the key is either 0 (regular) or 1 (common), and so two lookups are made per password. Because two lookups are made, the false positive rate is twice of what it would be with just one lookup (0.0078 instead of 0.0039). A regular Bloom filter with the same guarantees would be ~760 MB.
0 commit comments