Skip to content

Commit 2deb1cf

Browse files
Introduce category methods
Follow-up to #89. Now that we enforce a specific label format, we can dynamically generate methods based on the types of data that was filtered. ```ruby result = TopSecret::Text.filter("Ralph can be reached at [email protected]") result.emails? => true result.emails => ["[email protected]"] result.email_mapping => {EMAIL_1 => "[email protected]"} result.people? => true result.people => ["Ralph"] result.email_mapping => {PERSON_1 => "Ralph"} result.credit_cards? => false result.credit_cards => [] result.credit_card_mapping => {} ```
1 parent a65115a commit 2deb1cf

File tree

7 files changed

+485
-2
lines changed

7 files changed

+485
-2
lines changed

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
## [Unreleased]
22

3+
### Added
4+
5+
- Added category methods to `Result` for querying specific types of sensitive information (e.g., `emails`, `emails?`, `email_mapping`)
6+
- Category methods are automatically generated for all default filter types and custom labels
7+
- Category methods always return empty arrays/hashes when no data of that type is found, ensuring they're safe to call without checking
8+
39
### Changed
410

511
- **BREAKING:** Added strict label validation for custom filters. Labels must now start and end with letters and contain only alphabetic characters and single underscores (no consecutive underscores, digits, or special characters). Previously malformed labels will now raise `Error::MalformedLabel`.

README.md

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,70 @@ result.safe?
173173
# => false
174174
```
175175

176+
### Category Methods
177+
178+
Query the result for specific types of sensitive information using category methods:
179+
180+
```ruby
181+
result = TopSecret::Text.filter("Ralph can be reached at [email protected] or 555-1234")
182+
183+
# Check if emails were found
184+
result.emails?
185+
# => true
186+
187+
# Get all emails
188+
result.emails
189+
190+
191+
# Get email mapping
192+
result.email_mapping
193+
# => {:EMAIL_1=>"[email protected]"}
194+
195+
# Similarly for other types
196+
result.people? # => true
197+
result.people # => ["Ralph"]
198+
result.person_mapping # => {:PERSON_1=>"Ralph"}
199+
200+
result.phone_numbers? # => true
201+
result.phone_numbers # => ["555-1234"]
202+
result.phone_number_mapping # => {:PHONE_NUMBER_1=>"555-1234"}
203+
```
204+
205+
Available category methods for all default filters:
206+
207+
- `emails`, `emails?`, `email_mapping`
208+
- `credit_cards`, `credit_cards?`, `credit_card_mapping`
209+
- `phone_numbers`, `phone_numbers?`, `phone_number_mapping`
210+
- `ssns`, `ssns?`, `ssn_mapping`
211+
- `people`, `people?`, `person_mapping`
212+
- `locations`, `locations?`, `location_mapping`
213+
214+
These methods are always available and return empty arrays/hashes when no sensitive information of that type is found:
215+
216+
```ruby
217+
result = TopSecret::Text.filter("No sensitive data here")
218+
219+
result.emails? # => false
220+
result.emails # => []
221+
result.email_mapping # => {}
222+
```
223+
224+
When using custom labels, methods are generated based on the label name:
225+
226+
```ruby
227+
result = TopSecret::Text.filter(
228+
"user[at]example.com",
229+
email_filter: TopSecret::Filters::Regex.new(
230+
label: "EMAIL_ADDRESS",
231+
regex: /\w+\[at\]\w+\.\w+/
232+
)
233+
)
234+
235+
result.email_addresses # => ["user[at]example.com"]
236+
result.email_addresses? # => true
237+
result.email_address_mapping # => {:EMAIL_ADDRESS_1=>"user[at]example.com"}
238+
```
239+
176240
### Scanning for Sensitive Information
177241

178242
Use `TopSecret::Text.scan` to detect sensitive information without redacting the text. This is useful when you only need to check if sensitive data exists or get a mapping of what was found:

lib/top_secret/constants.rb

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,4 +25,7 @@ module TopSecret
2525

2626
# @return [Float] The minimum confidence score for NER filtering
2727
MIN_CONFIDENCE_SCORE = 0.5
28+
29+
# @return [String] The delimiter used in label names
30+
LABEL_DELIMITER = "_"
2831
end

lib/top_secret/mapping.rb

Lines changed: 191 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,40 @@
11
# frozen_string_literal: true
22

3+
require "active_support/core_ext/string/inflections"
4+
35
module TopSecret
6+
# Provides dynamic category methods for querying sensitive information by type.
7+
#
8+
# This module automatically generates methods for accessing sensitive information
9+
# organized by category (emails, credit cards, people, etc.). Methods are available
10+
# for all default filter types and any custom labels used in the mapping.
11+
#
12+
# @example Querying emails
13+
# result = TopSecret::Text.filter("Contact [email protected]")
14+
# result.emails? # => true
15+
# result.emails # => ["[email protected]"]
16+
# result.email_mapping # => {:EMAIL_1=>"[email protected]"}
17+
#
18+
# @example With no matches
19+
# result = TopSecret::Text.filter("No sensitive data")
20+
# result.emails? # => false
21+
# result.emails # => []
22+
# result.email_mapping # => {}
23+
#
24+
# @example Custom labels
25+
# result = TopSecret::Text.filter(
26+
# "user[at]example.com",
27+
# email_filter: TopSecret::Filters::Regex.new(
28+
# label: "EMAIL_ADDRESS",
29+
# regex: /\w+\[at\]\w+\.\w+/
30+
# )
31+
# )
32+
# result.email_addresses # => ["user[at]example.com"]
33+
# result.email_address_mapping # => {:EMAIL_ADDRESS_1=>"user[at]example.com"}
434
module Mapping
35+
MAPPING_SUFFIX = "_mapping"
36+
PREDICATE_SUFFIX = "?"
37+
538
# @return [Boolean] Whether sensitive information was found
639
def sensitive?
740
mapping.any?
@@ -11,5 +44,163 @@ def sensitive?
1144
def safe?
1245
!sensitive?
1346
end
47+
48+
def method_missing(method_name)
49+
if mapping_methods.include? method_name
50+
self.class.define_method(method_name) do
51+
build_mapping_method_from method_name
52+
end
53+
54+
send(method_name)
55+
elsif pluralized_methods.include? method_name
56+
self.class.define_method(method_name) do
57+
build_plural_method_from method_name
58+
end
59+
60+
send(method_name)
61+
elsif predicate_methods.include? method_name
62+
self.class.define_method(method_name) do
63+
build_predicate_method_from method_name
64+
end
65+
66+
send(method_name)
67+
elsif mapping_predicate_methods.include? method_name
68+
self.class.define_method(method_name) do
69+
build_mapping_predicate_method_from method_name
70+
end
71+
72+
send(method_name)
73+
else
74+
super
75+
end
76+
end
77+
78+
def respond_to_missing?(method_name, include_private = false)
79+
mapping_methods.include?(method_name) ||
80+
pluralized_methods.include?(method_name) ||
81+
predicate_methods.include?(method_name) ||
82+
mapping_predicate_methods.include?(method_name) ||
83+
super
84+
end
85+
86+
# Returns all available types for category methods.
87+
#
88+
# Types are derived from both the mapping keys and default filters.
89+
# For example, with mapping `{EMAIL_1: "[email protected]"}`, the type is `:email`.
90+
# Default filter types (credit_card, email, phone_number, ssn, person, location)
91+
# are always available even when not present in the mapping.
92+
#
93+
# @return [Array<Symbol>] List of available types
94+
# @example
95+
# result = TopSecret::Text.filter("[email protected]")
96+
# result.types
97+
# # => [:email, :credit_card, :phone_number, :ssn, :person, :location]
98+
def types
99+
@types ||= all_types.uniq.map(&:to_sym)
100+
end
101+
102+
private
103+
104+
# Extracts types from mapping keys.
105+
#
106+
# @return [Array<String>] Types extracted from mapping
107+
# @private
108+
def types_from_mapping
109+
mapping.keys.map do |key|
110+
parts = key.to_s.split(TopSecret::LABEL_DELIMITER).reject(&:empty?)
111+
parts[0...-1].join(TopSecret::LABEL_DELIMITER).downcase
112+
end
113+
end
114+
115+
# Returns types from default filters.
116+
#
117+
# @return [Array<String>] Types from default filters
118+
# @private
119+
def types_from_filters
120+
default_filter_objects.map { |filter| filter.label.downcase }
121+
end
122+
123+
# Combines types from mapping and filters.
124+
#
125+
# @return [Array<String>] All types
126+
# @private
127+
def all_types
128+
types_from_mapping + types_from_filters
129+
end
130+
131+
# Returns default filter objects from TopSecret configuration.
132+
#
133+
# @return [Array] Default filter objects
134+
# @private
135+
def default_filter_objects
136+
[
137+
TopSecret.credit_card_filter,
138+
TopSecret.email_filter,
139+
TopSecret.phone_number_filter,
140+
TopSecret.ssn_filter,
141+
TopSecret.people_filter,
142+
TopSecret.location_filter
143+
].compact
144+
end
145+
146+
def stringified_types
147+
types.map(&:to_s)
148+
end
149+
150+
def pluralized_methods
151+
@pluralized_methods ||= stringified_types.map(&:pluralize).map(&:to_sym)
152+
end
153+
154+
def predicate_methods
155+
@predicate_methods ||= pluralized_methods.map { :"#{_1}#{PREDICATE_SUFFIX}" }
156+
end
157+
158+
def mapping_predicate_methods
159+
@mapping_predicate_methods ||= mapping_methods.map { :"#{_1}#{PREDICATE_SUFFIX}" }
160+
end
161+
162+
def mapping_methods
163+
@mapping_methods ||= stringified_types.map do |type|
164+
if type.end_with?(MAPPING_SUFFIX)
165+
:"#{type.pluralize}#{MAPPING_SUFFIX}"
166+
else
167+
:"#{type}#{MAPPING_SUFFIX}"
168+
end
169+
end
170+
end
171+
172+
def build_mapping_method_from(method_name)
173+
type_name = method_name.to_s.delete_suffix(MAPPING_SUFFIX)
174+
175+
type_name = type_name.singularize if type_name.pluralize == type_name && type_name.singularize.end_with?(MAPPING_SUFFIX)
176+
177+
type = type_name.upcase
178+
179+
mapping.select { |key, _| key.start_with? type }
180+
end
181+
182+
def build_plural_method_from(method_name)
183+
singular = method_name.to_s.singularize
184+
185+
mapping_method = if singular.end_with?(MAPPING_SUFFIX)
186+
:"#{method_name}#{MAPPING_SUFFIX}"
187+
else
188+
:"#{singular}#{MAPPING_SUFFIX}"
189+
end
190+
191+
send(mapping_method).values
192+
end
193+
194+
def build_predicate_method_from(method_name)
195+
plural_method = method_name.to_s.chomp(PREDICATE_SUFFIX).to_sym
196+
197+
send(plural_method).any?
198+
end
199+
200+
def build_mapping_predicate_method_from(method_name)
201+
mapping_method = method_name.to_s.chomp(PREDICATE_SUFFIX).to_sym
202+
203+
send(mapping_method).any?
204+
end
14205
end
15206
end

lib/top_secret/text/global_mapping.rb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,11 +50,11 @@ def process_result(result)
5050
# @param individual_key [Symbol] The individual key from a filter result
5151
# @return [Symbol] The global key with consistent numbering
5252
def generate_global_key(individual_key)
53-
label_type = individual_key.to_s.rpartition("_").first
53+
label_type = individual_key.to_s.rpartition(TopSecret::LABEL_DELIMITER).first
5454

5555
label_counters[label_type] ||= 0
5656
label_counters[label_type] += 1
57-
:"#{label_type}_#{label_counters[label_type]}"
57+
:"#{label_type}#{TopSecret::LABEL_DELIMITER}#{label_counters[label_type]}"
5858
end
5959
end
6060
end

spec/top_secret/result_spec.rb

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,4 +38,81 @@
3838
end
3939
end
4040
end
41+
42+
describe "categorization" do
43+
let(:mapping) do
44+
{
45+
EMAIL_1: "[email protected]",
46+
EMAIL_2: "[email protected]",
47+
PERSON_1: "Ralph",
48+
IP_ADDRESS_1: "192.168.1.1",
49+
CREDIT_CARD_NUMBER_1: "4242424242424242",
50+
NETWORK_MAPPING_1: "10.0.1.0/24 -> 192.168.1.0/24"
51+
}
52+
end
53+
54+
it "categorizes by labels" do
55+
expect(subject.emails?).to be true
56+
expect(subject.people?).to be true
57+
expect(subject.credit_card_numbers?).to be true
58+
expect(subject.network_mappings?).to be true
59+
60+
expect(subject.emails).to eq([
61+
62+
63+
])
64+
expect(subject.people).to eq([
65+
"Ralph"
66+
])
67+
expect(subject.credit_card_numbers).to eq([
68+
"4242424242424242"
69+
])
70+
expect(subject.network_mappings).to eq([
71+
"10.0.1.0/24 -> 192.168.1.0/24"
72+
])
73+
74+
expect(subject.email_mapping).to eq({
75+
EMAIL_1: "[email protected]",
76+
EMAIL_2: "[email protected]"
77+
})
78+
expect(subject.person_mapping).to eq({
79+
PERSON_1: "Ralph"
80+
})
81+
expect(subject.credit_card_number_mapping).to eq({
82+
CREDIT_CARD_NUMBER_1: "4242424242424242"
83+
})
84+
expect(subject.network_mappings_mapping).to eq({
85+
NETWORK_MAPPING_1: "10.0.1.0/24 -> 192.168.1.0/24"
86+
})
87+
end
88+
89+
it "extracts types" do
90+
expect(subject.types).to include(
91+
:email,
92+
:person,
93+
:ip_address,
94+
:credit_card_number,
95+
:network_mapping,
96+
:credit_card,
97+
:phone_number,
98+
:ssn,
99+
:location
100+
)
101+
end
102+
103+
it "responds to dynamic methods" do
104+
expect(subject).to respond_to(:emails)
105+
expect(subject).to respond_to(:emails?)
106+
expect(subject).to respond_to(:email_mapping)
107+
expect(subject).to respond_to(:people)
108+
expect(subject).to respond_to(:people?)
109+
expect(subject).to respond_to(:person_mapping)
110+
expect(subject).to respond_to(:credit_card_numbers)
111+
expect(subject).to respond_to(:credit_card_numbers?)
112+
expect(subject).to respond_to(:credit_card_number_mapping)
113+
expect(subject).to respond_to(:network_mappings)
114+
expect(subject).to respond_to(:network_mappings_mapping?)
115+
expect(subject).to respond_to(:network_mappings_mapping)
116+
end
117+
end
41118
end

0 commit comments

Comments
 (0)