Skip to content

Commit 4ccbdc4

Browse files
realmarcinclaude
andcommitted
Add documentation of removed EU regulations and standards
Document all EU regulatory content removed from D4D schema in commit 4fc1f85 (Dec 2, 2025) based on Harry Caufield's recommendation to "stay US-centric". This file provides a complete reference of what was removed, with context showing 2 lines before and after each removal for easy understanding. Content removed: - GDPR (General Data Protection Regulation) - 5 references - EU AI Act (Regulation (EU) 2024/1689) - 3 references - Complete AIActRiskEnum with 4 risk categories (42 lines) - gdpr_compliant and eu_ai_act_risk_category fields - CSVW and Frictionless Data prefixes and mappings Total: 12 distinct removals across 4 schema files (~57 lines) Impact: Schema now focuses exclusively on US regulations (HIPAA, 45 CFR 46) and reduced aligned standards from 40+ to 25+. See: notes/eu_regulations_removed_content.txt for full details 🤖 Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
1 parent 5feb0c3 commit 4ccbdc4

File tree

1 file changed

+285
-0
lines changed

1 file changed

+285
-0
lines changed
Lines changed: 285 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,285 @@
1+
================================================================================
2+
EU REGULATIONS AND STANDARDS CONTENT REMOVED FROM D4D SCHEMA
3+
================================================================================
4+
5+
Commit: 4fc1f852746590ac76803d829d440c46f8782789
6+
Author: marcin p. joachimiak <[email protected]>
7+
Date: Tue Dec 2 19:31:12 2025 -0800
8+
Message: Complete Harry's feedback: Remove GDPR/EU AI Act, finish Frictionless/CSVW cleanup
9+
10+
Rationale: Harry Caufield's recommendation to "stay US-centric" - removed all
11+
EU regulatory framework references to focus on US regulations (HIPAA, 45 CFR 46).
12+
13+
================================================================================
14+
FILE: src/data_sheets_schema/schema/D4D_Base_import.yaml
15+
================================================================================
16+
17+
REMOVAL 1: CSVW Prefix
18+
--------------------------------------------------------------------------------
19+
BEFORE (2 lines):
20+
prefixes:
21+
biolink: https://w3id.org/biolink/vocab/
22+
23+
REMOVED:
24+
csvw: http://www.w3.org/ns/csvw#
25+
26+
AFTER (2 lines):
27+
data_sheets_schema: https://w3id.org/bridge2ai/data-sheets-schema/
28+
datasets: https://w3id.org/linkml/report
29+
--------------------------------------------------------------------------------
30+
31+
REMOVAL 2: Frictionless Prefix
32+
--------------------------------------------------------------------------------
33+
BEFORE (2 lines):
34+
dcterms: http://purl.org/dc/terms/
35+
example: https://example.org/
36+
37+
REMOVED:
38+
frictionless: https://specs.frictionlessdata.io/
39+
40+
AFTER (2 lines):
41+
linkml: https://w3id.org/linkml/
42+
mediatypes: https://www.iana.org/assignments/media-types/
43+
--------------------------------------------------------------------------------
44+
45+
REMOVAL 3: GDPR Reference in Composition Subset Description
46+
--------------------------------------------------------------------------------
47+
BEFORE (2 lines):
48+
with the information they need to make informed decisions about using the
49+
dataset for their chosen tasks. Some of the questions are designed to
50+
51+
REMOVED:
52+
elicit information about compliance with the EU's General Data Protection
53+
Regulation (GDPR) or comparable regulations in other jurisdictions.
54+
55+
REPLACED WITH:
56+
elicit information about compliance with applicable data protection
57+
regulations and privacy requirements.
58+
59+
AFTER (2 lines):
60+
Collection:
61+
description: >-
62+
--------------------------------------------------------------------------------
63+
64+
REMOVAL 4: CSVW Dialect Mapping
65+
--------------------------------------------------------------------------------
66+
BEFORE (2 lines):
67+
68+
dialect:
69+
70+
REMOVED:
71+
slot_uri: csvw:dialect
72+
73+
AFTER (2 lines):
74+
75+
bytes:
76+
--------------------------------------------------------------------------------
77+
78+
================================================================================
79+
FILE: src/data_sheets_schema/schema/D4D_Data_Governance.yaml
80+
================================================================================
81+
82+
REMOVAL 5: GDPR Reference in ExportControlRegulatoryRestrictions Description
83+
--------------------------------------------------------------------------------
84+
BEFORE (2 lines):
85+
Do any export controls or other regulatory restrictions apply to the dataset
86+
or to individual instances? Includes compliance tracking for regulations like
87+
88+
REMOVED:
89+
GDPR, HIPAA, and EU AI Act. If so, please describe these restrictions and
90+
91+
REPLACED WITH:
92+
HIPAA and other US regulations. If so, please describe these restrictions and
93+
94+
AFTER (2 lines):
95+
provide a link or copy of any supporting documentation. Maps to DUO terms
96+
related to ethics approval, geographic restrictions, and institutional requirements.
97+
--------------------------------------------------------------------------------
98+
99+
REMOVAL 6: gdpr_compliant Field
100+
--------------------------------------------------------------------------------
101+
BEFORE (2 lines):
102+
- DUO:0000022 # GS - geographic restriction
103+
- DUO:0000028 # IS - institution specific
104+
105+
REMOVED:
106+
gdpr_compliant:
107+
description: >-
108+
Indicates compliance with the EU General Data Protection Regulation (GDPR).
109+
GDPR applies to processing of personal data of individuals in the EU.
110+
range: ComplianceStatusEnum
111+
112+
AFTER (2 lines):
113+
hipaa_compliant:
114+
description: >-
115+
--------------------------------------------------------------------------------
116+
117+
REMOVAL 7: eu_ai_act_risk_category Field
118+
--------------------------------------------------------------------------------
119+
BEFORE (2 lines):
120+
HIPAA applies to protected health information in the United States.
121+
range: ComplianceStatusEnum
122+
123+
REMOVED:
124+
eu_ai_act_risk_category:
125+
description: >-
126+
Risk category under the EU AI Act. The EU AI Act classifies AI systems
127+
into risk categories: minimal, limited, high, and unacceptable.
128+
High-risk AI systems face strict requirements.
129+
range: AIActRiskEnum
130+
131+
AFTER (2 lines):
132+
other_compliance:
133+
description: >-
134+
--------------------------------------------------------------------------------
135+
136+
REMOVAL 8: GDPR/EU AI Act in ComplianceStatusEnum Description
137+
--------------------------------------------------------------------------------
138+
BEFORE (2 lines):
139+
description: >-
140+
Compliance status for regulatory frameworks. Indicates the extent to which
141+
142+
REMOVED:
143+
a dataset complies with specific regulations (e.g., GDPR, HIPAA, EU AI Act).
144+
145+
REPLACED WITH:
146+
a dataset complies with specific regulations (e.g., HIPAA, 45 CFR 46).
147+
148+
AFTER (2 lines):
149+
These are workflow status values that may evolve as regulations are assessed
150+
or as the dataset is modified.
151+
--------------------------------------------------------------------------------
152+
153+
REMOVAL 9: Complete AIActRiskEnum (42 lines)
154+
--------------------------------------------------------------------------------
155+
BEFORE (2 lines):
156+
determination has not been made.
157+
158+
159+
REMOVED:
160+
AIActRiskEnum:
161+
description: >-
162+
Risk categories under the EU Artificial Intelligence Act (Regulation (EU) 2024/1689).
163+
The AI Act establishes a risk-based regulatory framework with four categories.
164+
See https://artificialintelligenceact.eu/ and https://eur-lex.europa.eu/eli/reg/2024/1689/oj
165+
permissible_values:
166+
minimal_risk:
167+
description: >-
168+
AI systems with minimal risk (e.g., AI-enabled video games, spam filters).
169+
No specific obligations beyond general transparency for certain AI systems
170+
(Article 50). Represents the majority of AI systems on the EU market.
171+
related_mappings:
172+
- EUAIAct:Article50
173+
limited_risk:
174+
description: >-
175+
AI systems with limited risk subject to transparency obligations
176+
(e.g., chatbots, emotion recognition systems, biometric categorization,
177+
deepfakes). Must comply with specific transparency requirements to enable
178+
users to make informed decisions (Article 50).
179+
related_mappings:
180+
- EUAIAct:Article50
181+
- EUAIAct:TitleIV
182+
high_risk:
183+
description: >-
184+
AI systems with high risk to health, safety, or fundamental rights as
185+
defined in Annex III (e.g., AI in critical infrastructure, education,
186+
employment, law enforcement, migration, justice). Subject to strict
187+
requirements including conformity assessment, risk management, data
188+
governance, transparency, human oversight, and accuracy (Articles 6-51).
189+
related_mappings:
190+
- EUAIAct:Article6
191+
- EUAIAct:AnnexIII
192+
- EUAIAct:TitleIII
193+
unacceptable_risk:
194+
description: >-
195+
AI systems with unacceptable risk that are prohibited under Article 5
196+
(e.g., social scoring by public authorities, exploitation of vulnerabilities,
197+
real-time remote biometric identification in public spaces for law enforcement
198+
with limited exceptions). These AI practices are banned in the EU.
199+
related_mappings:
200+
- EUAIAct:Article5
201+
202+
AFTER (2 lines):
203+
ConfidentialityLevelEnum:
204+
description: >-
205+
--------------------------------------------------------------------------------
206+
207+
================================================================================
208+
FILE: src/data_sheets_schema/schema/D4D_Human.yaml
209+
================================================================================
210+
211+
REMOVAL 10: GDPR in Regulatory Compliance Examples
212+
--------------------------------------------------------------------------------
213+
BEFORE (2 lines):
214+
description: >
215+
What regulatory frameworks govern this human subjects research
216+
217+
REMOVED:
218+
(e.g., 45 CFR 46, GDPR, HIPAA)?
219+
220+
REPLACED WITH:
221+
(e.g., 45 CFR 46, HIPAA)?
222+
223+
AFTER (2 lines):
224+
range: string
225+
multivalued: true
226+
--------------------------------------------------------------------------------
227+
228+
================================================================================
229+
FILE: src/data_sheets_schema/schema/data_sheets_schema.yaml
230+
================================================================================
231+
232+
REMOVAL 11: CSVW Prefix (Main Schema)
233+
--------------------------------------------------------------------------------
234+
BEFORE (2 lines):
235+
AIO: https://w3id.org/aio/
236+
biolink: https://w3id.org/biolink/vocab/
237+
238+
REMOVED:
239+
csvw: http://www.w3.org/ns/csvw#
240+
241+
AFTER (2 lines):
242+
data_sheets_schema: https://w3id.org/bridge2ai/data-sheets-schema/
243+
datasets: https://w3id.org/linkml/report
244+
--------------------------------------------------------------------------------
245+
246+
REMOVAL 12: Frictionless Prefix (Main Schema)
247+
--------------------------------------------------------------------------------
248+
BEFORE (2 lines):
249+
dcterms: http://purl.org/dc/terms/
250+
example: https://example.org/
251+
252+
REMOVED:
253+
frictionless: https://specs.frictionlessdata.io/
254+
255+
AFTER (2 lines):
256+
linkml: https://w3id.org/linkml/
257+
mediatypes: https://www.iana.org/assignments/media-types/
258+
--------------------------------------------------------------------------------
259+
260+
================================================================================
261+
SUMMARY OF REMOVALS
262+
================================================================================
263+
264+
EU Regulations Removed:
265+
- GDPR (General Data Protection Regulation) - 5 references
266+
- EU AI Act (Regulation (EU) 2024/1689) - 3 references
267+
- Complete AIActRiskEnum with 4 risk categories (42 lines)
268+
- gdpr_compliant field from ExportControlRegulatoryRestrictions
269+
- eu_ai_act_risk_category field from ExportControlRegulatoryRestrictions
270+
271+
Standards/Prefixes Removed:
272+
- csvw: http://www.w3.org/ns/csvw# (2 occurrences)
273+
- frictionless: https://specs.frictionlessdata.io/ (2 occurrences)
274+
- csvw:dialect slot mapping (1 occurrence)
275+
276+
Impact:
277+
- Schema now focuses exclusively on US regulations (HIPAA, 45 CFR 46)
278+
- Removed overly granular CSVW mappings
279+
- Removed uncertain Frictionless mappings
280+
- Reduced from 40+ to 25+ aligned standards
281+
282+
Total Removals: 12 distinct changes across 4 schema files
283+
Total Lines Removed: ~57 lines of EU regulatory content
284+
285+
================================================================================

0 commit comments

Comments
 (0)