Skip to content

Commit 27f051a

Browse files
committed
table HTML changes
1 parent 3d6a8ad commit 27f051a

File tree

1 file changed

+131
-11
lines changed

1 file changed

+131
-11
lines changed

articles/machine-learning/concept-sourcing-human-data.md

Lines changed: 131 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -27,18 +27,138 @@ These are emerging practices, and we are continually learning. The best practice
2727

2828
We suggest the following best practices for manually collecting human data directly from people.
2929

30+
:::column span="4":::
31+
32+
:::row:::
33+
:::column:::
34+
**Best Practice**
35+
:::column-end:::
36+
:::column:::
37+
**Why?**
38+
:::column-end:::
39+
:::row-end:::
40+
-----
41+
42+
:::row:::
43+
:::column:::
44+
**Obtain voluntary informed consent.**
45+
:::column-end:::
46+
47+
:::column:::
48+
- Participants should understand and consent to data collection and how their data will be used.
49+
- Data should only be stored, processed, and used for purposes that are part of the original documented informed consent.
50+
- Consent documentation should be properly stored and associated with the collected data.
51+
:::column-end:::
52+
:::row-end:::
53+
54+
-----
55+
56+
:::row:::
57+
:::column:::
58+
**Compensate data contributors appropriately.**
59+
:::column-end:::
60+
61+
:::column:::
62+
- Data contributors should not be pressured or coerced into data collections and should be fairly compensated for their time and data.
63+
- Inappropriate compensation can be exploitative or coercive.
64+
:::column-end:::
65+
66+
:::row-end:::
67+
68+
-----
3069

31-
| **Best Practice** | **Why** |
32-
|:--------------------|:----------|
33-
| **Obtain voluntary informed consent.** | <ul><li>Participants should understand and consent to data collection and how their data will be used.<li>Data should only be stored, processed, and used for purposes that are part of the original documented informed consent. <li>Consent documentation should be properly stored and associated with the collected data. <ul> |
34-
| **Compensate data contributors appropriately.** | <ul><li>Data contributors should not be pressured or coerced into data collections and should be fairly compensated for their time and data. <li>Inappropriate compensation can be exploitative or coercive.<ul> |
35-
| **Let contributors self-identify demographic information.** | <ul><li>Demographic information that is not self-reported by data contributors but assigned by data collectors may 1) result in inaccurate metadata and 2) be disrespectful to data contributors.<ul> |
36-
| **Anticipate harms when recruiting vulnerable groups.** | <ul><li>Collecting data from vulnerable population groups introduces risk to data contributors and your organization.<ul> |
37-
| **Treat data contributors with respect.** | <ul><li>Improper interactions with data contributors at any phase of the data collection can negatively impact data quality, as well as the overall data collection experience for data contributors and data collectors.<ul> |
38-
| **Qualify external suppliers carefully.** | <ul><li>Data collections with unqualified suppliers may result in low quality data, poor data management, unprofessional practices, and potentially harmful outcomes for data contributors and data collectors (including violations of human rights). <li> Annotation or labeling work (e.g., audio transcription, image tagging) with unqualified suppliers may result in low quality or biased datasets, insecure data management, unprofessional practices, and potentially harmful outcomes for data contributors (including violations of human rights).<ul> |
39-
| **Communicate expectations clearly in the Statement of Work (SOW) with suppliers.** | <ul><li>An SOW which lacks requirements for responsible data collection work may result in low-quality or poorly collected data.<ul> |
40-
| **Qualify geographies carefully.** | <ul><li> When applicable, collecting data in restricted and/or unfamiliar geographies may result in unusable or low-quality data and may impact the safety of involved parties.<ul> |
41-
| **Be a good steward of your datasets.** | <ul><li>Improper data management and poor documentation can result in data misuse.<ul> |
70+
:::row:::
71+
:::column:::
72+
**Let contributors self-identify demographic information.**
73+
:::column-end:::
74+
75+
:::column:::
76+
- Demographic information that is not self-reported by data contributors but assigned by data collectors may 1) result in inaccurate metadata and 2) be disrespectful to data contributors.
77+
:::column-end:::
78+
79+
:::row-end:::
80+
81+
-----
82+
83+
:::row:::
84+
:::column:::
85+
**Anticipate harms when recruiting vulnerable groups.**
86+
:::column-end:::
87+
88+
:::column:::
89+
- Collecting data from vulnerable population groups introduces risk to data contributors and your organization.
90+
:::column-end:::
91+
92+
:::row-end:::
93+
94+
-----
95+
96+
:::row:::
97+
:::column:::
98+
**Treat data contributors with respect.**
99+
:::column-end:::
100+
101+
:::column:::
102+
- Improper interactions with data contributors at any phase of the data collection can negatively impact data quality, as well as the overall data collection experience for data contributors and data collectors.
103+
:::column-end:::
104+
105+
:::row-end:::
106+
107+
-----
108+
109+
:::row:::
110+
:::column:::
111+
**Qualify external suppliers carefully.**
112+
:::column-end:::
113+
114+
:::column:::
115+
- Data collections with unqualified suppliers may result in low quality data, poor data management, unprofessional practices, and potentially harmful outcomes for data contributors and data collectors (including violations of human rights).
116+
- Annotation or labeling work (e.g., audio transcription, image tagging) with unqualified suppliers may result in low quality or biased datasets, insecure data management, unprofessional practices, and potentially harmful outcomes for data contributors (including violations of human rights).
117+
:::column-end:::
118+
119+
:::row-end:::
120+
121+
-----
122+
123+
:::row:::
124+
:::column:::
125+
**Communicate expectations clearly in the Statement of Work (SOW) with suppliers.**
126+
:::column-end:::
127+
128+
:::column:::
129+
- An SOW which lacks requirements for responsible data collection work may result in low-quality or poorly collected data.
130+
:::column-end:::
131+
132+
:::row-end:::
133+
134+
-----
135+
136+
:::row:::
137+
:::column:::
138+
**Qualify geographies carefully.**
139+
:::column-end:::
140+
141+
:::column:::
142+
- When applicable, collecting data in restricted and/or unfamiliar geographies may result in unusable or low-quality data and may impact the safety of involved parties.
143+
:::column-end:::
144+
145+
:::row-end:::
146+
147+
-----
148+
149+
:::row:::
150+
:::column:::
151+
**Be a good steward of your datasets.**
152+
:::column-end:::
153+
154+
:::column:::
155+
- Improper data management and poor documentation can result in data misuse.
156+
:::column-end:::
157+
158+
:::row-end:::
159+
160+
-----
161+
42162

43163
>[!NOTE]
44164
>This article focuses on recommendations for human data, including personal data and sensitive data such as biometric data, health data, racial or ethnic data, data collected manually from the general public or company employees, as well as metadata relating to human characteristics, such as age, ancestry, and gender identity, that may be created via annotation or labeling.

0 commit comments

Comments
 (0)