Skip to content

Commit daae206

Browse files
author
Janina Sajka
committed
Scott Addition: Revised and expanded Sec 4.2: Alternative Text for Images
1 parent 145f83d commit daae206

File tree

1 file changed

+88
-12
lines changed

1 file changed

+88
-12
lines changed

index.html

Lines changed: 88 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -103,18 +103,92 @@ <h2 id="relevance-of-current-standards-and-guidance">Relevance of current standa
103103
<p>Furthermore, the <a href="https://www.w3.org/TR/UAAG20/">Authoring Tool Accessibility Guidelines (ATAG) 2.0</a> could also offer support, particularly in Part B, in which the creation of accessible content is of particular importance. For example, authoring tools that have automatically generated alternative text could support the creation of accessible content.</p>
104104
</section>
105105
<section>
106-
<h2 id="alternative-text-for-images">Alternative text for images</h2>
107-
<p>There currently exists a number of machine learning-based tools that have been integrated into popular social media platforms, alongside authoring tools, that are equipped to create an automated alternative text description based on machine learning algorithms that scan and determine the contents of visual materials, such as an image. Until recently, this automated process was considered to hold high inaccuracy to the point where its utility was questioned [[RN3]]. Recent developments have improved automated alternative text accuracy, but criticism persists due to limitations in providing detail and recognising the importance of relevant data.</p>
108-
<figure><img src="bar-graph.png" alt="A graph with different colored bars Description automatically generated">
109-
<figcaption>A coloured bar graph representing the favourite colour of children (inspired by an example from Twinkl, n.d.)</figcaption>
110-
</figure>
111-
<p>A good example can be seen in a popular image used to illustrate data (Twinkl, n.d.). The image features a classic bar graph of responses from children clarifying their favourite colour, in which yellow has been found the achieve the highest result with 9 responses. While an appropriate alternative text for the image should endeavour to capture the significant points of the graph with detail, such as its drawn intention - the information organising its X and Y axis, as well as the resulting data, the automated alternative text simply describes this image as “a graph with different coloured bars”. While technically accurate, this information lacks depth to convey important technical details from the graph.</p>
112-
<figure><img src="image2.jpeg" alt="A nebula in space with stars Description automatically generated">
113-
<figcaption>An image provided by the James Webb Space Telescope</figcaption>
114-
</figure>
115-
<p>A second example is shown through an image from the James Webb Space Telescope. As all images publicly released include automated alternative text, the alternative text for Figure 2 was compared to that of a manually created alternative text. The former reads the description, “The image is divided horizontally by an undulating line between a cloudscape forming a nebula along the bottom portion and a comparatively clear upper portion”, while the latter states: “Speckled across both portions is a starfield, showing innumerable stars of many sizes. The smallest of these are small, distant, and faint points of light. The largest of these appear larger, closer, brighter, and more fully resolved with 8-point diffraction spikes. The upper portion of the image is bluish and has wispy translucent cloudlike streaks rising from the nebula below.”</p>
116-
<p>Upon observation, the automated alternative text presents a simplified iteration of the image, using the brief narration, “a nebula in space with stars”. As such, once again, this comparison supports that, while automated alternative text provided by machine learning is representative of the image being studied and could assist in delivering a basic and minimised understanding of an image, it does not have the ability to incorporate the orientation of detail required to capture the essence of the image.</p>
117-
<p>Although machine learning techniques embedded in authoring tools and other platforms may provide some information, generative AI platforms that are able to create images, videos and other visual media content based on text input tend not to provide automated alternative text. Hence, this would make it difficult for people who are blind or have low vision to attain a meaningful interpretation of these AI-generated outputs.</p>
106+
<h2 id="alternative-text-for-images">4.2 Alternative text for images</h2>
107+
<p>There currently exists a number of machine learning-based tools that
108+
have been integrated into popular social media platforms, alongside
109+
authoring tools, that are equipped to create an automated alternative
110+
text description based on machine learning algorithms that scan and
111+
determine the contents of visual materials, such as an image. Until
112+
recently, this automated process was considered to hold high inaccuracy
113+
to the point where its utility was questioned [<a
114+
href="https://w3c.github.io/ai-accessibility/#bib-rn3"><em>RN3</em></a>].
115+
Recent developments have improved automated alternative text accuracy,
116+
but criticism persists due to limitations in providing detail and
117+
recognising the importance of relevant data.</p>
118+
<p><img src="media/image1.png" style="width:5.00694in;height:3.02083in"
119+
alt="A graph with different colored bars " /></p>
120+
<p><a
121+
href="https://w3c.github.io/ai-accessibility/#fig-a-coloured-bar-graph-representing-the-favourite-colour-of-children-inspired-by-an-example-from-twinkl-n-d">Figure 1</a> A
122+
coloured bar graph representing the favourite colour of children
123+
(inspired by an example from Twinkl, n.d.)</p>
124+
<p>A good example can be seen in a popular image used to illustrate data
125+
(Twinkl, n.d.). The image features a classic bar graph of responses from
126+
children clarifying their favourite colour, in which yellow has been
127+
found the achieve the highest result with 9 responses. While an
128+
appropriate alternative text for the image should endeavour to capture
129+
the significant points of the graph with detail, such as its drawn
130+
intention - the information organising its X and Y axis, as well as the
131+
resulting data, the automated alternative text on everyday applications
132+
such as Microsoft Word simply describes this image as “a graph with
133+
different coloured bars”. While technically accurate, this information
134+
lacks depth to convey important technical details from the graph.</p>
135+
<p>That said, recent evolutions in generative AI that incorporate AI
136+
into accessibility features, such as screenreaders, are able to provide
137+
a more accurate description. For example, on Google Gemini, the
138+
following descriptions were provided for the coloured bar graph:</p>
139+
<p>“The image displays a bar graph titled “Favourite Colour of Primary
140+
School Children”. The X-axis represents different colours: yellow, red,
141+
blue, green, and pink. The y-axis is labelled “Number of Votes” and goes
142+
up to 8. The graph shows the number of votes for each colour: Yellow has
143+
the most votes, followed by pink, then red, green and blue.”</p>
144+
<p>“The image displays a bar graph inside a white rounded rectangle
145+
against a black background. The graph’s title is “Favourite Colour of
146+
Primary School Children”. The Y-axis represents the “Number of Votes”
147+
from 0 to 8, and the X-axis shows the following colours: yellow, red,
148+
blue, green, and pink. There is a yellow bar with a value of 7, a red
149+
bar with a value of 5, a blue bar with a value of 2, a green bar with a
150+
value of 3, and a pink bar with a value of 6.”</p>
151+
<p>Although the descriptions for images can be seen to have
152+
significantly improved, the information provided changes each time the
153+
image is checked. This then introduces an issue of inconsistency,
154+
despite some relative accuracy.</p>
155+
<p><img src="media/image2.jpeg" style="width:4.375in;height:2.83333in"
156+
alt="A nebula in space with stars " /></p>
157+
<p><a
158+
href="https://w3c.github.io/ai-accessibility/#fig-an-image-provided-by-the-james-webb-space-telescope">Figure 2</a> An
159+
image provided by the James Webb Space Telescope</p>
160+
<p>A second example is shown through an image from the James Webb Space
161+
Telescope. As all images publicly released include automated alternative
162+
text, the alternative text for Figure 2 was compared to that of a
163+
manually created alternative text. The former reads the description,
164+
“The image is divided horizontally by an undulating line between a
165+
cloudscape forming a nebula along the bottom portion and a comparatively
166+
clear upper portion”, while the latter states: “Speckled across both
167+
portions is a starfield, showing innumerable stars of many sizes. The
168+
smallest of these are small, distant, and faint points of light. The
169+
largest of these appear larger, closer, brighter, and more fully
170+
resolved with 8-point diffraction spikes. The upper portion of the image
171+
is bluish and has wispy translucent cloudlike streaks rising from the
172+
nebula below.”</p>
173+
<p>Upon observation, the automated alternative text presents a
174+
simplified iteration of the image, using the brief narration, “a nebula
175+
in space with stars”. As such, once again, this comparison supports
176+
that, while automated alternative text provided by machine learning is
177+
representative of the image being studied and could assist in delivering
178+
a basic and minimised understanding of an image, it does not have the
179+
ability to incorporate the orientation of detail required to capture the
180+
essence of the image. The same can be currently said for generative AI
181+
that incorporate AI into accessibility features. Using Google Gemini, it
182+
was found that the complexity of the image makes it difficult for
183+
current generative AI techniques to fully comprehend the detail of the
184+
image.</p>
185+
<p>Although machine learning techniques embedded in authoring tools and
186+
other platforms may provide some information, generative AI platforms
187+
that are able to create images, videos and other visual media content
188+
based on text input tend not to provide automated alternative text.
189+
Hence, this would make it difficult for people who are blind or have low
190+
vision to attain a meaningful interpretation of these AI-generated
191+
outputs.</p>
118192
</section>
119193
<section>
120194
<h2 id="automatic-speech-recognition-for-captioning">Automatic Speech Recognition for captioning</h2>
@@ -166,6 +240,7 @@ <h1 id="evaluation-tools">AI for evaluation tools & accessibility testing</h1>
166240
<section>
167241
<h1 id="accessibility-user-interface">AI and user interface generation</h1>
168242
<p>@@This section will discuss how AI can be used to create and/or modify the user interface. Some core things to consider: What need is being met when we ask AI to modify or change a UI? What does an MVP AI generated UI look like?. How will the quality of generated user interfaces be determined? Are there potential harms and anti-patterns that need to be considered?</p>
243+
<section>
169244
<h2 id="accessibility-overlays">Accessibility Overlays</h2>
170245
<p>The rapid increase of accessibility overlays on websites has been viewed as rather controversial by people with disability. While these tools could be useful for individuals unfamiliar with assistive technologies that are built into computing and mobile devices, critics of overlays point to the tools being marketed as an accessibility solution, thus causing the code to interrupt the use of more developed assistive technologies such as screen readers [[RN9]]. Furthermore, these overlay features carry the tendency to be limited in functionality as compared to tools installed in an operating system.</p>
171246
<p>However, the promise of generative AI may be able to address the criticism that such tools lack functionality. An accessibility overlay capable of utilising generative AI functionality may be able to provide increased real-time support in overcoming accessibility issues or improving its interpretation of content, such as for images, language and page structure. Although these tools are currently promoted as a collection of accessibility features somewhat independent from the content, the applicability of an overlay that contributes accessibility improvements is similar to the use of AI chatbots and other prompting mechanisms, thereby suggesting this may prove to be another area where generative AI could introduce improvements.</p>
@@ -185,6 +260,7 @@ <h2 id="accessible-web-portals"> Accessible web portals</h2>
185260
approach also opens the door for other user interfaces such as verbal
186261
interaction to achieve tasks, not currently provided by a vendor’s
187262
website.</p>
263+
</section>
188264
</section>
189265
<section>
190266
<h1 id="potential-harms-and-anti-patterns">Potential harms and anti-patterns in AI / ML</h1>

0 commit comments

Comments
 (0)