BCD-esque fiction data for api.Hightlight.type#8
BCD-esque fiction data for api.Hightlight.type#8Elchi3 wants to merge 4 commits intololaslab:masterfrom
Conversation
|
When we spoke a couple at Web DX last month, the general consensus was to not have this as something that's connected to Baseline because of the lift (& potential negative consequence) of changing the definition. @captainbrosset do I have that right? In general I think a BCD-like structure would be good since it's close to what the consumers already expect. A quick question, "partial_implementation": true,
"notes": "Truncates the line and only starts reading from the highlight."Is |
That's right. Changing the Baseline definition to account for other factors, such as accessibility, is an important decision that will need more time/discussions/availability of data. |
|
@Elchi3 Another question for you, how would "a without href" be represented? This is something I struggled to deal with in my example dataset too. I know this example is using a CSS API but for HTML, we'd likely want to document all the accessibility API mappings, including "a without href" |
example-data/api/Highlight.json
Outdated
| }, | ||
| "voiceover": { | ||
| "version_added": "26", | ||
| "devices": ["macOS"], |
There was a problem hiding this comment.
Another point in favor of making "browser x AT" its own "browser": you could mark device limitations at the browser level. For example, all voiceover statements would be intrinsically macOS only and all jaws statements would be intrinsically Windows only; validations against the upstream BCD could ignore Linux-only support statements relating to Chrome or Firefox.
There was a problem hiding this comment.
I was thinking that in ACD we always need "devices" as a required property to indicate where the data point has been tested in. If that turns out not to be needed then yes, this could be recorded at the browser level.
There was a problem hiding this comment.
I was thinking that in ACD we always need "devices" as a required property to indicate where the data point has been tested in.
As far as I know there isn't a screen-reader that works across multiple operating systems, at least not of the major 5 I know of:
- NVDA: Windows
- JAWS: Windows
- VoiceOver for Mac: MacOS
- Talkback: Android
- VoiceOver for iOS: iOS
However, that doesn't mean that there isn't one.
There was a problem hiding this comment.
Ahh that is very good to know! I think I will then withdraw my idea to always require the devices array as it is quite a lot of noise in the data. We should then rather use voiceover_macos and voiceover_ios to differentiate VoiceOver.
There was a problem hiding this comment.
Before you do so, I just need to confirm this as I'm getting mixed results. NVDA may also work on Linux.
There was a problem hiding this comment.
I think it is also fine if we decide to focus on the most important combinations for now. Some marketshare data could help to inform this.
I can think of these combinations as a start, but maybe we can reduce it further?
chrome/nvda
chrome/jaws
chrome/voiceover_macos
chrome/voiceover_ios
chrome_android/talkback
firefox/nvda
firefox/jaws
firefox/voiceover_macos
firefox/voiceover_ios
safari/nvda
safari/jaws
safari/voiceover_macos
safari_ios/voiceover_ios
example-data/api/Highlight.json
Outdated
| "firefox": { | ||
| "version_added": "140", | ||
| "nvda": { | ||
| "version_added": "1.4", | ||
| "devices": ["Linux", "Windows", "macOS"] | ||
| }, | ||
| "voiceover": { | ||
| "version_added": "26", | ||
| "devices": ["macOS"], | ||
| "partial_implementation": true, | ||
| "notes": "Truncates the line and only starts reading from the highlight." | ||
| } | ||
| }, |
There was a problem hiding this comment.
Fundamentally, I don't think nesting the data can be made to work. For instance, suppose NVDA 2026 comes out and says they're not going to support Firefox releases before 147. How would the data reflect that? I figure you kinda have to join up the data, something like this:
"firefox": [
{
"version_added": "147",
"nvda": {
"version_added": "2026"
}
},
{
"version_added": "140",
"nvda": {
"version_added": "2025",
"version_removed": "2026"
}
}
]
But that gets super weird if you add more ATs into the mix, where there's "new" data for ATs that haven't changed:
"firefox": [
{
"version_added": "147",
"nvda": {
"version_added": "2026"
},
"jaws": {
"version_added": "2025"
}
},
{
"version_added": "140",
"nvda": {
"version_added": "2025",
"version_removed": "2026"
},
"jaws": {
"version_added": "2025"
}
}
]
I'm left thinking that the only way to do this is to flatten the data, as in:
"firefox/nvda": [
{
"version_added": "147/2026",
},
{
"version_added": "140/2025",
"version_removed": "147/2026",
}
]
I think it would make consuming this data a bit easier (e.g., I could point compute-baseline at the data and treat each browser-AT pair as a browser, if ACD otherwise had the same schema as BCD).
But it comes at serious cost: testing becomes at very least an expensive matrix (e.g., testing old ATs with new browsers; testing old browsers with new ATs), but perhaps impossible (e.g., how do you do this without built-in support from BrowserStack).
The alternative would be to limit ACD to some subset of releases of both browsers and ATs (e.g., have some moving window where you drop older browsers/ATs from the dataset, though I suspect you'd have to do research into AT user behavior to find out where to draw the line). But I wonder whether any pairing other than "latest stable/latest stable" would be practical to test on an ongoing basis.
(Except latest/latest, the browser data would be kinda gnarly. Do you treat a new release of an AT working with an older browser as if it were a backport and ignore it? Or do you have a list of releases that expands backwards and forwards in time, every time there's a new release?)
(As an aside, I think it would be very smart for BCD to publish its schema as a package with its own versioning scheme, such that @mdn/browser-compat-data has it as a peer dependency. Then ACD and RCD could do the same—and extend the schema explicitly, as needed.)
There was a problem hiding this comment.
Thanks Daniel, I'm convinced that nesting will lead to trouble as you've outlined.
I've now flattened the data and introduced version strings in the "<browser_version>/<at_version>" format that you just invented :)
I think we always want to try to find earliest/earliest. I don't know, but maybe a browser doesn't need to do anything special and has an implementation of a web platform feature from version 1 that is now accessible thanks to the AT software. I think we should then say "1/2026".
If we are unable to determine earliest versions then we could have "≤120/2026". In this example here, the feature builds upon the custom highlights feature, where we know custom highlights were only introduced in 105 in Firefox, so "105/2026" might make sense. I don't know if we could be clever in automation about testing certain milestone releases, like all ESR releases plus when the "parent" feature was introduced per BCD data, for example.
Agree to the aside. Something to think about as we move this along.
There was a problem hiding this comment.
Also, I wonder if we want to allow "140/false" or "false/2026. What does that mean?
How would we know that it is either the browser or the AT software that is ready in theory?
There was a problem hiding this comment.
maybe a browser doesn't need to do anything special and has an implementation of a web platform feature from version 1 that is now accessible thanks to the AT software. I think we should then say "1/2026".
I agree with this.
In this example here, the feature builds upon the custom highlights feature, where we know custom highlights were only introduced in 105 in Firefox
We need to be careful that we're not conflating a web feature being released in the browser with when the feature is exposed to the necessary accessibility APIs. For this dataset, we're not concerned about when the feature becomes available in the browser, we're interested in when it's available through accessibility mechanisms in the browser i.e. is it in the accessibility tree in the way we expect it to be? Which means it's possible for a web feature to be available in browsers but not have/respond to proper accessibility semantics and/or not in the a-tree.
For example custom highlights API was introduced in Chrome v105 but it hasn't been exposed to the accessibility API (see thread).
This complicates things when we're discussing what "the earliest" is in regards to the browser because it means that the BCD table and ACD table could possibly say different things about browser availability of a feature.
There was a problem hiding this comment.
Also, I wonder if we want to allow "140/false" or "false/2026. What does that mean?
How would we know that it is either the browser or the AT software that is ready in theory?
I think not or at least not in a regular version_{added,removed} field. I suppose you could use some sort of partial_implementation-style model to represent "underlying engine OK, overlaying AT not OK" situation, but that seems needless complex. Surely developers care about the sum of the browser and AT, not about the theoretical underpinnings of exposure to AT?
example-data/api/Highlight.json
Outdated
| "partial_implementation": true, | ||
| "notes": "Truncates the line and only starts reading from the highlight." |
There was a problem hiding this comment.
Partials seems like an area where ACD would have a huge advantage over BCD. By having a more specific scope, I'd expect ACD to extend the BCD schema to handle this in a less general way. I don't know a lot about accessibility, but I'd imagine that an accessibility expert could categorize AT failures. I'm guessing here, but I'd imagine things like this, instead of reusing BCD's legacy partial implementations and notes.
at_failures: [
{
"category": "bad-context",
"description": "Truncates the line and only starts reading from the highlight",
"bug": "https://…"
}
]
at_failures: [
{
"category": "hidden-content",
"description": "Doesn't read the foo attribute",
"bug": "https://…"
}
]
at_failures: [
{
"category": "traversal-errors",
"description": "Reads the list of values in reverse order.",
"bug": "https://…"
}
]
There was a problem hiding this comment.
I added a failures array. Intentionally generic because maybe BCD could also use failures arrays :) Good point about not jumping on the old partial implementation concept that BCD currently uses.
| { | ||
| "api": { | ||
| "Highlight": { | ||
| "type": { |
There was a problem hiding this comment.
api.Hightlight.type is a key that BCD is using already. Do we want this sort of key matching to indicate that this complements the BCD information?
Consumers would query:
bcd.api.Hightlight.type
acd.api.Hightlight.type
Or should we instead always create unique keys?
bcd.api.Hightlight.type
acd.api.Hightlight.type_exposure
There was a problem hiding this comment.
I think I'd need to understand more about why the keys are the way they are and if ACD loses/gains anything by adopting the same structure.
There was a problem hiding this comment.
I think by sticking roughly to BCD's structure you gain some sort of belonging for the data. The is maybe not so much of a concern now that we can also tag the data with a web-feature id, but it may still be useful for data consumers. BCD does this so you do queries like "http.headers.*" and then get the BCD for all HTTP headers.
Added example data for this. Again, if we follow the BCD structures, this would be a key in the form of acd.html.elements.a.a_without_href. But maybe AAM is something recurring, so we could group it as under "aam" trees, for example.
|
This data is fiction.
I was wondering if we could come to a BCD-like data format for ACD. Based on a discussion shared with me from mastodon: https://mas.to/@patrickbrosset/114782908880409941
Basically, the idea here is that such a structure could complement the information we have in BCD (
api.Highlight.type) and therefore offer a direct hook to any BCD consumers (such as web-features or "baseline").