Skip to content

BCD-esque fiction data for api.Hightlight.type#8

Draft
Elchi3 wants to merge 4 commits intololaslab:masterfrom
Elchi3:bcd-esque-acd
Draft

BCD-esque fiction data for api.Hightlight.type#8
Elchi3 wants to merge 4 commits intololaslab:masterfrom
Elchi3:bcd-esque-acd

Conversation

@Elchi3
Copy link
Copy Markdown

@Elchi3 Elchi3 commented Jul 28, 2025

This data is fiction.

I was wondering if we could come to a BCD-like data format for ACD. Based on a discussion shared with me from mastodon: https://mas.to/@patrickbrosset/114782908880409941

Basically, the idea here is that such a structure could complement the information we have in BCD (api.Highlight.type) and therefore offer a direct hook to any BCD consumers (such as web-features or "baseline").

@lolaodelola
Copy link
Copy Markdown
Collaborator

When we spoke a couple at Web DX last month, the general consensus was to not have this as something that's connected to Baseline because of the lift (& potential negative consequence) of changing the definition. @captainbrosset do I have that right?

In general I think a BCD-like structure would be good since it's close to what the consumers already expect. A quick question,

"partial_implementation": true,
 "notes": "Truncates the line and only starts reading from the highlight."

Is notes here used as a way to explain the partial implementation?

@captainbrosset
Copy link
Copy Markdown

When we spoke a couple at Web DX last month, the general consensus was to not have this as something that's connected to Baseline because of the lift (& potential negative consequence) of changing the definition. @captainbrosset do I have that right?

That's right. Changing the Baseline definition to account for other factors, such as accessibility, is an important decision that will need more time/discussions/availability of data.
Florian's proposal here is interesting as it would make the eventual consumption of the accessibility data from projects that already rely on BCD much easier.

@lolaodelola
Copy link
Copy Markdown
Collaborator

lolaodelola commented Jul 28, 2025

@Elchi3 Another question for you, how would "a without href" be represented? This is something I struggled to deal with in my example dataset too. I know this example is using a CSS API but for HTML, we'd likely want to document all the accessibility API mappings, including "a without href"

},
"voiceover": {
"version_added": "26",
"devices": ["macOS"],
Copy link
Copy Markdown

@ddbeck ddbeck Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another point in favor of making "browser x AT" its own "browser": you could mark device limitations at the browser level. For example, all voiceover statements would be intrinsically macOS only and all jaws statements would be intrinsically Windows only; validations against the upstream BCD could ignore Linux-only support statements relating to Chrome or Firefox.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that in ACD we always need "devices" as a required property to indicate where the data point has been tested in. If that turns out not to be needed then yes, this could be recorded at the browser level.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that in ACD we always need "devices" as a required property to indicate where the data point has been tested in.

As far as I know there isn't a screen-reader that works across multiple operating systems, at least not of the major 5 I know of:

  • NVDA: Windows
  • JAWS: Windows
  • VoiceOver for Mac: MacOS
  • Talkback: Android
  • VoiceOver for iOS: iOS

However, that doesn't mean that there isn't one.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh that is very good to know! I think I will then withdraw my idea to always require the devices array as it is quite a lot of noise in the data. We should then rather use voiceover_macos and voiceover_ios to differentiate VoiceOver.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before you do so, I just need to confirm this as I'm getting mixed results. NVDA may also work on Linux.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is also fine if we decide to focus on the most important combinations for now. Some marketshare data could help to inform this.

I can think of these combinations as a start, but maybe we can reduce it further?

chrome/nvda
chrome/jaws
chrome/voiceover_macos
chrome/voiceover_ios
chrome_android/talkback
firefox/nvda
firefox/jaws
firefox/voiceover_macos
firefox/voiceover_ios
safari/nvda
safari/jaws
safari/voiceover_macos
safari_ios/voiceover_ios

Comment on lines +25 to +37
"firefox": {
"version_added": "140",
"nvda": {
"version_added": "1.4",
"devices": ["Linux", "Windows", "macOS"]
},
"voiceover": {
"version_added": "26",
"devices": ["macOS"],
"partial_implementation": true,
"notes": "Truncates the line and only starts reading from the highlight."
}
},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fundamentally, I don't think nesting the data can be made to work. For instance, suppose NVDA 2026 comes out and says they're not going to support Firefox releases before 147. How would the data reflect that? I figure you kinda have to join up the data, something like this:

"firefox": [
  {
    "version_added": "147",
    "nvda": {
      "version_added": "2026"
    }
  },
  {
    "version_added": "140",
    "nvda": {
      "version_added": "2025",
      "version_removed": "2026"
    }
  }
]

But that gets super weird if you add more ATs into the mix, where there's "new" data for ATs that haven't changed:

"firefox": [
  {
    "version_added": "147",
    "nvda": {
      "version_added": "2026"
    },
    "jaws": {
      "version_added": "2025"
    }
  },
  {
    "version_added": "140",
    "nvda": {
      "version_added": "2025",
      "version_removed": "2026"
    },
    "jaws": {
      "version_added": "2025"
    }
  }
]

I'm left thinking that the only way to do this is to flatten the data, as in:

"firefox/nvda": [
  {
    "version_added": "147/2026",
  },
  {
    "version_added": "140/2025",
    "version_removed": "147/2026",
  }
]

I think it would make consuming this data a bit easier (e.g., I could point compute-baseline at the data and treat each browser-AT pair as a browser, if ACD otherwise had the same schema as BCD).

But it comes at serious cost: testing becomes at very least an expensive matrix (e.g., testing old ATs with new browsers; testing old browsers with new ATs), but perhaps impossible (e.g., how do you do this without built-in support from BrowserStack).

The alternative would be to limit ACD to some subset of releases of both browsers and ATs (e.g., have some moving window where you drop older browsers/ATs from the dataset, though I suspect you'd have to do research into AT user behavior to find out where to draw the line). But I wonder whether any pairing other than "latest stable/latest stable" would be practical to test on an ongoing basis.

(Except latest/latest, the browser data would be kinda gnarly. Do you treat a new release of an AT working with an older browser as if it were a backport and ignore it? Or do you have a list of releases that expands backwards and forwards in time, every time there's a new release?)

(As an aside, I think it would be very smart for BCD to publish its schema as a package with its own versioning scheme, such that @mdn/browser-compat-data has it as a peer dependency. Then ACD and RCD could do the same—and extend the schema explicitly, as needed.)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Daniel, I'm convinced that nesting will lead to trouble as you've outlined.
I've now flattened the data and introduced version strings in the "<browser_version>/<at_version>" format that you just invented :)

I think we always want to try to find earliest/earliest. I don't know, but maybe a browser doesn't need to do anything special and has an implementation of a web platform feature from version 1 that is now accessible thanks to the AT software. I think we should then say "1/2026".

If we are unable to determine earliest versions then we could have "≤120/2026". In this example here, the feature builds upon the custom highlights feature, where we know custom highlights were only introduced in 105 in Firefox, so "105/2026" might make sense. I don't know if we could be clever in automation about testing certain milestone releases, like all ESR releases plus when the "parent" feature was introduced per BCD data, for example.

Agree to the aside. Something to think about as we move this along.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I wonder if we want to allow "140/false" or "false/2026. What does that mean?
How would we know that it is either the browser or the AT software that is ready in theory?

Copy link
Copy Markdown
Collaborator

@lolaodelola lolaodelola Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a browser doesn't need to do anything special and has an implementation of a web platform feature from version 1 that is now accessible thanks to the AT software. I think we should then say "1/2026".

I agree with this.

In this example here, the feature builds upon the custom highlights feature, where we know custom highlights were only introduced in 105 in Firefox

We need to be careful that we're not conflating a web feature being released in the browser with when the feature is exposed to the necessary accessibility APIs. For this dataset, we're not concerned about when the feature becomes available in the browser, we're interested in when it's available through accessibility mechanisms in the browser i.e. is it in the accessibility tree in the way we expect it to be? Which means it's possible for a web feature to be available in browsers but not have/respond to proper accessibility semantics and/or not in the a-tree.

For example custom highlights API was introduced in Chrome v105 but it hasn't been exposed to the accessibility API (see thread).

This complicates things when we're discussing what "the earliest" is in regards to the browser because it means that the BCD table and ACD table could possibly say different things about browser availability of a feature.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I wonder if we want to allow "140/false" or "false/2026. What does that mean?
How would we know that it is either the browser or the AT software that is ready in theory?

I think not or at least not in a regular version_{added,removed} field. I suppose you could use some sort of partial_implementation-style model to represent "underlying engine OK, overlaying AT not OK" situation, but that seems needless complex. Surely developers care about the sum of the browser and AT, not about the theoretical underpinnings of exposure to AT?

Comment on lines +51 to +52
"partial_implementation": true,
"notes": "Truncates the line and only starts reading from the highlight."
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partials seems like an area where ACD would have a huge advantage over BCD. By having a more specific scope, I'd expect ACD to extend the BCD schema to handle this in a less general way. I don't know a lot about accessibility, but I'd imagine that an accessibility expert could categorize AT failures. I'm guessing here, but I'd imagine things like this, instead of reusing BCD's legacy partial implementations and notes.

at_failures: [
  {
    "category": "bad-context",
    "description": "Truncates the line and only starts reading from the highlight",
    "bug": "https://…"
  }
]
at_failures: [
  {
    "category": "hidden-content",
    "description": "Doesn't read the foo attribute",
    "bug": "https://…"
  }
]
at_failures: [
  {
    "category": "traversal-errors",
    "description": "Reads the list of values in reverse order.",
    "bug": "https://…"
  }
]

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a failures array. Intentionally generic because maybe BCD could also use failures arrays :) Good point about not jumping on the old partial implementation concept that BCD currently uses.

{
"api": {
"Highlight": {
"type": {
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

api.Hightlight.type is a key that BCD is using already. Do we want this sort of key matching to indicate that this complements the BCD information?

Consumers would query:
bcd.api.Hightlight.type
acd.api.Hightlight.type

Or should we instead always create unique keys?
bcd.api.Hightlight.type
acd.api.Hightlight.type_exposure

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd need to understand more about why the keys are the way they are and if ACD loses/gains anything by adopting the same structure.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think by sticking roughly to BCD's structure you gain some sort of belonging for the data. The is maybe not so much of a concern now that we can also tag the data with a web-feature id, but it may still be useful for data consumers. BCD does this so you do queries like "http.headers.*" and then get the BCD for all HTTP headers.

@Elchi3
Copy link
Copy Markdown
Author

Elchi3 commented Jul 29, 2025

@Elchi3 Another question for you, how would "a without href" be represented? This is something I struggled to deal with in my example dataset too. I know this example is using a CSS API but for HTML, we'd likely want to document all the accessibility API mappings, including "a without href"

Added example data for this. Again, if we follow the BCD structures, this would be a key in the form of acd.html.elements.a.a_without_href. But maybe AAM is something recurring, so we could group it as under "aam" trees, for example.

  • acd.html.elements.a.aam.a_without_href
  • acd.html.elements.a.aam.some_other_mapping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants