Skip to content

Add new accessibility tool via integrating AXorcist #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 66 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
44091b3
Added vibecoded accessibility API
mitsuhiko May 16, 2025
99f9594
Switch to release mode
mitsuhiko May 16, 2025
5464ec1
Fix accessibility_query.md KB validation by changing language to java…
steipete May 19, 2025
a783fd1
Fixes requesting accessibility permissions on first run
steipete May 19, 2025
8bfffaa
Change build script to make a universal binary
steipete May 19, 2025
bdf9bec
Add missing app name helper
steipete May 19, 2025
1d0d647
Make sure binary is copied into release
steipete May 19, 2025
6f94d4d
Add limit and execution time feature, reformat description
steipete May 19, 2025
d6f0729
switch to camel_case
steipete May 19, 2025
c115366
Add changelog for new ax query
steipete May 19, 2025
da5392e
Massage the Swift file and add a help parameter and a version.
steipete May 20, 2025
c67b678
Fix the build script to not use thin_lto
steipete May 20, 2025
27ca93b
Further reduce binary size
steipete May 20, 2025
4c4a4ad
Check in ax binary
steipete May 20, 2025
b11d420
Improve build script for clean builds
steipete May 20, 2025
2e8b3f7
Explain new accessibility tool
steipete May 20, 2025
10ca087
Make executor more lenient and add more features
steipete May 20, 2025
364f32f
Add accessbility debugging rule
steipete May 20, 2025
21695e6
Improve script runner reliability
steipete May 20, 2025
bd0d328
Add options and make parsing more lenient
steipete May 20, 2025
d1e57cc
Refactor and greatly improve AXHelper
steipete May 20, 2025
5cef0b4
Handle attributed strings
steipete May 20, 2025
1da2520
more refinements
steipete May 20, 2025
f3bbf25
fixes compile issues
steipete May 20, 2025
dfba531
Add various debug rules
steipete May 20, 2025
0d3656d
script claude desktop
steipete May 20, 2025
1f977b6
Major refactorings and logic improvements
steipete May 20, 2025
4e5388b
Improve ax handling
steipete May 20, 2025
853cc53
further object orient this
steipete May 20, 2025
70f31bd
Major refactorings to AXHelper
steipete May 20, 2025
423ed51
Update changelog
steipete May 20, 2025
d99ae3b
Allow search on computed properties
steipete May 20, 2025
8cc14ea
Major refactorings and structure changes
steipete May 20, 2025
2e22769
Drop the AX where applicable
steipete May 20, 2025
2b5ab95
Various optimizations and LOC reduction
steipete May 20, 2025
6c9a59d
Add support for focussed app
steipete May 20, 2025
03557b8
Further refactors
steipete May 20, 2025
522ee0b
Improve capability of AnyCodable
steipete May 20, 2025
fca38ff
Refresh learning
steipete May 20, 2025
4db5900
delete dupe
steipete May 20, 2025
aadfd80
Revise learning
steipete May 20, 2025
1a740d6
Fixes TextEdit ax
steipete May 20, 2025
cc1a6d1
Support calling from the command line next to stdin
steipete May 20, 2025
30afffb
Revise readme
steipete May 20, 2025
9680b1b
Move classes into new AXorcist lib
steipete May 20, 2025
8e82197
Move files into new structure
steipete May 21, 2025
9fd467e
Rule updates
steipete May 21, 2025
a02278f
Add Package.resolved to gitignore
steipete May 21, 2025
4970c51
Improve command line tools
steipete May 21, 2025
a89f822
Greatly improve test suite
steipete May 21, 2025
89953c1
Update agent rules
steipete May 21, 2025
8449780
Update rules
steipete May 21, 2025
0b0f216
Add lots of tests
steipete May 21, 2025
31e268f
Migrate to Terminator
steipete May 21, 2025
d6e8aad
Delete AXspector
steipete May 22, 2025
a1c736b
Refactoring
steipete May 22, 2025
27d74f3
Integrate AXorcist as git submodule and rename ax to axorc
steipete May 22, 2025
88bd91f
Replace AXorcist git submodule with symlink for development
steipete May 22, 2025
ef32b62
Update rule
steipete May 22, 2025
ff05e5a
Remove AXorcist symlink and ignore axorc binary
steipete May 22, 2025
2ea32a5
Refine instructions
steipete May 22, 2025
734e083
Restore AXorcist symlink and add binary management docs
steipete May 22, 2025
2edfc3c
Update agent rules with AXorcist binary management docs
steipete May 22, 2025
5e5daf7
Update agent rules
steipete May 22, 2025
a12443c
Update agent and runner
steipete May 22, 2025
920baea
Transform README with robot theme and punchy headline
steipete May 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions .cursor/rules/agent.mdc
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
---
description:
globs:
alwaysApply: false
---
# Agent Instructions

This file provides guidance to AI assistants when working with code in this repository.
Expand All @@ -22,6 +27,44 @@ The knowledge base (`knowledge_base/` directory) contains numerous Markdown file
- The actual script code is contained in the Markdown body in a fenced code block
- Scripts can use placeholders like `--MCP_INPUT:keyName` and `--MCP_ARG_N` for parameter substitution

## General useful for debug loops:
- The Claude Code tool is helpful to analyze large logs or do complex file edits.
- Pipe scripts that output lots of logs into a file for easier reading.
- Use AppleScript to execute apps such as Claude Code to test the mcp.
(This requires that the mcp is correctly set up)
- Whenever you want to ask something to the user, ask Claude Code first instead.
- Use AppleScript to find the bundle identifier for an app name

- To run any terminal command, use `osascript .cursor/scripts/terminator.scpt`.
Call it without arguments to understand syntax.
Call it with just your tag and it will return the log.

- read_file, write_file, move_file all need absolute paths!

- To run tests for AXorcist reliable, use `run_tests.sh`.

- To test the stdin feature of `axorc`, you MUST use `axorc/axorc_runner.sh`.

## AXorcist Binary Management

The `axorc/` directory contains:
- `axorc`: The main AXorcist binary (tracked in git)
- `AXorcist`: Symlink to `/Users/steipete/Projects/CodeLooper/AXorcist` (DO NOT REMOVE - needed for builds)
- `axorc_runner.sh`: Wrapper script that tries multiple binary locations

To rebuild the AXorcist binary:
1. Ensure the `AXorcist` symlink exists (points to CodeLooper project)
2. Copy the latest binary from CodeLooper: `cp /Users/steipete/Projects/CodeLooper/AXorcist/.build/debug/axorc axorc/axorc`
3. Also copy to expected location: `cp axorc/axorc axorc/AXorcist/.build/debug/axorc`

- Long-term this project simply ships with a compiled binary of `axorc`.
During developent, we have a symlink of the `AXorcist` directory from another folder to simplify development across projects.

The symlink is ESSENTIAL for:
- Development workflow integration
- Build process access to source code
- Testing and debugging

## Common Development Commands

```bash
Expand Down
138 changes: 138 additions & 0 deletions .cursor/rules/axorc.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
---
description:
globs:
alwaysApply: false
---
# macOS Accessibility (`axorc`) Command-Line Tool

This document outlines the functionality, build process, testing procedures, and technical details of the `axorc` Swift command-line utility, designed for interacting with the macOS Accessibility framework.

## 1. `axorc` Overview

* **Purpose**: Provides a JSON-based interface to query UI elements and perform actions using the macOS Accessibility API. It's intended to be called by other processes. The core Swift library `AXorcist` handles the accessibility interactions.
* **Communication**: `axorc` reads JSON commands (via direct argument, stdin, or file) and writes JSON responses (or errors) to `stdout`. Debug information often goes to `stderr`.
* **Core `AXorcist` Library Commands (exposed by `axorc` via `CommandType` enum in `ax/AXorcist/Sources/AXorcist/Core/Models.swift`)**:
* `ping`: Checks if `axorc` is responsive.
* `getFocusedElement`: Retrieves information about the currently focused UI element in a target application.
* `query`: Retrieves information about specific UI element(s) matching locator criteria.
* `getAttributes`: Retrieves specific attributes for element(s) matching locator criteria.
* `describeElement`: Retrieves a comprehensive list of attributes for element(s) matching locator criteria.
* `collectAll`: Retrieves information about all UI elements matching criteria within a scope.
* `performAction`: Executes an action on a specified UI element.
* `extractText`: Extracts textual content from specified UI element(s).
* `batch`: Executes a sequence of sub-commands.
* **Key Input Fields (JSON - see `CommandEnvelope` in `ax/AXorcist/Sources/AXorcist/Core/Models.swift`)**:
* `command_id` (string): A unique identifier for the command, echoed in the response.
* `command` (string enum: `CommandType`): e.g., "ping", "getFocusedElement", "query", "getAttributes", "describeElement", "collectAll", "performAction", "extractText", "batch".
* `application` (string, optional): Bundle ID (e.g., "com.apple.TextEdit") or localized name of the target application. If omitted, behavior might depend on the command (e.g., `getFocusedElement` might try the system-wide focused app).
* `locator` (object, optional - see `Locator` in `ax/AXorcist/Sources/AXorcist/Core/Models.swift`): Specifies the target element(s) for commands like `query`, `getAttributes`, `describeElement`, `performAction`, `extractText`, `collectAll`.
* `criteria` (object `[String: String]`): Key-value pairs of attributes to match (e.g., `{"AXRole": "AXWindow", "AXTitle":"My Window"}`).
* `match_all` (boolean, optional): If true, all criteria must match. If false or omitted, any criterion matching is sufficient (behavior might vary by implementation).
* `root_element_path_hint` (array of strings, optional): A pathHint to find a container element from which the locator criteria will be applied.
* `requireAction` (string, optional): Filters results to elements supporting a specific action (e.g., "AXPress").
* `computed_name_contains` (string, optional): Filters elements whose computed name (derived from title, value, etc.) contains the given string.
* `attributes` (array of strings, optional): For commands like `getFocusedElement`, `query`, `getAttributes`, `collectAll`, specifies which attributes to retrieve. Defaults to a common set if omitted.
* `path_hint` (array of strings, optional): A path to navigate the UI tree (e.g., `["window[0]", "button[AXTitle=OK]"]`) to find a target element or a base for the `locator`. (Exact path syntax may evolve).
* `action_name` (string, optional): For `performAction` command, the action to execute (e.g., "AXPress", "AXSetValue").
* `action_value` (any, optional, via `AnyCodable`): For `performAction` with actions like "AXSetValue", this is the value to set (e.g., a string, number, boolean).
* `sub_commands` (array of `CommandEnvelope` objects, optional): For the `batch` command, contains the sequence of commands to execute.
* `max_elements` (int, optional): For `collectAll`, can limit the number of elements returned. Also used as max depth in some search operations.
* `output_format` (string enum `OutputFormat`, optional): For attribute retrieval, can be "smart", "verbose", "text_content", "json_string". From `ax/AXorcist/Sources/AXorcist/Core/Models.swift`.
* `debug_logging` (boolean, optional): If `true`, `axorc` and `AXorcist` include detailed internal debug logs in the response and/or stderr.
* `payload` (object `[String: String]`, optional): Legacy field, primarily for `ping` compatibility to echo back simple data.
* **Key Output Fields (JSON - see response structs in `ax/AXorcist/Sources/axorc/axorc.swift` which wrap `AXorcist.HandlerResponse`)**:
* All responses generally include `command_id` (string), `success` (boolean), and `debug_logs` (array of strings, optional).
* `SimpleSuccessResponse` (for `ping`): Contains `status`, `message`, `details`.
* `QueryResponse` (for `getFocusedElement`, `query`, `getAttributes`, `describeElement`, `collectAll`, `performAction`, `extractText`):
* `command` (string): The original command type.
* `data` (object `AXElementForEncoding`, optional): Contains the primary accessibility element data.
* `attributes` (object `[String: AnyCodable]`): Dictionary of element attributes.
* `path` (array of strings, optional): Path to the element.
* `error` (object `ErrorDetail`, optional): Contains an error `message` if `success` is false.
* `BatchOperationResponse` (for `batch`):
* `results` (array of `QueryResponse` objects): One for each sub-command.
* `ErrorResponse` (for input errors, decoding errors, or unhandled command types):
* `error` (object `ErrorDetail`): Contains an error `message`.

## 2. Functionality - How it Works

The `axorc` binary (`ax/AXorcist/Sources/axorc/main.swift`) is the command-line entry point. It parses input, decodes the JSON `CommandEnvelope`, and then calls methods on an instance of the `AXorcist` class (from `ax/AXorcist/Sources/AXorcist/AXorcist.swift`). The `AXorcist` library handles the core accessibility interactions.

* **`AXorcist` Library**:
* Located in `ax/AXorcist/Sources/AXorcist/`.
* `AXorcist.swift`: Contains the main class and handler methods for each command type (e.g., `handleGetFocusedElement`, `handleQuery`, `handlePerformAction`).
* `Core/Models.swift`: Defines `CommandEnvelope`, `Locator`, `HandlerResponse`, `AXElement` (for data representation), `AnyCodable`, `OutputFormat`, etc.
* `Core/Element.swift`: Defines `AXorcist.AXElement` which is a wrapper around `AXUIElement` and is used internally by `AXorcist` and in `HandlerResponse.data`.
* `Search/ElementSearch.swift`: Contains logic for finding UI elements based on locators, path hints, and criteria (e.g., depth-first search, attribute matching).
* `Core/AccessibilityPermissions.swift`: Handles checking for necessary permissions.
* `Core/ProcessUtils.swift`: Utilities for finding application PIDs.
* Many functions interacting with `AXUIElement` are marked `@MainActor`.

* **Application Targeting**:
* `AXorcist` uses `ProcessUtils.swift` to find the `pid_t` for a given application bundle ID or name.
* `AXUIElementCreateApplication(pid)` gets the root application `AXUIElement`.

* **Element Location**:
* Typically handled by methods in `AXorcist.swift` or `Search/ElementSearch.swift`.
* Uses locators (`criteria`, `requireAction`, etc.) and `path_hint`.
* Involves traversing the accessibility tree (e.g., an element's `kAXChildrenAttribute`).

* **Attribute Retrieval**:
* `AXorcist`'s `getElementAttributes` (internal helper) fetches attributes for an `AXUIElement`.
* Converts `CFTypeRef` values to Swift types, often using `AnyCodable` for the `attributes` dictionary in `AXorcist.AXElement`.
* Handles `AXValue` types (like position/size).
* May generate synthetic attributes like "ComputedName" or "AXActionNames".

* **Action Performing**:
* `AXorcist` checks if an action is supported (e.g., via `kAXActionNamesAttribute`).
* Uses `AXUIElementPerformAction` or `AXUIElementSetAttributeValue` (for "AXSetValue").

* **Error Handling**:
* `AXorcist` handler methods return a `HandlerResponse` which includes an optional error string.
* `axorc` wraps this into its JSON error structures.

* **Threading**:
* Core Accessibility API calls are dispatched to the `@MainActor` by `AXorcist`.

* **Debugging**:
* The `debug_logging: true` in the input JSON enables verbose logging.
* Logs are collected by `AXorcist` and passed back in `HandlerResponse.debug_logs`.
* `axorc` includes these in its final JSON output's `debug_logs` field and may also print to `stderr` using `fputs`.

## 3. Build Process

* **Swift Package Manager**: `axorc` is built using SPM from the package in `ax/AXorcist/`.
* `ax/AXorcist/Package.swift` defines the "axorc" executable product and the "AXorcist" library product.
* **Output**: The executable is typically found in `ax/AXorcist/.build/debug/axorc` or `ax/AXorcist/.build/release/axorc`.

## 4. Running & Testing

* **Direct Execution**:
```bash
cd /path/to/your/project/ax/AXorcist/
swift build # if not already built
./.build/debug/axorc '{ "command_id":"ping1", "command":"ping" }'
```
* **Via `terminator.scpt` (Example for consistency)**:
It is recommended to use a consistent tag (e.g., "axorc_ops") when using `terminator.scpt` to reuse the same terminal window/tab.
```bash
# First command with a new tag (establishes session, cds, runs command)
osascript /path/to/.cursor/scripts/terminator.scpt "axorc_ops" "cd /Users/steipete/Projects/macos-automator-mcp/ax/AXorcist/ && ./.build/debug/axorc --debug '{ \"command_id\": \"claude-ping\", \"command\": \"ping\" }'"

# Subsequent commands with the same tag
osascript /path/to/.cursor/scripts/terminator.scpt "axorc_ops" ".build/debug/axorc '{ \"command_id\": \"claude-getfocused\", \"command\": \"getFocusedElement\", \"application\": \"com.anthropic.claudefordesktop\" }'"
```
* **Input Methods for `axorc`**:
* Direct argument (last argument on the command line, must be valid JSON).
* `--stdin`: Reads JSON from standard input.
* `--file /path/to/file.json`: Reads JSON from a specified file.
* **Permissions**: The process executing `axorc` (e.g., Terminal, or your calling application) **must** have "Accessibility" permissions in "System Settings > Privacy & Security > Accessibility". `AXorcist` calls `AccessibilityPermissions.checkAccessibilityPermissions()` on startup.

## 5. macOS Accessibility (AX) Intricacies

* **Frameworks**: `ApplicationServices` (for C APIs like `AXUIElement...`), `AppKit` (for `NSRunningApplication`).
* **`AXUIElement`**: The core C type representing an accessible UI element.
* **Attributes & `CFTypeRef`**: Values are `CFTypeRef`. Handled by `AXorcist.AnyCodable` for JSON serialization.
* **Tooling**: **Accessibility Inspector** (Xcode > Open Developer Tool) is vital for inspecting UI elements and their properties.

This document reflects the structure and functionality of the `axorc` tool and its underlying `AXorcist` library.
Loading
Loading