diff --git a/docs/superpowers/specs/2026-04-09-wayland-compositor-event-interface-design.md b/docs/superpowers/specs/2026-04-09-wayland-compositor-event-interface-design.md new file mode 100644 index 0000000..ab02403 --- /dev/null +++ b/docs/superpowers/specs/2026-04-09-wayland-compositor-event-interface-design.md @@ -0,0 +1,355 @@ +# Wayland Compositor Event Interface For Churn Reduction + +## Summary + +Revise Cthulhu's event interface so Wayland session state can reduce AT-SPI churn before it reaches the current object-event pipeline. Keep `AT-SPI` authoritative for object semantics, text, selection, and actions. Add a small compositor-state normalization layer that is authoritative for desktop context such as active workspace, active top-level window, and focus routing during transitions. + +This is not a Newton-style redesign. It is a conservative performance-oriented split intended to reduce queue growth, redundant script activation, and processing of stale events on Wayland systems while preserving existing accessibility semantics. + +## Current State + +- Cthulhu's main event flow is still centered on `Atspi.EventListener` and a single `event_manager` queue in `src/cthulhu/event_manager.py`. +- The current design already spends significant effort filtering spam, handling floods and deluges, recovering focus context, and pruning duplicate events. +- `src/cthulhu/input_event_manager.py` already uses `Atspi.Device` for keyboard handling and pointer monitoring wrappers, so Cthulhu already has one narrow example of a side-channel interface on Wayland. +- `src/cthulhu/focus_manager.py` and `src/cthulhu/cthulhu_state.py` still derive active-window and focus truth primarily from AT-SPI objects and events. +- `src/cthulhu/wnck_support.py` correctly gates `Wnck` to X11-only use, which is a useful precedent for runtime capability-based backend selection. +- Recent work in this repository has improved individual Wayland features, but there is not yet a compositor-agnostic interface for desktop state that can suppress irrelevant object churn before it enters the hot path. + +## Goals + +- Reduce AT-SPI event churn on Wayland systems before it reaches the main queue. +- Keep the design generic and compositor-agnostic at the Cthulhu interface level. +- Preserve current `AT-SPI` object semantics and script behavior as much as possible. +- Improve prioritization of focus and active-window related work during workspace and window transitions. +- Provide a single internal contract that can support Mutter, KWin, and wlroots-based compositors without teaching the rest of Cthulhu about compositor brands. +- Fail safe to current behavior when compositor state is missing, incomplete, or inconsistent. + +## Non-Goals + +- No Newton-style compositor-authoritative accessibility tree transport. +- No replacement of AT-SPI for object semantics, text, selection, caret movement, or actions. +- No dependency on `niri` IPC, GNOME Shell extensions, or KWin scripts in the core architecture for phase one. +- No requirement that every supported compositor expose the same amount of state. +- No broad rewrite of script listeners or script modules in this pass. + +## Approaches Considered + +### 1. Direct Wayland Integration Inside `event_manager` + +Teach `event_manager` to consume shared Wayland state directly and add compositor-specific checks inside its current logic. + +Pros: + +- smallest code footprint up front +- no new internal abstraction layer + +Cons: + +- quickly leaks compositor-specific capability checks into the hottest code path +- makes Mutter, KWin, and wlroots differences part of `event_manager` logic +- harder to test in isolation + +### 2. Thin Compositor-State Adapter Layer + +Introduce a small internal normalization layer that consumes shared compositor-facing signals and emits a compact generic event vocabulary to `event_manager`. + +Pros: + +- keeps compositor-specific capability detection out of the hot path +- gives `event_manager` one stable interface to consume +- matches the long-term direction of consuming normalized state below the main screen reader logic +- easy to test with mocked backends + +Cons: + +- adds one more internal subsystem +- requires careful definition of authority boundaries to avoid duplication with AT-SPI + +### 3. Full Compositor-Authoritative Routing + +Make compositor state the primary truth for most event routing and use AT-SPI only for fine-grained object semantics. + +Pros: + +- highest potential performance ceiling +- most direct path toward a future compositor-led architecture + +Cons: + +- much higher correctness risk +- too large a shift for a churn-focused first pass +- would require significantly more backend-specific coverage and recovery logic + +## Recommendation + +Implement approach 2. + +For phase one, Cthulhu should add a thin compositor-state adapter layer. It should be authoritative for desktop context and transition hints, but not for accessible object semantics. This gives Cthulhu a generic way to suppress or reprioritize AT-SPI noise without requiring a full architecture rewrite. + +## Design + +### Authority Split + +The revised interface should split responsibility cleanly: + +- compositor-state adapter is authoritative for: + - active desktop context + - active workspace set + - active top-level window identity when available + - desktop transition start and end + - focus-routing hints during compositor-driven changes +- AT-SPI remains authoritative for: + - focused accessible object + - accessible roles, names, and states + - text, caret, and selection semantics + - actionable objects and accessibility events consumed by scripts + +Inference from upstream: current Orca remains AT-SPI-authoritative in its main pipeline, while newer GNOME accessibility work moves normalization below Orca rather than throughout it. This design follows that same direction without depending on Newton itself. + +### New Internal Boundary + +Add a new internal interface named `CompositorStateAdapter`. + +Responsibilities: + +- detect and activate the best available compositor-state backend at runtime +- normalize raw backend signals into a small generic event vocabulary +- maintain a current desktop-context snapshot +- emit state deltas and control hints to `event_manager` +- degrade to no-op behavior when capabilities are absent or unclear + +This adapter should not expose compositor-specific event names or objects outside its own implementation. + +### Normalized Event Vocabulary + +The adapter should emit two families of signals. + +State deltas: + +- `workspace_state_changed` +- `desktop_focus_context_changed` +- `desktop_transition_started` +- `desktop_transition_finished` + +Control hints: + +- `pause_atspi_churn` +- `resume_atspi_churn` +- `prioritize_focus` +- `deprioritize_context` +- `flush_stale_atspi_events` + +This vocabulary is intentionally small. It exists to shape queueing and prioritization, not to mirror every compositor event. + +### Desktop Context Snapshot + +The adapter should maintain a compact snapshot object with fields along these lines: + +- `session_type` +- `backend_name` +- `active_workspace_ids` +- `active_window_token` +- `focus_route_token` +- `transition_active` +- `timestamp` + +`active_window_token` and `focus_route_token` are intentionally generic. They should be comparable identifiers, not compositor-native objects leaking into the rest of Cthulhu. + +### Backend Selection + +Phase one should use capability-driven backend selection with generic shared interfaces first. + +Preferred order: + +1. `WaylandSharedProtocolsBackend` +2. `AtspiContextBackend` +3. `NullBackend` + +#### `WaylandSharedProtocolsBackend` + +This backend should consume shared Wayland-facing protocols where available and normalize them for Cthulhu. The first protocol target should be `ext_workspace_v1`. + +Reasoning: + +- `ext_workspace_v1` provides workspace groups, workspaces, active state, and atomic `done` notifications, which is exactly the sort of low-volume desktop-state signal Cthulhu needs to reason about transitions. +- As of April 9, 2026, Wayland Explorer lists `ext_workspace_v1` support for Mutter `49.2`, KWin `6.6`, and niri `25.11`, making it a good cross-family starting point. + +This backend should only expose normalized state to the rest of Cthulhu. It should not expose protocol objects or protocol-specific state transitions outside the backend. + +#### `AtspiContextBackend` + +This backend should use current AT-SPI-based context recovery when shared Wayland state is missing or insufficient. It does not improve churn by itself, but it preserves current behavior and keeps the adapter contract usable everywhere. + +#### `NullBackend` + +This backend should emit no compositor hints and leave the rest of Cthulhu in its current behavior. It is the fail-safe path. + +### Explicit Exclusions For Phase One + +The following are not part of the core architecture in phase one: + +- `niri` IPC as a first-class public interface +- compositor-specific `D-Bus` integrations +- GNOME Shell extension event streams +- KWin scripting APIs + +These may become optional backend implementations in the future if a shared protocol proves insufficient, but they must remain behind the same generic adapter contract. + +### Event Flow + +The adapter should sit ahead of the current `event_manager`, not replace it. + +Proposed flow: + +1. backend receives shared compositor-facing state changes +2. adapter updates its desktop-context snapshot +3. adapter emits normalized state deltas and control hints +4. `event_manager` updates queueing, prioritization, and obsolescence decisions +5. AT-SPI object events continue to provide object-level truth within the selected desktop context + +This allows Cthulhu to reduce irrelevant work before scripts interpret it, while keeping existing AT-SPI semantics intact. + +### `event_manager` Integration + +`event_manager` should gain a separate notion of desktop-context state in addition to its existing AT-SPI queue. + +New responsibilities: + +- track whether churn suppression is active +- track the currently prioritized desktop context +- reject queued AT-SPI work that became stale because the compositor already moved to a new context +- prefer script activation and focus recovery work that matches the current desktop context + +The existing queue does not need to be replaced, but it does need one new concept: context obsolescence. + +### Context Obsolescence + +Add a new pruning rule: an event can be obsolete not only because a newer event of the same type exists, but because the desktop context that made the event relevant no longer exists. + +Examples: + +- a large `children-changed` burst from a window on a workspace that is no longer active +- queued `showing` or `name-change` events from a window that just lost compositor priority +- stale background application updates that were queued before a workspace switch finished + +This is the primary performance win of the new interface. It lets Cthulhu discard irrelevant work based on context freshness, not only on event-type heuristics. + +### Churn Suppression Mode + +When the adapter emits `pause_atspi_churn`, `event_manager` should enter a guarded suppression mode. + +This is not a full stop. It is a mode where: + +- high-value focus and activation events are always preserved +- low-value background churn is collapsed, deprioritized, or dropped +- stale work from the prior desktop context is pruned aggressively + +Events that should still be preserved during suppression: + +- `window:*` +- `object:state-changed:focused` with `detail1=true` +- `object:state-changed:active` on frames and windows +- `object:text-selection-changed` +- `object:selection-changed` for the prioritized context +- user-trigger-correlated events associated with the last input event + +Events that should usually be suppressed, collapsed, or deprioritized: + +- `object:children-changed:*` +- `object:state-changed:showing` +- `object:state-changed:sensitive` +- background `object:property-change:accessible-name` +- bulk `object:text-changed:*` +- repeated `object:text-caret-moved` from non-priority contexts + +### Transition Handling + +When the adapter indicates a desktop transition: + +- `desktop_transition_started` + - adapter marks `transition_active=True` + - adapter emits `pause_atspi_churn` +- `desktop_focus_context_changed` + - `event_manager` updates the preferred app/window context before AT-SPI focus settles + - focus-related events for the new context get priority +- `desktop_transition_finished` + - adapter emits `resume_atspi_churn` + - adapter emits `flush_stale_atspi_events` + - `event_manager` drops queued events that belong to the old context and resumes normal flow + +This allows Cthulhu to be more decisive during workspace switches, app switches, and transient window transitions without needing compositor-authoritative object trees. + +### Script Activation + +`event_manager` should use the adapter's prioritized desktop context as an additional input when deciding whether to activate a script. + +It should not activate a script purely because the compositor hinted at it, but it should: + +- prioritize AT-SPI events from the compositor-indicated context +- deprioritize or skip activation from obviously stale background contexts +- improve focus recovery when AT-SPI active-window truth is noisy or delayed + +This should reduce cases where script activation thrashes between background and foreground apps during Wayland transitions. + +### Debugging And Observability + +Add explicit logs around the new boundary: + +- backend chosen and why +- normalized adapter signal emitted +- transition suppression entered and left +- events dropped due to context obsolescence +- events preserved during suppression and why + +These logs should make it possible to explain every major pruning decision in the same way current flood and ignore logic can be debugged. + +## Testing + +### Automated + +Add targeted tests for: + +- backend capability detection and fallback order +- normalization of backend signals into adapter events +- transition start and finish behavior +- churn suppression activation and release +- context-obsolescence pruning of queued AT-SPI events +- preservation of focus and selection events during suppression +- script activation preferring compositor-indicated context when AT-SPI is noisy +- fail-safe fallback to current behavior when the adapter is uncertain + +These tests should be written with mocks and fake backend signals. No real compositor should be required for unit coverage. + +### Manual + +Manual validation should focus on: + +- workspace switches on Mutter, KWin, and niri +- fast application switching +- opening and dismissing transient dialogs and menus +- noisy web or Steam-like scenarios where background AT-SPI churn is high +- confirmation that speech follows real focused objects once AT-SPI settles +- confirmation that no important notifications or menu interactions are lost during suppression + +## Risks + +- Shared Wayland protocols may provide enough workspace truth but not enough top-level detail on every compositor. +- Compositor state may lead AT-SPI focus briefly, which could cause over-eager prioritization if the suppression policy is too aggressive. +- Some legitimate background AT-SPI events may look like churn if the rules are too broad. +- Adding a new boundary introduces state-synchronization bugs if adapter and event queue state diverge. + +## Risk Management + +- keep suppression guarded rather than absolute +- keep AT-SPI authoritative for object semantics +- make context obsolescence explicit and debuggable +- fail back to current behavior whenever backend certainty is too low +- avoid compositor-specific backends in phase one unless a shared protocol is clearly insufficient + +## Recommendation + +Implement a small generic `CompositorStateAdapter` and teach `event_manager` to consume its normalized desktop-state signals and churn-control hints. + +Do not make `niri` IPC part of the core contract. +Do not attempt a compositor-authoritative accessibility architecture in this pass. +Do not broaden the first implementation beyond shared protocol detection, AT-SPI fallback, context-based pruning, and guarded churn suppression.