Add Wayland compositor event interface design spec

This commit is contained in:
2026-04-09 06:14:03 -04:00
parent cc5adf4cce
commit 67abceda9a

View File

@@ -0,0 +1,355 @@
# Wayland Compositor Event Interface For Churn Reduction
## Summary
Revise Cthulhu's event interface so Wayland session state can reduce AT-SPI churn before it reaches the current object-event pipeline. Keep `AT-SPI` authoritative for object semantics, text, selection, and actions. Add a small compositor-state normalization layer that is authoritative for desktop context such as active workspace, active top-level window, and focus routing during transitions.
This is not a Newton-style redesign. It is a conservative performance-oriented split intended to reduce queue growth, redundant script activation, and processing of stale events on Wayland systems while preserving existing accessibility semantics.
## Current State
- Cthulhu's main event flow is still centered on `Atspi.EventListener` and a single `event_manager` queue in `src/cthulhu/event_manager.py`.
- The current design already spends significant effort filtering spam, handling floods and deluges, recovering focus context, and pruning duplicate events.
- `src/cthulhu/input_event_manager.py` already uses `Atspi.Device` for keyboard handling and pointer monitoring wrappers, so Cthulhu already has one narrow example of a side-channel interface on Wayland.
- `src/cthulhu/focus_manager.py` and `src/cthulhu/cthulhu_state.py` still derive active-window and focus truth primarily from AT-SPI objects and events.
- `src/cthulhu/wnck_support.py` correctly gates `Wnck` to X11-only use, which is a useful precedent for runtime capability-based backend selection.
- Recent work in this repository has improved individual Wayland features, but there is not yet a compositor-agnostic interface for desktop state that can suppress irrelevant object churn before it enters the hot path.
## Goals
- Reduce AT-SPI event churn on Wayland systems before it reaches the main queue.
- Keep the design generic and compositor-agnostic at the Cthulhu interface level.
- Preserve current `AT-SPI` object semantics and script behavior as much as possible.
- Improve prioritization of focus and active-window related work during workspace and window transitions.
- Provide a single internal contract that can support Mutter, KWin, and wlroots-based compositors without teaching the rest of Cthulhu about compositor brands.
- Fail safe to current behavior when compositor state is missing, incomplete, or inconsistent.
## Non-Goals
- No Newton-style compositor-authoritative accessibility tree transport.
- No replacement of AT-SPI for object semantics, text, selection, caret movement, or actions.
- No dependency on `niri` IPC, GNOME Shell extensions, or KWin scripts in the core architecture for phase one.
- No requirement that every supported compositor expose the same amount of state.
- No broad rewrite of script listeners or script modules in this pass.
## Approaches Considered
### 1. Direct Wayland Integration Inside `event_manager`
Teach `event_manager` to consume shared Wayland state directly and add compositor-specific checks inside its current logic.
Pros:
- smallest code footprint up front
- no new internal abstraction layer
Cons:
- quickly leaks compositor-specific capability checks into the hottest code path
- makes Mutter, KWin, and wlroots differences part of `event_manager` logic
- harder to test in isolation
### 2. Thin Compositor-State Adapter Layer
Introduce a small internal normalization layer that consumes shared compositor-facing signals and emits a compact generic event vocabulary to `event_manager`.
Pros:
- keeps compositor-specific capability detection out of the hot path
- gives `event_manager` one stable interface to consume
- matches the long-term direction of consuming normalized state below the main screen reader logic
- easy to test with mocked backends
Cons:
- adds one more internal subsystem
- requires careful definition of authority boundaries to avoid duplication with AT-SPI
### 3. Full Compositor-Authoritative Routing
Make compositor state the primary truth for most event routing and use AT-SPI only for fine-grained object semantics.
Pros:
- highest potential performance ceiling
- most direct path toward a future compositor-led architecture
Cons:
- much higher correctness risk
- too large a shift for a churn-focused first pass
- would require significantly more backend-specific coverage and recovery logic
## Recommendation
Implement approach 2.
For phase one, Cthulhu should add a thin compositor-state adapter layer. It should be authoritative for desktop context and transition hints, but not for accessible object semantics. This gives Cthulhu a generic way to suppress or reprioritize AT-SPI noise without requiring a full architecture rewrite.
## Design
### Authority Split
The revised interface should split responsibility cleanly:
- compositor-state adapter is authoritative for:
- active desktop context
- active workspace set
- active top-level window identity when available
- desktop transition start and end
- focus-routing hints during compositor-driven changes
- AT-SPI remains authoritative for:
- focused accessible object
- accessible roles, names, and states
- text, caret, and selection semantics
- actionable objects and accessibility events consumed by scripts
Inference from upstream: current Orca remains AT-SPI-authoritative in its main pipeline, while newer GNOME accessibility work moves normalization below Orca rather than throughout it. This design follows that same direction without depending on Newton itself.
### New Internal Boundary
Add a new internal interface named `CompositorStateAdapter`.
Responsibilities:
- detect and activate the best available compositor-state backend at runtime
- normalize raw backend signals into a small generic event vocabulary
- maintain a current desktop-context snapshot
- emit state deltas and control hints to `event_manager`
- degrade to no-op behavior when capabilities are absent or unclear
This adapter should not expose compositor-specific event names or objects outside its own implementation.
### Normalized Event Vocabulary
The adapter should emit two families of signals.
State deltas:
- `workspace_state_changed`
- `desktop_focus_context_changed`
- `desktop_transition_started`
- `desktop_transition_finished`
Control hints:
- `pause_atspi_churn`
- `resume_atspi_churn`
- `prioritize_focus`
- `deprioritize_context`
- `flush_stale_atspi_events`
This vocabulary is intentionally small. It exists to shape queueing and prioritization, not to mirror every compositor event.
### Desktop Context Snapshot
The adapter should maintain a compact snapshot object with fields along these lines:
- `session_type`
- `backend_name`
- `active_workspace_ids`
- `active_window_token`
- `focus_route_token`
- `transition_active`
- `timestamp`
`active_window_token` and `focus_route_token` are intentionally generic. They should be comparable identifiers, not compositor-native objects leaking into the rest of Cthulhu.
### Backend Selection
Phase one should use capability-driven backend selection with generic shared interfaces first.
Preferred order:
1. `WaylandSharedProtocolsBackend`
2. `AtspiContextBackend`
3. `NullBackend`
#### `WaylandSharedProtocolsBackend`
This backend should consume shared Wayland-facing protocols where available and normalize them for Cthulhu. The first protocol target should be `ext_workspace_v1`.
Reasoning:
- `ext_workspace_v1` provides workspace groups, workspaces, active state, and atomic `done` notifications, which is exactly the sort of low-volume desktop-state signal Cthulhu needs to reason about transitions.
- As of April 9, 2026, Wayland Explorer lists `ext_workspace_v1` support for Mutter `49.2`, KWin `6.6`, and niri `25.11`, making it a good cross-family starting point.
This backend should only expose normalized state to the rest of Cthulhu. It should not expose protocol objects or protocol-specific state transitions outside the backend.
#### `AtspiContextBackend`
This backend should use current AT-SPI-based context recovery when shared Wayland state is missing or insufficient. It does not improve churn by itself, but it preserves current behavior and keeps the adapter contract usable everywhere.
#### `NullBackend`
This backend should emit no compositor hints and leave the rest of Cthulhu in its current behavior. It is the fail-safe path.
### Explicit Exclusions For Phase One
The following are not part of the core architecture in phase one:
- `niri` IPC as a first-class public interface
- compositor-specific `D-Bus` integrations
- GNOME Shell extension event streams
- KWin scripting APIs
These may become optional backend implementations in the future if a shared protocol proves insufficient, but they must remain behind the same generic adapter contract.
### Event Flow
The adapter should sit ahead of the current `event_manager`, not replace it.
Proposed flow:
1. backend receives shared compositor-facing state changes
2. adapter updates its desktop-context snapshot
3. adapter emits normalized state deltas and control hints
4. `event_manager` updates queueing, prioritization, and obsolescence decisions
5. AT-SPI object events continue to provide object-level truth within the selected desktop context
This allows Cthulhu to reduce irrelevant work before scripts interpret it, while keeping existing AT-SPI semantics intact.
### `event_manager` Integration
`event_manager` should gain a separate notion of desktop-context state in addition to its existing AT-SPI queue.
New responsibilities:
- track whether churn suppression is active
- track the currently prioritized desktop context
- reject queued AT-SPI work that became stale because the compositor already moved to a new context
- prefer script activation and focus recovery work that matches the current desktop context
The existing queue does not need to be replaced, but it does need one new concept: context obsolescence.
### Context Obsolescence
Add a new pruning rule: an event can be obsolete not only because a newer event of the same type exists, but because the desktop context that made the event relevant no longer exists.
Examples:
- a large `children-changed` burst from a window on a workspace that is no longer active
- queued `showing` or `name-change` events from a window that just lost compositor priority
- stale background application updates that were queued before a workspace switch finished
This is the primary performance win of the new interface. It lets Cthulhu discard irrelevant work based on context freshness, not only on event-type heuristics.
### Churn Suppression Mode
When the adapter emits `pause_atspi_churn`, `event_manager` should enter a guarded suppression mode.
This is not a full stop. It is a mode where:
- high-value focus and activation events are always preserved
- low-value background churn is collapsed, deprioritized, or dropped
- stale work from the prior desktop context is pruned aggressively
Events that should still be preserved during suppression:
- `window:*`
- `object:state-changed:focused` with `detail1=true`
- `object:state-changed:active` on frames and windows
- `object:text-selection-changed`
- `object:selection-changed` for the prioritized context
- user-trigger-correlated events associated with the last input event
Events that should usually be suppressed, collapsed, or deprioritized:
- `object:children-changed:*`
- `object:state-changed:showing`
- `object:state-changed:sensitive`
- background `object:property-change:accessible-name`
- bulk `object:text-changed:*`
- repeated `object:text-caret-moved` from non-priority contexts
### Transition Handling
When the adapter indicates a desktop transition:
- `desktop_transition_started`
- adapter marks `transition_active=True`
- adapter emits `pause_atspi_churn`
- `desktop_focus_context_changed`
- `event_manager` updates the preferred app/window context before AT-SPI focus settles
- focus-related events for the new context get priority
- `desktop_transition_finished`
- adapter emits `resume_atspi_churn`
- adapter emits `flush_stale_atspi_events`
- `event_manager` drops queued events that belong to the old context and resumes normal flow
This allows Cthulhu to be more decisive during workspace switches, app switches, and transient window transitions without needing compositor-authoritative object trees.
### Script Activation
`event_manager` should use the adapter's prioritized desktop context as an additional input when deciding whether to activate a script.
It should not activate a script purely because the compositor hinted at it, but it should:
- prioritize AT-SPI events from the compositor-indicated context
- deprioritize or skip activation from obviously stale background contexts
- improve focus recovery when AT-SPI active-window truth is noisy or delayed
This should reduce cases where script activation thrashes between background and foreground apps during Wayland transitions.
### Debugging And Observability
Add explicit logs around the new boundary:
- backend chosen and why
- normalized adapter signal emitted
- transition suppression entered and left
- events dropped due to context obsolescence
- events preserved during suppression and why
These logs should make it possible to explain every major pruning decision in the same way current flood and ignore logic can be debugged.
## Testing
### Automated
Add targeted tests for:
- backend capability detection and fallback order
- normalization of backend signals into adapter events
- transition start and finish behavior
- churn suppression activation and release
- context-obsolescence pruning of queued AT-SPI events
- preservation of focus and selection events during suppression
- script activation preferring compositor-indicated context when AT-SPI is noisy
- fail-safe fallback to current behavior when the adapter is uncertain
These tests should be written with mocks and fake backend signals. No real compositor should be required for unit coverage.
### Manual
Manual validation should focus on:
- workspace switches on Mutter, KWin, and niri
- fast application switching
- opening and dismissing transient dialogs and menus
- noisy web or Steam-like scenarios where background AT-SPI churn is high
- confirmation that speech follows real focused objects once AT-SPI settles
- confirmation that no important notifications or menu interactions are lost during suppression
## Risks
- Shared Wayland protocols may provide enough workspace truth but not enough top-level detail on every compositor.
- Compositor state may lead AT-SPI focus briefly, which could cause over-eager prioritization if the suppression policy is too aggressive.
- Some legitimate background AT-SPI events may look like churn if the rules are too broad.
- Adding a new boundary introduces state-synchronization bugs if adapter and event queue state diverge.
## Risk Management
- keep suppression guarded rather than absolute
- keep AT-SPI authoritative for object semantics
- make context obsolescence explicit and debuggable
- fail back to current behavior whenever backend certainty is too low
- avoid compositor-specific backends in phase one unless a shared protocol is clearly insufficient
## Recommendation
Implement a small generic `CompositorStateAdapter` and teach `event_manager` to consume its normalized desktop-state signals and churn-control hints.
Do not make `niri` IPC part of the core contract.
Do not attempt a compositor-authoritative accessibility architecture in this pass.
Do not broaden the first implementation beyond shared protocol detection, AT-SPI fallback, context-based pruning, and guarded churn suppression.