Files
cthulhu/docs/superpowers/specs/2026-04-09-wayland-compositor-event-interface-design.md

15 KiB

Wayland Compositor Event Interface For Churn Reduction

Summary

Revise Cthulhu's event interface so Wayland session state can reduce AT-SPI churn before it reaches the current object-event pipeline. Keep AT-SPI authoritative for object semantics, text, selection, and actions. Add a small compositor-state normalization layer that is authoritative for desktop context such as active workspace, active top-level window, and focus routing during transitions.

This is not a Newton-style redesign. It is a conservative performance-oriented split intended to reduce queue growth, redundant script activation, and processing of stale events on Wayland systems while preserving existing accessibility semantics.

Current State

  • Cthulhu's main event flow is still centered on Atspi.EventListener and a single event_manager queue in src/cthulhu/event_manager.py.
  • The current design already spends significant effort filtering spam, handling floods and deluges, recovering focus context, and pruning duplicate events.
  • src/cthulhu/input_event_manager.py already uses Atspi.Device for keyboard handling and pointer monitoring wrappers, so Cthulhu already has one narrow example of a side-channel interface on Wayland.
  • src/cthulhu/focus_manager.py and src/cthulhu/cthulhu_state.py still derive active-window and focus truth primarily from AT-SPI objects and events.
  • src/cthulhu/wnck_support.py correctly gates Wnck to X11-only use, which is a useful precedent for runtime capability-based backend selection.
  • Recent work in this repository has improved individual Wayland features, but there is not yet a compositor-agnostic interface for desktop state that can suppress irrelevant object churn before it enters the hot path.

Goals

  • Reduce AT-SPI event churn on Wayland systems before it reaches the main queue.
  • Keep the design generic and compositor-agnostic at the Cthulhu interface level.
  • Preserve current AT-SPI object semantics and script behavior as much as possible.
  • Improve prioritization of focus and active-window related work during workspace and window transitions.
  • Provide a single internal contract that can support Mutter, KWin, and wlroots-based compositors without teaching the rest of Cthulhu about compositor brands.
  • Fail safe to current behavior when compositor state is missing, incomplete, or inconsistent.

Non-Goals

  • No Newton-style compositor-authoritative accessibility tree transport.
  • No replacement of AT-SPI for object semantics, text, selection, caret movement, or actions.
  • No dependency on niri IPC, GNOME Shell extensions, or KWin scripts in the core architecture for phase one.
  • No requirement that every supported compositor expose the same amount of state.
  • No broad rewrite of script listeners or script modules in this pass.

Approaches Considered

1. Direct Wayland Integration Inside event_manager

Teach event_manager to consume shared Wayland state directly and add compositor-specific checks inside its current logic.

Pros:

  • smallest code footprint up front
  • no new internal abstraction layer

Cons:

  • quickly leaks compositor-specific capability checks into the hottest code path
  • makes Mutter, KWin, and wlroots differences part of event_manager logic
  • harder to test in isolation

2. Thin Compositor-State Adapter Layer

Introduce a small internal normalization layer that consumes shared compositor-facing signals and emits a compact generic event vocabulary to event_manager.

Pros:

  • keeps compositor-specific capability detection out of the hot path
  • gives event_manager one stable interface to consume
  • matches the long-term direction of consuming normalized state below the main screen reader logic
  • easy to test with mocked backends

Cons:

  • adds one more internal subsystem
  • requires careful definition of authority boundaries to avoid duplication with AT-SPI

3. Full Compositor-Authoritative Routing

Make compositor state the primary truth for most event routing and use AT-SPI only for fine-grained object semantics.

Pros:

  • highest potential performance ceiling
  • most direct path toward a future compositor-led architecture

Cons:

  • much higher correctness risk
  • too large a shift for a churn-focused first pass
  • would require significantly more backend-specific coverage and recovery logic

Recommendation

Implement approach 2.

For phase one, Cthulhu should add a thin compositor-state adapter layer. It should be authoritative for desktop context and transition hints, but not for accessible object semantics. This gives Cthulhu a generic way to suppress or reprioritize AT-SPI noise without requiring a full architecture rewrite.

Design

Authority Split

The revised interface should split responsibility cleanly:

  • compositor-state adapter is authoritative for:
    • active desktop context
    • active workspace set
    • active top-level window identity when available
    • desktop transition start and end
    • focus-routing hints during compositor-driven changes
  • AT-SPI remains authoritative for:
    • focused accessible object
    • accessible roles, names, and states
    • text, caret, and selection semantics
    • actionable objects and accessibility events consumed by scripts

Inference from upstream: current Orca remains AT-SPI-authoritative in its main pipeline, while newer GNOME accessibility work moves normalization below Orca rather than throughout it. This design follows that same direction without depending on Newton itself.

New Internal Boundary

Add a new internal interface named CompositorStateAdapter.

Responsibilities:

  • detect and activate the best available compositor-state backend at runtime
  • normalize raw backend signals into a small generic event vocabulary
  • maintain a current desktop-context snapshot
  • emit state deltas and control hints to event_manager
  • degrade to no-op behavior when capabilities are absent or unclear

This adapter should not expose compositor-specific event names or objects outside its own implementation.

Normalized Event Vocabulary

The adapter should emit two families of signals.

State deltas:

  • workspace_state_changed
  • desktop_focus_context_changed
  • desktop_transition_started
  • desktop_transition_finished

Control hints:

  • pause_atspi_churn
  • resume_atspi_churn
  • prioritize_focus
  • deprioritize_context
  • flush_stale_atspi_events

This vocabulary is intentionally small. It exists to shape queueing and prioritization, not to mirror every compositor event.

Desktop Context Snapshot

The adapter should maintain a compact snapshot object with fields along these lines:

  • session_type
  • backend_name
  • active_workspace_ids
  • active_window_token
  • focus_route_token
  • transition_active
  • timestamp

active_window_token and focus_route_token are intentionally generic. They should be comparable identifiers, not compositor-native objects leaking into the rest of Cthulhu.

Backend Selection

Phase one should use capability-driven backend selection with generic shared interfaces first.

Preferred order:

  1. WaylandSharedProtocolsBackend
  2. AtspiContextBackend
  3. NullBackend

WaylandSharedProtocolsBackend

This backend should consume shared Wayland-facing protocols where available and normalize them for Cthulhu. The first protocol target should be ext_workspace_v1.

Reasoning:

  • ext_workspace_v1 provides workspace groups, workspaces, active state, and atomic done notifications, which is exactly the sort of low-volume desktop-state signal Cthulhu needs to reason about transitions.
  • As of April 9, 2026, Wayland Explorer lists ext_workspace_v1 support for Mutter 49.2, KWin 6.6, and niri 25.11, making it a good cross-family starting point.

This backend should only expose normalized state to the rest of Cthulhu. It should not expose protocol objects or protocol-specific state transitions outside the backend.

AtspiContextBackend

This backend should use current AT-SPI-based context recovery when shared Wayland state is missing or insufficient. It does not improve churn by itself, but it preserves current behavior and keeps the adapter contract usable everywhere.

NullBackend

This backend should emit no compositor hints and leave the rest of Cthulhu in its current behavior. It is the fail-safe path.

Explicit Exclusions For Phase One

The following are not part of the core architecture in phase one:

  • niri IPC as a first-class public interface
  • compositor-specific D-Bus integrations
  • GNOME Shell extension event streams
  • KWin scripting APIs

These may become optional backend implementations in the future if a shared protocol proves insufficient, but they must remain behind the same generic adapter contract.

Event Flow

The adapter should sit ahead of the current event_manager, not replace it.

Proposed flow:

  1. backend receives shared compositor-facing state changes
  2. adapter updates its desktop-context snapshot
  3. adapter emits normalized state deltas and control hints
  4. event_manager updates queueing, prioritization, and obsolescence decisions
  5. AT-SPI object events continue to provide object-level truth within the selected desktop context

This allows Cthulhu to reduce irrelevant work before scripts interpret it, while keeping existing AT-SPI semantics intact.

event_manager Integration

event_manager should gain a separate notion of desktop-context state in addition to its existing AT-SPI queue.

New responsibilities:

  • track whether churn suppression is active
  • track the currently prioritized desktop context
  • reject queued AT-SPI work that became stale because the compositor already moved to a new context
  • prefer script activation and focus recovery work that matches the current desktop context

The existing queue does not need to be replaced, but it does need one new concept: context obsolescence.

Context Obsolescence

Add a new pruning rule: an event can be obsolete not only because a newer event of the same type exists, but because the desktop context that made the event relevant no longer exists.

Examples:

  • a large children-changed burst from a window on a workspace that is no longer active
  • queued showing or name-change events from a window that just lost compositor priority
  • stale background application updates that were queued before a workspace switch finished

This is the primary performance win of the new interface. It lets Cthulhu discard irrelevant work based on context freshness, not only on event-type heuristics.

Churn Suppression Mode

When the adapter emits pause_atspi_churn, event_manager should enter a guarded suppression mode.

This is not a full stop. It is a mode where:

  • high-value focus and activation events are always preserved
  • low-value background churn is collapsed, deprioritized, or dropped
  • stale work from the prior desktop context is pruned aggressively

Events that should still be preserved during suppression:

  • window:*
  • object:state-changed:focused with detail1=true
  • object:state-changed:active on frames and windows
  • object:text-selection-changed
  • object:selection-changed for the prioritized context
  • user-trigger-correlated events associated with the last input event

Events that should usually be suppressed, collapsed, or deprioritized:

  • object:children-changed:*
  • object:state-changed:showing
  • object:state-changed:sensitive
  • background object:property-change:accessible-name
  • bulk object:text-changed:*
  • repeated object:text-caret-moved from non-priority contexts

Transition Handling

When the adapter indicates a desktop transition:

  • desktop_transition_started
    • adapter marks transition_active=True
    • adapter emits pause_atspi_churn
  • desktop_focus_context_changed
    • event_manager updates the preferred app/window context before AT-SPI focus settles
    • focus-related events for the new context get priority
  • desktop_transition_finished
    • adapter emits resume_atspi_churn
    • adapter emits flush_stale_atspi_events
    • event_manager drops queued events that belong to the old context and resumes normal flow

This allows Cthulhu to be more decisive during workspace switches, app switches, and transient window transitions without needing compositor-authoritative object trees.

Script Activation

event_manager should use the adapter's prioritized desktop context as an additional input when deciding whether to activate a script.

It should not activate a script purely because the compositor hinted at it, but it should:

  • prioritize AT-SPI events from the compositor-indicated context
  • deprioritize or skip activation from obviously stale background contexts
  • improve focus recovery when AT-SPI active-window truth is noisy or delayed

This should reduce cases where script activation thrashes between background and foreground apps during Wayland transitions.

Debugging And Observability

Add explicit logs around the new boundary:

  • backend chosen and why
  • normalized adapter signal emitted
  • transition suppression entered and left
  • events dropped due to context obsolescence
  • events preserved during suppression and why

These logs should make it possible to explain every major pruning decision in the same way current flood and ignore logic can be debugged.

Testing

Automated

Add targeted tests for:

  • backend capability detection and fallback order
  • normalization of backend signals into adapter events
  • transition start and finish behavior
  • churn suppression activation and release
  • context-obsolescence pruning of queued AT-SPI events
  • preservation of focus and selection events during suppression
  • script activation preferring compositor-indicated context when AT-SPI is noisy
  • fail-safe fallback to current behavior when the adapter is uncertain

These tests should be written with mocks and fake backend signals. No real compositor should be required for unit coverage.

Manual

Manual validation should focus on:

  • workspace switches on Mutter, KWin, and niri
  • fast application switching
  • opening and dismissing transient dialogs and menus
  • noisy web or Steam-like scenarios where background AT-SPI churn is high
  • confirmation that speech follows real focused objects once AT-SPI settles
  • confirmation that no important notifications or menu interactions are lost during suppression

Risks

  • Shared Wayland protocols may provide enough workspace truth but not enough top-level detail on every compositor.
  • Compositor state may lead AT-SPI focus briefly, which could cause over-eager prioritization if the suppression policy is too aggressive.
  • Some legitimate background AT-SPI events may look like churn if the rules are too broad.
  • Adding a new boundary introduces state-synchronization bugs if adapter and event queue state diverge.

Risk Management

  • keep suppression guarded rather than absolute
  • keep AT-SPI authoritative for object semantics
  • make context obsolescence explicit and debuggable
  • fail back to current behavior whenever backend certainty is too low
  • avoid compositor-specific backends in phase one unless a shared protocol is clearly insufficient

Recommendation

Implement a small generic CompositorStateAdapter and teach event_manager to consume its normalized desktop-state signals and churn-control hints.

Do not make niri IPC part of the core contract. Do not attempt a compositor-authoritative accessibility architecture in this pass. Do not broaden the first implementation beyond shared protocol detection, AT-SPI fallback, context-based pruning, and guarded churn suppression.