Accessibility Systemic Analyzer

The **Accessibility Systemic Analyzer** serves as an advanced architectural analysis and reporting layer for multi-tool accessibility scan outputs. It is designed explicitly for development and product teams managing large digital estates where issue backlogs have grown too massive to sensibly triage page by page[cite: 10].

Instead of logging every single automated scanner match as an isolated, independent defect, the engine normalizes incoming source attributes, deduplicates overlapping data rows, maps entries to standard WCAG taxonomies, and groups matching issue footprints into highly actionable remediation patterns. This pipeline generates an interactive visual dashboard alongside a detailed spreadsheet workbook built for rapid remediation planning.

Current system state: The analyzer natively supports an expanded tool adapter set, deep WCAG schema enrichment, humanized rule titles, tightened workbook/dashboard synchronization, static site compilation paths, and searchable, paginated data drilldowns.

Why this exists

Automated testing engines record entries per URL node. While this method is excellent for localized debugging, it makes high-level remediation orchestration incredibly difficult. A single global component bug will trigger hundreds of repetitive failures across a site's templates.

Scanned Page Location	Automated Scanner Output Item
Home	Button contrast failure
Product page	Button contrast failure
Cart	Button contrast failure
Checkout	Button contrast failure

Standard reporting structures force teams to triage these as four distinct, unrelated tasks.

💡 **The Reality:** The actual engineering root cause is simply **one single shared component** possessing a faulty CSS color assignment.

The analyzer exists to collapse high-volume raw rows into a lean list of high-impact reusable remediation targets.

Core Engine Operations

To clean and structure high-volume testing data, the analyzer executes the following pipeline functions[cite: 10]:

Ingestion: Automatically detects and processes supported scanner log layouts.
Standardization: Translates diverse source attributes into a shared row schema.
Compliance Mapping: Correlates scanner codes to standard WCAG criteria, success levels, and criteria names.
Sanitization: Normalizes string values, DOM selectors, severities, and labels.
Inference: Evaluates code context to track element, component, or design-system ownership.
Deduplication: Collapses overlapping tool matches targeting identical elements on a single page.
Clustering: Groups repeated issue patterns into site-wide systemic blocks.
Analytics: Computes data indicators for tool consensus, confidence scales, and fix priority ranks.
Exporting: Compiles finished data loops into rich multi-sheet workbooks and static web dashboard bundles.

Supported Tools & Adapters

Testing Engine Source	Capability & Parsing Attributes
axe-core	Extracts WCAG structural metadata; preserves verified violations and manual-review check rows.
axe-scan	Ingests axe-derived scanner files; manages partial/incomplete contrast logs.
IBM Accessibility Checker	Maps native IBM rule identifiers directly to WCAG; models distinct `violation` and `potentialviolation` parameters.
Lighthouse	Parses automated accessibility failure audits and pulls out WCAG tags from underlying debug arrays[cite: 10].
Oobee	Crawler-oriented architecture optimized to sweep across massive digital estates and page inventories.
UUV	Run/flow-oriented tracking system incorporating strict severity normalization and compliance metadata parsing.
Pa11y	Modeled as a unified source pipeline while cleanly tracking individual runner properties (e.g., `axe` vs `htmlcs`).
HTMLCS / html-sniffer	Enables HTMLCS-specific parsing schemas matching technique-style rule titles.

Certain adapters intentionally preserve "needs-review" flags rather than deleting them. This behavior ensures engineering teams can investigate subtle potential flaws.

WCAG Handling & Data Schema

The analyzer runs dedicated rule-to-compliance mapping frameworks, leverages fallback messages, and enriches criteria metadata. To maximize text scannability, the database architecture completely separates rule vocabulary layers:

Rule: The clean, humanized label identifying the issue (e.g., Contrast (Minimum)).
Rule Id: The raw, technical string identifier generated by the testing engine.
WCAG: The formal compliance success criterion numerical code (e.g., 1.4.3).
WCAG Name: The official compliance title of the criterion.

Important recent model changes

🔄 Pa11y mixed-run support

Pa11y records are no longer tracked as two synthetic tools when axe and HTMLCS outputs inhabit the same file. The engine tracks a single, clean pa11y pipeline while maintaining precise per-issue runner metadata.

🏷️ Friendly rule labels

Label rendering has been significantly sharpened. The engine avoids raw tool codes if a clear, humanized title can be derived using metadata mappings or technique-aware context flags (e.g., 1.4.6 [G17]).

🧼 WCAG chart cleanup

Analytical dashboard elements and spreadsheet summary tabs accept only real, verified WCAG codes. Raw engine error descriptions and unmapped technical strings are blocked from entering compliance graphics[cite: 10].

🔍 Drilldown UX upgrade

The primary analytics drilldown interface features real-time keyword text filtering, custom row display limits, and full pagination.

How the pipeline works

Adapters: Tool-specific parsing layers scan source outputs to output standardized rows holding keys like selector strings, severities, and paths.
Enrichment: Rows are injected with readable WCAG criteria parameters, stable page markers, and refined rules.
Processing: The layer checks the code layout to infer design-system component roots, layout templates, or page-level ownership constraints.
Deduplication: Overlapping failures occurring on identical nodes are merged, computing engine consensus values.
Clustering & Metrics: Repetitive bugs are structured into systemic families. The system computes metrics for tool agreement, page spread, and remediation impact.
Exports: The analytical architecture compiles payloads directly into structured workbooks, Power BI tables, and static site assets.

Defining “Systemic”

A **systemic issue** is any finding pattern that repeats frequently enough that a single development change will repair multiple page variants simultaneously. The analyzer relies on three distinct operational classifications to separate these concerns:

design_system_issue: High-probability indicators of an issue living within global shared source components.
issue_scope: Tracks whether a failure is localized to a single page, a template type, or a global component.
systemic: A computed rating showing how efficient the item is as a centralized "fix-once" choice.

Dashboard, Workbook, & Deployment Support

The application engine outputs a static-build track built for quick, documentation-style hosting platforms. This build outputs analytical dashboard charts, support matrices, page coverage checks, and fully searchable drilldowns alongside a downloadable spreadsheet workbook model. The data model features built-in fields like tool_count, consensus, and confidence to help your engineering team track exactly how much tool agreement backs an issue.

⚠️ Interpretation guidance and limits

Automated coverage remains partial. Deploying a robust, multi-engine testing stack optimizes bug discovery and confidence, but it **does not replace expert manual testing** or human judgment.

Subjective evaluation criteria—such as image alt-text context quality, meaningful button labels, logical keyboard focus tracking, error recovery text clarity, document layout semantic clarity, and cognitive accessibility paths—must still be validated through human inspection.

Suggested operational workflow

Generate: Execute your automated accessibility scanners across your properties and gather the report files.
Compile: Build the normalized analyzer dashboard and spreadsheet models using the core framework.
Validate: Inspect the **Page Inventory Check** page first to verify data alignment across all tool sets.
Triage: Evaluate the dashboard views to pull out high-density components, WCAG hotspots, and "fix-once" tasks.
Execute: Leverage the spreadsheet workbook tabs for filtered engineer handoffs, bug tracking, and corporate BI reporting.
Audit: Re-run the analysis pipeline post-remediation to verify improvements across tool consensus, volume, and systemic impact.

Quick start development pipeline

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --reload

Related guides

Dashboard Guide: Full reference covering dashboard indicators, action panels, and interactive drilldowns.
Workbook Guide: Complete technical reference explaining spreadsheet tabs, fact-tables, and column data dict formats.