Skip to content

Auditor's Guide: Inspecting Lár Agents

Audience: Compliance Officers, Quality Assurance, Notified Bodies, and external Auditors. Scope: How to verify a high-risk AI system built with Lár, in accordance with the EU AI Act (2026) and Nannini et al. (2026) — "AI Agents Under EU Law: A Compliance Architecture for AI Providers" (arXiv:2604.04604v1).


1. What You Are Looking At

Lár is a "Glass Box" agent execution engine. Every agent is a deterministic graph of explicit steps. You do not need to understand the underlying LLM to audit a Lár agent — you need to read the three artefacts it produces on every run.

No external tooling required. No LangSmith account. No vendor access. The artefacts are plain JSON files, signed with a key your organisation controls.

For the canonical reference implementation showing all 13 compliance primitives firing in sequence, see: EU AI Act Finance Showcase →


2. The Three Required Artefacts

Request these from the engineering team for any conformity assessment or incident investigation. Every Lár Enterprise run writes all three to enterprise_audit/.

Artefact 1 — Action Inventory (compliance_manifest.json)

Regulatory basis: Step 9 (Nannini et al.), Annex IV

Generated by static graph traversal before the agent runs. This is the regulatory map — what the agent could do, not what it did.

What to verify: - External actions: Every tool, LLM call, and API endpoint the agent can reach. If an action appears at runtime that is not in the manifest, that is an Art. 3(23) Substantial Modification event. - Affected parties: Actions flagged THIRD_PARTY must have Art. 50 disclosure evidence in the causal trace. - Unvaulted tools: tools_without_credential_vault must be zero in production. Any tool not using CredentialVault holds standing credentials — a Art. 15(4) violation. - Risk flags: HIGH severity flags (e.g., AdaptiveNode present) require explicit provider documentation before CE-marking submission.

Artefact 2 — Causal Trace (run_<uuid>.json)

Regulatory basis: Art. 12 — Record-Keeping

The forensic log of what actually happened. Lár records a state diff at every step — not a text dump, but the exact variables added, removed, or modified, the rendered prompt, token usage, and outcome.

What to verify: 1. Sequentiality: Step numbers are consecutive with no gaps. 2. Causality: The variable written in step N is the variable read by the router in step N+1. The decision chain is reconstructible from the log alone. 3. PII redaction: Search the log for known PII values (names, SSNs, IBANs). If present, the PIIRedactionEngine was not applied before signing — GDPR Art. 17 violation. 4. HMAC signature: The final entry contains a cryptographic signature. Verify it (see Section 3). If verification fails, the log was tampered with after execution. 5. Bias scan: Steps involving BiasFilterNode should show a scan result in state_diff. If a protected characteristic was detected and no HumanJuryNode interrupt followed, flag for review.

Artefact 3 — Authority Ledger (authority_ledger.json)

Regulatory basis: Art. 14 — Human Oversight (Fourth Tier)

Records every human oversight exercise. This is the evidentiary chain the paper's footnote 18 requires: action proposal → risk assessment → human determination → execution outcome.

What to verify: - Stakeholder identity and role: stakeholder_id and stakeholder_role must match the organisation's authorised reviewer registry. - Rationale: Every record must contain a non-empty rationale. An approval with no rationale is rubber-stamping — it will fail an audit for negligent oversight. - Timestamp realism: Cross-reference timestamp against the causal trace. If a HumanJuryNode decision appears in under 10 seconds on a complex case, investigate. - Risk score: The risk_score_key field shows the upstream RiskScorerNode output that triggered the interrupt. Verify the score matches the action's PolicyRegistry classification.


3. Cryptographic Verification

A log is only useful if it cannot be silently altered. Lár signs the causal trace and authority ledger using HMAC-SHA256 with an enterprise-controlled secret key.

Verification procedure (no coding required):

python examples/compliance/11_verify_audit_log.py run_<uuid>.json your_enterprise_secret_key

Expected output: [+] VERIFICATION SUCCESSFUL

If output is [-] VERIFICATION FAILED, the log was modified after execution. Treat this as an integrity incident.

Key management: The HMAC secret should be stored in your organisation's secrets manager (AWS KMS, HashiCorp Vault). The engineering team should not have access to the production key — this ensures the audit trail cannot be altered even by the team that built the agent.


4. The Nannini et al. (2026) 12-Step Coverage Map

Use this table to verify which steps the agent's architecture addresses. Request the compliance manifest and causal trace to confirm runtime coverage.

Step Requirement Lár Primitive Where to Verify
0 Scope: Art. 3(1) AI system definition DOMAIN_PRESETS classification record Provider documentation
1 GPAI layer: Art. 53 documentation LiteLLM model config run_metadata.model in causal trace
2 Classify: Annex III / high-risk conformity_id in domain preset Compliance manifest header
3 QMS: prEN 18286 artifacts Manifest + Ledger + Causal Trace All three artefacts
4 Risk management: Art. 9 PolicyRegistry + RiskScorerNode computed_oversight_level in causal trace
5 Data governance: prEN 18283/18284 PIIRedactionEngine + BiasFilterNode PII absent from log; bias scan in state diff
6 Trustworthiness: Art. 12–14 AuditLogger + HumanJuryNode + AuthorityLedger All three artefacts
7 Cybersecurity: Art. 15(4) CredentialVault jit_token_present in causal trace; unvaulted tools = 0 in manifest
8 CRA applicability Secure-by-design architecture Provider documentation
9 Adjacent legislation inventory ComplianceManifestGenerator Compliance manifest action inventory
10 Conformity assessment artefacts Manifest + Ledger + Trace → Annex IV All three artefacts
11 Post-market monitoring: Art. 3(23) RuntimeStateVersioner drift_report in causal trace

Full mapping: https://docs.snath.ai/compliance/paper-compliance-mapping/


5. Failure Modes and Red Flags

Failure What to Look For Primitive That Should Have Caught It
Rubber-stamp approval HumanJuryNode decisions with empty rationale or sub-10s timestamps AuthorityLedger rationale field
Lethal Trifecta violation Untrusted input + PII + autonomous action with no jury record LethalTrifectaGuard
Behavioural drift Tool in causal trace not present in manifest RuntimeStateVersioner drift report
PII in audit log Known personal data values present in signed log PIIRedactionEngine
Unvaulted credentials tools_without_credential_vault > 0 in manifest CredentialVault
Unapproved subgraph AdaptiveNode present but no TopologyValidator referenced TopologyValidator rejection log
Missing Art. 50 disclosure Third-party-affecting action with no transparency flag TransparencyEngine
Tampered log HMAC verification fails HMAC-SHA256 signing
Bias in output Protected characteristic in LLM output with no interrupt BiasFilterNode + HumanJuryNode
Consolidated-only jury context BatchNode present but no branch_findings_summary in HumanJuryNode context keys BranchTriageNode

6. Questions to Ask the Engineering Team

  1. What is the conformity_id for this system and where is the conformity assessment record?
  2. Where is the HMAC secret stored and who has access to it?
  3. Has the compliance_manifest.json been reviewed and signed off before production deployment?
  4. Are HumanJuryNode approvers trained on what they are approving, or is rubber-stamping structurally possible?
  5. Is RuntimeStateVersioner configured with a conformity baseline, and what is the drift threshold before a new assessment is triggered?

See Also