Does RAGCompliance handle PHI and PII?

Yes, opt-in as of v0.1.8. Set RAGCOMPLIANCE_REDACT_PII=true and the handler runs a regex pass before signing and storing the record. Nine built-in patterns ship: email, SSN with SSA never-issued prefix exclusion, Luhn-validated credit card, US phone, IPv4, AWS / OpenAI / Anthropic API keys, and bearer tokens. Custom Pattern objects can be registered on top. The SHA-256 chain signature is computed over the redacted payload, so an auditor reproducing the signature never needs access to raw secrets. For PHI, pair with a HIPAA-eligible Supabase deployment and your own BAA.

Is RAGCompliance Supabase-only?

Supabase is the default because it gives you Postgres, row-level security, and a managed API out of the box, so the self-host path is a single SQL migration plus environment variables. The AuditStorage interface is swappable. Anything that can persist a dict of the record shape and query it back works. storage.save() is wrapped defensively so a custom backend that raises cannot take the chain down. Postgres-direct and BYO object storage backends are on the roadmap.

Audit trail for retrieval-augmented generation

Proof for every answer
your RAG ever gave.

RAG compliance middleware for LangChain and LlamaIndex. One callback, one signed audit row per chain.

ragcompliance is drop-in middleware for LangChain and LlamaIndex that logs the full chain (query, retrieved chunks with source URLs and similarity scores, the LLM answer, model name, latency), signs it with SHA-256, and writes one row per invocation to your own Supabase, behind row-level security per workspace. No chain rewrites. No black box.

$ pip install "ragcompliance[supabase]"

Read the docs →

171tests green

<1mshandler overhead

MITlicensed

Handler overhead measured in isolation (p50 ≈ 38µs on a clean hot path). Full-chain latency is dominated by your retriever and LLM; the handler adds a small constant on top.

rag_audit_logs · example mock data

#0041 Does policy §4.2 cover indemnification?→ contract-v3.pdf · chunk-042 · 0.94 sha256:a3f8c2d1…

#0042 Summarise PSA termination rules.→ psa-master.pdf · chunk-014 · 0.88 sha256:9b4e1d77…

#0043 Tampered post-write→ signature mismatch on recompute sha256:invalid ✗

#0044 What's the liability cap clause?→ msa-2025.pdf · chunk-031 · 0.91 sha256:7e2af055…

#0045 Explain the notice window for breach.→ msa-2025.pdf · chunk-058 · 0.89 sha256:44d7be12…

4 of 5 verified POST /api/logs

Works withLangChain, LlamaIndex

ComplementsLangSmith, Langfuse (observability stays separate from audit)

StorageSupabase (BYO object storage planned)

01 · The gap

RAG doesn't fail in the model.
It fails in the audit.

RAG compliance is not a solved problem. Retrieval works, generation works, but the moment a compliance team asks for proof of what the model saw and proof that no one tampered with it since, most RAG stacks have nothing to hand over.

In regulated industries (finance, healthcare, legal, insurance) the question is never "is your retrieval good?" It's "can you prove, for any given answer on any given day, which document was cited, what the model saw, and that no one tampered with it since?" Most RAG stacks can't.

one row

per chain invocation. Signed. Queryable. That's what an auditor actually wants.

The kinds of questions compliance, internal audit, and GRC teams actually bring to a RAG system in regulated industries:

"Show me the retrieval-to-answer chain for every invocation in Q1, and prove it hasn't been modified." the audit-reconstruction ask

"One row per query. Source documents, chunk IDs, the answer verbatim. And row-level security across tenants." the evidence-shape ask

"If your answer changes tomorrow, I want to know exactly what changed. Model, prompt, source, all of it." the answer-drift ask

02 · What it captures

One row per chain.
Signed, stored, surfaced.

Left: what your chain normally leaves behind. Right: what ragcompliance writes to your warehouse. No wrappers, no forks, just a callback handler passed through LangChain's config={"callbacks":[handler]} or LlamaIndex's CallbackManager.

Without

Your chain runs. A string comes back. Good luck.

› chain.invoke("Does §4.2 cover indemnification?")

› "Yes, section 4.2 obligates Party A to indemnify…"

› ... no source, no chunk, no signature, no row ...

› ... compliance team: no.

With ragcompliance

One audited row, chained by signature. (example, mock data)

{
  "id": "c4e91…",
  "session_id": "user-abc",
  "workspace_id": "acme-prod",
  "query": "Does §4.2 cover indemnification?",
  "retrieved_chunks": [{
    "source_url": "contract-v3.pdf",
    "chunk_id": "chunk-042",
    "similarity_score": 0.94
  }],
  "llm_answer": "Yes, section 4.2…",
  "model_name": "gpt-4o-mini",
  "latency_ms": 1240,
  "chain_signature": "a3f8c2d1…" // sha256(q + chunks + answer)
}

03 · Drop in

Three lines. No chain rewrites.

The handler attaches via the standard callback channel on both frameworks. Your retriever, vector store, prompt, and LLM don't change. One callback, one row.

          
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from ragcompliance import RAGComplianceHandler, RAGComplianceConfig

retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
prompt    = ChatPromptTemplate.from_template("Context:\n{context}\n\nQ: {query}")
llm       = ChatOpenAI(model="gpt-4o-mini")

chain = (
    {"context": retriever | RunnableLambda(lambda d: "\n\n".join(c.page_content for c in d)),
     "query": RunnablePassthrough()}
    | prompt | llm
)

# the only two new lines
handler = RAGComplianceHandler(config=RAGComplianceConfig.from_env(), session_id="user-abc")
answer  = chain.invoke("Does §4.2 cover indemnification?", config={"callbacks": [handler]})
        

          
from llama_index.core import Settings
from llama_index.core.callbacks import CallbackManager
from ragcompliance import RAGComplianceConfig
from ragcompliance.llamaindex_handler import LlamaIndexRAGComplianceHandler

handler = LlamaIndexRAGComplianceHandler(
    config=RAGComplianceConfig.from_env(),
    session_id="user-abc",
)
Settings.callback_manager = CallbackManager([handler])

# any query engine now runs under the audit handler
response = query_engine.query("Does §4.2 cover indemnification?")
        

04 · Capabilities

What ships today.
All production-hardened. All tested.

Every feature below exists because a compliance team, an SRE on-call, or a power user asked for it. No vapor. 171 tests green, including a dedicated batch and concurrency suite, a retriever-chunk regression suite for langchain-core >= 1.3.0, and a 27-test PII / PHI redaction suite shipped in v0.1.8.

Drop-in callback handlers

LangChain LCEL-safe. Batch- and concurrency-safe as of v0.1.4: share one handler across chain.batch([...]) and concurrent chain.invoke() calls. LlamaIndex via CallbackManager on SYNTHESIZE events.

SHA-256 chain signature

Deterministic hash over query + chunks + answer. Any post-hoc tamper on any field is detectable at verification.

Supabase storage with RLS

Bring your own Supabase. Row-level security per workspace_id. One row per chain, one workspace per tenant.

SOC 2 evidence generator

One CLI produces a Markdown report mapped to CC6.1 / CC7.2 / CC8.1 / A1.1 / C1.1, with a signature-verified random sample an auditor can spot-check.

OIDC SSO for the dashboard

Four env vars and SSO turns on. Google Workspace, Okta, Auth0, Entra, Authentik, any standards OIDC. Domain allowlist optional.

Stripe billing + quota (reference implementation)

Self-hostable paid-tier UI for operators who want to run RAGCompliance as an internal product and meter downstream users. Period-rollover via Stripe webhook with a self-healing fallback so a dropped webhook can't lock a workspace out. Not a paid tier of this project.

Async audit writes

Fire-and-forget enqueue, bounded in-memory queue, daemon drainer. Per-chain overhead drops from ~1.2s to sub-millisecond.

Slack anomaly alerts

Four rules: zero chunks, low similarity, slow chain, chain errored. Env-tunable thresholds. Bounded alert queue so Slack outages can't back-pressure you.

Live-mode readiness probe

/health/billing flags the classic paste errors (prod_… where price_… belongs, missing webhook secret, localhost base URL in live mode) before a customer finds them on Saturday night.

PII / PHI redaction v0.1.8

Opt-in, off by default. Nine built-in patterns (email, SSN with SSA never-issued exclusion, Luhn-validated credit card, US phone, IPv4, AWS / OpenAI / Anthropic API keys, bearer tokens) plus custom Pattern registration. The SHA-256 chain signature is computed over the redacted payload, so an auditor reproducing the signature never needs access to raw secrets. Per-record redaction_findings hit counts surface to dashboards and SOC 2 reports without storing the raw value.

05 · Self-host

Self-host. Free forever.

ragcompliance is MIT-licensed middleware. Clone it, pip install it, point it at your Supabase, and every feature on this page is yours. No paid tier, no call-home, no telemetry, no per-seat fees.

If you'd rather not run it yourself, I offer a few kinds of paid help around the project: integration reviews, SOC 2 evidence prep, custom features on contract, and an operated dashboard under your own domain. These are engagements with the author, not a locked tier of the codebase.

Email the author →

06 · Shipped & shipping

A tight scope, shipped in public.

RAGCompliance is a focused audit trail library for retrieval-augmented generation. Not a tracing platform, not an eval suite, not a vector database. One job: turn every RAG chain invocation into one signed, row-level-secured audit record your compliance team can actually audit.

Every checkmark below has a commit, a test suite, and a Render deploy behind it. Anything planned is marked as such, with no pretending.

✓LangChain callback handler (LCEL-safe, outermost-chain latching)v0.1.0

✓LlamaIndex callback handler (SYNTHESIZE-based)v0.1.0

✓Supabase persistence with row-level securityv0.1.0

✓Dashboard exports: CSV / JSON with filtersv0.1.0

✓Stripe billing + quota metering (reference implementation, optional)v0.1.1

✓Period-rollover with self-healing fallbackv0.1.1

✓Fail-closed quota enforcementv0.1.2

✓Async audit writes (bounded queue, atexit drain)v0.1.3

✓Slack alerts for anomalous chainsv0.1.3

✓SOC 2 evidence report generatorv0.1.3

✓OIDC SSO on the dashboardv0.1.3

✓Stripe live-mode readiness probev0.1.3

✓Batch- and concurrency-safe shared handler (chain.batch, threads, asyncio)v0.1.4

✓Defensive storage.save() wrapping: a custom backend that raises can't take the chain downv0.1.4

✓Retriever-chunk capture fix for langchain-core >= 1.3.0 (silent correctness bug fix)v0.1.5

✓SOC 2 evidence default sample size raised from 5 to 25; handler-overhead vs full-chain latency disambiguationv0.1.6

✓SECURITY.md + ragcompliance-selftest CLI for clean-install verification + FAQ + dynamic shields.io badgesv0.1.7

✓Opt-in PII / PHI redaction with 9 built-in patterns; signature computed over the redacted payloadv0.1.8

○BYO object storage for source documents (S3, GCS, R2)planned

○NER-based redaction pass (in addition to regex patterns)planned

○Postgres-direct storage backend (without Supabase)planned

○Haystack and DSPy first-class handlersplanned

07 · Honest answers

Questions you'd ask before adopting this.

The first things compliance, security, and platform teams ask when they look at ragcompliance for real. Plain answers, no hedging.

Isn't this just LangSmith?

No. LangSmith (and Langfuse, Helicone) are observability tools. They help you debug why a chain behaved a certain way. ragcompliance is an audit tool. It produces one signed row per invocation that a compliance reviewer can spot-check a year later and verify the answer hasn't been modified. Different audience (compliance instead of the engineering team), different artefact (immutable signed record instead of traces), different storage model (your Supabase under RLS instead of a SaaS trace store). Most teams in regulated industries need both.

LangChain 1.3 ships PIIMiddleware natively. How is this different?

They solve different problems and they compose. PIIMiddleware redacts what the LLM sees, so sensitive values never reach the model in the first place. ragcompliance redacts what the audit log persists, so sensitive values never reach storage. You typically want both: redact pre-LLM with PIIMiddleware to control what crosses the model boundary, then redact pre-signature with ragcompliance so the audit row a compliance reviewer pulls a year later cannot leak data either. The two passes operate on different copies of the payload and write to different surfaces.

Does it handle PHI / PII?

As of v0.1.8, yes, opt-in. Set RAGCOMPLIANCE_REDACT_PII=true and the handler runs a regex pass before signing and storing the record. Nine built-in patterns ship: email, SSN (with SSA never-issued prefix exclusion), Luhn-validated credit card, US phone, IPv4, AWS / OpenAI / Anthropic API keys, bearer tokens. Custom Pattern objects can be registered on top. The SHA-256 chain signature is computed over the redacted payload, so an auditor reproducing the signature never needs access to raw secrets. For PHI specifically, pair with a HIPAA-eligible Supabase deployment and your own BAA. NER-based redaction (named-entity layer on top of regex) and BYO object storage with customer-held KMS keys are on the roadmap.

Is the SHA-256 chain signature a legal signature?

It is not a digital signature in the eIDAS / DSA / advanced electronic signature sense. It is a cryptographic integrity tag over query + chunks + answer that lets an auditor detect post-hoc tampering on any of those three fields without needing access to the original model run. Courts and regulators treat it as evidence of integrity, not identity. If you need non-repudiation on top (who wrote the record), pair with signed inserts at the database layer or S3 Object Lock on the raw payload store.

Is the audit log encrypted at rest?

Yes, because Supabase (the default storage) encrypts all rows at rest with AES-256 and all traffic in transit with TLS 1.2+. Row-level security isolates rows per workspace_id so a multi-tenant install cannot cross-read. If your threat model needs customer-held keys, the planned BYO object storage path lets you write the raw payload to S3 / GCS / Azure Blob under your own KMS key and keep only metadata + signature in Supabase.

Supabase only?

Supabase is the default because it gives you Postgres, row-level security, and a managed API out of the box, so a self-host path is a single SQL migration plus env vars. The AuditStorage interface is swappable, though. Anything that can persist a dict of the record shape and query it back works; storage.save() is wrapped defensively so a custom backend that raises cannot take the chain down. A Postgres-direct and a BYO object storage backend are on the roadmap.

What about Haystack, DSPy, or a raw OpenAI call?

LangChain and LlamaIndex are the two shipped handlers because they cover most of what teams put in front of auditors today. For anything else, the capture shape is small: build the same AuditRecord at your own boundary (query, retrieved chunks, answer, model, latency), hand it to storage.save(...), and you get the same signed row and the same SOC 2 evidence export. Haystack and DSPy first-class handlers are on the roadmap if there is pull.

Ship a RAG system your compliance team will actually sign off on.