RAGCompliance documentation

ragcompliance is drop-in middleware for LangChain and LlamaIndex that logs, signs, and stores every retrieval-augmented generation chain. This documentation covers installation, framework integration, the audit record schema, the dashboard, SSO, billing, SOC 2 evidence export, PII / PHI redaction, operational tuning, and a full API reference. Every code snippet on this page is copy-pasteable against ragcompliance >= 0.1.8.

Overview

ragcompliance sits between your chain and your observability stack. You keep your retriever, your vector store, your prompt, and your LLM exactly as they are. You pass one callback through the standard callback channel and, per invocation, one signed row lands in your Supabase.

Every row contains: the query verbatim, every retrieved chunk with source URL, chunk ID and similarity score, the LLM answer verbatim, the model name, end-to-end latency, the workspace ID, a session ID you control, a timestamp, and a SHA-256 chain signature computed over query + chunks + answer. If any of those three fields is mutated after-the-fact, the signature no longer validates at verification time.

That single row is what compliance teams need to sign off on a RAG system: "which document was cited, what did the model say, and prove it hasn't been modified since." Row-level security in Supabase means the same physical table can hold audit logs for many tenants without cross-contamination.

noteThis is middleware, not a vector database. Bring your own retriever (FAISS, Chroma, Pinecone, pgvector, Weaviate, any BaseRetriever). ragcompliance only cares about what your retriever returned and what the LLM answered.

Installation

The base install logs to stdout, which is good for local dev but not good for production. For anything real, install with the Supabase extra so the handler writes to a real audit table.

$ pip install "ragcompliance[supabase]"

Additional extras:

# + FastAPI dashboard
$ pip install "ragcompliance[supabase,dashboard]"

# + LlamaIndex handler
$ pip install "ragcompliance[supabase,llamaindex]"

# + OIDC single sign-on
$ pip install "ragcompliance[dashboard,sso]"

Requirements. Python 3.11 or newer. langchain-core >= 0.2 (covers LangChain 0.2+ and all LCEL chains). llama-index-core >= 0.10.

Supabase setup

Create a free Supabase project at supabase.com. You need two SQL scripts applied once in the SQL editor. One creates the audit log table with row-level security, the other adds the billing + usage tables used by the dashboard.

Open the SQL editor in your Supabase project.
Paste and run supabase_schema.sql from the repo (audit log table, indexes, RLS policies).
Paste and run supabase_migration_billing.sql (subscriptions, query counters, period-end RPC).
Copy the service_role key from your project's API settings. Do not use the anon key; RLS will block the handler from writing.

securityThe service role key bypasses RLS on your Supabase project. Store it in a secret manager, never commit it, never expose it to browser code. The handler only needs it server-side.

Environment variables

Copy .env.example from the repo and fill in your values. The handler reads these via RAGComplianceConfig.from_env().

RAGCOMPLIANCE_SUPABASE_URL=https://your-project.supabase.co
RAGCOMPLIANCE_SUPABASE_KEY=your-service-role-key
RAGCOMPLIANCE_WORKSPACE_ID=your-workspace-id   # one per tenant
RAGCOMPLIANCE_DEV_MODE=false                    # true = stdout, false = Supabase
RAGCOMPLIANCE_ENFORCE_QUOTA=false               # true = RuntimeError on overage
RAGCOMPLIANCE_ASYNC_WRITES=true                 # fire-and-forget (default)
RAGCOMPLIANCE_ASYNC_MAX_QUEUE=1000              # bounded buffer

workspace_id is how ragcompliance isolates audit logs across tenants. One workspace per customer in a multi-tenant SaaS. One per app for internal use. Row-level security keeps rows from leaking across workspaces even if an application bug asks for the wrong one.

Your first audited chain

Once Supabase is set up and environment variables are loaded, any chain you already have will audit itself the moment you attach the handler. Here's a minimal standalone example:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from ragcompliance import RAGComplianceHandler, RAGComplianceConfig

retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
prompt    = ChatPromptTemplate.from_template("Context:\n{context}\n\nQ: {query}")
llm       = ChatOpenAI(model="gpt-4o-mini")

chain = (
    {"context": retriever | RunnableLambda(lambda d: "\n\n".join(c.page_content for c in d)),
     "query": RunnablePassthrough()}
    | prompt | llm
)

handler = RAGComplianceHandler(config=RAGComplianceConfig.from_env(), session_id="user-abc")
answer  = chain.invoke("Does section 4.2 cover indemnification?", config={"callbacks": [handler]})

After one invocation you'll see one row in rag_audit_logs. Query it from the SQL editor, the dashboard UI, or the /api/logs endpoint.

LangChain handler

The LangChain handler is LCEL-safe and latches onto the outermost chain by default, which is what you want for 95% of setups. It captures:

the first user-facing query that entered the chain
every document the retriever yielded (source_url from metadata["source"], chunk_id from metadata["chunk_id"], and similarity_score if the retriever exposes it)
the LLM's final string output
the model name reported by the LLM integration
end-to-end latency, rounded to the millisecond

Session IDs

The session_id argument is free-form. Use it to correlate a single user's conversation across many chain invocations. A common pattern is one session per chat thread, logged in by your application.

handler = RAGComplianceHandler(
    config=RAGComplianceConfig.from_env(),
    session_id=request.session["chat_id"],
)

Passing source metadata

For source URLs to land in the audit record, your Document objects need metadata["source"] set (and ideally metadata["chunk_id"]). Most loaders do this already. A manual retriever looks like:

from langchain_core.documents import Document

docs = [
    Document(
        page_content="Section 4.2 obligates Party A to indemnify…",
        metadata={"source": "s3://contracts/acme-msa-v3.pdf", "chunk_id": "chunk-042"},
    ),
    …
]

LlamaIndex handler

The LlamaIndex handler latches onto the SYNTHESIZE event so it captures the final synthesized answer alongside the retrieved nodes. It's set globally via CallbackManager on Settings, so every query engine you subsequently construct inherits it.

from llama_index.core import Settings
from llama_index.core.callbacks import CallbackManager
from ragcompliance import RAGComplianceConfig
from ragcompliance.llamaindex_handler import LlamaIndexRAGComplianceHandler

handler = LlamaIndexRAGComplianceHandler(
    config=RAGComplianceConfig.from_env(),
    session_id="user-abc",
)
Settings.callback_manager = CallbackManager([handler])

response = query_engine.query("Does section 4.2 cover indemnification?")

Audit record schema

Every invocation writes one row into rag_audit_logs:

{
  "id": "c4e91…",                          // uuid
  "session_id": "user-abc",
  "workspace_id": "acme-prod",
  "query": "Does section 4.2 cover indemnification?",
  "retrieved_chunks": [
    {
      "content": "Section 4.2 obligates Party A…",
      "source_url": "s3://contracts/acme-msa-v3.pdf",
      "chunk_id": "chunk-042",
      "similarity_score": 0.94
    }
  ],
  "llm_answer": "Section 4.2 covers indemnification obligations…",
  "model_name": "gpt-4o-mini",
  "chain_signature": "a3f8c2d1…",
  "timestamp": "2026-04-10T06:00:00Z",
  "latency_ms": 1240
}

Every field is nullable except workspace_id, query, timestamp, and chain_signature. A chain that never returned a result (retriever failed, LLM errored) still writes a row with the error captured downstream by Slack alerts.

Signature verification

The signature is computed deterministically over the normalized JSON of the three fields that matter for accountability:

import hashlib, json

payload = {
    "query": record["query"],
    "chunks": [
        {"content": c["content"], "source_url": c["source_url"], "chunk_id": c["chunk_id"]}
        for c in record["retrieved_chunks"] or []
    ],
    "answer": record["llm_answer"],
}
expected = hashlib.sha256(
    json.dumps(payload, sort_keys=True, default=str).encode()
).hexdigest()

assert expected == record["chain_signature"], "record was tampered with after write"

This algorithm is stable across language runtimes because sort_keys=True canonicalizes the JSON. An auditor with SQL access can re-run the check without a Python dependency using their own SHA-256 implementation over the same canonical payload.

Running the dashboard

The dashboard is a single-file FastAPI app. Run it locally with:

$ pip install "ragcompliance[dashboard]"
$ uvicorn ragcompliance.app:app --reload

It ships with an HTML dashboard at / showing stats cards, recent logs, and export buttons, plus a JSON + CSV API under /api/….

HTTP endpoints

Method	Path	Purpose
GET	`/`	HTML dashboard: stat cards, recent logs, export buttons
GET	`/health`	Liveness probe. Always 200 OK.
GET	`/health/billing`	Stripe live-mode readiness probe. 200 when configured, 503 + issues list otherwise.
GET	`/api/logs`	Paginated audit records. Supports `workspace_id`, `session_id`, `start`, `end`, `limit`, `offset`.
GET	`/api/logs/detail/{id}`	Single audit record.
GET	`/api/logs/export.csv`	CSV export with filter query params.
GET	`/api/logs/export.json`	JSON file export with filter query params.
GET	`/api/summary`	Aggregate stats: total queries, unique sessions, avg latency, models.
GET	`/api/plans`	Configured billing plans + their Stripe price IDs.
POST	`/billing/checkout`	Start a Stripe Checkout session. Body: `{workspace_id, tier}`.
POST	`/stripe/webhook`	Stripe event receiver. Verifies signature.
GET	`/billing/subscription/{workspace_id}`	Current subscription + usage for one workspace.
GET	`/login`	Redirects to the configured OIDC provider (when SSO is enabled).
GET	`/auth/callback`	OIDC callback. Validates email domain, seeds session.
GET	`/logout`	Clears the session cookie.

SSO (OIDC)

The dashboard ships open by default so local dev stays frictionless. Set four environment variables and SSO turns on via standards OIDC discovery: Google Workspace, Okta, Auth0, Microsoft Entra, Authentik, Keycloak, and any IdP that exposes a .well-known/openid-configuration document.

$ pip install "ragcompliance[dashboard,sso]"

RAGCOMPLIANCE_OIDC_ISSUER=https://accounts.google.com
RAGCOMPLIANCE_OIDC_CLIENT_ID=your-client-id
RAGCOMPLIANCE_OIDC_CLIENT_SECRET=your-client-secret
RAGCOMPLIANCE_OIDC_REDIRECT_URI=https://dash.example.com/auth/callback
RAGCOMPLIANCE_OIDC_ALLOWED_DOMAINS=acme.com,acme.co.uk  # optional allowlist
RAGCOMPLIANCE_SESSION_SECRET=$(python -c "import secrets; print(secrets.token_urlsafe(48))")

With SSO enabled, every route except /health, /login, /auth/callback, /logout, and /stripe/webhook requires a signed-in session. Browser requests get a 302 redirect to /login. API clients (anything sending Accept: application/json) get a 401 so scripted access surfaces cleanly instead of following redirects into HTML.

domain allowlistThe RAGCOMPLIANCE_OIDC_ALLOWED_DOMAINS allowlist is optional. Leave it unset to permit any email the IdP authenticates. Set it to one or more comma-separated domains to lock down to corporate accounts only.

Plans & checkout

reference implementationThese plans and prices are a ready-to-run reference implementation for operators who want to run ragcompliance as an internal product and meter downstream users. They are not a paid tier of this project. ragcompliance itself is MIT licensed and free to self-host forever: rewrite PLANS in ragcompliance.billing to anything you want, or remove the billing router entirely if you don't need it.

Two plans ship out of the box as a reference. Both are configured via Stripe products + recurring prices and wired into the dashboard at boot.

Tier	Example price	Queries / month	Extras
Team (reference)	$49 / mo	10,000	CSV/JSON export, email support
Enterprise (reference)	$199 / mo	Unlimited	SSO, custom retention, priority review

Start a checkout from your app:

import requests

r = requests.post(
    "https://your-dashboard.example.com/billing/checkout",
    json={"workspace_id": "my-workspace", "tier": "team"},
)
checkout_url = r.json()["checkout_url"]
# redirect user to checkout_url

Quota enforcement

Quota enforcement is soft by default: the chain logs a warning if the workspace is over its limit but the invocation still runs. Set RAGCOMPLIANCE_ENFORCE_QUOTA=true to hard-block. In fail-closed mode the handler raises RuntimeError before the LLM runs, and no audit row is written for the blocked invocation.

why soft by defaultThe first time you wire billing in, you don't want an expired invoice to suddenly break your product. Keep it soft while you verify that Stripe webhooks are delivering, then flip it to hard.

Period rollover

Query counters reset automatically at each billing period rollover. The reset is driven by Stripe's customer.subscription.updated webhook. If the webhook is ever missed (network blip, endpoint downtime, misconfigured secret), check_query_quota has a self-healing fallback: it compares the stored period_end to now() and forces a reset if the period has lapsed, so a dropped webhook can never permanently lock a workspace out.

Going live (Stripe)

Flipping the dashboard from test mode to live mode is a four-step runbook. The readiness probe catches the paste errors before a customer does.

In the Stripe dashboard, switch to Live mode. Create the Team and Enterprise products + recurring prices. Live mode is a separate universe from test mode, the price IDs do not carry over. Copy the two price_live_… IDs.
Update your deployment environment:

STRIPE_SECRET_KEY=sk_live_...
STRIPE_WEBHOOK_SECRET=whsec_...                   # from the live webhook endpoint
STRIPE_PRICE_ID_TEAM=price_live_...
STRIPE_PRICE_ID_ENTERPRISE=price_live_...
APP_BASE_URL=https://dash.example.com             # must NOT be localhost in live mode

In Stripe → Developers → Webhooks, create a new live-mode endpoint at https://<your-dash>/stripe/webhook. Subscribe to checkout.session.completed, customer.subscription.updated, customer.subscription.deleted, and invoice.paid. Paste the signing secret into STRIPE_WEBHOOK_SECRET.
Hit the readiness probe:

$ curl https://<your-dash>/health/billing

A fully-configured live deployment returns {"ok": true, "mode": "live", …} with a 200. Any misconfiguration comes back as 503 with an issues list:

{
  "ok": false,
  "mode": "live",
  "issues": [
    "STRIPE_WEBHOOK_SECRET is not set",
    "STRIPE_PRICE_ID_TEAM looks wrong, expected 'price_' prefix",
    "APP_BASE_URL must not be localhost in live mode"
  ],
  "summary": {
    "secret_key_prefix": "sk_live…",
    "webhook_secret_set": false,
    "supabase_configured": true
  }
}

Response sanitises every secret (only prefixes like sk_live… ever leak), so it's safe for an uptime monitor or status page to poll.

Programmatic callers get the same structure via BillingManager.readiness() returning a BillingReadiness dataclass.

from ragcompliance import BillingManager, BillingReadiness

rd: BillingReadiness = BillingManager().readiness()
if not rd.ok:
    raise SystemExit("refusing to start: " + ", ".join(rd.issues))

SOC 2 evidence

Most compliance teams can't sign off on a RAG pipeline without a written trail of what was retrieved, what was answered, and proof that the trail hasn't been tampered with. The built-in evidence generator produces a Markdown report mapped to the Trust Services Criteria controls ragcompliance actually has data for: CC6.1 (logical access), CC7.2 (system operation monitoring), CC8.1 (change management), A1.1 (availability), C1.1 (confidentiality).

$ python -m ragcompliance.soc2 \
    --workspace acme-prod \
    --start 2026-01-01 \
    --end 2026-03-31 \
    --sample 25 --seed 42 \
    --out acme-q1-2026-evidence.md

The report pulls records straight from rag_audit_logs, computes integrity stats (signed vs unsigned, unique sessions, avg latency, models observed), recomputes the SHA-256 signature on a random sample so an auditor can spot-check independently, and renders the control matrix and methodology section. It is not itself a SOC 2 attestation (only a licensed auditor can issue one), but it cuts audit-prep back-and-forth from weeks to minutes.

Sample size and confidence

The evidence report recomputes SHA-256 signatures on a random sample of records from the period. The default is 25 records, suitable for a quarterly compliance spot-check. For deeper due-diligence runs, raise it by passing --sample 100 or higher. Sampling is random but seeded via --seed for reproducibility, so an auditor re-running with the same inputs gets the same sample. A given run may not surface a specific tampered record if the tamper rate is low and the sample size is small; the relationship is the standard hypergeometric one. For exhaustive verification across the full period, pass a sample_size equal to the total record count or loop _verify_signature over every record programmatically.

Programmatic access is the same pipeline without argparse:

from ragcompliance.soc2 import generate_report

md = generate_report(
    workspace_id="acme-prod",
    start="2026-01-01",
    end="2026-03-31",
    sample_size=25,
    seed=42,
)

Async audit writes

Handler overhead is under 1ms at p50 (~38µs measured in isolation on a clean hot path). End-to-end chain latency depends on your retriever, LLM, and prompt; the handler's contribution is a small constant added on top of whatever your chain does.

Audit writes are fire-and-forget by default. save() enqueues the record onto a bounded in-memory queue and a single daemon worker drains it into Supabase, so the chain's hot path never blocks on audit I/O.

In benchmarks, per-chain audit-write overhead drops from roughly 1.2s (sync Supabase RTT) to well under 1ms (enqueue only), a three to four order of magnitude improvement.

If Supabase is unreachable, records buffer in memory up to RAGCOMPLIANCE_ASYNC_MAX_QUEUE (default 1000) and then drop with a log warning rather than leak memory. On normal process exit an atexit hook drains pending records within RAGCOMPLIANCE_ASYNC_SHUTDOWN_TIMEOUT seconds (default 5). You can also call handler.storage.flush() explicitly in tests or your own shutdown path.

when to disable asyncSet RAGCOMPLIANCE_ASYNC_WRITES=false for tests that inspect storage mid-chain, or for workloads where you'd rather pay the latency than risk any in-flight data loss on a crash.

PII / PHI redaction

In a regulated industry an audit log is, by construction, a perfect copy of every query a user has ever asked plus every chunk the retriever surfaced plus every word the LLM generated. A single SELECT against the audit table and an attacker has account numbers, patient identifiers, and internal documents at full fidelity.

Set RAGCOMPLIANCE_REDACT_PII=true and the handler runs every record's query, retrieved chunks, and answer through a regex redactor before the chain signature is computed. Sensitive values never reach storage, and an auditor reproducing the SHA-256 from a persisted record never needs access to the raw secret. Off by default. Upgrading from 0.1.7 is a no-op until you flip the flag.

RAGCOMPLIANCE_REDACT_PII=true
RAGCOMPLIANCE_REDACTION_PATTERNS=email,ssn,credit_card,phone_us,ipv4,aws_access_key,openai_key,anthropic_key,bearer_token
RAGCOMPLIANCE_REDACTION_REPLACEMENT=[REDACTED:{name}]

Built-in patterns

Nine patterns ship out of the box. Each is tested under positive, false-positive, and adversarial inputs in the public test suite (171 total tests, 27 dedicated to redaction). Names are case-insensitive in the env var; the replacement uses the uppercased pattern name ([REDACTED:EMAIL], not [REDACTED:email]).

Pattern	Matches	Notable correctness guards
`email`	RFC-shaped email addresses.	Standard local-part + domain-with-TLD.
`ssn`	US Social Security Numbers in `NNN-NN-NNNN` form.	Excludes SSA never-issued prefixes (`000`, `666`, `9xx`).
`credit_card`	13- to 19-digit primary account numbers.	Luhn-validated. Random 16-digit chunk IDs that fail Luhn are not flagged.
`phone_us`	US phone numbers with at least one separator.	Bare 10-digit strings are not flagged (too many false positives).
`ipv4`	Dotted-quad IPv4 addresses.	Octet-bounded; `192.168.300.1` is not flagged.
`aws_access_key`	`AKIA…` 20-character access key IDs.	Strict AWS prefix and length.
`openai_key`	`sk-…` OpenAI API keys.	Length-bounded. Substrings like `sk-foo` are not flagged.
`anthropic_key`	`sk-ant-…` Anthropic API keys.	Matched before `openai_key` so the longer prefix wins.
`bearer_token`	JWT-shaped bearer tokens after `Bearer` .	Three base64url-encoded segments separated by dots.

Custom patterns

Register additional patterns from Python. Useful for domain-specific identifiers like internal case IDs, account numbers, or private patient identifiers your organisation uses:

import re
from ragcompliance import RAGComplianceHandler, Pattern, Redactor

case_id = Pattern("case_id", re.compile(r"\bCASE-\d{4}\b"))
redactor = Redactor(custom_patterns=[case_id])

handler = RAGComplianceHandler(redactor=redactor)

Per-record findings

After redaction, the audit record's extra.redaction_findings field holds a per-pattern hit count for that invocation. Dashboards can surface "this record had 3 PII findings" without ever persisting the raw value:

{
  "id": "…",
  "query": "Find records for patient SSN [REDACTED:SSN]",
  "llm_answer": "Patient found with email [REDACTED:EMAIL].",
  "chain_signature": "a3f8c2d1…",
  "extra": {
    "redaction_findings": { "ssn": 1, "email": 2 }
  }
}

Compose with LangChain's native `PIIMiddleware`

LangChain 1.3 ships PIIMiddleware natively. The two redactors solve different problems and are designed to compose, not compete.

LangChain PIIMiddleware redacts what the LLM sees. Sensitive values never reach the model in the first place.
RAGCompliance redaction redacts what the audit log persists. Sensitive values never reach storage and the SHA-256 chain signature is computed over the redacted payload.

Most regulated-industry deployments want both passes. They operate on different copies of the payload and write to different surfaces, so there is no double-redaction concern.

known limitationsThe built-in patterns are regex-based, not NER-based. International phone formats, IPv6, non-bearer OAuth tokens, and unicode-lookalike emails are not caught by default. NER-based redaction is on the roadmap. For now, register custom patterns or pre-process inputs before they reach the handler if your threat model requires it.

Slack anomaly alerts

Set RAGCOMPLIANCE_SLACK_WEBHOOK_URL to a Slack incoming-webhook URL (or any compatible receiver: Discord, Teams via shim, your own HTTP endpoint) and the handler fires async alerts when a chain looks unhealthy. Four rules, all env-tunable:

Rule	Fires when	Tuned by
`retrieval_returned_zero_chunks`	Retriever returned no documents.	n/a
`low_similarity`	Best matching chunk scored below threshold.	`RAGCOMPLIANCE_SLACK_MIN_SIMILARITY` (default 0.3)
`chain_slow`	End-to-end latency exceeded threshold.	`RAGCOMPLIANCE_SLACK_SLOW_CHAIN_MS` (default 10000)
`chain_errored`	LangChain or LlamaIndex raised before the chain completed.	n/a

Alerts post on a separate daemon worker with a bounded queue, so Slack outages can't back-pressure your chain. When the queue fills, alerts drop with a log warning. Set RAGCOMPLIANCE_SLACK_DASHBOARD_URL to include a View in dashboard link in each payload.

Deployment

The dashboard is a single FastAPI app. It's stateless (state lives in Supabase), so any container platform works identically.

Render (fastest)

Create a new Web Service on render.com, pointing at your repo.
Build command: pip install -e ".[supabase,dashboard,llamaindex,sso]"
Start command: uvicorn ragcompliance.app:app --host 0.0.0.0 --port $PORT
Copy every variable from .env.example into Render's environment settings.
After the service is live, update the Stripe webhook endpoint to https://<your-render-url>/stripe/webhook.

Fly.io, Railway, Cloud Run

All three work identically. The app is a stateless container, no volumes, no sidecars. A minimal Dockerfile:

FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install -e ".[supabase,dashboard,llamaindex,sso]"
CMD ["uvicorn", "ragcompliance.app:app", "--host", "0.0.0.0", "--port", "8000"]

API reference

`RAGComplianceHandler`

LangChain callback handler. Instantiate once per chat session or once per request, pass via config={"callbacks": [handler]}.

RAGComplianceHandler(
    config: RAGComplianceConfig,
    session_id: str | None = None,
)

`LlamaIndexRAGComplianceHandler`

LlamaIndex callback handler. Import from ragcompliance.llamaindex_handler. Attach via Settings.callback_manager = CallbackManager([handler]).

`RAGComplianceConfig`

Configuration dataclass. The common constructor is RAGComplianceConfig.from_env(), which reads every RAGCOMPLIANCE_* variable.

RAGComplianceConfig(
    supabase_url: str | None,
    supabase_key: str | None,
    workspace_id: str,
    dev_mode: bool = False,
    enforce_quota: bool = False,
    async_writes: bool = True,
    async_max_queue: int = 1000,
    async_shutdown_timeout: float = 5.0,
)

`BillingManager` / `BillingReadiness`

BillingManager(
    stripe_secret_key: str | None = None,
    stripe_webhook_secret: str | None = None,
    supabase_url: str | None = None,
    supabase_key: str | None = None,
    app_base_url: str | None = None,
)

BillingManager.readiness() -> BillingReadiness

@dataclass
class BillingReadiness:
    mode: str            # "live" | "test" | "unconfigured"
    ok: bool
    issues: list[str]
    summary: dict[str, Any]
    def to_dict(self) -> dict: ...

`ragcompliance.soc2.generate_report`

generate_report(
    workspace_id: str,
    start: str,                     # ISO date
    end: str,                       # ISO date
    sample_size: int = 25,
    seed: int | None = None,
    storage: AuditStorage | None = None,
) -> str

`Redactor` / `Pattern` / `RedactionResult`

Imported from ragcompliance.redaction (also re-exported from ragcompliance for convenience). The Redactor walks any string and returns a RedactionResult with the redacted text plus per-pattern hit counts. Pass a Redactor to RAGComplianceHandler(redactor=…) to wire it into the audit pipeline.

@dataclass
class Pattern:
    name: str
    regex: re.Pattern

@dataclass
class RedactionResult:
    text: str                       # the redacted string
    findings: dict[str, int]       # per-pattern hit count

Redactor(
    builtin_patterns: list[str] | None = None,
    custom_patterns: list[Pattern] | None = None,
    replacement: str = "[REDACTED:{name}]",
)

Redactor.redact(text: str) -> RedactionResult

Environment variable reference

Name	Default	Purpose
`RAGCOMPLIANCE_SUPABASE_URL`		Supabase project URL.
`RAGCOMPLIANCE_SUPABASE_KEY`		Supabase service role key.
`RAGCOMPLIANCE_WORKSPACE_ID`		Tenant isolation key.
`RAGCOMPLIANCE_DEV_MODE`	`false`	`true` = log to stdout; `false` = persist to Supabase.
`RAGCOMPLIANCE_ENFORCE_QUOTA`	`false`	`true` raises `RuntimeError` when over limit.
`RAGCOMPLIANCE_ASYNC_WRITES`	`true`	Fire-and-forget audit writes.
`RAGCOMPLIANCE_ASYNC_MAX_QUEUE`	`1000`	Bounded in-memory buffer size.
`RAGCOMPLIANCE_ASYNC_SHUTDOWN_TIMEOUT`	`5.0`	Seconds to wait on atexit drain.
`RAGCOMPLIANCE_MAX_PENDING_RUNS`	`10000`	Soft cap on in-flight root run state. Oldest evicted on overflow (batch / concurrent safety).
`RAGCOMPLIANCE_REDACT_PII`	`false`	Master toggle. When `true`, redaction runs before signature.
`RAGCOMPLIANCE_REDACTION_PATTERNS`	all 9 builtins	Comma-separated subset of built-in pattern names. Limits which patterns run.
`RAGCOMPLIANCE_REDACTION_REPLACEMENT`	`[REDACTED:{name}]`	Format string for replacements. `{name}` resolves to the uppercased pattern name.
`RAGCOMPLIANCE_SLACK_WEBHOOK_URL`		Enables Slack alerts when set.
`RAGCOMPLIANCE_SLACK_MIN_SIMILARITY`	`0.3`	Threshold for `low_similarity`.
`RAGCOMPLIANCE_SLACK_SLOW_CHAIN_MS`	`10000`	Threshold for `chain_slow`.
`RAGCOMPLIANCE_SLACK_DASHBOARD_URL`		Appends a "View in dashboard" link to alert payloads.
`RAGCOMPLIANCE_OIDC_ISSUER`		OIDC provider URL (discovery endpoint).
`RAGCOMPLIANCE_OIDC_CLIENT_ID`		OIDC client ID.
`RAGCOMPLIANCE_OIDC_CLIENT_SECRET`		OIDC client secret.
`RAGCOMPLIANCE_OIDC_REDIRECT_URI`		Full callback URL (`.../auth/callback`).
`RAGCOMPLIANCE_OIDC_ALLOWED_DOMAINS`		Optional comma-separated email domain allowlist.
`RAGCOMPLIANCE_SESSION_SECRET`		Session cookie signing key.
`STRIPE_SECRET_KEY`		`sk_live_…` or `sk_test_…`.
`STRIPE_WEBHOOK_SECRET`		Endpoint signing secret from the Stripe webhook config.
`STRIPE_PRICE_ID_TEAM`		Stripe price ID (must start with `price_`).
`STRIPE_PRICE_ID_ENTERPRISE`		Stripe price ID (must start with `price_`).
`APP_BASE_URL`		Dashboard base URL. Must not be `localhost` in live mode.

License & contact

MIT licensed. Source at github.com/dakshtrehan/ragcompliance. Issues, PRs, and questions welcome.

For enterprise support, private-deployment guidance, or SOC 2 prep help, email daksh.trehan@hotmail.com.

ragcompliance· v0.1.8· 171 tests green· MIT

Everything you need to run ragcompliance in production.

Overview

Installation

Supabase setup

Environment variables

Your first audited chain

LangChain handler

Session IDs

Passing source metadata

LlamaIndex handler

Audit record schema

Signature verification

Running the dashboard

HTTP endpoints

SSO (OIDC)

Plans & checkout

Quota enforcement

Period rollover

Going live (Stripe)

SOC 2 evidence

Sample size and confidence

Async audit writes

PII / PHI redaction

Built-in patterns

Custom patterns

Per-record findings

Compose with LangChain's native PIIMiddleware

Slack anomaly alerts

Deployment

Render (fastest)

Fly.io, Railway, Cloud Run

API reference

RAGComplianceHandler

LlamaIndexRAGComplianceHandler

RAGComplianceConfig

BillingManager / BillingReadiness

ragcompliance.soc2.generate_report

Redactor / Pattern / RedactionResult

Environment variable reference

License & contact

Compose with LangChain's native `PIIMiddleware`

`RAGComplianceHandler`

`LlamaIndexRAGComplianceHandler`

`RAGComplianceConfig`

`BillingManager` / `BillingReadiness`

`ragcompliance.soc2.generate_report`

`Redactor` / `Pattern` / `RedactionResult`