Security White Paper

How Privatiser anonymizes sensitive data, what it protects against, and where the limits of that protection are. Written for security teams, IT leads, and the curious.

Version 0.5.0 March 2026 admin@privatiser.net
Contents
  1. Overview
  2. Architecture
  3. Anonymization Pipeline
  4. Detection Patterns
  5. Threat Model
  6. Compliance
  7. Known Limitations
  8. Pro Licence Architecture
  9. FAQ

1. Overview

Privatiser is a privacy tool that strips sensitive information from text before it reaches an AI service. It replaces real values (IP addresses, API keys, passwords, names, credit card numbers, and dozens more) with consistent pseudonyms, and lets you swap the real values back in after you get a response.

The core design principle is simple: if data never leaves your machine, it cannot be leaked, logged, or trained on. Every part of Privatiser runs locally. There are no servers. There is no account. There is nothing phoning home.

What Privatiser does
1
Your text
raw input with secrets
2
Privatiser
runs in your browser
3
AI service
never sees real values
4
Deanonymize
one click to restore

The mapping between real values and pseudonyms is stored only in your browser's session memory, and is cleared when you close the tab. No mapping is ever persisted to a server or a database.

2. Architecture

Privatiser is distributed as a static website, browser extension, and VS Code extension. In all three cases, the anonymization logic is the same JavaScript engine running entirely on your machine.

Data boundary
👤
Your machine
Browser, VS Code, or web tool. All processing happens here. Mapping stored in session memory only.
NO DATA CROSSES
THIS LINE
External servers
Privatiser makes zero network requests. Your text and mappings never reach any server we operate.

Installation methods

The web tool is a static HTML/CSS/JS page. The anonymization engine (privatiser.min.js) is loaded from the same origin as the page. No external scripts, no CDN dependencies, no analytics. You can verify by opening DevTools and checking the Network tab: after the initial page load, no further requests are made when you anonymize text.
The extension is published on the Chrome Web Store and Firefox Add-ons. It uses activeTab permission (only runs when you invoke it), storage (saves your settings locally), and contextMenus/menus (right-click menu). It declares no host permissions beyond the AI sites it supports, and makes no external network requests.
The VS Code extension runs inside VS Code's extension sandbox. It uses the same JavaScript engine as the web tool, loaded from the extension bundle. It has no network activation events and no telemetry. The package.json declares no external dependencies beyond the VS Code API.

3. Anonymization Pipeline

The anonymization engine processes text in a single deterministic pass. Here is what happens from the moment you click Anonymize to the moment you see output.

Processing pipeline
1
Allowlist filter
Values on the allowlist are flagged and skipped. They will never be replaced regardless of what patterns match.
2
Custom words
User-defined terms are replaced first. Case-insensitive, whole-word match. These run before any built-in patterns.
3
Pattern matching
30+ built-in patterns run in priority order, high-confidence first. Each match is immediately marked so later patterns cannot re-match it.
4
Placeholder swap
Matched values are replaced with internal null-byte markers. This prevents cross-pattern interference (e.g. a domain pattern re-matching inside an already-redacted email).
5
Pseudonym mapping
Each unique real value gets a consistent pseudonym (IP_1, EMAIL_1, etc.). The same value appearing multiple times always maps to the same pseudonym.
6
Output
Anonymized text is returned. The mapping is held in browser session memory only, cleared when the tab is closed.

Why placeholder markers matter

Each match is immediately replaced with a null-byte marker before the next pattern runs. This prevents a later, lower-priority pattern from matching inside a value that has already been handled. For example, an email address contains a domain name. Without markers, a domain pattern could match the domain part of an already-redacted email, producing a confusing double-redaction.

Consistent pseudonyms

If the same IP address appears three times in your text, it will always be replaced with the same pseudonym (for example, IP_1). This is essential for the AI to reason about your text correctly. The mapping is stored in a JavaScript Map object in browser memory for the duration of your session.

4. Detection Patterns

Privatiser ships with 30+ detection patterns organized into 6 categories. Each pattern has a confidence level based on the risk of a false positive.

Pattern Category Confidence Example
Connection stringsSecretsHighpostgres://user:pass@host/db
JWT tokensSecretsHigheyJhbGci...
PEM / SSH keysSecretsHigh-----BEGIN RSA PRIVATE KEY-----
AWS Access KeysAWSHighAKIA...
AWS ARNsAWSHigharn:aws:iam::123456789012:user/...
Credit cards (Luhn)PIIHigh4111 1111 1111 1111
SSNsPIIHigh123-45-6789
IBANsPIIHighGB29NWBK60161331926819
Bearer tokensSecretsHighBearer sk-abc123...
IPv4 addressesNetworkMedium192.168.1.1
Email addressesPIIMediumuser@company.com
UUIDsIdentifiersMedium550e8400-e29b-41d4-...
Phone numbersPIIMedium+44 7911 123456
Azure / GCP resource IDsCloudMedium/subscriptions/abc.../...
S3 bucket URLsAWSMediums3://my-bucket/path
MAC addressesNetworkMedium00:1A:2B:3C:4D:5E
Keyword-based secretsSecretsLowapi_key = "abc123"
Domain namesNetworkLowinternal.corp.net
Private URLsNetworkLowhttp://10.0.0.1/admin

Table shows a selection. Privatiser detects 30+ pattern types in total.

Confidence levels

High means the pattern is highly specific and false positives are rare. These run first.

Medium means the pattern is reliable but may occasionally match something benign.

Low means the pattern is intentionally broad. It catches more, but may redact things you did not intend. Use the allowlist to exclude known safe values.

5. Threat Model

Privatiser is designed to address a specific, realistic threat: a staff member pasting internal data into a commercial AI assistant. Here is how it handles the most common attack surfaces in that scenario.

AI service logs your input
Blocked

The AI service only ever sees anonymized text. Real values never leave your machine.

AI vendor trains on your data
Blocked

Pseudonyms cannot be reversed by the AI vendor. They have no mapping table.

Clipboard interception
Partial

Privatiser does not intercept clipboard access. If you paste anonymized text, only pseudonyms are exposed. But real values remain in your clipboard history.

Browser extension compromise
Partial

If the extension itself were compromised, it could read text. The extension is published on the Chrome Web Store and Firefox Add-ons where it can be independently reviewed.

Missed detection
Partial

Non-standard formats or custom naming conventions may not be caught. Use custom words and review output before sharing sensitive material.

Mapping table leakage
Blocked

Mappings are held in JavaScript session memory only. They are not written to disk, localStorage, or any server.

What Privatiser does not protect against

No local tool can fully substitute for a proper data governance policy. Privatiser reduces risk significantly for the most common use cases, but should be part of a broader security approach.

6. Compliance Relevance

Privatiser is not itself a compliance product and does not make any regulatory certification claims. That said, its architecture directly supports a number of common compliance requirements around data handling and AI usage.

GDPR No personal data leaves your machine. No third-party processors involved. Processing happens in the user's browser only.
HIPAA PHI is redacted before reaching any AI service. No BAA required because covered data never leaves the covered entity's environment.
FCA / SRA Client data, matter references, and financial identifiers can be stripped before AI use, keeping client confidentiality intact.
ISO 27001 Supports data minimisation controls. Helps enforce a policy that AI tools do not receive production credentials or PII.
SOC 2 Reduces the blast radius of an AI vendor incident. Pseudonyms in vendor logs cannot be mapped back to real identifiers.
PCI DSS Card numbers are detected using Luhn validation and replaced before any text reaches an AI service.

For enterprise deployments, a shared configuration file can be pre-loaded with the organisation's custom words, allowlist, and enabled categories via MDM or group policy. This enforces consistent anonymization across the whole team without relying on individuals to configure it correctly. Contact admin@privatiser.net to discuss a team deployment.

7. Known Limitations

Being honest about what Privatiser does not do is as important as describing what it does.

Patterns are written against common, well-documented formats. If your organisation uses custom identifier formats (for example, internal account numbers or reference codes with no standard structure), those will not be detected automatically. Use the custom words feature to add them.
Patterns like "keyword-based secrets" and "domain names" are intentionally broad. They may redact values that are not actually sensitive. You can turn off individual categories in Settings, or add false positives to the allowlist.
Privatiser processes text only. Screenshots, PDFs, Word documents, and other binary formats are not scanned. If you export a config file to PDF and upload it to an AI, Privatiser will not help.
Closing the browser tab clears the mapping. If you need to deanonymize an AI response after closing the tab, you will need to re-run the original text through Privatiser to rebuild the same mapping. The pseudonyms are generated deterministically per session, not across sessions.
Always check the anonymized output before sending it, especially for high-stakes content. The mapping table shows you exactly what was detected and replaced. If something sensitive is still visible, add it to custom words and re-run.

8. Pro Licence Architecture

Privatiser Pro adds paid features (custom regex, pattern packs, saved presets). The licence system uses public-key cryptography so verification happens entirely in your browser — no licence server, no account, no phone-home.

Key anatomy

PRV1-MjAyNi1h...base64url...Zg==
PRV1- — version prefix
4 bytes — Unix timestamp expiry (big-endian uint32)
64 bytes — Ed25519 signature over privatiser-licence-{expiry}

The total payload is 68 bytes, giving a fixed-length base64url string. Both the expiry timestamp and signature must pass independently. Altering the expiry bytes invalidates the signature.

Payment and delivery flow

1
💳 Payment

Customer pays via Stripe Checkout (Stripe-hosted page). Card details never touch Privatiser infrastructure.

2
📡 Webhook

Stripe fires a checkout.session.completed event to a Cloudflare Worker endpoint over HTTPS.

3
🔒 Signature verification

The Worker validates the Stripe webhook signature using HMAC-SHA256 with timing-safe comparison and a 5-minute replay window. Replayed or tampered events are rejected.

4
✍️ Key generation

The Worker signs privatiser-licence-{expiry} with the Ed25519 private key stored as an encrypted Cloudflare Worker Secret. The key never appears in source code.

5
✉️ Delivery

The PRV1- key is emailed to the customer via Resend with a one-click activation link.

6
🌐 Browser activation

Customer enters the key on /activate.html. The browser verifies the Ed25519 signature using the embedded public key via crypto.subtle.verify. No server call is made. On success the key is saved to localStorage.

7
🚀 Feature gating

Each page load calls isProUser(), which re-verifies the stored key locally. Pro features unlock instantly. No network request is ever made during normal use.

Cryptographic properties

🔑 Unforgeable
Ed25519 signatures cannot be produced without the private key. Extracting the public key from browser JS does not help — it can only verify, not sign.
📅 Expiry-bound
The signed message includes a Unix timestamp. A key with a past expiry fails immediately. Altering the timestamp bytes in the stored token also invalidates the signature.
🔄 Reusable
Keys are not consumed on use. Clearing your browser does not invalidate the key — enter it again to re-activate. Deliberate trade-off: no revocation infrastructure.
✈️ Offline-first
Once issued, verification is fully local. The anonymization tool and Pro features work without any network access.

What the Worker does and does not see

9. FAQ

No. The tool runs entirely in your browser. The code that processes your text is downloaded when you load the page, but then executes locally. We have no server receiving your input.
No. The extension only activates on a declared list of AI sites (ChatGPT, Claude, Gemini, Copilot, Perplexity, and about 25 others). It requests the minimum permissions needed: activeTab, storage, and contextMenus. It does not have broad host permissions.
Each category has a counter. The first IP address detected becomes IP_1, the second becomes IP_2, and so on. The same real value always maps to the same pseudonym within a session. Pseudonyms contain no information about the original value.
No. Pseudonyms like IP_1 or API_KEY_3 carry no structural information about the original value. The mapping exists only in your browser's session memory. The AI vendor has no access to it.
It is cleared entirely. Mappings are held in a JavaScript Map in session memory. Nothing is written to localStorage, cookies, or any server. Closing the tab is a hard reset.

Questions not answered here? Email admin@privatiser.net. For enterprise security reviews or pen test requests, please include your organisation name and a brief description of your requirements.