Security White Paper

1. Overview

Privatiser is a privacy tool that strips sensitive information from text before it reaches an AI service. It replaces real values (IP addresses, API keys, passwords, names, credit card numbers, and dozens more) with consistent pseudonyms, and lets you swap the real values back in after you get a response.

The core design principle is simple: if data never leaves your machine, it cannot be leaked, logged, or trained on. Every part of Privatiser runs locally. There are no servers. There is no account. There is nothing phoning home.

What Privatiser does

1

Your text

raw input with secrets

→

2

Privatiser

runs in your browser

→

3

AI service

never sees real values

→

4

Deanonymize

one click to restore

The mapping between real values and pseudonyms is stored only in your browser's session memory, and is cleared when you close the tab. No mapping is ever persisted to a server or a database.

2. Architecture

Privatiser is distributed as a static website, browser extension, and VS Code extension. In all three cases, the anonymization logic is the same JavaScript engine running entirely on your machine.

Data boundary

👤

Your machine

Browser, VS Code, or web tool. All processing happens here. Mapping stored in session memory only.

NO DATA CROSSES

THIS LINE

☁

External servers

Privatiser makes zero network requests. Your text and mappings never reach any server we operate.

Installation methods

Web tool (privatiser.net) ▲

The web tool is a static HTML/CSS/JS page. The anonymization engine (privatiser.min.js) is loaded from the same origin as the page. No external scripts, no CDN dependencies, no analytics. You can verify by opening DevTools and checking the Network tab: after the initial page load, no further requests are made when you anonymize text.

Browser extension (Chrome, Firefox, Edge) ▲

The extension is published on the Chrome Web Store and Firefox Add-ons. It uses activeTab permission (only runs when you invoke it), storage (saves your settings locally), and contextMenus/menus (right-click menu). It declares no host permissions beyond the AI sites it supports, and makes no external network requests.

VS Code extension ▲

The VS Code extension runs inside VS Code's extension sandbox. It uses the same JavaScript engine as the web tool, loaded from the extension bundle. It has no network activation events and no telemetry. The package.json declares no external dependencies beyond the VS Code API.

3. Anonymization Pipeline

The anonymization engine processes text in a single deterministic pass. Here is what happens from the moment you click Anonymize to the moment you see output.

Processing pipeline

1

Allowlist filter

Values on the allowlist are flagged and skipped. They will never be replaced regardless of what patterns match.

2

Custom words

User-defined terms are replaced first. Case-insensitive, whole-word match. These run before any built-in patterns.

3

Pattern matching

30+ built-in patterns run in priority order, high-confidence first. Each match is immediately marked so later patterns cannot re-match it.

4

Placeholder swap

Matched values are replaced with internal null-byte markers. This prevents cross-pattern interference (e.g. a domain pattern re-matching inside an already-redacted email).

5

Pseudonym mapping

Each unique real value gets a consistent pseudonym (IP_1, EMAIL_1, etc.). The same value appearing multiple times always maps to the same pseudonym.

6

Output

Anonymized text is returned. The mapping is held in browser session memory only, cleared when the tab is closed.

Why placeholder markers matter

Each match is immediately replaced with a null-byte marker before the next pattern runs. This prevents a later, lower-priority pattern from matching inside a value that has already been handled. For example, an email address contains a domain name. Without markers, a domain pattern could match the domain part of an already-redacted email, producing a confusing double-redaction.

Consistent pseudonyms

If the same IP address appears three times in your text, it will always be replaced with the same pseudonym (for example, IP_1). This is essential for the AI to reason about your text correctly. The mapping is stored in a JavaScript Map object in browser memory for the duration of your session.

4. Detection Patterns

Privatiser ships with 30+ detection patterns organized into 6 categories. Each pattern has a confidence level based on the risk of a false positive.

Pattern	Category	Confidence	Example
Connection strings	Secrets	High	postgres://user:pass@host/db
JWT tokens	Secrets	High	eyJhbGci...
PEM / SSH keys	Secrets	High	-----BEGIN RSA PRIVATE KEY-----
AWS Access Keys	AWS	High	AKIA...
AWS ARNs	AWS	High	arn:aws:iam::123456789012:user/...
Credit cards (Luhn)	PII	High	4111 1111 1111 1111
SSNs	PII	High	123-45-6789
IBANs	PII	High	GB29NWBK60161331926819
Bearer tokens	Secrets	High	Bearer sk-abc123...
IPv4 addresses	Network	Medium	192.168.1.1
Email addresses	PII	Medium	user@company.com
UUIDs	Identifiers	Medium	550e8400-e29b-41d4-...
Phone numbers	PII	Medium	+44 7911 123456
Azure / GCP resource IDs	Cloud	Medium	/subscriptions/abc.../...
S3 bucket URLs	AWS	Medium	s3://my-bucket/path
MAC addresses	Network	Medium	00:1A:2B:3C:4D:5E
Keyword-based secrets	Secrets	Low	api_key = "abc123"
Domain names	Network	Low	internal.corp.net
Private URLs	Network	Low	http://10.0.0.1/admin

Table shows a selection. Privatiser detects 30+ pattern types in total.

Confidence levels

High means the pattern is highly specific and false positives are rare. These run first.

Medium means the pattern is reliable but may occasionally match something benign.

Low means the pattern is intentionally broad. It catches more, but may redact things you did not intend. Use the allowlist to exclude known safe values.

5. Threat Model

Privatiser is designed to address a specific, realistic threat: a staff member pasting internal data into a commercial AI assistant. Here is how it handles the most common attack surfaces in that scenario.

AI service logs your input

Blocked

The AI service only ever sees anonymized text. Real values never leave your machine.

AI vendor trains on your data

Blocked

Pseudonyms cannot be reversed by the AI vendor. They have no mapping table.

Clipboard interception

Partial

Privatiser does not intercept clipboard access. If you paste anonymized text, only pseudonyms are exposed. But real values remain in your clipboard history.

Browser extension compromise

Partial

If the extension itself were compromised, it could read text. The extension is published on the Chrome Web Store and Firefox Add-ons where it can be independently reviewed.

Missed detection

Partial

Non-standard formats or custom naming conventions may not be caught. Use custom words and review output before sharing sensitive material.

Mapping table leakage

Blocked

Mappings are held in JavaScript session memory only. They are not written to disk, localStorage, or any server.

What Privatiser does not protect against

Malware or spyware running on your machine that reads memory or keystrokes
Screen sharing or recording that captures your original text before anonymization
Sensitive data embedded in images or binary files
Secrets with custom naming schemes not covered by any pattern

No local tool can fully substitute for a proper data governance policy. Privatiser reduces risk significantly for the most common use cases, but should be part of a broader security approach.

6. Compliance Relevance

Privatiser is not itself a compliance product and does not make any regulatory certification claims. That said, its architecture directly supports a number of common compliance requirements around data handling and AI usage.

GDPR No personal data leaves your machine. No third-party processors involved. Processing happens in the user's browser only.

HIPAA PHI is redacted before reaching any AI service. No BAA required because covered data never leaves the covered entity's environment.

FCA / SRA Client data, matter references, and financial identifiers can be stripped before AI use, keeping client confidentiality intact.

ISO 27001 Supports data minimisation controls. Helps enforce a policy that AI tools do not receive production credentials or PII.

SOC 2 Reduces the blast radius of an AI vendor incident. Pseudonyms in vendor logs cannot be mapped back to real identifiers.

PCI DSS Card numbers are detected using Luhn validation and replaced before any text reaches an AI service.

For enterprise deployments, a shared configuration file can be pre-loaded with the organisation's custom words, allowlist, and enabled categories via MDM or group policy. This enforces consistent anonymization across the whole team without relying on individuals to configure it correctly. Contact admin@privatiser.net to discuss a team deployment.

7. Known Limitations

Being honest about what Privatiser does not do is as important as describing what it does.

It will miss secrets in unusual formats ▲

Patterns are written against common, well-documented formats. If your organisation uses custom identifier formats (for example, internal account numbers or reference codes with no standard structure), those will not be detected automatically. Use the custom words feature to add them.

Low-confidence patterns can cause false positives ▲

Patterns like "keyword-based secrets" and "domain names" are intentionally broad. They may redact values that are not actually sensitive. You can turn off individual categories in Settings, or add false positives to the allowlist.

It does not handle images or binary files ▲

Privatiser processes text only. Screenshots, PDFs, Word documents, and other binary formats are not scanned. If you export a config file to PDF and upload it to an AI, Privatiser will not help.

The mapping is not persistent across sessions ▲

Closing the browser tab clears the mapping. If you need to deanonymize an AI response after closing the tab, you will need to re-run the original text through Privatiser to rebuild the same mapping. The pseudonyms are generated deterministically per session, not across sessions.

It is not a substitute for reviewing output ▲

Always check the anonymized output before sending it, especially for high-stakes content. The mapping table shows you exactly what was detected and replaced. If something sensitive is still visible, add it to custom words and re-run.

8. Pro Licence Architecture

Privatiser Pro adds paid features (custom regex, pattern packs, saved presets). The licence system uses public-key cryptography so verification happens entirely in your browser — no licence server, no account, no phone-home.

Key anatomy

PRV1-MjAyNi1h...base64url...Zg==

PRV1- — version prefix

4 bytes — Unix timestamp expiry (big-endian uint32)

64 bytes — Ed25519 signature over privatiser-licence-{expiry}

The total payload is 68 bytes, giving a fixed-length base64url string. Both the expiry timestamp and signature must pass independently. Altering the expiry bytes invalidates the signature.

Payment and delivery flow

1

💳 Payment

Customer pays via Stripe Checkout (Stripe-hosted page). Card details never touch Privatiser infrastructure.

2

📡 Webhook

Stripe fires a checkout.session.completed event to a Cloudflare Worker endpoint over HTTPS.

3

🔒 Signature verification

The Worker validates the Stripe webhook signature using HMAC-SHA256 with timing-safe comparison and a 5-minute replay window. Replayed or tampered events are rejected.

4

✍️ Key generation

The Worker signs privatiser-licence-{expiry} with the Ed25519 private key stored as an encrypted Cloudflare Worker Secret. The key never appears in source code.

5

✉️ Delivery

The PRV1- key is emailed to the customer via Resend with a one-click activation link.

6

🌐 Browser activation

Customer enters the key on /activate.html. The browser verifies the Ed25519 signature using the embedded public key via crypto.subtle.verify. No server call is made. On success the key is saved to localStorage.

7

🚀 Feature gating

Each page load calls isProUser(), which re-verifies the stored key locally. Pro features unlock instantly. No network request is ever made during normal use.

Cryptographic properties

🔑 Unforgeable

Ed25519 signatures cannot be produced without the private key. Extracting the public key from browser JS does not help — it can only verify, not sign.

📅 Expiry-bound

The signed message includes a Unix timestamp. A key with a past expiry fails immediately. Altering the timestamp bytes in the stored token also invalidates the signature.

🔄 Reusable

Keys are not consumed on use. Clearing your browser does not invalidate the key — enter it again to re-activate. Deliberate trade-off: no revocation infrastructure.

✈️ Offline-first

Once issued, verification is fully local. The anonymization tool and Pro features work without any network access.

What the Worker does and does not see

The Worker receives the customer email address from Stripe's session data, used only to deliver the licence. It is logged in partially masked form (te***@example.com).
The Worker never sees any text the customer anonymizes. Anonymization runs in the browser only.
The private key is imported at runtime with extractable: false, preventing it from being exported from the WebCrypto API after import.

9. FAQ

Can Privatiser see my text? ▲

No. The tool runs entirely in your browser. The code that processes your text is downloaded when you load the page, but then executes locally. We have no server receiving your input.

Does the browser extension have access to all sites? ▲

No. The extension only activates on a declared list of AI sites (ChatGPT, Claude, Gemini, Copilot, Perplexity, and about 25 others). It requests the minimum permissions needed: activeTab, storage, and contextMenus. It does not have broad host permissions.

How are pseudonyms generated? ▲

Each category has a counter. The first IP address detected becomes IP_1, the second becomes IP_2, and so on. The same real value always maps to the same pseudonym within a session. Pseudonyms contain no information about the original value.

Can an AI reverse the pseudonyms? ▲

No. Pseudonyms like IP_1 or API_KEY_3 carry no structural information about the original value. The mapping exists only in your browser's session memory. The AI vendor has no access to it.

What happens to the mapping when I close the tab? ▲

It is cleared entirely. Mappings are held in a JavaScript Map in session memory. Nothing is written to localStorage, cookies, or any server. Closing the tab is a hard reset.