WhoAmI — Privacy-First Architecture and Browser Fingerprinting

Most tools that collect browser or network data do so silently. WhoAmI is built on the opposite premise: show users exactly what can be collected about them, gate every collection behind explicit consent, and never send raw data off the device unless there's a concrete reason. That constraint shaped every architectural decision.

The server-client boundary is a privacy boundary. When a request arrives at /lab/modules/whoami, a React Server Component reads two headers — x-forwarded-for for the client IP and User-Agent for the browser string. These two values are the only data points that exist without any user action. Everything else requires an explicit opt-in. The RSC layer runs getIpIntel and parseUserAgent in parallel, keeping both calls server-side so the raw IP never travels to the client at all.

Resilient upstream design. IP geolocation depends on a third-party provider, which means it will occasionally fail. The IP intelligence pipeline hits ipwho.is first with a 10-second timeout, then falls back to ipapi.co if the response is a 4xx, contains success: false, or throws. Private IPv4 ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 127.0.0.0/8) and IPv6 link-local addresses short-circuit before any network call — there is no point asking an upstream provider about a LAN address, and doing so wastes quota and adds latency for nothing.

What gets stored and how. The IP displayed to the user has its last octet stripped. If a log line ever needs to reference the IP — for rate limiting or abuse detection — the unmasked value is hashed with SHA-256 plus a daily-rotating salt before it touches storage. The salt rotation means a hash from yesterday cannot be cross-referenced with a hash from today even if someone obtains both log files. Country code is stored in the visit snapshot; city is not. These are not arbitrary choices — they reflect the minimum data needed to make the tool useful.

Two consent tiers, not one. The basic fingerprint tier collects navigator properties: platform, language list, hardware concurrency, device memory, connection effective type, screen dimensions, and the WebGL renderer string from a throwaway canvas. These signals are fingerprintable but not stable across browser updates or cookie clears. The deep fingerprint tier runs three probes: a canvas hash (drawing a fixed scene and hashing the pixel output via SubtleCrypto.digest), an audio hash (500 frames of a triangle wave through an OfflineAudioContext and a dynamics compressor), and a battery query. Canvas and audio hashes are stable across incognito sessions and cookie clears — that is a qualitatively different level of identifiability, and it warrants a second explicit confirmation rather than bundling it with the first consent.

WebRTC as an educational probe. The WebRTC leak detection works by creating an RTCPeerConnection against Google's public STUN server, opening a data channel, and watching onicecandidate events. The first non-RFC1918 candidate IP that surfaces is the user's real public IP, bypassing any VPN that does not explicitly block WebRTC traffic. The implementation never reports that IP back to the server — it lives only in the browser. The point is to demonstrate that the leak happens, not to exploit it.

Deliberate by default. Privacy-first design is not a feature toggle — it is a set of constraints applied at the architecture level before a single line of business logic is written. Server-side data collection only, explicit consent gating for client probes, minimum-viable data storage, salted hashing, coarse-grained geographic data. Each constraint is documented and traceable to a specific risk it mitigates.