IdentityMatrix
IdentityMatrix HEM Extractor - Browser Storage Scanning for Email Identifiers
Summary
IdentityMatrix's tracking script contains an extractHems() function that scans all browser storage (cookies, localStorage, sessionStorage) looking for email addresses and email hashes. Found values are transmitted for cross-reference against identity graphs for visitor de-anonymization.
BTSS Score Breakdown
Technical Details
Code Evidence
extractHems() function - scans all browser storage
function extractHems() {
var found = []; var seen = new Set();
var md5Regex = /^[a-f0-9]{32}$/i;
var sha256Regex = /^[a-f0-9]{64}$/i;
var emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
// Scans cookies
document.cookie.split(";").forEach(function(cookie) {
var parts = cookie.trim().split("=");
var key = parts[0];
var value = parts[1];
if (key && value) { addHem("cookie", key, value); }
});
// Scans localStorage
for (var key in localStorage) {
if (localStorage.hasOwnProperty(key)) {
var value = localStorage[key];
if (value) addHem("localStorage", key, value);
}
}
// Scans sessionStorage
for (var key in sessionStorage) {
if (sessionStorage.hasOwnProperty(key)) {
var value = sessionStorage[key];
if (value) addHem("sessionStorage", key, value);
}
}
VISIT_DATA.hem = found;
}
VISIT_DATA structure - all collected visitor data
var VISIT_DATA = {
id: getTrackingID(),
pixelId: null,
sessionId: getSessionID(),
domain: window.location.hostname,
path: window.location.pathname,
search: window.location.search,
dateVisit: new Date().toISOString(),
userAgent: navigator.userAgent,
referrer: getCookie("im_session_referrer"),
pageTitle: document.title,
formData: null,
hem: [], // Hashed Email extraction results go here
timezone: Intl.DateTimeFormat().resolvedOptions().timeZone,
screenSize: window.screen.width + "x" + window.screen.height,
viewportSize: window.innerWidth + "x" + window.innerHeight,
language: navigator.language,
idleTime: 0,
clickEvents: [],
mousePath: []
};
Network Evidence
https://api.identitymatrix.ai/track{"hem": [...], "sessionId": "...", ...}Visitor data including extracted HEMs transmitted to API
https://f2f-server-prod-*.herokuapp.com/visitors/Face2Face.io integration - note 'stalking' terminology in data model
Attack Parallel: Cookie Stealers
Traditional cookie stealers enumerate document.cookie to exfiltrate session tokens and authentication data. HEM extractors use the same technique but extend it to localStorage and sessionStorage, searching for identity markers that enable cross-site tracking. The only difference: cookie stealers are called malware, HEM extractors are called "identity resolution" and sold as B2B SaaS.
ReferenceFramework Mappings
Legal Touchpoints
Purpose limitation - data must be collected for specified purposes. Scanning all storage to find any email is not a specified purpose, it's a fishing expedition.
Data minimization - scanning ALL storage keys violates minimization. A legitimate script would access specific, known keys only.
Accessing information stored on user's terminal equipment requires consent. Enumerating all storage is accessing stored information.
Sensitive PI (which may be found in storage) requires opt-in consent. HEM extraction is opt-out at best, and often has no notice at all.
Prevalence
- IdentityMatrix's own website
- Various B2B technology companies
Reproduction Steps
## Reproduction Steps 1. **Visit** https://www.identitymatrix.ai/ 2. **Open DevTools** → Sources tab 3. **Find** trackingScript.js in the source tree 4. **Search** for "extractHems" or "localStorage" 5. **Observe** the storage enumeration code ## Alternative: Network Analysis 1. **Open DevTools** → Network tab 2. **Visit** https://www.identitymatrix.ai/ 3. **Search** for requests to api.identitymatrix.ai 4. **Inspect** request payloads for "hem" field
Remediation
Remove IdentityMatrix script if deployed
Effort: trivialAudit scripts for storage enumeration patterns
Effort: moderateImplement storage access policies via CSP
Effort: significant