Data Cleaning
Upload qualitative data files for PII de-identification. The tool scans for phone numbers, emails, postal codes, SINs, URLs, social media handles, and long ID numbers, then replaces them with markers like [PHONE-1] and [EMAIL-1].
Accepted file types
- .docx — Word documents (text is extracted automatically)
- .txt — Plain text files
- .csv — CSV files (you'll choose which columns to clean)
Not supported: Excel (.xlsx) — export as CSV first (File → Save As → CSV). Scanned PDFs/images cannot be processed.
What gets detected
Pattern-matching catches: phone numbers, email addresses, Canadian postal codes, SINs, URLs, social media handles (@mentions), and long ID numbers (8+ digits).
Not caught: Names, locations, facility names, job titles, or any contextual identifier. Always review cleaned files before sharing.