Skip to content

đź§˝ LLM Deletion Kit: A Low-Tech Guide to Erasing Your Digital Shadow

Author: Resilience Project (eirenicon LLC)
Version: 1.0 (Adapted for Print/Offline Use from Codeberg Repo: resilience/Ardens/llm-deletion-kit)
Date: December 12, 2025
Purpose: This document converts the modular, digital-first "LLM Deletion Kit" into a printable, low-tech format. Print it as a PDF or booklet for offline reference—ideal for workshops, zines, or air-gapped setups. No internet required after printing. For the interactive repo (scripts, templates), visit: https://codeberg.org/resilience/Ardens/src/branch/main/llm-deletion-kit.


Introduction: Why Delete Your Data from LLMs?

Large Language Models (LLMs) are trained on vast internet scrapes, including your posts, emails, and personal traces. Once ingested, your data fuels automated systems that contribute to surveillance, bias, and control. This kit empowers you to opt out—request deletion, poison datasets, and build personal data autonomy.

Key Principles (Resilience Over Compliance): - Proactive Erasure: Act before training cycles lock in your information forever. - Layered Approach: Combine legal requests, technical obfuscation, and analog backups. - Community Focus: Share this protocol; digital resilience is a collective defense mechanism.

Scope: Covers major providers (OpenAI, xAI, Anthropic, Meta). Expansion and contributions are welcome via forks on Codeberg.

Risks: Providers may ignore requests or delay compliance; use Tor/VPN for submission if anonymity is critical. Track all correspondences in a physical notebook.


Step 1: Assess Your Exposure (The Manual Audit)

Map your digital footprint manually—no apps or internet access needed for the review process.

  1. List Your Assets:

    • Social: X/Twitter (@handle), Reddit (u/name), forums.
    • Public Writes: Blogs, GitHub, Pastebin.
    • Scraped Sources: Common Crawl, LAION datasets (images/text).
    Asset Type Examples Exposure Level (Low/Med/High) Notes (Date Ranges, Topics)
    Social Media X posts, FB shares High Often scraped en masse
    Blogs/Forums Personal site, StackOverflow Med Check robots.txt status
    Images/Media Flickr uploads, portfolios High LAION-5B includes billions
    Emails/Logs Leaked via breaches Low (if private) Check against printed breach logs
  2. Audit Timeline:

    • Recall key events: "Posted about [topic] in 2023?"
    • Cross-reference with printed archives (e.g., your journal exports).

Offline Tip: Use graph paper to sketch a "data web"—nodes for accounts, edges for connections. Circle the nodes that are most likely to be in training sets.


Leverage GDPR (EU), CCPA (CA), or emerging rights. Print the template below; fill by hand, and consider mailing certified copies for a legally verifiable paper trail.

Universal Template: Data Deletion Request Letter

[Your Printed Letterhead or Handwritten Header]
[Your Name/Anonymous Alias]
[Your Address or "c/o Resilience Drop, [PO Box]"]
[Date]

[Provider Legal Dept Address—Research & Fill: e.g., OpenAI, 3180 18th St, San Francisco, CA 94110]

Subject: Formal Request for Personal Data Deletion Under [GDPR/CCPA/Right to be Forgotten] Dear [Provider Legal Team],

I am writing to exercise my rights under [specify law, e.g., Article 17 GDPR] to request the deletion of my personal data from your systems, including but not limited to training datasets for Large Language Models.

Details of Data Subject: - Name/Alias: [Your Info]
- Associated Accounts: [e.g., email@example.com, @handle on X]
- Ingested Content: [Describe, e.g., "Posts from 2020-2025 scraped via Common Crawl, accessible via [URL/Source]"]

Scope of Request: - Erase all instances of my data from models (e.g., GPT-4, Llama).
- Confirm destruction in writing within 30 days.
- Cease future scraping of my content.

Evidence of ingestion: [Attach printouts/screenshots if available, or note "Publicly verifiable via [source]"].

Failure to comply may result in escalation to [regulator, e.g., CNIL in France].

Sincerely,
[Signature]
[Contact: Encrypted email or none]

Tracking Sheet (Print & Use):

Provider Date Sent Method (Mail/Email) Response Date Status (Pending/Granted/Denied) Notes
OpenAI
xAI
Anthropic

Pro Tip: Send from a burner PO Box or use a physical letter drop service. For global reach, partner with digital rights NGOs.


Step 3: Technical Poisoning & Obfuscation (The Analog Hack)

For data already scraped: Make it unusable for future model iterations. These are low-tech analogs to repository scripts.

  1. Text Poisoning (Analog Method):

    • Append a Marker to old, high-value posts (e.g., edit a blog post: "Original text... POISON: DELETE-REQ-2025").
    • Why? Future scrapes will capture the marker, potentially flagging the content for filtering or corrupting the model's output for that data point.
  2. Image Hashes (Manual Description):

    • Print images with high-contrast watermarks: "NOT FOR AI TRAINING" or a visible, low-opacity pattern.
    • For Digital Content: Use a perceptual hash tool (like Phash). Though this is a high-tech tool, note the principle—a unique ID per file to track re-uploads and identify model ingestion.
  3. Opt-Out Lists (Curated Print):

    • robots.txt Snippet to Add to Your Site: (Use this if you control the site's root directory) ``` User-agent: GPTBot Disallow: /

    User-agent: Google-Extended Disallow: / ``` - Common Crawl Opt-Out: Note the process—visit their site once, print confirmation, then block their crawler via a local hosts file (memo: "Block crawl-*.archive.org").

Offline Tool Equivalent: The Codeberg repo has a poison.py script—run it on a local machine and print the output logs as proof of action taken.


Step 4: Habit-Building & Resilience Playbook

Sustain privacy post-deletion through consistent low-tech habits.

Daily Checklist (Print as Poster): - [ ] Use pseudonyms or context-free posts for new high-risk content. - [ ] Export & burn digital archives quarterly (print and delete). - [ ] Data Diet: Review your sharing habits before posting anything new. - [ ] Monitor via printed news clippings (e.g., AI regulatory changes).

Advanced: Community Scaffolding - Deletion Pods: Form 3-5 person groups to review and co-sign each other's deletion requests. - Tie to OTFR: Use this kit as a core module for Open Tools for Resilience workshops.

Metrics for Success: - 80% requests acknowledged. - Zero new exposure points in 6 months.


Appendix: Repo Mapping for Low-Tech

This paper mirrors the digital structure of the Codeberg repository:

Digital File/Folder Paper Section Offline Hack
README.md Introduction Print first page as cover
templates/deletion-letter.tex Step 2 Template Hand-copy or LaTeX-print
scripts/poison.py Step 3 Describe algo: "For each file, append marker"
docs/exposure-audit.md Step 1 Table above

Documentation License: This text is licensed under CC BY-ND 4.0 (Creative Commons Attribution-NoDerivatives). You are free to share it, but you may not modify it, as its integrity is crucial to our ongoing research protocols.

Code License: All associated scripts and code in the Codeberg repository are licensed under EUPL 1.2.

Call to Action: This kit is part of the LTOT Hub. Find the source code and templates on Codeberg. Fork it. Build resilience together.