First of all, I would like to make it clear that I have read and understood the guidelines. Now to the point: My post is not completely generated by Artificial Intelligence, AI is used to explain or reinforce some concepts which I find difficult to do so myself, however all the ideas and statements in my posts are solely my opinions, and based on my understanding of the General Data Protection Regulation (GDPR). Furthermore, I am not responding to a post or answering someone with AI generated content, or asking anyone to respond with AI generated content, but instead asking for opinions on my complaint to the Dutch Data Protection Authority, which was partially written by AI. If you still believe my post should be flagged, please tell me why. Thank you.
Complaint Regarding The Unlawful Processing, Retention, and Refusal of Erasure by The Internet Archive
This complaint concerns the Internet Archive’s ongoing and systemic breaches of the General Data Protection Regulation, California Consumer Privacy Act, Children’s Online Privacy Protection Act, specifically through its operation of the “Wayback Machine”, which involves the large-scale, indefinite collection, storage, duplication, and dissemination of personal data without a lawful basis under Article 6(1) GDPR.
Definitions: Controller (“Internet Archive”, “IA”), Data Subject/Complainant (REDACTED, REDACTED), REDACTED/REDACTED (“I”, “me”, “my”, “us”, “ours”), General Data Protection Regulation (“GDPR”), California Consumer Privacy Act (“CCPA”), Children’s Online Privacy Protection Act (“COPPA”), Permanent and irreversible deletion or anonymisation so that the data can no longer be restored and attributed to an identified or identifiable person, in line with Article 4 (2) and Recital 66 GDPR. “Exclude”, “exclusion”, or “hide” does not fulfill this requirement as it remains accessible internally. (“erase”, “erasure”, “delete”, “deletion”), The specific data protection complaint initiated by me regarding the unlawful processing, storage, and retention of my personal data listed under Annex A, and the failure to comply with relevant provisions of the GDPR, including but not limited to Articles 5, 6, 12, 13, 14, 17, 21, and 89, as they pertain to said personal data. (“this case”)
Section 1 – GDPR Violations by the IA
- Unlawful Processing Without a Legal Basis (Article 6)
Per Article 6(1), processing of personal data is lawful only if one of the legal bases applies, including the data subject’s explicit consent (Article 4(11)) or contractual necessity. The IA collects and archives personal data (Article 4(1)) from third-party websites without consent or notice, in violation of Article 6(1)(a) and (f). The activities of web crawling, duplicating, storing, and making available personal data clearly fall under the definition of “processing” under Article 4(2). The IA’s default position of scraping all content regardless of public interest does not qualify as a legitimate interest, nor has a balancing test under Article 6(1)(f) been demonstrated.
- Misapplication of Public Interest (Articles 5, 89, Recitals 157, 158)
The IA routinely invokes Article 89(1) (processing for archival purposes in the public interest) without satisfying the substantive criteria. For this exemption to apply, the processing must: serve a genuine and substantial public interest (e.g., educational, historical, scientific, or journalistic purposes) (Recitals 157, 158), be subject to appropriate safeguards, and respect data minimization and necessity principles (Article 5(1)(c), 5(1)(e)). The indiscriminate scraping and retention of non-public, personal, and semi-private content such as personal blogs, social profiles, or deleted/modified web pages clearly exceeds what is necessary for a legitimate public interest archive. As such, the IA’s reliance on Article 89 is legally unfounded.
- Failure to Inform Data Subjects (Article 5, 13, 14)
Where personal data is not collected directly from the data subject, Article 13 and Article 14(1) obliges the controller (Article 4(7)) to provide information including: the identity and contact details of the controller, the purposes and legal basis for processing, the categories of data involved, retention periods, and the rights of the data subject. The IA does not notify affected individuals when it collects and archives their data, nor does it offer any public mechanism for identifying or objecting to such processing. This violates the transparency principle under Article 5(1)(a).
- Excessive Retention (Article 5)
The IA stores personal data indefinitely, including in backup systems, without a defined retention schedule or regular erasure mechanism. This is incompatible with the storage limitation principle under Article 5(1)(e), which mandates data be kept only for as long as necessary for the purposes originally collected. Archiving “everything forever,” including obsolete, harmful, or contested data, clearly exceeds what is proportionate or necessary under GDPR standards.
- Failure to Respond to Erasure Requests (Articles 12, 17)
Under Article 17(1) (Right to Erasure) and Article 12(3), a controller must respond to valid data deletion requests within one month. In multiple documented cases including mine, the IA has: not responded within the deadline, failed to confirm any deletion or legal basis for refusal, or offered vague responses about “exclusion” without any erasure. This constitutes a direct violation of Articles 12 and 17 and demonstrates systemic failure to uphold data subject rights.
- Incomplete Erasure (Recital 66)
Instead of fully deleting personal data upon request, the IA often merely “excludes” it from public view while retaining the data internally. Per Recital 66, controllers must take reasonable steps to erase all replications, including: backups, indexed copies, and any further dissemination. This form of concealment is not equivalent to erasure. As long as the data remains stored or accessible internally, it is still being “processed” (Article 4(2)) and therefore remains under the controller’s obligations.
- Processing Special and Potentially Sensitive Categories of Data (Article 9)
The IA may be processing special categories of personal data (e.g., political views, health information, personal identifiers tied to minors, usernames linked to behavior, etc.) without satisfying the exceptions in Article 9(2). In many cases, the archived data also includes: phone numbers, emails, real names, photos, personal identifiers (such as age, location, etc.), data from minors (violation of COPPA), and entire web pages. This heightens the severity of the data protection violations and suggests lack of adequate internal data classification and safeguards.
- No Designated Data Protection Officer (Articles 37, 38, 39)
Under Article 37(1)(b), a Data Protection Officer (DPO) is required when processing operations: involve regular and systematic monitoring of data subjects on a large scale, or involve special categories of data (Article 9). Despite operating a platform that systematically collects and reproduces personal data from across the globe, the IA provides no accessible DPO contact and appears not to have designated a qualified DPO according to their Bios. This is a governance failure under Articles 37–39.
- Inadequate Technical and Organizational Measures (Articles 5, 32)
Article 32 requires data controllers to implement measures ensuring a level of security appropriate to the risk. Given the scale of IA’s duplication and distribution systems, and the lack of deletion pathways, it is unlikely that they: apply proper access controls, enforce deletion from backups, or prevent unauthorized internal access. The lack of any external audit or compliance transparency also raises concerns about data integrity and confidentiality (Article 5(1)(f)).
- General Evasion of Data Subject Rights (Articles 12, 13, 14, 15, 17, 21)
The IA creates systemic obstacles to users exercising their rights under GDPR by: obscuring their internal policies, failing to provide a working request form, ignoring or delaying responses, and denying access to meaningful erasure options. This constitutes a violation of: Article 12 (Transparent communication), Articles 13–15 (Right to information and access), Article 17 (Erasure), Article 21 (Right to object), and probably more.
Section 2 – CCPA Violations by the IA
- Collection Without Notice (§1798.100)
The IA archives and processes personal information (names, usernames, emails, etc.) without informing users at or before the point of collection. There’s no notice or “right to know” disclosure, especially when content is scraped.
- Failure to Honor Deletion Requests (§1798.105)
If a user makes a verified request to delete personal data, the service must: respond within specific deadlines, erase the data from systems (including backups), and confirm the request is completed. As mentioned above, the IA has a history of “excluding” content (hiding it) instead of erasing it, which does not fulfill the deletion requirement.
- No Access or Disclosure Rights Fulfilled (§1798.100, 1798.110, 1798.115)
Consumers have the right to request: what personal data is collected, how it’s used and with whom it’s shared, and where it came from. The IA provides no clear mechanism or response process for such access requests.
- Retention Policy Not Transparent (§1798.100, 1798.130)
CCPA requires a description of how long data is retained or the criteria used to determine retention. The IA keeps data indefinitely without disclosing any justification or retention schedule.
- No Easily Accessible Privacy Policy (§1798.130)
The IA’s privacy policy (if it exists) is not clearly accessible from all pages, nor does it outline consumer rights as required under California law.
Section 3 – COPPA Violations by the IA
- Collecting Personal Info from Children Without Verifiable Parental Consent (§312.5)
The IA archives: profiles, posts, comments, and entire web pages which often contain names, usernames, photos, voice, IPs, and identifiers of children under 13. There is no consent mechanism, nor do they even attempt to verify age.
- No Direct Notice to Parents (§312.4)
COPPA requires direct notice to parents before any collection of personal data from children. The IA does not notify parents when a child’s webpage or content is archived.
- Failure to Honor Erasure Requests for Children (§312.6)
If a parent or guardian requests deletion of a child’s data, it must be fully removed including from backups. The IA’s “exclude” system is not compliant under COPPA as the data still exists internally.
- No Privacy Policy for Children (§312.3, 312.4)
Any service likely to process children’s data must have a clear, child-friendly privacy policy as well as state what data is collected, how it’s used, and how to request deletion. The IA does not have a specific COPPA-compliant privacy policy or a special section addressing children’s data.
- Passive Collection from Third-Party Sites Visited by Children
Even if the IA does not directly target children, archiving children’s websites, gaming forums, or educational tools that have a strong likelihood of being used by minors still falls under COPPA if the audience includes under-13s, and no consent was obtained from guardians.
Section 4 – My Demands for the IA
Upon receipt of the identifiers listed in Annex A, the IA shall locate and permanently erase all personal data relating to me according to Annex A stored in any way anywhere in any web page including user generated content across: the live Wayback Machine index (all 946 billion URLs or at minimum, all URLs from January 1st 2020 to the date of erasure), all internal mirrors, backups, and replication stores, any derivative databases, datasets, or researcher portals.
Furthermore, I explicitly state that I will not bear any cost, financial or otherwise, associated with the location, identification, or deletion of this data. As the original data collection and processing were conducted without my consent or legal basis under Article 6(1), any related operational overhead is the sole responsibility of the IA.
The IA shall also implement a technical filter that automatically blocks capture, storage, or any other form of processing of content containing these in any way anywhere in any web page including user generated content: any identifier supplied in Annex A and the Opt-Out Code (the literal string “OPTOUT”).
This obligation is indefinite and binding under Articles 5 (1)(b)(c), 6(1), 21 & 25.
Any future attempt to process, crawl, cache, store, display, or republish any form of material containing those identifiers constitutes unlawful processing.
Deadline (for the Internet Archive): 30 days after the first business day after which this is sent (not received)
Method of confirmation: Written statement of completion, including hashing logs (hashing logs regarding personal identifiers are required to be erased after confirmation) or comparable evidence of deletion and filter deployment.
Annex A – Personal Identifiers
REDACTED
REDACTED
Any other tags, identifiers, or references that are associated in any way with the above identifiers
Annex B – Terms of Use for Annex A
YOU ARE NOT AUTHORIZED TO USE THIS DOCUMENT (Annex A) IN ANY WAY WHATSOEVER WITHOUT EXPLICIT CONSENT
YOU ARE NOT AUTHORIZED TO USE THIS DOCUMENT (Annex A) FOR ANYTHING OTHER THAN WHAT I GAVE CONSENT FOR
YOU ARE NOT AUTHORIZED TO SHARE THIS DOCUMENT (Annex A) TO ANYONE OR ANYTHING WITHOUT EXPLICIT CONSENT
I HAVE THE RIGHT TO REVOKE ANY AUTHORIZATION GIVEN REGARDING THIS DOCUMENT (Annex A) AT ANY TIME WITH NO PRIOR NOTICE