Stylometry circumvention

Stylometry is studying authors style to reveal their identities.

Stylometry poses a significant privacy challenge in its ability to unmask anonymous authors or to link pseudonyms to an author’s other identities,[29] which, for example, creates difficulties for whistleblowers,[30] activists,[31] and hoaxers and fraudsters.[32] The privacy risk is expected to grow as machine learning techniques and text corpora develop.[33]

This is a complex topic, and includes so many depths not only in text analysis, but also music, paintings, and other media. But a good start might be command line to analyze basic punctuation patterns, sentence endings, and their forms. Of course this is assuming all spelling and grammatical mistakes are already eliminated.


It is possible to utilize a large language model as a means of evading stylometry, employing strategies comparable to those commonly used by students to evade plagiarism.
- academic style

Yo, listen up, let me break it down
You can drop LLM to evade stylometry, don’t be a clown
Just like the kids today, they use tactics to dodge that plagiarism trap
So get with the times, and level up your game, son.
- rap style

Above quotes courtesy of Vicuna LLM stylistic reinterpretation.

Sending your text to an online service like ChatGPT that has your phone and credit card may actually work against what you’re trying to achieve. Vicuna and gpt4alpaca are just two of the many models you can download and use offline.

Be wary of websites employing (usually from a third party) invasive CAPTCHAs that fingerprint you from input events way before you finish writing something on a web form. Copy-pasting works but may flag you as a bot.

For something more serious, I’d look into the GitHub project hosted at computationalstylistics/stylo. For some theory, a book like Machine Learning Methods for Stylometry can shed more light on the topic.


Related: The NSA's Large Language Models - Conscious Digital