Tool to Search Email for Accounts?

Is there a tool that takes an export of your email inbox, processes it locally, and outputs a neat list of every mailing list & online account you’ve gotten mail from?

____

I’ve found one of the most time-consuming parts of this journey is the ‘damage control’; working though every last account & scrubbing it clean of PII. It’s agonizingly tedious.

Before we even begin to actually clean all these accounts, they first have to be compiled & listed. I’ve used a PW manager for the last decade, so that was a nice headstart. But as I spend weeks gradually combing through my inbox, tens of thousands of emails in total, I’m finding a tremendous number of accounts I wouldn’t have otherwise remembered. In this regard, email is my most reliable papertrail - every random retailer Ive ever bought a cigarette from has apparently emailed a confirmation code & receipt

As I lament the blood, sweat, and tears that couldve been saved by taking privacy seriously a little sooner, I find myself wondering if this laborious process can be automated.

Surely it wouldn’t be too difficult to locally parse through an mbox export, and spit out a list of sender domains, linked to the relevant email. In fact, it seems so straightforward, I have a hard time believing it doesn’t already exist. Yet, I haven’t been able to find anything

So, have any of the good folks at PG stumbled across such a tool? Or am I the only one who would find value in something like that?

1 Like

Something like this maybe?: GitHub - PS1607/mbox-to-json: A small package that converts MBOX files to JSON(or CSV). Also includes functionality to extract attachments.

You can maybe ask an LLM to code you something basic with the data as input?
It can do wonders without too much effort. Ask it to keep it simple with HTML & CSS. :hugs:

You could ask Cursor or any kind of LLM really.
Not sure people did a generic tool because it depends on the shape of your data.
But hey, LLMs are very good with approximations as their entry point and output, which might be exactly what you’re looking for here. :wink: