I want to propose a different take: I get what you’re trying to do here, but I think you’re trying to solve the wrong problem.
You don’t need universal file tagging. A generic cross-platform file tagging system is going to suck at everything. You’re going to spend all this time manually tagging files, and then what? You still have to remember to tag them. You still have to decide on a tagging schema. And in 5 years when you want to access your data differently, your tags aren’t going to match the way you actually want to search.
This is the same problem as hierarchical directories, just with tags instead of folders. You’re still doing manual organization work that humans are terrible at.
What you need are data silos. Hear me out.
This philosophy came about because a co-worker was complaining that his kids don’t understand directories in computers because they use Chromebooks in school, which use Google Drive, which in turn has a UI focused on search rather than directory structure. This was effectively “brainwashing” people towards relying on Google rather than organizing stuff themselves.
I thought about this for a while and came to the conclusion that the only issue with this is that its Google search specifically. Aside from that, the kids are right, f directories, no one is gonna remember why you organized your directories a specific way 10 years from now, and the way you’re going to want to access data 10 years from now is going to be completely different and completely unpredictable. So why bother?
I was always one of those people who meticulously organized everything into folders and then couldn’t find anything. F that, just search. If you absolutely need to layer some kind of data on top, use tagging. But put the focus on search.
The real solution is Data Silos. Instead of trying to tag “files”, think about what TYPE of data you’re actually organizing.
For photos and videos, use Immich or Stash. Just drop your photos in. ML search finds everything. Search for “beach” or “dog” or “birthday cake” and it works. No tagging required. It even recognizes faces automatically. No metadata needed either. Mine are stripped, and Immich doesn’t care.
For documents (PDFs,receipts, taxes, letters), use Paperless-ngx with OCR + full-text-search. Upload and forget. Search for any text that was in the document.
For web articles/bookmarks, Linkwarden. Save full page archives with SingleFile integration. Full text search across everything you’ve saved. Tag if you want, but search is the primary interface.
For music, Navidrome or Audiobookshelf.. Automatic metadata extraction, search by artist/album/genre.
For code/Technical docs, use grep. Hierarchy probably matters most for code in terms of function, but in terms of finding stuff, search with grep.
I’ve convinced myself that this is the superior way of “organizing”.
- Minimal manual work. Immich just takes the photo. Done. Paperless just upload PDF. Done. Don’t waste time tagging because ML/OCR will do it for you.
- Purpose-Built search. Photo apps can do facial recognition and object detection. Document apps can do OCR and semantic search. Music apps understand ID3 tags and acoustic fingerprinting. A generic file tagger can do NONE of this.
- Data Isolation. Your taxes should never touch your vacation photos. Your music libarary should never mix with your work documents. Directories are for isolation, not organization.
- Future-Proof. When you want to search differently in 5 years, the ML/OCR adapts. You’re not locked into your old tagging schema, and data is still accessible even if you change how you think about it.
“But I want everything in one place!”
No you don’t. You think you do, but you don’t. You need directories to separate TYPES of data, that’s it. What you want are searchable photos when you’re looking for photos, documents when you’re looking for documents, and everything backed up to the same server.
This approach is called “silos’, specialized systems where you just drop data in, never think about organization, access via search and (optionally) tags, and trust ML/indexing to make it findable.
Humans are not good at organizing data. We forget our schemas. We can’t predict the future access patterns. We waste hours building folder structures that become useless. Computers, however, are excellent at indexing and searching. Let them do what they’re good at.
So, don’t build a universal file tagging system, use specialized silos.
Granted, a lot of the stability that comes from this mindset is when you start self-hosting. Otherwise, you’re at other people’s mercy of a product’s availability. But just thought I’d share this mindset that I’ve been embracing for a few years now after trying (and failing) to organize everything in my life.