Say I want to download this discussion site in its current state and form, how would I achieve that? I basically want this archive to be an offline mirror of the site, with everything like posts and comments available, but without internet connection.
This could probably work https://www.httrack.com/
I know this doesn’t help answer your question, but may I ask, why you want to do this?
If you want to save a copy of the site, you can just use the Internet Archive’s Wayback Machine to take a snapshot of it.
Why would you need an offline copy of an entire forum? I can only imagine that it would be huge, and I can’t see the use for it.
It’s just for archival purposes. As for day-to-day use case, I just want to save posts I like offline, as to not worry it ever going down, which has just happened way too often with me. Wayback machine does just that but I’ve find it to be too unreliable, maybe I’m not doing it right, but it’s not offline anyway so why bother. Though I do wonder if I can self host something like that and not rely internet archive…
I’m not sure this would work for sites like this one, but for wiki sites specifically you can make use of this:
Be warned, it’s slow to crawl and download the sites. But it can be useful I guess for things like Arch wiki that have tons of information.
I guess a valid use case for doing this may be to train an AI locally to have it answer quick questions even while offline. Not sure how useful this really is, but at least is a good project for experience with LLMs or even as a proof of concept.
In the past I tried everything but only this software worked for my use case.
Not OP, but it probably wouldn’t be that big. This forum is text heavy, and text is light on resources
Thought people did more study on this, but you could achieve this with https://youzim.it/ and then use a ZimClient (Kiwix) to display it.