As someone who follows ADFS or Ad Dev Filtering Summit (read: drinks too much ADFS kool-aid), all those ML/AI projects including AdGuard’s (using LLMs for view / content classification) sound inevitable (even though, they might not be) for content filtering.
Imo, where AdGuard’s current approach falls short is, the code to display the ad/banner/widget has already executed before the LLM has had the chance. The trackerware / adware code that these ads/widgets load are equally (if not more) invasive. I mean, some folks recommend blocking in-browser ads for security! They’re not hoping to merely filter the ads cosmetically but to filter out the code itself.
When Gemini 2 (and subsequently GPT 4) launched with 1M token context windows, I regularly used it to feed it webpages and JavaScript on those webpages and ask it questions about the kind of API calls, fingerprinting, PII violation it was seeing, and to my mild surprise, it was pretty good at it. Especially when you consider the fact that most of the code it was looking at was obfuscated and minified. Even found a tonne of trackerware vending domains that none of the blocklists I use had (like Badmojr’s 1Hosts and @ph00lt0’s Blocklist; ex: Block `metadata.io` · Issue #49 · ph00lt0/blocklist · GitHub).
Disclaimer: I co-develop Rethink DNS + Firewall which is very similar to AdGuard for Android in capabilities. We do have plans to add on-device LLM-driven domain-name/text analysis & classification in Rethink but mainly from security point of view (domains hosting/serving phishing webpages, for example; notifications/SMS with smishing attacks etc).
Can you provide the text here? The page is broken for me and only lists their blocker, it’s not an article.