"A look at search engines with their own indexes"

Google search is good at protecting users from spam and phishing sites as it detects malicious URLs and removes them to prevent users from clicking on them.

This is a super old post, but I’d like to challenge this. Here’s an example of Google returning a malicious link as the first result when users search for Keepass, instead of the actual website.

This behavior is enabled by Google’s bizarre decision to not display punycodes in URLs, but worsened by Google’s inability to curtail malvertising. I’ve seen multiple articles from Ars about Google malvertising. They can’t seem to get a handle on this.

Compare this to another search engine, like Mojeek, which doesn’t index any URLs with punycodes in them at all:

Our approach is not to index them. We’ve discussed this a few times before but for now we’re focussing on the large quantity of non-punycode URLs.

That’s not to say that we’d never look at this area, but this is the first time a question has been asked about it, so it’s not a priority at this time.

I don’t have high confidence in Google’s results not to be spam or malware sites after knowing this.

Kagi is working on this: Filter out or mark punycode domains - Kagi Feedback

Also: Duckduckgo uses Bing’s index. The same criticism applies to all Bing-based search engines. Search Engine Map is a pretty good way of identifying these search engines.

1 Like