I’ve spent the last 3 months building a crawler to index the public parts of Telegram (https://telehunt.org). The native search is essentially a black box that favors the top 0.1% of bot almost invisible.
The Tech: I had to deal with rate limits and the lack of a global 'sitemap'. I’m currently using a hybrid approach of metadata scraping to keep the index fresh.
The Goal: It’s an experiment in making 'un-indexable' bot data discoverable.
I’ve spent the last 3 months building a crawler to index the public parts of Telegram (https://telehunt.org). The native search is essentially a black box that favors the top 0.1% of bot almost invisible. The Tech: I had to deal with rate limits and the lack of a global 'sitemap'. I’m currently using a hybrid approach of metadata scraping to keep the index fresh. The Goal: It’s an experiment in making 'un-indexable' bot data discoverable.
[dead]