It looks nice, I'm positively surprised by search results I got on demo page. Ai response is more or less similar to what I'm used to, graphics contain YouTube thumbnails from videos that are super related to the topic I asked (one that took me a while to stumble upon, but is a huge knowledge source), text results are decent...
I never looked into private/selfhosted search.
How does such service gather data from web? What's the original source, who does scrapping and how do you update it?
About how they scrape other search engines, it's really simple: HTTP calls and parsing of HTML (for most of them).
In MiniSearch, I don't need to save the results by myself. The scrape is done in real-time by SearXNG and passed to MiniSearch, which in turn runs a similarity search and filters out the textual results that don't seem that useful.
But I can say the real differential of MiniSearch is that it's mobile-first. Since the beginning, it was made to run on the browsers of Chrome/Safari/Firefox Mobile, and [Wllama](https://github.com/ngxson/wllama) together with [Web-LLM](https://github.com/mlc-ai/web-llm), along with LLMs of <1B parameters, allowed it!
Thank you for such patient and rich response. I didn't realize that's how SearXNG works.
Fully agree with you that mobile experience is the highlight of your project. It feels... not present. They way it should be - clean, intuitive and focused on results, not the tool.
It really makes me want to add it to things I self host.
It looks nice, I'm positively surprised by search results I got on demo page. Ai response is more or less similar to what I'm used to, graphics contain YouTube thumbnails from videos that are super related to the topic I asked (one that took me a while to stumble upon, but is a huge knowledge source), text results are decent...
I never looked into private/selfhosted search. How does such service gather data from web? What's the original source, who does scrapping and how do you update it?
Glad for your review!
About the source of the search results, both text and images, they're all from [SearXNG](https://github.com/searxng/searxng/).
> SearXNG is a free internet metasearch engine which aggregates results from more than 70 search services. Users are neither tracked nor profiled.
SearXNG by itself offers a full-stack platform for you to run searches privately (you can find public instances at <https://searx.space/>, and easily host yourself [via docker](https://github.com/searxng/searxng-docker)).
About how they scrape other search engines, it's really simple: HTTP calls and parsing of HTML (for most of them).
In MiniSearch, I don't need to save the results by myself. The scrape is done in real-time by SearXNG and passed to MiniSearch, which in turn runs a similarity search and filters out the textual results that don't seem that useful.
But I can say the real differential of MiniSearch is that it's mobile-first. Since the beginning, it was made to run on the browsers of Chrome/Safari/Firefox Mobile, and [Wllama](https://github.com/ngxson/wllama) together with [Web-LLM](https://github.com/mlc-ai/web-llm), along with LLMs of <1B parameters, allowed it!
If you're curious, here's the HN post I made about it a year ago: https://news.ycombinator.com/item?id=37885752
Thank you for such patient and rich response. I didn't realize that's how SearXNG works.
Fully agree with you that mobile experience is the highlight of your project. It feels... not present. They way it should be - clean, intuitive and focused on results, not the tool.
It really makes me want to add it to things I self host.