Building Observability with ClickHouse

(cmtops.dev)

50 points | by valyala 4 days ago ago

13 comments

  • BiteCode_dev an hour ago

    Interestingly, I recently interviewed Samuel Colvin, Pydantic's author, and he said when designing his observability Saas called LogFire, he tried multiple backends, including ClickHouse.

    But it didn't work out.

    One of the reasons is LogFire allows the users to fetch the service data with arbitrary SQL queries.

    So they had to build their own backend in rust, on top of DataFusion.

    I used ClickHouse myself and it's been nice, but it's easy when you get to decide what schema you need yourself. For small to medium needs, this plus Grafana works well.

    But I must admit that the plug and play aspect of great services like Sentry or LogFire make it so easy to setup it's tempting to skip the whole self hosting. They are not that expensive (unlike datadog), and maintaining your observability code is not free.

  • zokier 4 hours ago

    I see lot of hype around ClickHouse these days. Few years ago I remember TimescaleDB making the rounds, arguably being predecessor for this sort of "observability on SQL" thinking. The article has short paragraph mentioning Timescale, but unfortunately it doesn't really go into comparing it to ClickHouse. How does HN see the situation these days, is ClickHouse simply overtaking Timescale on all axis? That sounds bit of a shame; I have used Timescale a bit and enjoyed it, but just on such small scale that it's operational aspects did not really come up.

    • valyala 3 hours ago

      ClockHouse outperforms TimescaleDB in every aspect on large volumes of data. https://benchmark.clickhouse.com/

      If you have small volumes of data (let's say less than a terabyte of data), then TimescaleDB is OK to use if you are OK with not so fast query performance.

    • ekabod 3 hours ago

      Clickhouse has been popular for many years. Even before Timescale.

    • shin_lao 2 hours ago

      Timescale "doesn't scale" - in a nutshell.

      Clikchouse performance is better because it's truly column oriented and it has powerful partitioning tools.

      However, Clickhouse has quirks and isn't great if you need low latency data updates or if your data is mutable.

    • BiteCode_dev an hour ago

      I echo other's sentiment, ClickHouse is much more performant than TimeScale.

  • ebfe1 an hour ago

    ClickHouse + Grafana is definitely a fantastic choice, here is another blog from ClickHouse talking about dogfooding their own technology and save millions:

    https://clickhouse.com/blog/building-a-logging-platform-with...

    (Full disclosure: I work for ClickHouse and love it here!)

  • dakiol 4 hours ago

    Such a PITA. Unless you have a dedicated team to handle observability, you are in for pain, no matter the tech stack you use.

    • valyala 3 hours ago

      That's not truth. There are solutions for logging, which are very easy to setup and operate. For example, VictoriaLogs [1] (I'm its' author). It is designed from the grounds up to be easy to configure and use. It contains a single self-contained executable without external dependencies, which runs optimally on any hardware starting from Raspberry Pi and ending with a monster machine containing hundreds of CPU cores and terabytes of RAM. It accepts logs over all the popular data ingestion protocols [2]. It provides very easy to use query language for typical querying tasks over logs - LogsQL [3].

      [1] https://docs.victoriametrics.com/victorialogs/

      [2] https://docs.victoriametrics.com/victorialogs/data-ingestion...

      [3] https://docs.victoriametrics.com/victorialogs/logsql/

      • oulipo 21 minutes ago

        Interesting, although the doc is not really user-friendly and doesn't show a lot of screenshots from the UI to get a sense of what the product can do

  • k_bx 4 hours ago

    Another project I want to give shout out to is Databend. It's built around the idea of storing your data at S3-compatible storage as Parquet files, and querying as SQL or other protocol.

    Like many popular Data Lake solutions, but it's open-source and written in Rust, which means quite easy to extend for many who know it already.

    • valyala 3 hours ago

      Databend performance looks good! https://benchmark.clickhouse.com/#eyJzeXN0ZW0iOnsiQWxsb3lEQi...

      It looks like it has slightly worser on-disk data compression than ClickHouse, and slightly worser performance for some query types when the queried data isn't cached by the operating system page cache, according to the link above (e.g. when you query terabytes of data, which doesn't fit RAM).

      Are there additional features other than S3 storage, which can convince ClickHouse user switching to Databend?

  • h1fra 2 hours ago

    Completely rewriting a system because you don't like JSON is a bit extreme