7 comments

  • unusual_typo 15 hours ago

    Here are the benchmark results. You can check more details in the repo. openai/privacy-filter on Apple M1 Max

       dtype              1k total    1k tok/s       8k total    8k tok/s
      ━━━━━━━━━━━━━━━━  ━━━━━━━━━━━  ━━━━━━━━━━  ━━━━━━━━━━━━━  ━━━━━━━━━━
       fp32              620.52 ms       1,664    4,893.86 ms       1,689
      ────────────────  ───────────  ──────────  ─────────────  ──────────
       fp16              654.56 ms       1,578    5,430.17 ms       1,521
      ────────────────  ───────────  ──────────  ─────────────  ──────────
       q4                582.13 ms       1,776    4,635.39 ms       1,784
      ────────────────  ───────────  ──────────  ─────────────  ──────────
       q4f16             648.10 ms       1,594    5,261.56 ms       1,570
      ────────────────  ───────────  ──────────  ─────────────  ──────────
       quantized int8    573.94 ms       1,801    4,594.95 ms       1,800
  • anoop_kumar a day ago

    I would love to have an option where instead of just redaction; I'd love to swap it with something else when it goes to AI and then swap it back when the AI returns it. Thanks for sharing the github. I might submit a PR if I don't find that feature

    • unusual_typo a day ago

      I wanted to implement the feature initially. i realized that it requires modification of coding agents (eg codex, claude code, opencode etc). hook or skills pass PII data into server eventually so i decided to share the standalone app first. Feel free to submit a PR!

  • levi840714 a day ago

    Nice, local is the right call. What's the local AI model — a small NER model bundled in, or calling out to something? Curious about the size/footprint for a desktop app.

    • unusual_typo 15 hours ago

      It use openai/privacy-filter which is smaller than 1GB in size. I haven't checked usage during inference. It rans at 1k toks/sec on my macbook. I will update the repo with benchmark results. Thanks for the comment

  • biduskamil 17 hours ago

    Local is the way. Any benchmarks on latency it has on CPU?

    • unusual_typo 15 hours ago

      I just ran the benchmark on my macbook. 582 ms for 1k tokens and 4.64 s for 8k