1 comments

  • ailef 5 hours ago

    Hi HN!

    I built Before Upload, a small browser-based tool that checks files before you upload them, in general, or specifically to AI tools.

    The tool tries to identify mostly PII leaks (personal info, VAT codes, API keys/secrets) and prompt injection attacks and runs locally in the browser. There is no backend analysis, no account, and the file is not uploaded anywhere.

    Right now it supports:

    - DOCX

    - XLSX

    - PPTX

    - PDF

    - JPG / PNG

    - TXT / Markdown / HTML

    It's still early and definitely not perfect. Right now, I'm building an internal dataset as I go, with several examples of prompt injection attacks hidden in PDFs and other Office file formats and I'm working to improve detection against it. I'm also looking at the next possible features like OCR, bulk analysis, and others, but it's still not clear what direction to take.

    Known limitations:

    - no OCR yet;

    - PDF hidden text detection is incomplete;

    - prompt injection detection is rule-based and can be noisy;

    - it can produce false positives, especially on AI/security documents;

    The current version is more of a warning tool than a cleaner. There are a few toggles in the advanced settings to customize the search behaviour and e.g. disable prompt injection detection on visible text because it can generate a lot of false positives, especially if the text is technical and naturally contains instructions/code.

    I'd really like feedback to understand whether people could find such a tool useful and in which environments.

    Thanks!