Show HN: Whispering – Open-source, local-first dictation you can trust

(github.com)

70 points | by braden-w 3 hours ago ago

18 comments

wkcheng 43 minutes ago

Does this support using the Parakeet model locally? I'm a MacWhisper user and I find that Parakeet is way better and faster than Whisper for on-device transcription. I've been using push-to-transcribe with MacWhisper through Parakeet for a while now and it's quite magical.

braden-w an hour ago

For those checking out the repo this morning, I'm in the middle of a release that adds Whisper C++ support!

https://github.com/epicenter-so/epicenter/pull/655

After this pushes, we'll have far more extensive local transcription support. Just fixing a few more small things :)

dumbmrblah an hour ago

I’ve been using whispering for about a year now, it has really changed how I interact with the computer. I make sure to buy mice or keyboards that have programmable hotkeys so that I can use the shortcuts for whispering. I can’t go back to regular typing at this point, just feels super inefficient. Thanks again for all your hard work!

mrs6969 21 minutes ago

am I not getting it correctly; it says local is possible but can't find any information about how to run it without any api key?

I get the whispers models, and do what? how to run in a device without internet, no documentation about it...

[-]

braden-w 7 minutes ago

Commented this earlier, but I'm in the middle of a release that adds Whisper C++ support! https://github.com/epicenter-so/epicenter/pull/655

After this pushes, we'll have far more extensive local transcription support. Just fixing a few more small things :)

rpdillon 16 minutes ago

The docs are pretty clear that you need to use speaches if you want entirely local operation.

https://speaches.ai/

glial an hour ago

This is wonderful, thank you for sharing!

Do you have any sense of whether this type of model would work with children's speech? There are plenty of educational applications that would value a privacy-first locally deployed model. But, my understanding is that Whisper performs pretty poorly with younger speakers.

solarkraft 2 hours ago

Cool! I just started becoming interested in local transcription myself.

If you add Deepgram listen API compatibility, you can do live transcription via either Deepgram (duh) or OWhisper: https://news.ycombinator.com/item?id=44901853

(I haven’t gotten the Deepgram JS SDK working with it yet, currently awaiting a response by the maintainers)

[-]

braden-w 2 hours ago

Thank you for checking it out! Coincidentally, it's on the way:

https://github.com/epicenter-so/epicenter/pull/661

In the middle of a huge release that sets up FFMPEG integration (OWhisper needs very specifically formatted files), but hoping to add this after!

Johnny_Bonk an hour ago

Great work! I've been using Willow Voice but I think I will migrate to this (much cheaper) but they do have a great UI or UX just by hitting a key to start recording and the context goes into whatever text input you want. I haven't installed whispering yet but will do so. P.S

[-]

braden-w an hour ago

Amazing, thanks for giving it a try! Let me know how it goes and feel free to message me any time :) happy to add any features that you miss from closed-source altneratives!

newman314 an hour ago

Does Whispering support semantic correction? I was unable to find confirmation while doing a quick search.

[-]

braden-w an hour ago

Hmm, we support prompts at both 1. the model level (the Whisper supports a "prompt" parameter that sometimes works) and 2. transformations level (inject the transcribed text into a prompt and get the output from an LLM model of your choice). Unsure how else semantic correction can be implemented, but always open expand the feature set greatly over the next few weeks!

[-]

joshred an hour ago

They might not now how whisper works. I suspect that the answer to their question is 'yes' and the reason they can't find a straightforward answer through your project is that the answer is so obvious to you that it's hardly worth documenting.

Whisper for transcription tries to transform audio data into LLM output. The transcripts generally have proper casing, punctuation and can usually stick to a specific domain based on the surrounding context.

satisfice 36 minutes ago

Windows Defender says it is infected.

[-]

barryfandango 19 minutes ago

I'm no expert, but since it acts as a keyboard wedge it's likely to be unpopular with security software.

sa-code 21 minutes ago

This needs to be higher, the installer on the README has a trojan.

codybontecou an hour ago

Now we just need text to speech so we can truly interact with our computers hands free.