OpenAI Realtime API: The Missing Manual

(latent.space)

2 points | by swyx 3 days ago ago

5 comments

swyx 3 days ago

hey folks! I'm fresh off presenting a well received Realtime API demo (https://x.com/swyx/status/1859607840639549871) at the 3rd OpenAI devday. There's been a lot of little roadbumps along the way building with the new API, and I've benefited a lot from advice from @kwindla, that I figured it'd be worth publishing a guest post with him on everything we learned using this rather strange new audio (and soon video) API (which we think is quite different from standard text/json over HTTP.

hope it is helpful to someone out there, enjoy!

[-]

mooreds 3 days ago

Heya!

I saw there was a mention of content moderation when the author discussed https://github.com/pipecat-ai/pipecat

But when I went to the github repo, I didn't see anything about that.

I'm loosely related to the content moderation space through my employer, so wanted to learn more about that.

[-]

kwindla 3 days ago

We've helped a number of Pipecat users hook into a variety of content moderation systems or use LLMs as judges.

The most common approach is to use a `ParallelPipeline` to evaluate the output of the LLM at the same time as the TTS inference is running, then to cancel the output and call a function if a moderation condition is triggered.

Other people have written custom frame processors to make use of the content moderation scoring in the Google and Azure APIs.

If you're interested in building a Pipecat integration for your employer's tech, happy to support that. Feel free to DM me on Twitter.

[-]

mooreds a day ago

Awesome, thanks! Will pass this along.

3 days ago

[deleted]