hey folks! I'm fresh off presenting a well received Realtime API demo (https://x.com/swyx/status/1859607840639549871) at the 3rd OpenAI devday. There's been a lot of little roadbumps along the way building with the new API, and I've benefited a lot from advice from @kwindla, that I figured it'd be worth publishing a guest post with him on everything we learned using this rather strange new audio (and soon video) API (which we think is quite different from standard text/json over HTTP.
We've helped a number of Pipecat users hook into a variety of content moderation systems or use LLMs as judges.
The most common approach is to use a `ParallelPipeline` to evaluate the output of the LLM at the same time as the TTS inference is running, then to cancel the output and call a function if a moderation condition is triggered.
Other people have written custom frame processors to make use of the content moderation scoring in the Google and Azure APIs.
If you're interested in building a Pipecat integration for your employer's tech, happy to support that. Feel free to DM me on Twitter.
hey folks! I'm fresh off presenting a well received Realtime API demo (https://x.com/swyx/status/1859607840639549871) at the 3rd OpenAI devday. There's been a lot of little roadbumps along the way building with the new API, and I've benefited a lot from advice from @kwindla, that I figured it'd be worth publishing a guest post with him on everything we learned using this rather strange new audio (and soon video) API (which we think is quite different from standard text/json over HTTP.
hope it is helpful to someone out there, enjoy!
Heya!
I saw there was a mention of content moderation when the author discussed https://github.com/pipecat-ai/pipecat
But when I went to the github repo, I didn't see anything about that.
I'm loosely related to the content moderation space through my employer, so wanted to learn more about that.
We've helped a number of Pipecat users hook into a variety of content moderation systems or use LLMs as judges.
The most common approach is to use a `ParallelPipeline` to evaluate the output of the LLM at the same time as the TTS inference is running, then to cancel the output and call a function if a moderation condition is triggered.
Other people have written custom frame processors to make use of the content moderation scoring in the Google and Azure APIs.
If you're interested in building a Pipecat integration for your employer's tech, happy to support that. Feel free to DM me on Twitter.
Awesome, thanks! Will pass this along.