OpenAI Realtime API: The Missing Manual

(latent.space)

2 points | by swyx 3 days ago ago

5 comments

  • swyx 3 days ago

    hey folks! I'm fresh off presenting a well received Realtime API demo (https://x.com/swyx/status/1859607840639549871) at the 3rd OpenAI devday. There's been a lot of little roadbumps along the way building with the new API, and I've benefited a lot from advice from @kwindla, that I figured it'd be worth publishing a guest post with him on everything we learned using this rather strange new audio (and soon video) API (which we think is quite different from standard text/json over HTTP.

    hope it is helpful to someone out there, enjoy!

    • mooreds 3 days ago

      Heya!

      I saw there was a mention of content moderation when the author discussed https://github.com/pipecat-ai/pipecat

      But when I went to the github repo, I didn't see anything about that.

      I'm loosely related to the content moderation space through my employer, so wanted to learn more about that.

      • kwindla 3 days ago

        We've helped a number of Pipecat users hook into a variety of content moderation systems or use LLMs as judges.

        The most common approach is to use a `ParallelPipeline` to evaluate the output of the LLM at the same time as the TTS inference is running, then to cancel the output and call a function if a moderation condition is triggered.

        Other people have written custom frame processors to make use of the content moderation scoring in the Google and Azure APIs.

        If you're interested in building a Pipecat integration for your employer's tech, happy to support that. Feel free to DM me on Twitter.

        • mooreds a day ago

          Awesome, thanks! Will pass this along.

  • 3 days ago
    [deleted]