18 comments

  • foundry27 14 hours ago

    The prompting strategies in this post made me remember a funny anecdote from this Thanksgiving. My older family members had been desperately trying to get ChatGPT to write a good poem about spume (the white foam you see in waves), and no matter how many ways they explicitly prompted it to not write in rhyming couplets, it dutifully produced a rhyming couplet poem every time. There’s clearly an enormous volume of poems in the training data written in this form, and it was practically impossible to escape that local minimum within the latent space of the model, like the half-full wine glass imagery. They only succeeded at generating a poem written the way they wanted when they first prompted ChatGPT to reason through the elements of good poetry writing regardless of style, and then generate a prompt to write a poem following those guidelines. Naturally, that produced a lovely poem on the first attempt with that prompt!

    It’s pretty well known at this point, but it seems like when it comes to prompting these models, telling them what to do or not do is less effective than telling them how to go through the process of achieving the outcome. You need to get them to follow steps to reach a conclusion, or they’ll just follow the statistical path of least resistance.

    Edit: the poem: https://paste.ee/d/rIbLa/0

    • negoutputeng 11 hours ago

      thanks for this comment ! it clarifies the function of the llm well.

      ie, use it as a template-generating search-engine helper for most common things. for uncommon things, you have to prompt-guide it to get what you want.

  • mikeocool 14 hours ago

    This aligns with my experience using image generators —- I can get them to generate really weird unique combinations of things (ie “an octopus dancing with a cat”) but when asking for a relatively common image with 1-2 unique aspects they seem to just generate the common image.

    The folks who were able to get it by having ChatGPT generate a really detailed prompt is kind of interesting. When I’ve tried writing my own detailed prompts, it seems like the limit on detail is relatively low before it starts going completely off the rails.

  • mensetmanusman 14 hours ago

    It’s not possible if there are ~no images like this in the training set.

    • speedgoose 14 hours ago

      These AIs can generate images that are quite different to anything found in the training set. This case seems more about overfitting.

    • epolanski 14 hours ago

      Have you read all the comments?

      Some users achieved it by having the chatbot describe in detail a glass full of wine, and then use that output in a new context to generate the image.

      I think that's a very interesting take away.

      • JofArnold 14 hours ago

        This is the approach I use to get ChatGPT to generate images that get around copyright. E.g.

        Me: Create something in the style of Escher AI: Can't due to copyright Me: Precisely describe in detail <insert artwork here> such that I can use it as a prompt to generate an image AI: <prompt> Me: <prompt>

        9 times out of 10 it works really well.

    • jjbinx007 14 hours ago

      In all fairness there probably aren't any images of Gordon Ramsey riding an Ostrich on the moon in its training set either but it manages that.

      I tried this prompt several times in Ideogram, both as realistic and also design-based images and it couldn't do it at all.

      I haven't yet tried it with a more elaborate prompt but it's interesting to me that it can do the most incredible and amazing things but can't do something that sounds simple.

      • mensetmanusman 13 hours ago

        It wouldn’t be able to do that without an ostrich in the training set. There is a subtle but important difference between combining and what is being combined.

    • echoangle 11 hours ago

      Why would this be the case? These AI image generators can generate very weird combinations of stuff that certainly aren’t shown in a training image.

    • Topfi 14 hours ago

      Maybe take a look through the linked thread, some did manage to do so.

      • mensetmanusman 13 hours ago

        I eventually went the route of describing capillary effects of water, changing the water color, changing the type of water container.

        Managing the travel through the knowledge graphs is becoming a skill :)

  • jareklupinski 13 hours ago

    looking at the wine-red beret perched on top of the glass

    https://preview.redd.it/sije3nm0imwd1.jpeg?width=1024&format...

    I think the backend needs to be tuned on the fact that anything, not just hats, can have a brim

  • 14 hours ago
    [deleted]
  • aaron695 14 hours ago

    [dead]

  • DemocracyFTW2 12 hours ago

    sort of nonsensical to link to old.reddit.com which shows pictures only as an `<image>` placeholder

    • tverrbjelke 11 hours ago

      Saved me. I am used to open a pic-link into a new tab to see pictures. The "normal" reddit is almost dysfunctional for me. (I am using noscript to narrow down to actually really needed things, but reddit wants too much and even with scripts allowed needs pletora of clicks to even only get a normal thread of comments and it seems to load half of yet another OS into my browser, which takes too long)

      • pigeons 5 hours ago

        Everytime I somehow go to reddit without the old. subdomain I think "How does anyone use this? How is it still around." Just happened to me again yesterday.