LeJEPA

(arxiv.org)

42 points | by nothrowaways 7 hours ago ago

9 comments

  • cl42 5 hours ago

    This Yann LeCun lecture is a nice summary of the conceptual model behind JEPA (+ why he isn't a fan of autoregressive LLMs): https://www.youtube.com/watch?v=yUmDRxV0krg

    • krackers 5 hours ago

      Is there a summary? Every time I try to understand more about what LeCun is saying all I see are strawmans of LLMs (like claims that LLMs cannot learn a world model or that next token prediction is insufficient for long-range planning). There are lots of tweaks you can do to LLMs without fundamentally changing the architecture, e.g. looped latents, adding additional models as preprocessors for input embeddings (in the way that image tokens are formed)

      I can buy that a pure next-token prediction inductive bias for training might be turn out to be inefficient (e.g. there's clearly lots of information in the residual stream that's being thrown away), but it's not at all obvious a priori to me as a layman at least that the transformer architecture is a "dead end"

      • sbinnee an hour ago

        You don’t sound like a layman knowing the looped latents and others :)

  • rfv6723 5 hours ago

    > using imagenet-1k for pretraining

    Lecun still can't show JEPA competitive at scale with autoregressive LLM.

    • welferkj an hour ago

      It's ok, autoregressive LLMs are a dead end anyway.

      Source: Y. LeCun.

  • byyoung3 5 hours ago

    jepa shows little promise over traditional objectives in my own experiments

    • eden-u4 an hour ago

      what type of experiments did you run in less than a week to be so dismissing? (seriously curious)

      • hodgehog11 13 minutes ago

        JEPA has been around for quite a while now, so many labs have had time to assess its viability.

  • suthakamal 5 hours ago

    More optimistic signal it’s very early innings in the architectural side of AI, with many more orders of magnitude power-to-intelligence efficiency to come, and less certainty today’s giants’ advantages will be durable.