gptel: a simple LLM client for Emacs

(github.com)

156 points | by michaelsbradley 3 days ago ago

33 comments

In this post yesterday https://justine.lol/lex/ there is this quote:

> The new highlighter and chatbot interface has made llamafile so pleasant for me to use, combined with the fact that open weights models like gemma 27b it have gotten so good, that it's become increasingly rare that I'll feel tempted to use Claude these days.

Leaving me tempted more than ever to see if I can integrate some sort of LLM workflow locally. I would only consider doing it locally, and I've an older computer, so I didn't think this would be possible till reading that post yesterday.

The only thing I thought was - how would I work it in to Emacs? And then today, this post. It looks very well integrated. Has anyone any experience using gemma 27b it with llamafile and gptel? I know very little about the whole space, really.

[-]

alwayslikethis 2 days ago

Gemma 27b is pretty censored, I prefer mistral-small 22b. Works fairly well with ollama and gptel but it still doesn't compare to the commercial offerings. llama3.1 70b is quite a bit smarter but doesn't fit into my VRAM. I expect llama3.1 405b would come quite close to commercial offerings.

whartung 3 days ago

I'm on an Intel iMac, and llama can't leverage its GPU. So, it's 1990s slow. It's literally like talking to a machine in the 90s, from the slow response time to the 1200-2400 baud output.

It's easy to give tasks to, hard to have a conversation with. Just paste the task in and let it churn, come back to it later.

[-]

anthk 3 days ago

90s? Are you sure?

[-]

jasonm23 2 days ago

80s isn't it.

omneity 3 days ago

Hearing Gemma 27B called “so good I’ll rarely reach out to Claude” (paraphrased) is quite a surprising sentence to hear. Perhaps for simpler workflows?

My experience with the whole Gemma line 27B included has been less than stellar, especially in instruction following ability.

michaelsbradley 3 days ago

And here’s a nice writeup:

gptel: Mindblowing integration between Emacs and ChatGPT

https://www.blogbyben.com/2024/08/gptel-mindblowing-integrat...

BaculumMeumEst 3 days ago

gptel is great because it does exactly what you would expect and stays up to date with new models from anthropic and openai. Before settling on gptel, I went through FOUR programs that had a lot of buzz but were not being kept up to date with new models!

gptel has joined magit and undo-tree in being so damn useful and reliable that they are keeping me from ditching emacs, even though I want to.

[-]

__mharrison__ 3 days ago

Mentioning that it is on par with magit is a strong recommendation. Going to try this out tomorrow.

[-]

marci 3 days ago

Gptel leverages Magit's interface.

[-]

johanvts 3 days ago

Yes, its called transient and it’s pretty simple to use for your own projects. I wrote a transient for my work tasks and now ex. Creating a new youtrack issue is effortless.

[-]

ossusermivami 3 days ago

it's built-in emacs as well now!

nanna 2 days ago

Personally I ditched undo-tree for Vundo (visual undo), which has all the functionality that I need from undo-tree whilst being significantly lighter. Give it a spin!

https://github.com/casouri/vundo

jasonm23 2 days ago

It's kmacros, wgrep (with ripgrep) and wdired that keep me using it....

VSCode / VSCodium have really pulled me away from it a lot this year though, after ~30 years.

[-]

BaculumMeumEst 8 hours ago

Yeah, I like a lot about VS code and it feels far less annoying to use. It just sucks that there's no undo equivalent to undo-tree, the Git equivalents are not up to snuff (I just end up using CLI, which is fine I guess). But gptel is also super useful, and there's no good VS code equivalent I've found for that either.

wgrep is nice, I use it very rarely but its useful. I did not know about wdired, looking that up now, looks very cool, you are NOT helping me escape this damned editor..

PestoDiRucola 3 days ago

May I ask why do you want to ditch emacs and in favor of what?

[-]

autumn-antlers 2 days ago

Not OP, but I feel strongly about this.

I've got a lot of Lisp experience, and love Emacs' values and ecosystem. I still use neovim regularly because of a tangle security, a sound sure-footedness of action, derived from consistency and latency. It takes a combination of confident action on the users part and confident response from the machine to perpetuate the user's experience of both speed and intrusive confidence. (IMO) Emacs fails to deliver this experience plainly in latency due to the main thread blocking (even with more the 8MB of RAM). It fails partly due to the greater variety of modes and (as an evil user) lack of "core" support for Vim bindings, creating a higher sense of care and vigilance, but I really that could be over-come if one's config is kept small and the ecosystem tuned more towards robustness and optimization in visible UI, tactile UI, and multi-threading.

In favor of what, I don't know. Something that explicitly aspires to feature-parity with a modern-emacs stack (vertico/corfu/marginalia/transient/tramp), but which sacrifices some aspect of the platform's flexibility (eg. make plugins use transient and allow consistent global user prefs) to prioritize consistency, latency, and robustness for the sake of UX.

kleiba 3 days ago

This is really sweet. I've only recently started dabbing into AI-assisted programming, and I think this integration into Emacs is really smooth.

What would be really neat is to add REPL-like functionality to an LLM buffer so that code generated by the LLM can be evaluated right away in place.

[-]

karthink 3 days ago

For a REPL-like interface, you could try the chatgpt-shell package. It can execute code generated by the LLM. It too does this by using org-babel though, it just calls org-babel functions under the hood. It's also OpenAI-only right now, although the author plans to add support for the other major APIs.

gptel has a buffer-centric design because it tries to get out of your way and integrate with your regular Emacs usage. (For example, it's even available _in_ the minibuffer, in that you can call it in the middle of calling another command, and fill the minibuffer prompt itself with the text from an LLM response.)

foobarqux 3 days ago

gptel can output org-babel.

[-]

TeMPOraL 3 days ago

Indeed; fire up gptel-mode in an Org Mode buffer, and you'll get to work with Org Mode, including code blocks with whatever evaluation support you have configured in your Emacs.

Also I really like the design of the chat feature - the interactive chat buffer is still just a plain Markdown buffer, which you can simply save to file to persist the conversation. Unlike with typical interactive buffers (e.g. shell), nothing actually breaks - gptel-mode just appends the chat settings to the buffer in the standard Emacs fashion (key/value comments at the bottom of the file), so to continue from where you left off, you just need to open file and run M-x gptel.

(This also means you can just run M-x gptel in a random Markdown buffer - or an Org Mode buffer, if you want aforementioned org-babel functionality; as long as gptel minor mode is active, saving the buffer will also update persisted chat configuration.)

[-]

kleiba 3 days ago

Org code blocks are great but not quite the same as having a REPL. But like I said above, I think this is really a great piece of software. I can definitely see this being a game changer in my daily work with Emacs.

[-]

TeMPOraL 3 days ago

Used the right way, Org mode code blocks are better, though setting things up to allow this can be tricky, and so I rarely bother.

What I mean is: the first difference between a REPL and an Org Mode block (of non-elisp code[0]) is that in REPL, you eval code sequentially in the same runtime session; in contrast, org-babel will happily run each execution in a fresh interpreter/runtime, unless steps are taken to keep a shared, persistent session. But once you get that working (which may be more or less tricky, depending on the language), your Org Mode file effectively becomes a REPL with editable scrollback.

This may not be what you want in many cases, but it is very helpful when you're collaborating with an LLM - being able to freely edit and reshape the entire conversation history is useful in keeping the model on point, and costs in check.

[0] - Emacs Lisp snippets run directly on your Emacs, so your current instance is your session. It's nice that you get a shared session for free, but it also sucks, as there only ever is one session, shared by all elisp code you run. Good luck keeping your variables from leaking out to the global scope and possibly overwriting something.

foobarqux 3 days ago

org-babel-eval-in-repl

joeevans1000 3 days ago

Forgive my naiveté in these questions:

Can one get the memory of context now available on the online/web chatgpt?

I find the list of conversations on the left of the web version are a way to start new lines of thinking. Can/how can I get that workflow going in gptel? Is there a better way to organize than what the web version provides?

Thanks to all who have made this!

[-]

karthink 3 days ago

I think the web chat history is separate from API use, so you can't combine them. OpenAI claims not to retain a history of your API queries and responses.

For organizing LLM chat logs in Emacs, there are many solutions. Here are a few:

As a basic solution, chats are just text buffers/files, so you can simply store your conversations in files in a single directory. You can then see them in dired etc -- and they are ripgrep-able, can be integrated into Org-roam or your choice of knowledge management system.

If you use Org mode, you can have branching conversations in gptel where each path through the document's outline tree is a separate conversation branch. This way you can explore tangential topics while retaining the lineage of the conversation that led to them, while excluding the other branches. This keeps the context window from blowing up and your API inference costs (if any) down.

If you use Org mode, you can limit the scope of the conversation to the current heading by assigning a topic (gptel-set-topic). This way you can have multiple independent conversations in one file/buffer instead of one per buffer. (This works in tandem with the previous solution.)

-----

Tools tend to compose very well in Emacs. So there are probably many other solutions folks have come up with to organize their LLM chat history. For instance, any feature to handle collections of files or give you an outline/ToC view of your Markdown/Org documents should work well with the above -- and there are dozens of extensions like these.

[-]

kreyenborgi 3 days ago

> If you use Org mode, you can have branching conversations in gptel where each path through the document's outline tree is a separate conversation branch. This way you can explore tangential topics while retaining the lineage of the conversation that led to them, while excluding the other branches. This keeps the context window from blowing up and your API inference costs (if any) down.

Can you give an example of how this looks? I see it's mentioned in https://github.com/karthink/gptel/?tab=readme-ov-file#extra-... but I feel like I need an example. It sounds quite interesting and useful, I've often done this "manually" by saving to a new buffer when I go on a tangent.

EDIT: Nevermind, C-h v gptel-org-branching-context gives:

    Use the lineage of the current heading as the context for gptel in Org buffers.
    
    This makes each same level heading a separate conversation
    branch.
    
    By default, gptel uses a linear context: all the text up to the
    cursor is sent to the LLM.  Enabling this option makes the
    context the hierarchical lineage of the current Org heading.  In
    this example:
    
    -----
    Top level text
    
    * Heading 1
    heading 1 text
    
    * Heading 2
    heading 2 text
    
    ** Heading 2.1
    heading 2.1 text
    ** Heading 2.2
    heading 2.2 text
    -----
    
    With the cursor at the end of the buffer, the text sent to the
    LLM will be limited to
    
    -----
    Top level text
    
    * Heading 2
    heading 2 text
    
    ** Heading 2.2
    heading 2.2 text
    -----
    
    This makes it feasible to have multiple conversation branches.

Cool :-D

marci 3 days ago

I suppose the simplest would be to have to split to two windows with the same buffer with "Ctrl x 3", and press "Shift+Tab" to only have the headline. That is if all the conversation are on the same file.

If they are not on the same file, maybe something simple like deft (https://github.com/jrblevin/deft) or more complex like org-roam (https://github.com/org-roam/org-roam)

jfdi 3 days ago

Anyone know of a similar option for vi/m? (Not neovim etc).

Been searching and have found some but nothing stands out yet.

[-]

chungy 3 days ago

You could use Emacs in evil-mode.

[-]

pxc 3 days ago

Seconding this recommendation! I've never been a super advanced vim user, but Evil is the best and most complete vim emulator I've ever used, and I try them on every editor or IDE I ever run.

setopt 3 days ago

Seems some of these are for Vim too, but I haven’t tried them yet: https://github.com/jkitching/awesome-vim-llm-plugins Scanning the list quickly, dense-analysis/neural perhaps sticks out since it’s written by the author of ALE, which is a very high-quality plugin.

Another option that is perhaps more the Unix way, is to run an LLM client in a terminal split (there are lots of CLI clients), and then use vim-slime to send code and text to that split.

Personally I’m still using ChatGPT in the browser and mobile app. Would love to try something else, but the OpenAI API key seems to cost extra, and something like llama probably takes time to setup right.

hedari 3 days ago

I’ve been happy with vim-ai plugin: https://github.com/madox2/vim-ai