Total tangent, but what vagary of HTML (or the Brave Browser, which I'm using here) causes words to be split in very odd places? The "inspect" devtools certainly didn't show anything unusual to me. (Edit: Chrome, MS Edge, and Firefox do the same thing. I also notice they're all links; wonder if that has something to do with it.)
The thesis here seems to be that delimiters provide important context for Claude, and for that putpose we should use XML.
The article even references English's built-in delimiter, the quotation mark, which is reprented as a token for Claude, part of its training data.
So are we sure the lesson isn't simply to leverage delimiters, such as quotation marks, in prompts, period? The article doesn't identify any way in which XML is superior to quotation marks in scenarios requiring the type of disambiguation quotation marks provide.
Rather, the example XML tags shown seem to be serving as a shorthand for notating sections of the prompt ("treat this part of the prompt in this particular way"). That's useful, but seems to be addressing concerns that are separate from those contemplated by the author.
The GP isn't suggesting to literally use quotes as the delimiter when prompting LLMs. They're pointing out that we humans already use delimiters in our natural language (quotation marks to delimit quotes). They're suggesting that delimiters of any kind may be helpful in the context of LLM prompting, which to me makes intuitive sense. That Claude is using XML is merely a convention.
That first image, “Structure Prompts with XML”, just screams AI-written. The bullet lists don’t line up, the numbering starts at (2), random bolding. Why would anyone trust hallucinated documentation for prompting? At least with AI-generated software documentation, the context is the code itself, being regurgitated into bulleted english. But for instructions on using the LLM itself, it seems pretty lazy to not hand-type the preferred usage and human-learned tips.
There must be an OpenClaw YouTube video helping people post to hacker news, or something, because the front page is overrun with AI slop like this article, that makes no sense anyway. The author literally has no idea what any of this stuff means.
I think XML is good to know for prompting (similar to how <think></think> was popular for outputs, you can do that for other sections). But I have had much better experience just writing JSON and using line breaks, colons, etc. to demarcate sections.
Could you clarify, do those tags need to be tags which exist and we need to lear about them and how to use them? Or we can put inside them whatever we want and just by virtue of being tags, Claude understands them in a special way?
All the major foundation models will understand them implicitly, so it was popular to use <think>, but you could also use <reason> or <thinkhard> and the model would still go through the same process.
Sounds like as 1. XML is the cleanest/best quality training data (especially compared to PDF/HTML) 2. It follows that a user providing semantic tags in XML format can get best training alignment (hence best results). Shame they haven't quantified this assertion here.
HTML diverged from SGML pretty early on. Various standards over the years have attempted to specify it as an application of SGML but in practice almost nobody properly conformed to those standards. HTML5 gave up the pretence entirely.
A very minor porcelain on some of the agent input UX could present this structure for you. Instead of a single chat window, have four: task, context, constraints, output format.
And while we're at it, instead of wall-of-text, I also feel like outputs could be structured at least into thinking and content, maybe other sections.
I think the main advantage of the XML here is that the model is expected to have a matching end tag that is balanced, which reduces the likelihood of malformed outputs.
Anthropic’s tool calling was exposed as XML tags at the beginning, before they introduced the JSON API. I expect they’re still templating those tool calls into XML before passing to the model’s context
Yeah like I remember prior to reasoning models, their guidance was to use <think> tags to give models space for reasoning prior to an answer (incidentally, also the reason I didn't quite understand the fuss with reasoning models at first). It's always been XML with Anthropic.
Exactly the same story here. I still use a tool that just asks them to use <think> instead of enabling native reasoning support, which has worked well back to Sonnet 3.0 (their first model with 'native' reasoning support was Sonnet 3.7)
Total tangent, but what vagary of HTML (or the Brave Browser, which I'm using here) causes words to be split in very odd places? The "inspect" devtools certainly didn't show anything unusual to me. (Edit: Chrome, MS Edge, and Firefox do the same thing. I also notice they're all links; wonder if that has something to do with it.)
https://i.imgur.com/HGa0i3m.png
CSS on the <a> tags:
word-break: break-all;
CSS word-break property
The thesis here seems to be that delimiters provide important context for Claude, and for that putpose we should use XML.
The article even references English's built-in delimiter, the quotation mark, which is reprented as a token for Claude, part of its training data.
So are we sure the lesson isn't simply to leverage delimiters, such as quotation marks, in prompts, period? The article doesn't identify any way in which XML is superior to quotation marks in scenarios requiring the type of disambiguation quotation marks provide.
Rather, the example XML tags shown seem to be serving as a shorthand for notating sections of the prompt ("treat this part of the prompt in this particular way"). That's useful, but seems to be addressing concerns that are separate from those contemplated by the author.
Except quotation marks look like regular text. I regularly use quotes in prompts for, ya know, quotes.
The GP isn't suggesting to literally use quotes as the delimiter when prompting LLMs. They're pointing out that we humans already use delimiters in our natural language (quotation marks to delimit quotes). They're suggesting that delimiters of any kind may be helpful in the context of LLM prompting, which to me makes intuitive sense. That Claude is using XML is merely a convention.
That first image, “Structure Prompts with XML”, just screams AI-written. The bullet lists don’t line up, the numbering starts at (2), random bolding. Why would anyone trust hallucinated documentation for prompting? At least with AI-generated software documentation, the context is the code itself, being regurgitated into bulleted english. But for instructions on using the LLM itself, it seems pretty lazy to not hand-type the preferred usage and human-learned tips.
No, it’s two screenshots from Anthropic documentation, stitched together: https://platform.claude.com/docs/en/build-with-claude/prompt...
The post even links to that page, although there’s a typo in the link.
Author here: I have just fixed the typo. Thank you.
And yes, these are screenshots from Anthropic’s documentation.
They're not even stitched together ; there's just no padding between the two images.
It looks like a screenshot from the Claude desktop app, so I don't think the author is trying to disguise the AI origin of the marerial
There must be an OpenClaw YouTube video helping people post to hacker news, or something, because the front page is overrun with AI slop like this article, that makes no sense anyway. The author literally has no idea what any of this stuff means.
You just hallucinated the content is AI generated.
"This is AI" is the new "This is 'shopped, I can tell by the pixels."
I can tell by the em dashes
I think XML is good to know for prompting (similar to how <think></think> was popular for outputs, you can do that for other sections). But I have had much better experience just writing JSON and using line breaks, colons, etc. to demarcate sections.
E.g. instead of
Just doing something like: Use case document processing/extraction (both with Haiku and OpenAI models), the latter example works much better than the XML.N of 1 anecdote anyway for one use case.
Could you clarify, do those tags need to be tags which exist and we need to lear about them and how to use them? Or we can put inside them whatever we want and just by virtue of being tags, Claude understands them in a special way?
They probably don’t need to be specific values. The model is fine tuned to see the tags as signals and then interprets them
All the major foundation models will understand them implicitly, so it was popular to use <think>, but you could also use <reason> or <thinkhard> and the model would still go through the same process.
Sounds like as 1. XML is the cleanest/best quality training data (especially compared to PDF/HTML) 2. It follows that a user providing semantic tags in XML format can get best training alignment (hence best results). Shame they haven't quantified this assertion here.
Makes sense
This isn’t surprising: XML’s core purpose was to simplify SGML for a wider breadth of applications on the web.
HTML also descended from SGML, and it’s hard to imagine a more deeply grooved structure in these models, given their training data.
So if you want to annotate text with semantics in a way models will understand…
XML and HTML are SGMLs
HTML diverged from SGML pretty early on. Various standards over the years have attempted to specify it as an application of SGML but in practice almost nobody properly conformed to those standards. HTML5 gave up the pretence entirely.
A very minor porcelain on some of the agent input UX could present this structure for you. Instead of a single chat window, have four: task, context, constraints, output format.
And while we're at it, instead of wall-of-text, I also feel like outputs could be structured at least into thinking and content, maybe other sections.
I think the main advantage of the XML here is that the model is expected to have a matching end tag that is balanced, which reduces the likelihood of malformed outputs.
I thought the goal was minimal instruction to let Claude determine the best way to solve the problem. Not adding this to my workflow anytime soon.
It is not for the end user, it is more for things like wrappers and automation scripts.
Nobody expects the end user to prompt the AI using a structured language like xml
Anthropic’s tool calling was exposed as XML tags at the beginning, before they introduced the JSON API. I expect they’re still templating those tool calls into XML before passing to the model’s context
Yeah like I remember prior to reasoning models, their guidance was to use <think> tags to give models space for reasoning prior to an answer (incidentally, also the reason I didn't quite understand the fuss with reasoning models at first). It's always been XML with Anthropic.
Exactly the same story here. I still use a tool that just asks them to use <think> instead of enabling native reasoning support, which has worked well back to Sonnet 3.0 (their first model with 'native' reasoning support was Sonnet 3.7)
bemused by how competently designed this is, compared to enshittified blogs and whatnot
To be realistic, this design needs more weirdly sexual etsy garbage, “one weird tip,” and “punch the monkey”
This sounds like something for harnesses, not end users. Are they really expecting us to format prompts as XML??