It’s ironic that the very site in question, despite claiming XHTML compliance, is served as text/html instead of application/xhtml+xml, so the browser will never parse it as XML.
To quote [0]:
> All those “Valid XHTML 1.0!” links on the web are really saying “Invalid HTML 4.01!”.
Although the article is 20 years old now, so these days it’s actually HTML5.
Edit: Checked the other member sites. Only two are served as application/xhtml+xml.
That HTML5 was used in marketing doesn't make the technical term disappear. HTML5 is a bit more precise than HTML, it refers to the living standard that's currently in use, as opposed to HTML 4.01 and the previous versions of HTML.
It's not a technical term. Nowhere in the current HTML standard will you find a versioning of HTML. That's why it's now called a "living standard". You will never find a HTML6 or higher. That note you found is to help with any confusion.
You might be right, but we don't know yet. Microsoft said that for Windows 10.
You might also be right that the current Living Standard specification doesn't really call it HTML5, but you'll find many people writing HTML for a living say HTML5 to refer to it, and telling them that HTML5 doesn't exist doesn't really help and is a bit wrong too if you have a descriptive approach to languages.
The next version of html should be able to do all the http verbs -- get, put, patch, post, delete online, reactively without having to use a form.
There has to be a way to figure this out, even if it requires a transition period. The best time to plant a tree was twenty years ago, the second best time is now. These things belong in the core HTML standards, not a js library you need to include in your code.
Oh that and better controls and better defaults but I guess that is something individual web browsers can implement on their own?
Telling them HTML5 does exist does even more harm cause it doesn't exist. Telling them it does exist is entirely wrong and is even a false statement, is misleading and causes confusion.
I am right and I gave you the proof. Understanding what one means when mentioning HTML5 has nothing to do with technically understanding that there is no HTML5 standard.
Let's just say that I don't think the truths you are pushing are as absolute as you seem to think, and are a reflect of how you view the world more than anything.
And that by correcting people that mention HTML5, you will probably just annoy people without achieving anything worth it. That would be true even if you are absolutely correct.
It's peak "well, actually", with the twist it might not even actually be.
One of the annoying things about having a living standard is that it is difficult to implement a conforming version as additional updates means that you are no longer conforming.
Versioned standards allow you to know that you are compliant to that version of the specification, and track the changes between versions -- i.e. what additional functionality do I need to implement.
With "living standards" you need to track the date/commit you last checked and do a manual diff to work out what has changed.
I highly recommend everyone involved in web development to read at least a small proportion of the horrors that are the HTML parser specification. It will leave you yearning for the return of XHTML.
Or you could also read web proposals where the reason for avoiding the ideal implementation is complication of updating HTML parser rules.
Or attempt to use the web features that are already hindered by the HTML parser (custom element table rows).
I was in college when XHTML was all the rage and everything we wrote had to pass validation. I still get uncomfortable adding breaks without closing them.
> Validation is ignored, and most modern sites are built with little concern for structure or longevity.
I remember going online with a modem in the 90s. There was a new ISP in town, but their homepage took forever to load. I viewed the source, and whatever page generator they were rendered the page as HTML tables (this was fine back then), and added repetitive style tags to every table cell instead of using CSS (although I wonder if this was before CSS) or not doing so for empty cells, and that their homepage was so bloated and slow to load on dial-up.
I wonder how it is nowadays. But I suppose in the age that accomodates apps like Teams and Slack, who cares?
I used to create a number of simple web pages in XHTML back in the days when we believed XHTML was the future. Recently, while going through and restructuring some of my old "online stuff", I learned that XHTML really isn't in a state that I'd want to use it any more:
* XHTML 1.0 and 1.1 are officially deprecated by the W3C.
* XHTML5 exists as a variant of HTML5. However, it's very clear that it's absolutely not a priority for the HTML5 working groups, and there's a statement that future features will not necessarily be supported by the XHTML5 variant.
* XHTML5 does not have a DTD, so one of the main advantages of XHTML - that you can validate its correctness with pure XML functionality - isn't there.
* If you do a 'view source' in Firefox on a completely valid XHTML 1.0/1.1 page, it'll redline the XML declaration like it's something wrong. Not sure if this is intended or possibly even a bug, but it certainly gives me a 'browser tells me this is not supposed to be there' feeling.
It pretty much seems to me XHTML has been abandoned by the web community. My personal conclusion has been that whenever I touch any of my old online things still written in XHTML, I'll convert them to HTML5.
> If you do a 'view source' in Firefox on a completely valid XHTML 1.0/1.1 page, it'll redline the XML declaration like it's something wrong
Is the page actually being served as "application/xhtml+xml"? Most xhtml sites aren't, in which case the browser is indeed interpreting those as invalid declarations in a regular old html document
XHTML survives in ePub. Recently there was a survey to gather industry feedback for a potential addition of an HTML flavour of ePub to be added to the next version of the spec, but it soon became fairly clear that people saw a lot of value in remaining XHTML-only: https://www.w3.org/blog/2026/epub-and-html-survey-results-an...
Circa '99 a high fraction (50%-ish) of HTML in the field was invalid, so if you were making a new web browser it had to parse invalid HTML the same way as Netscape which was one more reason we didn't get competitive web browsers.
HTML 5 specified exactly how "invalid" HTML is parsed so now there is no such thing as invalid HTML. XHTML was one of those things that never quite worked:
The things that are invalid should all have defined behaviour. For example, a <label> is not allowed to contain two form controls, but is defined as applying to the first such control.
> This specification defines the parsing rules for HTML documents, whether they are syntactically correct or not. Certain points in the parsing algorithm are said to be parse errors. The error handling for parse errors is well-defined (that's the processing rules described throughout this specification), but user agents, while parsing an HTML document, may abort the parser at the first parse error that they encounter for which they do not wish to apply the rules described in this specification.
> you should master the HTML programming¹ language
The footnote reads:
> 1. This is a common debate - but for simplicity sake I'm just calling it this.
It's not really a debate, HTML is a markup language [1], not a programming language: you annotate a document with its structure and its formatting. You are not really programming when you write HTML (the markup is not procedural) (and this is not gatekeeping, there's nothing wrong about this and doesn't make HTML a lesser language).
To avoid the issue completely, you can phrase this as: "you should master HTML" and remove the footnote. Simple, clean, concise, clear. By the way, ML already means "Markup Language", so any "HTML .* language" phrasing can feel a bit off.
I think that it is a debate, and it depends on the role of HTML in your system.
If all you're doing is using HTML to "annotate a document with its structure and its formatting", then yes, I'll accept that it's not quite programming, but I've not seen this approach of starting with a plain non-html document and marking it up by hand done in probably over two decades. I do still occasionally see it done for marking up blog posts or documentation into markdown and then generating html from it, but even that's a minuscule part of what HTML is used for these days.
Your mileage my vary, but what I and people around me typically do is work on hundreds/thousands of loosely coupled small snippets of HTML used within e.g. React JSX, or Django/Jinja templates or htmx endpoints, in order to dynamically control data and state in a large program. In this sense, while the html itself doesn't have control flow, it is an integral part of control flow in the larger system, and it's extremely likely that I'll break something in the functionality if I carelessly change an element's type or attribute value. In this sense, I'm not putting on a different hat when I'm working on the html, but just working on a different part of the program.
Those are not HTML. PHP neither, even when used as a templating language for HTML.
> htmx endpoints
Not really familiar with htmx, but I would say this is HTML augmented with some additional mechanisms. I don't know how I would describe this augmented HTML, but I'm not applying my "not programming" statement to htmx (I probably could, but I haven't given enough thoughts to do it).
> In this sense, I'm not putting on a different hat when I'm working on the html, but just working on a different part of the program.
I agree with this actually. I wouldn't consider that writing HTML (or CSS) is really a separate activity when I'm building some web app.
> In this sense, while the html itself doesn't have control flow, it is an integral part of control flow in the larger system
That's correct but I don't see what it has got to do with the question of whether HTML is a programming language or not.
Strings do not have control flow but strings are integral part of larger programs that have control flow. So what? That doesn't make strings any closer to being programming languages.
What happens if I simply add an iterator mechanism to HTML (well, I guess we need variables too)? Is it no longer a markup language here (I won't add anything else):
<for i=0; i<1; i++>
<html>
</html>
</for>
Better question, why don't we upgrade XML to do that?
I ask you then: (1) how do you deal with the template that surrounds a large number of pages on a site? (2) how do you deal with the fact that the average web form might want to display something different based on the form contents (e.g. redraw the form if there's an error, draw something different on success?) (3) do you write anything that returns JSON or other results for AJAX or web services?
But if you disagree with this, or somehow work around this statement by replacing your for element with some "for-loop" custom element (it is valid HTML to add custom tags with dashes in their names), my stronger argument is at https://news.ycombinator.com/item?id=46743219#46743554
I dunno, you're being pedantic :) Yes yes, the name clearly ends up "Markup Language" so yeah, with a very strict definition of programming languages, HTML is not one of them.
But if we use a broader definition, basically "a formal language that specifies behavior a machine must execute", then HTML is indeed a programming language.
HTML is not only about annotating documents or formatting, it can do things you expect from a "normal" programming language too, for example, you can do constraints validation:
That's neither annotating, just a "document" or just formatting. Another example is using <details> + <summary> and you have users mutating state that reveals different branches in the page, all just using HTML and nothing else.
In the end, I agree with you, HTML ultimately is a markup language, but it's deceiving, because it does more than just markup.
One threshold is "can you write a program that might not complete?" You can't in SQL, which makes it less of a programming language than, say, FORTRAN.
If you look at the HTML 5 spec it is clear that it's intended to be a substrate for applications. The HTML 5 spec could be factored into a specification of the DOM, specification of an x-language API for the DOM and a specification for a serialization format as well as bindings of that x-language API to specific languages like Javascript.
I'm not sure we can call your parent comment pedantic. They're just being correct. Is it pedantic to say that fish is not a fruit? It's just correct to do so.
If anything, it is the act of stretching the definition of "programming language" so much that it includes HTML as a programming language that we should call pedantic.
It might be, I'm usually not, but this is all xhtml.club and this footnote are about, might as well be correct :-)
Constraint validation is still descriptive (what is allowed)
All details and summary are doing is conveying information on what's a summary and what's the complete story, and it has this hidden / shown behavior.
In any case, you will probably find something procedural / programming like in HTML, but it's not the core idea of the language, and if you are explaining what HTML is to a newbie, I feel like you should focus to the essential. Then we can discuss the corners between more experienced people.
In the end, all I'm saying is: you can just avoid issues and just say "HTML" without further qualifying it.
I would really like to use XHTML. It would make my HTML emitter much simpler (as I don't need special rules for elements that are self-closing, have special closing or escaping rules and whatever else) and more secure.
However no browsers have implemented streaming XHTML parsers. This means that the performance is notably worse for XHTML and if you rely on streaming responses (I currently do for a few pages like bulk imports) it won't work.
> no browsers have implemented streaming XHTML parsers
Dang, I hadn't considered this. That's something to add to the "simplest HTML omitting noisy tags like body and head vs going full XHTML" debate I have with myself.
One for XHTML: I like that the parser catches errors, it often prevent subtle issues.
I don’t thing it’s about luddites as website mentioned. Many professions have tools suggesting that person have extensive experience and in terms of web development, XHTML 1.0 or old standards of HTML are such.
It is not “your HTML”, it’s HTML 4.01 from 1999, when XHTML 1.0 is from 2000. The common is the origins of validations that comes from W3 validator (1).
Current websites that are valid XHTML 1.0 Strict (or Transitional)
Back to tirreno website, it is a pure transitional HTML 4.01 without JS or CSS, thus more or less same challenges to make it W3 valid (2) in our days. Have a look.
In the early 2000s I was 100% sold on the idea of strict XHTML documents and the semantic web. I loved the idea that all web pages could be XML documents which easily provided their data for other sources. If you marked your document with, an XHTML 1.0 Strict or XHTML 1.1 doctype, a web browser was supposed to show an error if the page contained an XML error. Problem was, it was a bit of a pain to get this right, so effectively no one cared about making compliant XHTML. It was a nice idea, but it didn't interact well with the real world.
Decades later, I'm still mildly annoyed when I see self-closing tags in HTML. When you're not trying to build a strict XML document, they're no longer required. Now I read them as a vestigial reminder of the strict XHTML dream.
EDIT2: Hey, look, if you still want to use self-closing tags where they're not required: go nuts! I'm just explaining why I don't use them anymore and don't think others should (unless you want to make strict XML documents).
As someone who has gotten into the idea of semantic Web long after XHTML was all the rage[0], I somewhat resent that semantic Web and XML are so often lumped together[1]. After all, XML is just one serialisation mechanism for linked data.
[0] I don’t dislike XHTML. The snob in me loves the idea. Sure, had XHTML been The Standard it would have been so much more difficult to publish my first website at the age of 14 that I’m not sure I would have gotten into building for Web at all, but is it necessarily a good thing if our field is based on technology so forgiving to malformed input that a middle school pupil can pass for an engineer? and while I do omit closing tags when allowed by the spec, are the savings worth remembering these complicated rules for when they can be omitted, and is it worth maintaining all this branching that allows parsers to handle invalid markup, when barely any HTML is hand-written these days?
[1] Usually it is to the detriment of the former: the latter tends to be ill-regarded by today’s average Web developer used to JSON (even as they hail various schema-related additions on top of JSON that essentially try to make it do things XML can, but worse).
That is a good point, if you consider XSD then that is an XML connection, it starts to become a bit complicated and I see why people start to dislike it. I forget about that because to me it’s just about the idea of a graph, which is otherwise quite elegant. Why not have a graph type-free with just string literals; much richer information about what kind of values go where can be provided through constraints, vocabularies, etc.
My favourite serialisation has got to be dumb triples (maybe quads). I don’t think writing graphs by hand is the future. However, when it comes to that, Turtle’s great.
You're annoyed when people are trying to keep the dream alive?
Since HTML5 specifies how to handle all parse errors, and the handling of an XML self-closing tag is to ignore it unless it's part of an unquoted attribute value, it's valid HTML5.
I'm not annoyed by it when people are trying to make XML compatible documents, but effectively no one is. Platforms like WordPress use self-closing image tags everywhere, but almost no one using WordPress cares about document validation. This ends up meaning that the `<img ... />` is just an empty gesture.
It’s ironic that the very site in question, despite claiming XHTML compliance, is served as text/html instead of application/xhtml+xml, so the browser will never parse it as XML.
To quote [0]:
> All those “Valid XHTML 1.0!” links on the web are really saying “Invalid HTML 4.01!”.
Although the article is 20 years old now, so these days it’s actually HTML5.
Edit: Checked the other member sites. Only two are served as application/xhtml+xml.
[0]: https://webkit.org/blog/68/understanding-html-xml-and-xhtml/
And this makes the XML prolog invalid, because it's invalid to have it in HTML.
Not having it is XHTML compliant though, so it could just be removed.
>>these days it’s actually HTML5.
There is no HTML5. It's just a buzzword. https://html.spec.whatwg.org/dev/introduction.html#is-this-h...?
That's a stretch. Your link says
> Is this HTML5?
> In short: Yes.
See also [1].
That HTML5 was used in marketing doesn't make the technical term disappear. HTML5 is a bit more precise than HTML, it refers to the living standard that's currently in use, as opposed to HTML 4.01 and the previous versions of HTML.
[1] https://en.wikipedia.org/wiki/HTML5
It's not a technical term. Nowhere in the current HTML standard will you find a versioning of HTML. That's why it's now called a "living standard". You will never find a HTML6 or higher. That note you found is to help with any confusion.
> You will never find a HTML6 or higher
You might be right, but we don't know yet. Microsoft said that for Windows 10.
You might also be right that the current Living Standard specification doesn't really call it HTML5, but you'll find many people writing HTML for a living say HTML5 to refer to it, and telling them that HTML5 doesn't exist doesn't really help and is a bit wrong too if you have a descriptive approach to languages.
I'm still hopeful.
The next version of html should be able to do all the http verbs -- get, put, patch, post, delete online, reactively without having to use a form.
There has to be a way to figure this out, even if it requires a transition period. The best time to plant a tree was twenty years ago, the second best time is now. These things belong in the core HTML standards, not a js library you need to include in your code.
Oh that and better controls and better defaults but I guess that is something individual web browsers can implement on their own?
> that is something individual web browsers can implement on their own?
Yes, they could, but you want a standard that makes them all implement stuff in a compatible way… :-)
Telling them HTML5 does exist does even more harm cause it doesn't exist. Telling them it does exist is entirely wrong and is even a false statement, is misleading and causes confusion.
Ok, I'll bite.
Assuming you are right and HTML5 doesn't exist. What would be the actual bad outcomes of the following?
- believing HTML5 exists
- silently choosing to understand what someone mentioning HTML5 obviously meant
I am right and I gave you the proof. Understanding what one means when mentioning HTML5 has nothing to do with technically understanding that there is no HTML5 standard.
Let's just say that I don't think the truths you are pushing are as absolute as you seem to think, and are a reflect of how you view the world more than anything.
And that by correcting people that mention HTML5, you will probably just annoy people without achieving anything worth it. That would be true even if you are absolutely correct.
It's peak "well, actually", with the twist it might not even actually be.
One of the annoying things about having a living standard is that it is difficult to implement a conforming version as additional updates means that you are no longer conforming.
Versioned standards allow you to know that you are compliant to that version of the specification, and track the changes between versions -- i.e. what additional functionality do I need to implement.
With "living standards" you need to track the date/commit you last checked and do a manual diff to work out what has changed.
I highly recommend everyone involved in web development to read at least a small proportion of the horrors that are the HTML parser specification. It will leave you yearning for the return of XHTML.
Or you could also read web proposals where the reason for avoiding the ideal implementation is complication of updating HTML parser rules.
Or attempt to use the web features that are already hindered by the HTML parser (custom element table rows).
> It will leave you yearning for the return of XHTML.
…or be grateful you can just use an existing HTML5 parser that hides all this stuff to your innocent eyes :-)
I was in college when XHTML was all the rage and everything we wrote had to pass validation. I still get uncomfortable adding breaks without closing them.
Younger but on the same boat. Nothing reasonable, but this just feels unmatched. It itches exactly like an (unclosed parenthesis
> Validation is ignored, and most modern sites are built with little concern for structure or longevity.
I remember going online with a modem in the 90s. There was a new ISP in town, but their homepage took forever to load. I viewed the source, and whatever page generator they were rendered the page as HTML tables (this was fine back then), and added repetitive style tags to every table cell instead of using CSS (although I wonder if this was before CSS) or not doing so for empty cells, and that their homepage was so bloated and slow to load on dial-up.
I wonder how it is nowadays. But I suppose in the age that accomodates apps like Teams and Slack, who cares?
If only the repeated inline styles and abusively nested tables were the issue…
The dozens (or hundreds! have you tried GitHub recently??) HTTP requests.
The JavaScript bundles whose sizes are expressed in 10⁶ bytes.
The UIs that are fully recomputed and redrawn on each small interaction.
The auto playing videos. The images that are comparable to full res pictures (but usually empty of meaning because they are stock or AI generated).
I used to create a number of simple web pages in XHTML back in the days when we believed XHTML was the future. Recently, while going through and restructuring some of my old "online stuff", I learned that XHTML really isn't in a state that I'd want to use it any more:
* XHTML 1.0 and 1.1 are officially deprecated by the W3C.
* XHTML5 exists as a variant of HTML5. However, it's very clear that it's absolutely not a priority for the HTML5 working groups, and there's a statement that future features will not necessarily be supported by the XHTML5 variant.
* XHTML5 does not have a DTD, so one of the main advantages of XHTML - that you can validate its correctness with pure XML functionality - isn't there.
* If you do a 'view source' in Firefox on a completely valid XHTML 1.0/1.1 page, it'll redline the XML declaration like it's something wrong. Not sure if this is intended or possibly even a bug, but it certainly gives me a 'browser tells me this is not supposed to be there' feeling.
It pretty much seems to me XHTML has been abandoned by the web community. My personal conclusion has been that whenever I touch any of my old online things still written in XHTML, I'll convert them to HTML5.
> If you do a 'view source' in Firefox on a completely valid XHTML 1.0/1.1 page, it'll redline the XML declaration like it's something wrong
Is the page actually being served as "application/xhtml+xml"? Most xhtml sites aren't, in which case the browser is indeed interpreting those as invalid declarations in a regular old html document
If it’s served as XML, then view-source instead highlights the doctype line as an error (“Stray doctype.”).
I can confirm, I'm seeing this on my XHTML pages that are served as application/html+xml, that's a shame.
> it's very clear that it's absolutely not a priority for the HTML5 working groups
I wouldn't mind as long as it keeps working, but…
> and there's a statement that future features will not necessarily be supported by the XHTML5 variant.
That's news for me, and unfortunate.
XHTML survives in ePub. Recently there was a survey to gather industry feedback for a potential addition of an HTML flavour of ePub to be added to the next version of the spec, but it soon became fairly clear that people saw a lot of value in remaining XHTML-only: https://www.w3.org/blog/2026/epub-and-html-survey-results-an...
Circa '99 a high fraction (50%-ish) of HTML in the field was invalid, so if you were making a new web browser it had to parse invalid HTML the same way as Netscape which was one more reason we didn't get competitive web browsers.
HTML 5 specified exactly how "invalid" HTML is parsed so now there is no such thing as invalid HTML. XHTML was one of those things that never quite worked:
https://friendlybit.com/html/why-xhtml-is-a-bad-idea/
> there is no such thing as invalid HTML
There is. There are things that are still considered invalid, like nesting form elements for instance.
(this doesn't take away your argument though, and you were focusing on the parsing aspect).
The things that are invalid should all have defined behaviour. For example, a <label> is not allowed to contain two form controls, but is defined as applying to the first such control.
As far as parse errors is concerned, https://html.spec.whatwg.org/multipage/parsing.html#parse-er... says:
> This specification defines the parsing rules for HTML documents, whether they are syntactically correct or not. Certain points in the parsing algorithm are said to be parse errors. The error handling for parse errors is well-defined (that's the processing rules described throughout this specification), but user agents, while parsing an HTML document, may abort the parser at the first parse error that they encounter for which they do not wish to apply the rules described in this specification.
> The things that are invalid should all have defined behaviour
100% agree.
And then I guess the philosophical question is "What's invalid when everything is defined?"
In the linked article:
> you should master the HTML programming¹ language
The footnote reads:
> 1. This is a common debate - but for simplicity sake I'm just calling it this.
It's not really a debate, HTML is a markup language [1], not a programming language: you annotate a document with its structure and its formatting. You are not really programming when you write HTML (the markup is not procedural) (and this is not gatekeeping, there's nothing wrong about this and doesn't make HTML a lesser language).
To avoid the issue completely, you can phrase this as: "you should master HTML" and remove the footnote. Simple, clean, concise, clear. By the way, ML already means "Markup Language", so any "HTML .* language" phrasing can feel a bit off.
[1] https://en.wikipedia.org/wiki/Markup_language
I think that it is a debate, and it depends on the role of HTML in your system.
If all you're doing is using HTML to "annotate a document with its structure and its formatting", then yes, I'll accept that it's not quite programming, but I've not seen this approach of starting with a plain non-html document and marking it up by hand done in probably over two decades. I do still occasionally see it done for marking up blog posts or documentation into markdown and then generating html from it, but even that's a minuscule part of what HTML is used for these days.
Your mileage my vary, but what I and people around me typically do is work on hundreds/thousands of loosely coupled small snippets of HTML used within e.g. React JSX, or Django/Jinja templates or htmx endpoints, in order to dynamically control data and state in a large program. In this sense, while the html itself doesn't have control flow, it is an integral part of control flow in the larger system, and it's extremely likely that I'll break something in the functionality if I carelessly change an element's type or attribute value. In this sense, I'm not putting on a different hat when I'm working on the html, but just working on a different part of the program.
> React JSX, or Django/Jinja templates
Those are not HTML. PHP neither, even when used as a templating language for HTML.
> htmx endpoints
Not really familiar with htmx, but I would say this is HTML augmented with some additional mechanisms. I don't know how I would describe this augmented HTML, but I'm not applying my "not programming" statement to htmx (I probably could, but I haven't given enough thoughts to do it).
> In this sense, I'm not putting on a different hat when I'm working on the html, but just working on a different part of the program.
I agree with this actually. I wouldn't consider that writing HTML (or CSS) is really a separate activity when I'm building some web app.
> In this sense, while the html itself doesn't have control flow, it is an integral part of control flow in the larger system
That's correct but I don't see what it has got to do with the question of whether HTML is a programming language or not.
Strings do not have control flow but strings are integral part of larger programs that have control flow. So what? That doesn't make strings any closer to being programming languages.
What happens if I simply add an iterator mechanism to HTML (well, I guess we need variables too)? Is it no longer a markup language here (I won't add anything else):
<for i=0; i<1; i++> <html> </html> </for>
Better question, why don't we upgrade XML to do that?
> Better question, why don't we upgrade XML to do that?
XSLT which is an application of XML allows you to do a for-each: https://developer.mozilla.org/en-US/docs/Web/XML/XSLT/Refere...
That's basically the design of PHP with different syntax. <?for($i=0;$i<1;$i++){?> <html></html> <?}?>
Nobody uses PHP this way any more though — people treat it like Python or Node and write the entire codebase inside a big <? block
JSP is similar with different syntax again — nobody uses JSP either
I think ASP too but I never used that
You could have some client side JavaScript handle your for nodes as well. That's how I imagined what OP described actually.
> Nobody uses PHP this way any more though
Well… I have bad news.
I do, for one :-)
I ask you then: (1) how do you deal with the template that surrounds a large number of pages on a site? (2) how do you deal with the fact that the average web form might want to display something different based on the form contents (e.g. redraw the form if there's an error, draw something different on success?) (3) do you write anything that returns JSON or other results for AJAX or web services?
That's not technically HTML anymore.
But if you disagree with this, or somehow work around this statement by replacing your for element with some "for-loop" custom element (it is valid HTML to add custom tags with dashes in their names), my stronger argument is at https://news.ycombinator.com/item?id=46743219#46743554
I dunno, you're being pedantic :) Yes yes, the name clearly ends up "Markup Language" so yeah, with a very strict definition of programming languages, HTML is not one of them.
But if we use a broader definition, basically "a formal language that specifies behavior a machine must execute", then HTML is indeed a programming language.
HTML is not only about annotating documents or formatting, it can do things you expect from a "normal" programming language too, for example, you can do constraints validation:
That's neither annotating, just a "document" or just formatting. Another example is using <details> + <summary> and you have users mutating state that reveals different branches in the page, all just using HTML and nothing else.In the end, I agree with you, HTML ultimately is a markup language, but it's deceiving, because it does more than just markup.
One threshold is "can you write a program that might not complete?" You can't in SQL, which makes it less of a programming language than, say, FORTRAN.
If you look at the HTML 5 spec it is clear that it's intended to be a substrate for applications. The HTML 5 spec could be factored into a specification of the DOM, specification of an x-language API for the DOM and a specification for a serialization format as well as bindings of that x-language API to specific languages like Javascript.
I'm not sure we can call your parent comment pedantic. They're just being correct. Is it pedantic to say that fish is not a fruit? It's just correct to do so.
If anything, it is the act of stretching the definition of "programming language" so much that it includes HTML as a programming language that we should call pedantic.
> I dunno, you're being pedantic :)
It might be, I'm usually not, but this is all xhtml.club and this footnote are about, might as well be correct :-)
Constraint validation is still descriptive (what is allowed)
All details and summary are doing is conveying information on what's a summary and what's the complete story, and it has this hidden / shown behavior.
In any case, you will probably find something procedural / programming like in HTML, but it's not the core idea of the language, and if you are explaining what HTML is to a newbie, I feel like you should focus to the essential. Then we can discuss the corners between more experienced people.
In the end, all I'm saying is: you can just avoid issues and just say "HTML" without further qualifying it.
I would really like to use XHTML. It would make my HTML emitter much simpler (as I don't need special rules for elements that are self-closing, have special closing or escaping rules and whatever else) and more secure.
However no browsers have implemented streaming XHTML parsers. This means that the performance is notably worse for XHTML and if you rely on streaming responses (I currently do for a few pages like bulk imports) it won't work.
> no browsers have implemented streaming XHTML parsers
Dang, I hadn't considered this. That's something to add to the "simplest HTML omitting noisy tags like body and head vs going full XHTML" debate I have with myself.
One for XHTML: I like that the parser catches errors, it often prevent subtle issues.
Valid pure HTML 4.01 (1) made in 2025 counts?
I don’t thing it’s about luddites as website mentioned. Many professions have tools suggesting that person have extensive experience and in terms of web development, XHTML 1.0 or old standards of HTML are such.
1. https://www.tirreno.com
It does not? HTML 4.01 is not XML. So not XHTML. What's the confusion?
Both technologies are from the same period and share same validation culture from W3.
> Both technologies are from the same period
Not really, XHTML is as current as HTML 5.
XHTML 1.0 is older and is indeed (more or less?) the XML variant of HTML 4.01.
How so? HTML 4.01 is from 1999, XHTML 1.0 from 2000.
XHTML club mentioned valid XHTML 1.0 Strict (or Transitional), not general XHTML.
The XML part of XHTML is an important feature which HTML 4.01 doesn't have tough.
Writing valid HTML should be a bare minimum (I know it isn't!).
It is not “your HTML”, it’s HTML 4.01 from 1999, when XHTML 1.0 is from 2000. The common is the origins of validations that comes from W3 validator (1).
Same badges, same limits.
1. https://validator.w3.org/
Sorry, I edited my reply in the meantime and I probably broke your citation.
but what you are describing is XHTML 1.0, not XHTML in general.
HTML5 has its XHTML variant too, sometimes called XHTML 5.
Valid XHTML 1.0 Strict (or Transitional) is requirements of XHTML club, thus my comparison with HTML 4.01
> Valid XHTML 1.0 Strict (or Transitional) is requirements of XHTML club
Where do you see this?
I see that they do use XHTML 1.0 Strict but I don't see this requirement written.
Brad, we need your clarification here, it's critical, we need you to tell us which one of us is wrong! :-)
Thank you for asking.
XHTML Members(1):
Current websites that are valid XHTML 1.0 Strict (or Transitional)
Back to tirreno website, it is a pure transitional HTML 4.01 without JS or CSS, thus more or less same challenges to make it W3 valid (2) in our days. Have a look.
1. https://xhtml.club/members.html
2. https://validator.w3.org/check?uri=https://www.tirreno.com/&...
Damn, it seems you are right!
Still not convinced with your proposal to extend the XHTML club to include valid HTML 4.01, not that I care much anyway :-)
In the early 2000s I was 100% sold on the idea of strict XHTML documents and the semantic web. I loved the idea that all web pages could be XML documents which easily provided their data for other sources. If you marked your document with, an XHTML 1.0 Strict or XHTML 1.1 doctype, a web browser was supposed to show an error if the page contained an XML error. Problem was, it was a bit of a pain to get this right, so effectively no one cared about making compliant XHTML. It was a nice idea, but it didn't interact well with the real world.
Decades later, I'm still mildly annoyed when I see self-closing tags in HTML. When you're not trying to build a strict XML document, they're no longer required. Now I read them as a vestigial reminder of the strict XHTML dream.
EDIT: I just checked, and my site (at least the index page) still validates! https://validator.nu/?showsource=yes&doc=https%3A%2F%2Fander...
EDIT2: Hey, look, if you still want to use self-closing tags where they're not required: go nuts! I'm just explaining why I don't use them anymore and don't think others should (unless you want to make strict XML documents).
As someone who has gotten into the idea of semantic Web long after XHTML was all the rage[0], I somewhat resent that semantic Web and XML are so often lumped together[1]. After all, XML is just one serialisation mechanism for linked data.
[0] I don’t dislike XHTML. The snob in me loves the idea. Sure, had XHTML been The Standard it would have been so much more difficult to publish my first website at the age of 14 that I’m not sure I would have gotten into building for Web at all, but is it necessarily a good thing if our field is based on technology so forgiving to malformed input that a middle school pupil can pass for an engineer? and while I do omit closing tags when allowed by the spec, are the savings worth remembering these complicated rules for when they can be omitted, and is it worth maintaining all this branching that allows parsers to handle invalid markup, when barely any HTML is hand-written these days?
[1] Usually it is to the detriment of the former: the latter tends to be ill-regarded by today’s average Web developer used to JSON (even as they hail various schema-related additions on top of JSON that essentially try to make it do things XML can, but worse).
The semantic web took on the XSD data types
https://www.w3.org/TR/xmlschema-2/
even though a lot of tools and standards (I'm looking at you SPARQL) don't really support them. My favorite serialization for RDF is Turtle:
https://en.wikipedia.org/wiki/Turtle_(syntax)
That is a good point, if you consider XSD then that is an XML connection, it starts to become a bit complicated and I see why people start to dislike it. I forget about that because to me it’s just about the idea of a graph, which is otherwise quite elegant. Why not have a graph type-free with just string literals; much richer information about what kind of values go where can be provided through constraints, vocabularies, etc.
My favourite serialisation has got to be dumb triples (maybe quads). I don’t think writing graphs by hand is the future. However, when it comes to that, Turtle’s great.
> I'm still mildly annoyed when I see self-closing tags in HTML
Why? That's (mildly) bad for your health.
You're annoyed when people are trying to keep the dream alive?
Since HTML5 specifies how to handle all parse errors, and the handling of an XML self-closing tag is to ignore it unless it's part of an unquoted attribute value, it's valid HTML5.
I'm not annoyed by it when people are trying to make XML compatible documents, but effectively no one is. Platforms like WordPress use self-closing image tags everywhere, but almost no one using WordPress cares about document validation. This ends up meaning that the `<img ... />` is just an empty gesture.
I knew this HN submission would ate my Saturday afternoon and replace any other procrastination activity. Thanks, I hate it.