XHTML Club

(xhtml.club)

71 points | by bradley_taunt 2 months ago ago

130 comments

nathell 2 months ago

It’s ironic that the very site in question, despite claiming XHTML compliance, is served as text/html instead of application/xhtml+xml, so the browser will never parse it as XML.

To quote [0]:

> All those “Valid XHTML 1.0!” links on the web are really saying “Invalid HTML 4.01!”.

Although the article is 20 years old now, so these days it’s actually HTML5.

Edit: Checked the other member sites. Only two are served as application/xhtml+xml.

[0]: https://webkit.org/blog/68/understanding-html-xml-and-xhtml/

[-]

jraph 2 months ago

And this makes the XML prolog invalid, because it's invalid to have it in HTML.

Not having it is XHTML compliant though, so it could just be removed.

assimpleaspossi 2 months ago

>>these days it’s actually HTML5.

There is no HTML5. It's just a buzzword. https://html.spec.whatwg.org/dev/introduction.html#is-this-h...?

[-]

jraph 2 months ago

That's a stretch. Your link says

> Is this HTML5?

> In short: Yes.

See also [1].

That HTML5 was used in marketing doesn't make the technical term disappear. HTML5 is a bit more precise than HTML, it refers to the living standard that's currently in use, as opposed to HTML 4.01 and the previous versions of HTML.

[1] https://en.wikipedia.org/wiki/HTML5

[-]

assimpleaspossi 2 months ago

It's not a technical term. Nowhere in the current HTML standard will you find a versioning of HTML. That's why it's now called a "living standard". You will never find a HTML6 or higher. That note you found is to help with any confusion.

[-]

jraph 2 months ago

> You will never find a HTML6 or higher

You might be right, but we don't know yet. Microsoft said that for Windows 10.

You might also be right that the current Living Standard specification doesn't really call it HTML5, but you'll find many people writing HTML for a living say HTML5 to refer to it, and telling them that HTML5 doesn't exist doesn't really help and is a bit wrong too if you have a descriptive approach to languages.

[-]

mcny 2 months ago

I'm still hopeful.

The next version of html should be able to do all the http verbs -- get, put, patch, post, delete online, reactively without having to use a form.

There has to be a way to figure this out, even if it requires a transition period. The best time to plant a tree was twenty years ago, the second best time is now. These things belong in the core HTML standards, not a js library you need to include in your code.

Oh that and better controls and better defaults but I guess that is something individual web browsers can implement on their own?

[-]

jraph 2 months ago

> that is something individual web browsers can implement on their own?

Yes, they could, but you want a standard that makes them all implement stuff in a compatible way… :-)

7bit 2 months ago

Microsoft never said that, that's a myth/common misconception

[-]

jraph 2 months ago

Okay, at least I haven't dreamed it: [1]

> Although Microsoft claimed Windows 10 would be the last Windows version, eventually a new major release, Windows 11, was announced in 2021.

Where does the misconception come from? Do you know where I could read about it?

edit: it seems you are right, a dev said Windows 10 was the "last version of Windows" which was true but was interpreted as being an absolute statement when he really probably meant "at this time".

Thanks for correcting me!

[1] https://en.wikipedia.org/wiki/Microsoft_Windows_version_hist...

[-]

7bit 2 months ago

Answering after your edit:

Yes, Jerry Nixon claimed something like that (he's not just a dev though). But Microsoft never confirmed that, so it's just a statement by one person.

The Wikipedia quote is problematic, because it doesn't reference any sources for their claim. Whoever the author of that paragraph, it's journalistically bad practice not providing any sources to that claim.

[-]

jraph 2 months ago

Yeah, we can find quotes from various articles on the web and I did between writing my comment and my edit (I was on the phone, I didn't bother citing them here).

Wikipedia articles should source everything indeed, it's not that it's bad practice, it's against the idea of Wikipedia not to.

assimpleaspossi 2 months ago

Telling them HTML5 does exist does even more harm cause it doesn't exist. Telling them it does exist is entirely wrong and is even a false statement, is misleading and causes confusion.

[-]

jraph 2 months ago

Ok, I'll bite.

Assuming you are right and HTML5 doesn't exist. What would be the actual bad outcomes of the following?

- believing HTML5 exists

- silently choosing to understand what someone mentioning HTML5 obviously meant

[-]

assimpleaspossi 2 months ago

I am right and I gave you the proof. Understanding what one means when mentioning HTML5 has nothing to do with technically understanding that there is no HTML5 standard.

[-]

jraph 2 months ago

Let's just say that I don't think the truths you are pushing are as absolute as you seem to think, and I think they are a reflect of how you view the world more than anything.

And that by correcting people that mention HTML5, you will probably just annoy people without achieving anything worth it. That would be true even if you are absolutely correct.

It's peak "well, actually", with the twist it might not even actually be.

That's not the truth, just my opinion, and I appreciate that you might not agree.

Note that OP didn't mention "The HTML5 Standard", they mentioned "HTML5".

[-]

assimpleaspossi 2 months ago

I would rather be correct and annoy people than be wrong. It's fascinating to me today to see so many people allow "good enough" over correctness. It's a disaster waiting to happen.

For example, people get annoyed when I tell them not to put closing slashes on void HTML elements. They reply that it doesn't matter because it's in the standard that it's allowed so it's perfect HTML. What they don't bother to understand, despite my pointing to online documentation, is that placing closing slashes on some elements can cause harm and that no HTML standard tells you to put one there or has ever required it. Yet they argue with me anyway. Much like you argue with me about this. And that's when I stop.

[-]

jraph 2 months ago

At this point, closing slashes for void elements is coding style, exactly like white space we use for indentation. You can't be right because this is in opinion territory. Exactly like whether one should put semicolons or not in JavaScript when you have automatic semicolon insertion. Some people have strong opinions on the matter. Putting them has drawbacks, and not putting them too, and in both cases, readability and clarity, which is subjective, is a factor.

You are right that it has drawbacks and that it can bite. OTOH, people using closing slashes usually also quote all their attributes and will virtually never be bitten by this.

But people have backgrounds and habits, there's culture around a language like HTML, and these backgrounds are cultures have been shaped by XHTML.

Whether to put or not to put the slash is a healthy conversation to have and there are valid points for both, but if you are arguing like you are doing here for HTML5, considering "they don't bother to understand", you'll lose your arguments and people will find you annoying.

Some people feel bad about not closing br with a slash because it kinda feels like unmatched parentheses, or old malformed HTML from the 90's. That's not reasonable, but for the better or the worse, you can't just ignore this.

Some people sometimes write XML, and when they switch to HTML, their XML habits are there, and following habits especially when they are mostly harmless is efficient.

Some people write polyglot (X)HTML for some reason, and there the slash is needed.

There are reasons to put the slash, like there are reasons not to write it, and you can't just impose your truths like this.

OptionOfT 2 months ago

Can you share some of the links you'd share?

I'm someone who still lives in the XHTML world and pedantically close all of my elements. Seems like I need a knowledge refresher.

(and by the way, I could Google this, or any other chatbots, but I want to hear from your experience).

[-]

jraph 2 months ago

Reasons for not putting the slash in HTML:

- since the slash doesn't have any meaning in HTML, if you don't quote your attributes, you are at risk that your slash is attached to the value or your unquoted attribute: ← uh oh, class = "myclass/"!

You can test this by visiting the following URL, and inspecting the content: data:text/html,

Now guess what happens to the unquoted src attribute of the img tag followed by an unspaced stray slash… OTOH, you don't need to not quote the src attribute…

- it can give a false sense of correctness, one can reasonably consider that the closing effect of the slash is pure illusion and even potentially confusing.

For backward compatibility, a stray slash at the end of the start tag is ignored, not considered as an attribute that doesn't have a value, so there's argument to be made that it's still part of the syntax. You'll never have any issue if you always put a space before the slash (which most people who put the slashes do because of a silly bug in a browser that has not been relevant for a long time), or if you quote all your attributes.

I don't understand why they haven't decided to make the HTML5 parser parse like but I guess it is what it is.

assimpleaspossi 2 months ago

https://github.com/validator/validator/wiki/Markup-%C2%BB-Vo...

https://www.matsimon.dev/blog/to-close-or-not-to-close

https://github.com/w3c/html/issues/737

There's more.

2 months ago

[deleted]

cxr 2 months ago

Your argument is bad, and you should feel, if not bad, then at least very silly. There is an HTML5 standard.

It was developed by browser makers with input from the community, published by WHATWG, and begrudgingly accepted by W3C in 2014. That's a fact. The HTML5 Recommendation exists.

That those people went on to continue to develop the standards further, as standards bodies are wont to do, and that they call their current work the "Living Standard" doesn't erase that fact, any more than the W3C's publication of the third edition of the PNG standard last summer means that earlier editions "don't exist".

[-]

assimpleaspossi 2 months ago

Please point to any current edition of the HTML standard that is titled HTML5 published by WHATWG or the W3C. You can't. It's impossible. You can only point to past, out-of-date, no longer maintained publications. We're talking current standards. Not old ones.

[-]

2 months ago

[deleted]

cxr 2 months ago

This is either the dumbest thing I've heard all day, or the most dishonest thing. It's not even a good attempt at sleight of hand.

> Please point to any current edition of the HTML standard that is titled HTML5 published by WHATWG or the W3C. You can't. It's impossible.

No shit.

It's impossible because the current edition is very obviously not HTML5. Nor is it HTML 4.01. Or 2.0. It's the WHATWG's "Living Standard" that you very well know exists and have referenced by name in this thread.

If you want to make an argument for the non-existence of HTML6, then fine; you're making a sound, totally defensible argument that no such thing exists. (A strawman, because nobody here—besides you—actually mentioned HTML6, but a verifiably true fact nonetheless.)

But it makes for totally asinine argument for the claim that "There is no HTML5" and that it "doesn't exist". You'll take the W3C's stamp of approval? Great, it's right there—available for review now just as it was an hour ago, or at any other time after October 2014. This is an incontrovertible fact. Feel free to actually engage with this or any of the other facts you have been confronted with, rather than setting unsatisfiable goals like asking for the "current edition" that is "titled HTML5".

[-]

assimpleaspossi 2 months ago

>>It's impossible because the current edition is very obviously not HTML5.

I find it interesting to read that you are agreeing with my entire point while insulting me and arguing that I am wrong.

[-]

cxr 2 months ago

<https://en.wikipedia.org/wiki/Moving_the_goalposts>

----

> There is no HTML5. It's just a buzzword.

> [HTML5]'s not a technical term.

> Telling them HTML5 does exist does even more harm cause it doesn't exist. Telling them it does exist is entirely wrong and is even a false statement

> there is no HTML5 standard

Source: Literally all you, here, in this thread.

If you want to switch gears now and try rewrite the record and say that, actually, what you're really saying and have said all along is that HTML5 is no longer the latest Recommendation, go jump off a bridge.

[-]

assimpleaspossi 2 months ago

You took things out of context. You aren't following along to what I replied to. The person I replied to said, "...so these days it's actually HTML5." And, in response to "these days", I said there is no HTML5. Which is true and you agree with.

So did the other guy. Thanks to you all for your support for web standards.

[-]

2 months ago

[deleted]

cxr 2 months ago

Basic truth: if you can't manage to accurately summarize your counterparty's position in a statement that ends with "you agree with <x>" and have that person agree that that's their position rather than feeling compelled to call you out as an intellectually dishonest sack of shit, then they don't actually agree with you, it's more than likely to be an accurate charge against you, and you should knock it off immediately.

> I said there is no HTML5. Which is true and you agree with.

No, I don't.

[-]

assimpleaspossi 2 months ago

Well, I can judge the quality of person you are by your comments and see you aren't worth talking to so I'll leave you in your misery.

[-]

throwaway150 2 months ago

Don't move the goalposts and take this as an opportunity to learn from the feedback you are receiving from several people here. Perhaps learn to be more accurate in what you say and if you fail to be accurate (which happens to everyone, we are all humans), admit it gracefully, and move on.

Your original claim was:

> There is no HTML5.

Clearly false because it exists: https://html.spec.whatwg.org/multipage/introduction.html#is-...

Then you move the goalpost.

> Please point to any current edition of the HTML standard that is titled HTML5 published by WHATWG or the W3C.

But who said anything about "current edition"? Only you did. The fact that the current edition is not HTML 5 does not mean that the HTML 5 standard has stopped existing!

[-]

assimpleaspossi 2 months ago

The poster I replied to did. You're like the other guy who jumped into a thread without following the context. Technical people know better than to do that. But I'm not here to teach people how to follow a thread. Please don't reply. I'm done here

[-]

throwaway150 2 months ago

> You're like the other guy who jumped into a thread without following the context. Technical people know better than to do that. But I'm not here to teach people how to follow a thread.

Can you stop being so antagonistic already? I have been following this whole thread since afternoon. We both began commenting on this post at about the same time. I regret to say that I have wasted my whole afternoon and evening on this thread. So regrettably I have been following the context very closely actually.

Most of your subsequent comments make sense but they also keep moving the goalpost which is frustrating. I mean it is easy to be correct if you constantly keep moving the goalpost. But we must go back to where this nuisance began. It began at https://news.ycombinator.com/item?id=46743683 when you said:

> There is no HTML5.

Do you admit that your orginal claim "There is no HTML5." was incorrect? If you don't admit that do you also think HTML 4.01 standard has stopped existing? What about C89? Has that also stopped existing? Just because there are newer standards and living standards, it doesn't mean that the old standards have stopped existing.

2 months ago

[deleted]

2 months ago

[deleted]

2 months ago

[deleted]

2 months ago

[deleted]

rhdunn 2 months ago

One of the annoying things about having a living standard is that it is difficult to implement a conforming version as additional updates means that you are no longer conforming.

Versioned standards allow you to know that you are compliant to that version of the specification, and track the changes between versions -- i.e. what additional functionality do I need to implement.

With "living standards" you need to track the date/commit you last checked and do a manual diff to work out what has changed.

hannob 2 months ago

I used to create a number of simple web pages in XHTML back in the days when we believed XHTML was the future. Recently, while going through and restructuring some of my old "online stuff", I learned that XHTML really isn't in a state that I'd want to use it any more:

* XHTML 1.0 and 1.1 are officially deprecated by the W3C.

* XHTML5 exists as a variant of HTML5. However, it's very clear that it's absolutely not a priority for the HTML5 working groups, and there's a statement that future features will not necessarily be supported by the XHTML5 variant.

* XHTML5 does not have a DTD, so one of the main advantages of XHTML - that you can validate its correctness with pure XML functionality - isn't there.

* If you do a 'view source' in Firefox on a completely valid XHTML 1.0/1.1 page, it'll redline the XML declaration like it's something wrong. Not sure if this is intended or possibly even a bug, but it certainly gives me a 'browser tells me this is not supposed to be there' feeling.

It pretty much seems to me XHTML has been abandoned by the web community. My personal conclusion has been that whenever I touch any of my old online things still written in XHTML, I'll convert them to HTML5.

[-]

swiftcoder 2 months ago

> If you do a 'view source' in Firefox on a completely valid XHTML 1.0/1.1 page, it'll redline the XML declaration like it's something wrong

Is the page actually being served as "application/xhtml+xml"? Most xhtml sites aren't, in which case the browser is indeed interpreting those as invalid declarations in a regular old html document

[-]

chrismorgan 2 months ago

If it’s served as XML, then view-source instead highlights the doctype line as an error (“Stray doctype.”).

[-]

jraph 2 months ago

I can confirm, I'm seeing this on my XHTML pages that are served as application/html+xml, that's a shame.

cxr 2 months ago

Those red squiggles on view-source: pages in Gecko all have title text with diagnostics. The message (errProcessingInstruction) in recent-ish releases is given as:

> Saw “<?”. Probable cause: Attempt to use an XML processing instruction in HTML. (XML processing instructions are not supported in HTML.)

jraph 2 months ago

> it's very clear that it's absolutely not a priority for the HTML5 working groups

I wouldn't mind as long as it keeps working, but…

> and there's a statement that future features will not necessarily be supported by the XHTML5 variant.

That's news for me, and unfortunate.

2 months ago

[deleted]

al_borland 2 months ago

I was in college when XHTML was all the rage and everything we wrote had to pass validation. I still get uncomfortable adding breaks without closing them.

[-]

jraph 2 months ago

Younger but on the same boat. Nothing reasonable, but this just feels unmatched. It itches exactly like an (unclosed parenthesis

[-]

dang 2 months ago

There used to be a commenter here who would end all his comments with a closing paren, even though there had not been an opening paren. It led to a surprising number of flamewars! )

Edit: hmm, I couldn't really find any flamewars, but it did lead to objections:

https://news.ycombinator.com/item?id=8534213 (Oct 2014)

https://news.ycombinator.com/item?id=8533381 (Oct 2014)

https://news.ycombinator.com/item?id=6027549 (July 2013)

https://news.ycombinator.com/item?id=4990706 (Dec 2012)

https://news.ycombinator.com/item?id=4963264 (Dec 2012)

https://news.ycombinator.com/item?id=4943159 (Dec 2012)

https://news.ycombinator.com/item?id=4881400 (Dec 2012)

https://news.ycombinator.com/item?id=4765943 (Nov 2012)

[-]

aleksejs 2 months ago

It's a cultural difference thing: https://russian.stackexchange.com/questions/13142/what-do-or...

jraph 2 months ago

Hilarious. To fix this, one would have needed to put an open parenthesis in a sibling comment and hope to get upvoted more than the comment with the closing parent.

tptacek 2 months ago

That's demonic.

2 months ago

[deleted]

robin_reala 2 months ago

XHTML survives in ePub. Recently there was a survey to gather industry feedback for a potential addition of an HTML flavour of ePub to be added to the next version of the spec, but it soon became fairly clear that people saw a lot of value in remaining XHTML-only: https://www.w3.org/blog/2026/epub-and-html-survey-results-an...

notnullorvoid 2 months ago

I highly recommend everyone involved in web development to read at least a small proportion of the horrors that are the HTML parser specification. It will leave you yearning for the return of XHTML.

Or you could also read web proposals where the reason for avoiding the ideal implementation is complication of updating HTML parser rules.

Or attempt to use the web features that are already hindered by the HTML parser (custom element table rows).

[-]

jraph 2 months ago

> It will leave you yearning for the return of XHTML.

…or be grateful you can just use an existing HTML5 parser that hides all this stuff to your innocent eyes :-)

[-]

notnullorvoid 2 months ago

Grateful in part, but I can't help to think that if there was refusal to build parsers for an outlandish spec in the first place then we'd have fixed the problem by now.

Using existing parsers only hides the poor design up to a point.

[-]

jraph 2 months ago

I'm conflicted on this.

I mostly agree with the sentiment, I'd rather have simple parsers and sensible specs, but I'm also happy they do whatever it takes not to break anything (well, they are breaking XSLT…)

yomismoaqui 2 months ago

Nowadays you can use AI to write the parser, so it's the machine the one that suffers I guess.

https://friendlybit.com/python/writing-justhtml-with-coding-...

kevincox 2 months ago

I would really like to use XHTML. It would make my HTML emitter much simpler (as I don't need special rules for elements that are self-closing, have special closing or escaping rules and whatever else) and more secure.

However no browsers have implemented streaming XHTML parsers. This means that the performance is notably worse for XHTML and if you rely on streaming responses (I currently do for a few pages like bulk imports) it won't work.

[-]

jraph 2 months ago

> no browsers have implemented streaming XHTML parsers

Dang, I hadn't considered this. That's something to add to the "simplest HTML omitting noisy tags like body and head vs going full XHTML" debate I have with myself.

One for XHTML: I like that the parser catches errors, it often prevent subtle issues.

jraph 2 months ago

In the linked article:

> you should master the HTML programming¹ language

The footnote reads:

> 1. This is a common debate - but for simplicity sake I'm just calling it this.

It's not really a debate, HTML is a markup language [1], not a programming language: you annotate a document with its structure and its formatting. You are not really programming when you write HTML (the markup is not procedural) (and this is not gatekeeping, there's nothing wrong about this and doesn't make HTML a lesser language).

To avoid the issue completely, you can phrase this as: "you should master HTML" and remove the footnote. Simple, clean, concise, clear. By the way, ML already means "Markup Language", so any "HTML .* language" phrasing can feel a bit off.

[1] https://en.wikipedia.org/wiki/Markup_language

[-]

2 months ago

[deleted]

2 months ago

[deleted]

MrJohz 2 months ago

Of course HTML is a programming language. It's one of the languages I use every day to program with. I'm not sure what the definition of a programming language would be beyond that.

Do you mean "Turing-complete" language? Or maybe "procedural programming language"? I agree HTML isn't either of those, but those aren't the be-all and end-all of programming now, are they?

[-]

jraph 2 months ago

I, and most of us, mean a language in which one can express a computer program, which is a set of instructions for a computer to execute. You don't execute an HTML file, you display it, render it. You can't implement fizz buzz in HTML. At best, you mark up its output. With HTML, you don't instruct, you describe. You instruct what to do with JavaScript, or Python, or whatever programming languages you use client or server side.

A programming language doesn't need to be procedural, it can be functional, or use another computationally equivalent paradigm. I'm not quite sure it needs to be Turing complete, but possibly.

A programming language lets you express to some processor that provides a set of computation primitives what to do with the memory cells you have at your disposal, and in general it lets you deal with input and output.

If you consider any language you program with to be a programming language, then CSS, JSON, YAML, XML, markdown (that's what your readme is likely written in) and even English (that's what you use to express the specs, the bugs, maybe your notes / drafts, the comments, possibly the language the singer of the songs you're listening to while programming use) or UML need to be programming languages too. That's not quite useful. "Program with" is too large and would make the "programming" qualifier largely useless.

https://en.wikipedia.org/wiki/Programming_language

https://en.wikipedia.org/wiki/Computer_program

https://en.wikipedia.org/wiki/List_of_programming_languages

https://stackoverflow.com/questions/14512218/is-html5-a-prog...

[-]

MrJohz 2 months ago

An HTML file is a set of instructions to execute. They're very high-level, declarative instructions for describing a UI, similar to how SQL is high-level declarative instructions for describing a set of data to be loaded, or how Prolog is a high-level declarative set of instructions for describing a set of logical axioms, but they're still instructions. You pass them to an execution engine, and on the basis of the instructions you've written, the engine does something. (See e.g. the section on fourth generation PLs in the second link you gave.)

More broadly, I think this discussion is a stupid one. There is no formal, mathematically precise definition of a programming language. There are formal definitions of lots of PL-related things, and for what a language is in general (a combination of syntax and semantics), but there's no formal definition of the term "programming language" that's useful here.

So if we're not arguing about a formal definition, then we're arguing about essentially our favourite dictionaries, and how we personally interpret our favourite dictionaries. And that's just not a useful argument at all, it's not even how dictionaries are meant to work! And yet whenever someone dares to write "HTML programming language" or something similar, there is always a comment from someone demanding that the author use their personal dictionary, and correct their changes. And it is deeply grating, because whenever I see this happen:

* The original statement is never ambiguous. I have never seen a situation where referring to HTML as a programming language has ever caused some sort of confusion.

* The discussion about whether HTML is a programming language is almost always completely irrelevant to the topic at hand, and bringing it up adds no value to the discussion.

* The author's definition is usually inconsistent anyway. Which isn't a problem — I don't imagine my mental definition of a programming language is entirely consistent either — but it's dumb watching someone try and correct other people without understanding their own definition enough to be able to respond to clarifying questions.

In your original comment, you said "it's not really a debate", and that's completely correct. It's not a debate because there's no right answer. There's not even any value to a right answer. The matter is entirely a question of terminology. And if different choices of terminology make things unclear, then it might be worth clarifying that terminology, but here I don't think the author could have been any clearer at all about what they were trying to communicate.

[-]

jraph 2 months ago

> More broadly, I think this discussion is a stupid one

Mostly agree (although reasoning about these things can be interesting). More on this at the end of the comment.

> So if we're not arguing about a formal definition, then we're arguing about essentially our favourite dictionaries

It's also a matter of the most widely accepted definitions, not just what definition one prefers. And it seems to me not considering HTML as a programming language is what's most accepted and for good reasons.

We need a common understanding to communicate.

> I don't think the author could have been any clearer at all about what they were trying to communicate.

They just make their expression more confusing and more complicated by needlessly qualifying HTML and putting this footnote when they could have skipped both the footnote and the qualifier.

Here I was in fact mostly concerned about the clarity and the presentation. That page seems to be written for newcomers, qualifying HTML as a programming language doesn't seem quite optimal given the (supposed) target, I think it would do a disservice to someone who has not a great understanding of those things.

So the better way of exposing things IMHO is just not mentioning it at all, and if someone wonders whether HTML is a programming language, they can do their own research.

> An HTML file is a set of instructions to execute

I believe it's a stretch to describe HTML like this. Your explanation makes it work, but I don't think it's a usual way of viewing HTML. In any case it seems to me presenting HTML like a set of instruction to execute to a newcomer would just be weird.

Now, this discussion wouldn't matter much between people who have such a clear understanding of these things as you. When everything is this clear, deciding whether HTML is a programming language is indeed a purely intellectual exercise that can totally feel pointless and where both positions are probably reasonable depending on the perspectives, and yes, on the exact, clarified definition one uses.

So I was wrong: there is a debate. It was incautious of me to state otherwise. And the debate is mostly pointless for whoever clearly understands the involved concepts. And I should have focused on the pedagogical aspect of this stuff, not on whether it's wrong.

I will definitely handle such a discussion differently next time, if I don't outright skip it.

radicalethics 2 months ago

What happens if I simply add an iterator mechanism to HTML (well, I guess we need variables too)? Is it no longer a markup language here (I won't add anything else):

Better question, why don't we upgrade XML to do that?

[-]

jraph 2 months ago

That's not technically HTML anymore.

But if you disagree with this, or somehow work around this statement by replacing your for element with some "for-loop" custom element (it is valid HTML to add custom tags with dashes in their names), my stronger argument is at https://news.ycombinator.com/item?id=46743219#46743554

mimasama 2 months ago

> Better question, why don't we upgrade XML to do that?

XSLT which is an application of XML allows you to do a for-each: https://developer.mozilla.org/en-US/docs/Web/XML/XSLT/Refere...

direwolf20 2 months ago

That's basically the design of PHP with different syntax. <?for($i=0;$i<1;$i++){?> <html></html> <?}?>

Nobody uses PHP this way any more though — people treat it like Python or Node and write the entire codebase inside a big <? block

JSP is similar with different syntax again — nobody uses JSP either

I think ASP too but I never used that

[-]

jraph 2 months ago

You could have some client side JavaScript handle your for nodes as well. That's how I imagined what OP described actually.

> Nobody uses PHP this way any more though

Well… I have bad news.

I do, for one :-)

[-]

PaulHoule 2 months ago

I ask you then: (1) how do you deal with the template that surrounds a large number of pages on a site? (2) how do you deal with the fact that the average web form might want to display something different based on the form contents (e.g. redraw the form if there's an error, draw something different on success?) (3) do you write anything that returns JSON or other results for AJAX or web services?

[-]

jraph 2 months ago

(1) What about it? (note that I don't manage websites with a large number of pages)

(2) It's easy to add if conditions that test $_GET, $_POST or $_REQUEST display different things depending on what was submitted

(3) Not often (but have in the past, and will probably have to soon in a personal project). What issue are you anticipating?

[-]

PaulHoule 2 months ago

(1) It's a problem if you have 2 or 3 (never mind N where N is large) different web pages that have the same stuff at the top of the bottom. I mean you can have

  <?php include("header.php") ?>
  ... body ...
  <?php include("footer.php") ?>

but...

(2) ... in either case it is just as easy to write

  <?php
  ... some "router" that tests $_GET, ... to set $body_file ...
  include("header.php");
  include($body_file)
  include("footer.php");
  ?>

where you have the option of putting headers on before you include header.php, showing a different header or footer conditional, etc. This approach is structurally stable and scales with the complexity of your application no matter what you're doing...

(3) ... for instance say you want to write a page that might return a different format depending on the headers, the router can return JSON if that is called for, or XML if that is called for, or HTML inside the site's global template if that is called for.

[-]

jraph 2 months ago

I'll use (1) for when I have a restricted set of pages (say, the usual pages of a site, like home, contact, about ...); body is not in a separate file; and (2) when the number of page is dynamic, say, a site that displays recipes stored in markdown files.

(3) I don't know yet for sure how I'd do it today (I will soon normally), I suppose I would just write different scripts, that can call some shared code. For APIs, people expect something that looks like REST endpoints and I suppose I would return JSON or XML in REST endpoint, but the URL structure that looks good for REST wouldn't for a normal page.

radicalethics 2 months ago

If HTML was never able to be the full solution, then I guess if I had to expand on where I'm going, then what the heck are we even doing with this html thing? Either MAKE IT like PHP, ditch it, or do something, anything.

[-]

jraph 2 months ago

HTML is perfectly able to do what it was designed for: mark up documents.

There still needs to be something like HTML even when you have PHP: PHP is something you run on the server and it still needs to output something to the client in some format, and HTML is adequate for this.

The heck we are doing with HTML is taking it for building client apps. But even then, you now have UI toolkits that mimic this model: QML, whatever XML format Android has to design UIs, etc.

mediumsmart 2 months ago

> Either MAKE IT like PHP, ditch it, or do something, anything.

Do nothing so I can make websites that are accessible, secure, static and fast. HTML is the full solution unlike PHP or JS. With CSS it’s Turing complete.

embedding-shape 2 months ago

I dunno, you're being pedantic :) Yes yes, the name clearly ends up "Markup Language" so yeah, with a very strict definition of programming languages, HTML is not one of them.

But if we use a broader definition, basically "a formal language that specifies behavior a machine must execute", then HTML is indeed a programming language.

HTML is not only about annotating documents or formatting, it can do things you expect from a "normal" programming language too, for example, you can do constraints validation:

    <input name="token" required pattern="[A-Z]{3}-\d{4}" title="Must match ABC-1234 (3 uppercase letters, hyphen, 4 digits)" placeholder="ABC-1234">

That's neither annotating, just a "document" or just formatting. Another example is using <details> + <summary> and you have users mutating state that reveals different branches in the page, all just using HTML and nothing else.

In the end, I agree with you, HTML ultimately is a markup language, but it's deceiving, because it does more than just markup.

[-]

jraph 2 months ago

> I dunno, you're being pedantic :)

It might be, I'm usually not, but this is all xhtml.club and this footnote are about, might as well be correct :-)

Constraint validation is still descriptive (what is allowed)

All details and summary are doing is conveying information on what's a summary and what's the complete story, and it has this hidden / shown behavior.

In any case, you will probably find something procedural / programming like in HTML, but it's not the core idea of the language, and if you are explaining what HTML is to a newbie, I feel like you should focus to the essential. Then we can discuss the corners between more experienced people.

In the end, all I'm saying is: you can just avoid issues and just say "HTML" without further qualifying it.

JimDabell 2 months ago

> behavior a machine must execute

This is not what HTML does. Tags are not instructions, they delimit the start and end of elements. They describe content, they do not specify behaviour.

In your pattern example, that is still just a description of what is acceptable input. It doesn’t execute anything. A paper form might specify the format DD / MM / YYYY but that doesn‘t mean the form is executing a program in your brain when you fill it out.

throwaway150 2 months ago

I'm not sure we can call your parent comment pedantic. They're just being correct. Is it pedantic to say that fish is not a fruit? It's just correct to do so.

If anything, it is the act of stretching the definition of "programming language" so much that it includes HTML as a programming language that we should call pedantic.

PaulHoule 2 months ago

One threshold is "can you write a program that might not complete?" You can't in SQL, which makes it less of a programming language than, say, FORTRAN.

If you look at the HTML 5 spec it is clear that it's intended to be a substrate for applications. The HTML 5 spec could be factored into a specification of the DOM, specification of an x-language API for the DOM and a specification for a serialization format as well as bindings of that x-language API to specific languages like Javascript.

[-]

jraph 2 months ago

> If you look at the HTML 5 spec it is clear that it's intended to be a substrate for applications

That's the saddest thing I've read today.

(arguably not a terribly sad day)

[-]

PaulHoule 2 months ago

Back when it was fashionable to complain about how every Electron application has 30 MB of bloat I did an eval of all the options for x-platform applications that weren't Electron and came to the conclusion that "they all sucked" except for maybe JavaFX -- and not everybody likes Java as much as I do.

Building up to Win 8, Microsoft pushed for grid and flexbox which are the bees knees for laying out applications in HTML.

Compare the annoying nag dialogs in MacOS and Windows. MacOS nags you to buy into Apple Music and other unwanted services with 2025 reskins of the 1999 reskins of the modal dialogs from the 1984 Mac Classic. Windows does the same with ads that look like advertising which I find more visually appealing even if the services are unappealing.

Every time I think about writing a GUI application that's not a web application I think "this is a waste of time" whereas my web applications keep finding new lives as mobile applications, VR applications, etc.

[-]

jraph 2 months ago

> Back when it was fashionable to complain about how every Electron application has 30 MB of bloat

What, it's not anymore??

And yes, I do end up writing web applications every time too (I haven't bundled them though). I don't want to tie myself to a specific platform, and being able to point users to an URL and bam, they can run the thing, is convenient. I hate that this makes me dependent on tech maintained with Big Tech money though.

recursivedoubts 2 months ago

consider the following:

https://html-lang.org/

[-]

jraph 2 months ago

Oh yes, this is "HTML, the programming language", not HTML (also called "HTML, the markup language" in that page).

And it's brilliant :-)

falcor84 2 months ago

I think that it is a debate, and it depends on the role of HTML in your system.

If all you're doing is using HTML to "annotate a document with its structure and its formatting", then yes, I'll accept that it's not quite programming, but I've not seen this approach of starting with a plain non-html document and marking it up by hand done in probably over two decades. I do still occasionally see it done for marking up blog posts or documentation into markdown and then generating html from it, but even that's a minuscule part of what HTML is used for these days.

Your mileage my vary, but what I and people around me typically do is work on hundreds/thousands of loosely coupled small snippets of HTML used within e.g. React JSX, or Django/Jinja templates or htmx endpoints, in order to dynamically control data and state in a large program. In this sense, while the html itself doesn't have control flow, it is an integral part of control flow in the larger system, and it's extremely likely that I'll break something in the functionality if I carelessly change an element's type or attribute value. In this sense, I'm not putting on a different hat when I'm working on the html, but just working on a different part of the program.

[-]

jraph 2 months ago

> React JSX, or Django/Jinja templates

Those are not HTML. PHP neither, even when used as a templating language for HTML.

> htmx endpoints

Not really familiar with htmx, but I would say this is HTML augmented with some additional mechanisms. I don't know how I would describe this augmented HTML, but I'm not applying my "not programming" statement to htmx (I probably could, but I haven't given enough thoughts to do it).

> In this sense, I'm not putting on a different hat when I'm working on the html, but just working on a different part of the program.

I agree with this actually. I wouldn't consider that writing HTML (or CSS) is really a separate activity when I'm building some web app.

throwaway150 2 months ago

> In this sense, while the html itself doesn't have control flow, it is an integral part of control flow in the larger system

That's correct but I don't see what it has got to do with the question of whether HTML is a programming language or not.

Strings do not have control flow but strings are integral part of larger programs that have control flow. So what? That doesn't make strings any closer to being programming languages.

[-]

falcor84 2 months ago

It's a question of semantics. What I'm saying is that the way many of us use html in practice in 2026 is less like arbitrary strings and more like db connection strings, where most of our focus is not on whether a bit of text is an article or an aside, but about how it participates in the control flow across different components in our architecture.

From another perspective, I'm not familiar with any present day company, where the html they use in their source code is sufficiently simple and distinct from the rest of the program to be managed by non-programmers. The only html that is just used as strings is that used for individual posts in a crm or marketing tool's cms, typically stored in a database rather than the source code repository.

netsharc 2 months ago

> Validation is ignored, and most modern sites are built with little concern for structure or longevity.

I remember going online with a modem in the 90s. There was a new ISP in town, but their homepage took forever to load. I viewed the source, and whatever page generator they were rendered the page as HTML tables (this was fine back then), and added repetitive style tags to every table cell instead of using CSS (although I wonder if this was before CSS) or not doing so for empty cells, and that their homepage was so bloated and slow to load on dial-up.

I wonder how it is nowadays. But I suppose in the age that accomodates apps like Teams and Slack, who cares?

[-]

jraph 2 months ago

If only the repeated inline styles and abusively nested tables were the issue…

The dozens (or hundreds! have you tried GitHub recently??) HTTP requests.

The JavaScript bundles whose sizes are expressed in 10⁶ bytes.

The UIs that are fully recomputed and redrawn on each small interaction.

The auto playing videos. The images that are comparable to full res pictures (but usually empty of meaning because they are stock or AI generated).

JimDabell 2 months ago

> whatever page generator they were rendered the page as HTML tables (this was fine back then), and added repetitive style tags to every table cell instead of using CSS

Apart from the fact that very few people understood CSS back then, there was a stupendous amount of really weird bugs. For instance, I remember having a simple th { font-size: … } rule, and some versions of Netscape 4 somehow managed to apply the font size to all <th> cells except for the third one. So workarounds like extra style attributes were added to fix things like this.

reconnecting 2 months ago

Valid pure HTML 4.01 (1) made in 2025 counts?

I don’t thing it’s about luddites as website mentioned. Many professions have tools suggesting that person have extensive experience and in terms of web development, XHTML 1.0 or old standards of HTML are such.

1. https://www.tirreno.com

[-]

throwaway150 2 months ago

It does not? HTML 4.01 is not XML. So not XHTML. What's the confusion?

[-]

reconnecting 2 months ago

Both technologies are from the same period and share same validation culture from W3.

[-]

jraph 2 months ago

> Both technologies are from the same period

Not really, XHTML is as current as HTML 5.

XHTML 1.0 is older and is indeed (more or less?) the XML variant of HTML 4.01.

[-]

reconnecting 2 months ago

How so? HTML 4.01 is from 1999, XHTML 1.0 from 2000.

XHTML club mentioned valid XHTML 1.0 Strict (or Transitional), not general XHTML.

jraph 2 months ago

The XML part of XHTML is an important feature which HTML 4.01 doesn't have tough.

Writing valid HTML should be a bare minimum (I know it isn't!).

[-]

reconnecting 2 months ago

It is not “your HTML”, it’s HTML 4.01 from 1999, when XHTML 1.0 is from 2000. The common is the origins of validations that comes from W3 validator (1).

Same badges, same limits.

1. https://validator.w3.org/

[-]

jraph 2 months ago

Sorry, I edited my reply in the meantime and I probably broke your citation.

but what you are describing is XHTML 1.0, not XHTML in general.

HTML5 has its XHTML variant too, sometimes called XHTML 5.

[-]

2 months ago

[deleted]

reconnecting 2 months ago

Valid XHTML 1.0 Strict (or Transitional) is requirements of XHTML club, thus my comparison with HTML 4.01

[-]

jraph 2 months ago

> Valid XHTML 1.0 Strict (or Transitional) is requirements of XHTML club

Where do you see this?

I see that they do use XHTML 1.0 Strict but I don't see this requirement written.

Brad, we need your clarification here, it's critical, we need you to tell us which one of us is wrong! :-)

[-]

reconnecting 2 months ago

Thank you for asking.

XHTML Members(1):

Current websites that are valid XHTML 1.0 Strict (or Transitional)

Back to tirreno website, it is a pure transitional HTML 4.01 without JS or CSS, thus more or less same challenges to make it W3 valid (2) in our days. Have a look.

1. https://xhtml.club/members.html

2. https://validator.w3.org/check?uri=https://www.tirreno.com/&...

[-]

jraph 2 months ago

Damn, it seems you are right!

Still not convinced with your proposal to extend the XHTML club to include valid HTML 4.01, not that I care much anyway :-)

firefoxd 2 months ago

Just tested my xhtml website on validator.w3.org , the errors I see triggered me:

> Trailing slash on void elements has no effect and interacts badly with unquoted attribute values.

Unquoted attribute values? So help me I don't see you using unquoted attribute values.

[-]

Elfener 2 months ago

I think you (or the site decided to) ran it through the html5 validator?

GavinAnderegg 2 months ago

In the early 2000s I was 100% sold on the idea of strict XHTML documents and the semantic web. I loved the idea that all web pages could be XML documents which easily provided their data for other sources. If you marked your document with, an XHTML 1.0 Strict or XHTML 1.1 doctype, a web browser was supposed to show an error if the page contained an XML error. Problem was, it was a bit of a pain to get this right, so effectively no one cared about making compliant XHTML. It was a nice idea, but it didn't interact well with the real world.

Decades later, I'm still mildly annoyed when I see self-closing tags in HTML. When you're not trying to build a strict XML document, they're no longer required. Now I read them as a vestigial reminder of the strict XHTML dream.

EDIT: I just checked, and my site (at least the index page) still validates! https://validator.nu/?showsource=yes&doc=https%3A%2F%2Fander...

EDIT2: Hey, look, if you still want to use self-closing tags where they're not required: go nuts! I'm just explaining why I don't use them anymore.

[-]

strogonoff 2 months ago

As someone who has gotten into the idea of semantic Web long after XHTML was all the rage[0], I somewhat resent that semantic Web and XML are so often lumped together[1]. After all, XML is just one serialisation mechanism for linked data.

[0] I don’t dislike XHTML. The snob in me loves the idea. Sure, had XHTML been The Standard it would have been so much more difficult to publish my first website at the age of 14 that I’m not sure I would have gotten into building for Web at all, but is it necessarily a good thing if our field is based on technology so forgiving to malformed input that a middle school pupil can pass for an engineer? and while I do omit closing tags when allowed by the spec, are the savings worth remembering these complicated rules for when they can be omitted, and is it worth maintaining all this branching that allows parsers to handle invalid markup, when barely any HTML is hand-written these days?

[1] Usually it is to the detriment of the former: the latter tends to be ill-regarded by today’s average Web developer used to JSON (even as they hail various schema-related additions on top of JSON that essentially try to make it do things XML can, but worse).

[-]

PaulHoule 2 months ago

The semantic web took on the XSD data types

https://www.w3.org/TR/xmlschema-2/

even though a lot of tools and standards (I'm looking at you SPARQL) don't really support them. My favorite serialization for RDF is Turtle:

https://en.wikipedia.org/wiki/Turtle_(syntax)

[-]

strogonoff 2 months ago

That is a good point, if you consider XSD then that is an XML connection, it starts to become a bit complicated and I see why people start to dislike it. I forget about that because to me it’s just about the idea of a graph, which is otherwise quite elegant. Why not have a graph type-free with just string literals; much richer information about what kind of values go where can be provided through constraints, vocabularies, etc.

My favourite serialisation has got to be dumb triples (maybe quads). I don’t think writing graphs by hand is the future. However, when it comes to that, Turtle’s great.

[-]

PaulHoule 2 months ago

Because the semantics of numbers and dates matters.

It's absurd that JSON defines numbers as strings and has no specification for dates and times.

I believe we lose a lot of small-p programming talent (people who have other skills who could put them on wheels by "learning to code") the moment people have the 0.1 + 0.2 != 0.3 experience. Decimal numbers should just be on people's fingertips, they should be the default thing that non-professional programmers get, IEEE doubles and floats should be as exotic as FP16.

As for dates, everyday applications written by everyday people that use JSON frequently have 5 or more different date formats used in different parts of the application and it is an everyday occurrence that people are scratching their heads over why the system says that some event that happened on Jan 24, 2026 happened on Jan 23, 2026 or Jan 25, 2026.

Give people choices like that and they will make the wrong choices and face the consequences. Build answers for a few simple things that people screw up over and over and... they won't screw up!

[-]

strogonoff 2 months ago

> Because the semantics of numbers and dates matters.

Type semantics is only a small part of what is needed for systems and humans to know how to adequately work with and display the data. All of that information, including the type but so much more, can be supplied in established ways (more graphs!) without having to sprinkle XSD types on your values.

For example, say you have a triple where the object is a number that for whatever good reason must lie between 1 and <value from elsewhere in the graph> in 0.1 increments. Knowing that it is a number and being able to do math on it is not that useful when 99% of math operations would yield an invalid value; you need more metadata, and if you have that you also have the type.

Besides, verbatim literal, as obtained, is the least lossy format. The user typed "2.2"—today you round it to an integer but tomorrow you support decimal places, if you keep the original the system can magically get more precise and no one needs to repeat themselves. (You can obviously reject input at the entry stage if it’s outlandish, but when it comes to storage plain string is king.)

jraph 2 months ago

> I'm still mildly annoyed when I see self-closing tags in HTML

Why? That's (mildly) bad for your health.

direwolf20 2 months ago

You're annoyed when people are trying to keep the dream alive?

Since HTML5 specifies how to handle all parse errors, and the handling of an XML self-closing tag is to ignore it unless it's part of an unquoted attribute value, it's valid HTML5.

[-]

GavinAnderegg 2 months ago

I'm not annoyed by it when people are trying to make XML compatible documents, but effectively no one is. Platforms like WordPress use self-closing image tags everywhere, but almost no one using WordPress cares about document validation. This ends up meaning that the `<img ... />` is just an empty gesture.

PaulHoule 2 months ago

Circa '99 a high fraction (50%-ish) of HTML in the field was invalid, so if you were making a new web browser it had to parse invalid HTML the same way as Netscape which was one more reason we didn't get competitive web browsers.

HTML 5 specified exactly how "invalid" HTML is parsed so now there is no such thing as invalid HTML. XHTML was one of those things that never quite worked:

https://friendlybit.com/html/why-xhtml-is-a-bad-idea/

[-]

jraph 2 months ago

> there is no such thing as invalid HTML

There is. There are things that are still considered invalid, like nesting form elements for instance.

(this doesn't take away your argument though, and you were focusing on the parsing aspect).

[-]

chrismorgan 2 months ago

The things that are invalid should all have defined behaviour. For example, a <label> is not allowed to contain two form controls, but is defined as applying to the first such control.

As far as parse errors is concerned, https://html.spec.whatwg.org/multipage/parsing.html#parse-er... says:

> This specification defines the parsing rules for HTML documents, whether they are syntactically correct or not. Certain points in the parsing algorithm are said to be parse errors. The error handling for parse errors is well-defined (that's the processing rules described throughout this specification), but user agents, while parsing an HTML document, may abort the parser at the first parse error that they encounter for which they do not wish to apply the rules described in this specification.

[-]

jraph 2 months ago

> The things that are invalid should all have defined behaviour

100% agree.

And then I guess the philosophical question is "What's invalid when everything is defined?"

[-]

chrismorgan 2 months ago

The idea of almost all of HTML’s errors (parsing and conformance) is that they indicate likely errors (though it’s definitely quite possible to deliberately skirt the edges, e.g. content=width=device-width,initial-scale=1).

reconnecting 2 months ago

You might avoid using inline CSS here by replacing <h2 style="font-weight:normal;"> with 

yomismoaqui 2 months ago

Is this ragebait?

I lived through the XML hype cycle and god it was awful. I Still have nightmares about some XSLT I had to maintain.

Good riddance...

jraph 2 months ago

I knew this HN submission would ate my Saturday afternoon and replace any other procrastination activity. Thanks, I hate it.