SuperHTML is here to rescue you from syntax errors, and it's FOSS

(theregister.com)

52 points | by rc00 2 days ago ago

39 comments

PreInternet01 2 days ago

Previous discussion, sans El Reg hyperbole: https://news.ycombinator.com/item?id=41512213

notpushkin 2 days ago

> Author would like to see a switch back to plain old static HTML. Us too

Wholeheartedly agree. Can we also go to XHTML5, while on it?

[-]

alwillis 2 days ago

I used to write XHTML and it took a while to lose the muscle memory.

But I have no desire to go back.

kristoff_it 2 days ago

> Can we also go to XHTML5, while on it?

Personally, I would like that, but in practice I don't think it's ever going to be a thing, so I didn't even bother adding support for it in SuperHTML, sorry!

[-]

notpushkin 2 days ago

Totally understandable, you do you! But I think the problem is, it’s not yet a thing because of the lack of tooling. It is kind of a chicken and egg situation here.

Would you perhaps be open to a PR, maybe? (I would love to work on something like that, although I’m swamped right now and can’t promise anything.)

[-]

kristoff_it 2 days ago

Yes, I'd be open to a PR, but I think the other comment on lack of browser support is the actual obstacle and really don't see that changing easily.

[-]

jraph 2 days ago

What's missing in browsers for XHTML5? Don't they understand and validate the XML and allow using the HTML5 elements when some page is served with a application/xhtml+xml mimetype? Don't they, at worst, degrade to html5 parsing?

(I only tried writing polyglot syntax, with only stuff allowed in both XHTML and HTML syntaxes)

Or are you concerned with the unrecoverable mechanism?

wizzwizz4 2 days ago

Being able to write:

  <script src="util.js" />

instead of:

  <script src="util.js"></script>

would be nice. But it's not (iirc) compatible with WHATWG's HTML5 parsing algorithm, browsers only use XML mode if served the correct content type, and in XML mode they don't recover at all from errors (when really, they should show an annoying popup and then switch to a quirks parser).

We need to fix that, before XHTML can be viable. As-is, it's like it's being sabotaged.

[-]

wruza 2 days ago

Maybe web shops must stop producing malformed documents. No one else expects a corrupt pdf, xlsx, exe, jpg, etc to work. Ability to show incorrect documents. What an absurd requirement. If it’s incorrect, fuken correct it man, or get a simpler job.

[-]

wizzwizz4 a day ago

Microsoft Excel has quite a bit of code dedicated to opening corrupt XLSX files. Pretty sure Adobe Reader does stuff with corrupt PDFs, too. The ability to show incorrect documents is important if you care more about reading the document than technical correctness.

notpushkin 2 days ago

> they don't recover at all from errors

This is a recurring argument I see against XHTML. Fundamentally I don’t see it as a downside (you don’t expect the browser to run malformed JavaScript – why not HTML, too?). I agree there’s a couple of possible obstacles, though:

1. It can be hard to produce 100% valid XHTML by hand – I personally run into errors from time to time. This is actually something an LSP could help with! All these errors can be detected at authoring time, given the proper tooling.

2. Naive template systems might produce invalid XHTML if you’re not careful. Personally I haven’t run into such situations, but some people say they do.

This is also a tooling problem, but it’s a bit trickier to tackle. A solution could be either (1) an LSP for the templating system, working together with the (X)HTML LSP, or (2) a templating system that is XML-native (but more ergonomic than XSLT haha).

(Personally, I’m thinking about starting something like (2) with Jinja-like syntax, but I feel it’s not something I have the time or energy for, at this point.)

> when really, they should show an annoying popup and then switch to a quirks parser

Not sure if it’s a good solution long-term, but I see how that could help with adoption, so I’m all for it.

[-]

jancsika 2 days ago

Interviewer: "So what's your professional opinion about this annoying popup that blocks the main thread?"

Interviewee: "I see how that could help with adoption, at least until we are able to migrate to a red single-line error with a long-ass-scrollbar that displays instead of rendering any of the page at all."

Interviewer: "Thanks. We'll be in touch."

[-]

notpushkin a day ago

I'm assuming it's just a joke for the sake of joke, but please re-read my and GP's comments again. This is not a solution we're discussing.

> Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize.

– https://news.ycombinator.com/newsguidelines.html

wizzwizz4 2 days ago

> you don’t expect the browser to run malformed JavaScript

Refer to https://tc39.es/ecma262/2024/multipage/ecmascript-language-l... (extract below)

---

In contrast, the source

  { 1
  2 } 3

is also not a valid ECMAScript sentence, but is transformed by automatic semicolon insertion into the following:

  { 1
  ;2 ;} 3;

which is a valid ECMAScript sentence.

[-]

notpushkin a day ago

That's a weird wording, huh. I would interpret it not as yeah this is invalid code but we fix it for you, but rather this code would be invalid if not for this weird exception we carve out for it.

I would love to say “let's abolish ASI too”, but (1) some folks genuinely use it assuming that it's a syntax feature and not a compat thing, and (2) it's a compat thing, meaning it's not getting removed without a very valid reason. A shame, though.

But JS parsing is still undeniably way more strict than HTML, no?

[-]

wizzwizz4 a day ago

How about Appendix B? https://tc39.es/ecma262/2024/multipage/additional-ecmascript...

> These features are not considered part of the core ECMAScript language. Programmers should not use or assume the existence of these features and behaviours when writing new ECMAScript code. ECMAScript implementations are discouraged from implementing these features unless the implementation is part of a web browser or is required to run the same legacy ECMAScript code that web browsers encounter.

Some of these are legacy APIs, but some of them are just plain syntax errors.

ChrisArchitect 2 days ago

Related from same creator, referenced in this article:

The Static Site Paradox

https://news.ycombinator.com/item?id=41775238

petesergeant 2 days ago

This looks awesome but I don’t really understand the complexities that mean this is new in late 2024 and wasn’t one of the first LSPs implemented? Can someone explain?

[-]

Semaphor 2 days ago

This is not HTML. This is HTML, but without allowing all kinds of shortcuts.

In HTML, <li>item<li> is correct (it gets parsed as <li>item</li><li></li> I think). In SuperHTML it throws an error. Essentially, this is an opinionated subset of HTML, and no one cared enough before.

[-]

kristoff_it 2 days ago

It's not just that, I've never seen any other HTML language server produce diagnostics at all. All the major editors use the VSCode stock HTML language server and that thing just doesn't produce any diagnostic at all.

[-]

Semaphor 2 days ago

Huh. Wow. I don’t normally use VSCode, and you are correct. JetBrain IDEs correctly report some nonsense like <p><div><img></ul>test</p> as an error, but VSCode does nothing. That’s really weird.

afavour 2 days ago

…so it’s XHTML?

[-]

taeric 2 days ago

Can seem similar, but realize that the parsing rules for HTML are far more complicated than you'd think from this simple example. It is the idea of when tags are automatically closed that is odd, here. Consider how `<li><div>Hello<li>world. How <div/>Does this</li> work?` gets turned into elements.

The common trap many run into is that `<div/>` essentially closes at the next boundary.

[-]

smaudet 20 hours ago

Not sure why I got down-voted on my other reply, but basically this.

XML != HTML, not sure what is controversial to say about that.

rozab 2 days ago

There is an LSP from Microsoft, it just doesn't report this sort of syntax error (presumably for the sake of people doing templating etc)

[-]

Semaphor 2 days ago

> doesn't report this sort of syntax error

It’s because it’s not a syntax error, but a legal construct. SuperHTML disallows things HTML allows.

[-]

OtomotO 2 days ago

So it should be called SubHTML instead?!

kristoff_it 2 days ago

I've never seen the language server that ships with VSCode produce any kind of diagnostic whatsoever, so I think it just doesn't implement any error checking.

smaudet 2 days ago

xml != html so you can't just plug an xml parser and a schema together and call it quits.

Plus, there's things like browser support, relative paths, and content embedding to consider, which always made dedicated products like Adobe's (Macromedia's?) DreamWeaver hit-or-miss (on top of being a very clunky way to edit some tiny text files).

As for why it took so long...well nobody cared enough, and everyone "just used scripts" i.e. Django, Wordpress, Ruby on Rails, etc, to get that "good enough" experience + access to testing. There's no reason this couldn't have been done back in 2014 when LSPs were first just starting to come into being...

IshKebab 2 days ago

Not many people write HTML by hand and I guess nobody bothered. I don't think there's a secret reason but you do need someone to actually do it and nobody bothered until now.

Where's the Makefile LSP server? (Ok technically there is one but it barely does anything.)

mediumsmart 2 days ago

I write sites in html and the only thing I need js for is a nav overlay for the phone when they have a nav. And I adore my typos.

AlienRobot 2 days ago

>This language server is stricter than the HTML spec whenever it would prevent potential human errors from being reported.

It doesn't support implicitly closing <li>, <p>, <dd>, <dt>, <head>, <body> tags which is one of the best features of HTML if you're handwriting it. I assume it doesn't support implicitly opening <html>, <head>, and <body> either. Why do I have to skip indenting those levels when I can simply not open the tags?

HTML is a terrible language and I'd rather write some strict superset of it that includes template tags to make it functional. If HTML features are treated as errors, that's not a superset.

Am4TIfIsER0ppos 2 days ago

I don't suppose there is a Vim version, is there?

[-]

krig 2 days ago

I’m using it in neovim right now. For vim, I guess you need a lsp plugin but it should work fine.

kristoff_it 2 days ago

SuperHTML can be used as a normal CLI tool, if you can integrate it with Vim to run it on save (and ideally parse the error trace), then you're all set.

WhyNotHugo 2 days ago

The tool is editor agnostic. You can use an existing Vim LSP client to use SuperHTML's LSP implementation. In theory this would work with things like ALE, it just needs to be integrated.

2 days ago

[deleted]

TekMol 2 days ago

    the following snippet is correct HTML [...] 
    will still be reported as an error by
    SuperHTML [...] it's most probably a typo

Bad. A tool should do one thing and do it will. A tool to verify HTML should verify HTML. Not report errors where there are none.

[-]

kristoff_it 2 days ago

This is not a tool to verify HTML, it's a tool to support the act of handwriting HTML, and that's what it does :^)