Why can't HTML alone do includes?

(frontendmasters.com)

312 points | by susam a day ago ago

261 comments

  • stevage 2 minutes ago
  • dwheeler 15 hours ago

    HTML was historically an application of SGML, and SGML could do includes. You could define a new "entity", and if you created a "system" entity, you could refer to it later and have it substituted in.

        <!DOCTYPE html example [
          <!ENTITY myheader SYSTEM "myheader.html">
        ]>
        ....
        &myheader;
    
    SGML is complex, so various efforts were made to simplify HTML, and that's one of the capabilities that was dropped along the way.
    • int_19h 12 hours ago

      We also had a brief detour into XML with XHTML, and XML has XInclude, although it's not a required feature.

      • tannhaeuser 4 hours ago

        The XML subset of SGML still includes most forms of entity usage SGML has, including external general entities as described by grandparent. XInclude can include any fragment not just a complete document, but apart from that was redundant, and what remains of XInclude in HTML today (<svg href=...>) doesnt't make use of fragments and also does away with the xinclude and other namespaces. For reusing fragments OTOH, SVG has the more specific <use href=...> construct. XInclude also really worked bad in the presence of XML Schema.

      • echelon 9 hours ago

        It's too bad we didn't go down the XHTML/semantic web route twenty years ago.

        Strict documents, reusable types, microformats, etc. would have put search into the hands of the masses rather than kept it in Google's unique domain.

        The web would have been more composible and P2P. We'd have been able to slurp first class article content, comments, contact details, factual information, addresses, etc., and built a wealth of tooling.

        Google / WhatWG wanted easy to author pages (~="sloppy markup, nonstandard docs") because nobody else could "organize the web" like them if it was disorganized by default.

        Once the late 2010's came to pass, Google's need for the web started to wane. They directly embed lifted facts into the search results, tried to push AMP to keep us from going to websites, etc.

        Google's decisions and technologies have been designed to keep us in their funnel. Web tech has been nudged and mutated to accomplish that. It's especially easy to see when the tides change.

        • kweingar 7 hours ago

          I don't think there was ever a sustainable route to a semantic web that would work for the masses.

          People wanted to write and publish. Only a small portion of people/institutions would have had the resources or appetite to tag factual information on their pages. Most people would have ignored the semantic taxonomies (or just wouldn't have published at all). I guess a small and insular semantic web is better than no semantic web, but I doubt there was a scenario where the web would have been as rich as it actually became, but was also rigidly organized.

        • tannhaeuser 3 hours ago

          The "semantic" part was what eventually became W3C's RDF stuff (a pet peeve of TBL's predating even the Web). When people squeeze poetry, threaded discussion, and other emergent text forms into a vocabulary for casual academic publishing and call that "semantic HTML", that still doesn't make it semantic.

          The "strict markup" part can be (and always could be) had using SGML which is just a superset of XML that also supports HTML empty elements, tag inference, attribute shortforms, etc. HTML was invented as SGML vocabulary in the first place.

          Agree though that Google derailed any meaningful standardization effort for the readins you stated. Actually, it started already with CSS and the idioticy to pile yet another item-value syntax over SGML/HTML, when it already has attributes for formatting. The "semantic HTML" postulate is kind of just an after-the-fact justification for insane CSS complexity that could grow because it wasn't part of HTML proper and the scrutinity that goes with introducing new elements or attributes with it.

        • geoffmunn 6 hours ago

          The thing I liked the most about XHTML was how it enforced strict notation.

          Elements had to be used in their pure form, and CSS was for all visual presentation.

          It really helped me understand and be better at web development - getting the tick from the XHTML validator was always an achievement for complicated webpages.

        • int_19h 3 hours ago

          Me personally, I didn't even care that much about strict semantic web, but XML has the benefits of the entire ecosystem around it (like XPath and XSLT), composable extensibility in form of namespaces etc. It was very frustrating to see all that thrown out with HTML5, and the reasoning never made any sense to me (backwards compatibility with pre-XHTML pages would be best handled by defining a spec according to which they should be converted to XHTML).

        • riffraff 7 hours ago

          I kinda agree with you but I'd argue the "death" of microformats is unrelated to the death of XHTML (tho schema.org is still around).

          You could still use e.g. hReview today, but nobody does. In the end the problem of microformats was that "I want my content to be used outside my web property" is something nobody wants, beyond search engines that are supposed to drive traffic to you.

          The fediverse is the only chance of reviving that concept because it basically keeps attribution around.

        • safety1st an hour ago

          I'm as big a critic of Google as anyone, but I'm always surprised at modern day takes around the lost semantic web technologies - they are missing facts or jumping to conclusions in hindsight.

          Here's what people should know.

          1) The failure of XHTML was very much a multi-vendor, industry-wide affair; the problem was that the syntax of XML was stricter than the syntax of HTML, and the web was already littered with broken HTML that the browser vendors all had to implement layers of quirk handling to parse. There was simply no clear user payoff for moving to the stricter parsing rules of XML and there was basically no vendor who wanted to do the work. To my memory Google does not really stand out here, they largely avoided working on what was frequently referred to as a science project, like all the other vendors.

          2) In subsequent years, Google actually has actually delivered a semantic web of sorts: https://developers.google.com/search/docs/appearance/structu...

          A few things stand out as interesting. First of all, the old semantic web never had a business case. JSON+LD Structured Data does: Google will parse your structured data and use it to inform the various snippets, factoids, previews and interactive widgets they show all over their search engine and other web properties. So as a result JSON+LD has taken off massively. Millions of websites have adopted it. The data is there in the document. It is just in a JSON+LD section. If you work in SEO you know all about this. Seems to be quite rare that anyone on Hacker News is aware of it however.

          Second interesting thing, why did we end up with the semantic data being in JSON in a separate section of the file? I don't know. I think everyone just found that interleaving it within the HTML was not that useful. For the legacy reasons discussed earlier, HTML is a mess. It's difficult to parse. It's overloaded with a lot of stuff. JSON is the more modern thing. It seems reasonable to me that we ended up with this implementation. Note that Google does have some level of support for other semantic data, like RDFa which I think is directly in the HTML - it is not popular.

          Which brings us to the third interesting thing, the JSON+LD schemas Google uses, are standards, or at least... standard-y. The W3C is involved. Google, Yahoo, Yandex and Microsoft have made the largest contributions to my knowledge. You can read all about it on schema.org.

          TL;DR - XHTML was not a practical technology and no browser or tool vendor wanted to support it. We eventually got the semantic web anyway!

        • DemocracyFTW2 5 hours ago

          As someone who worked in the field of "semantic XML processing" at the time I can tell you that while the "XML processing" part was (while full of unnecessary complications) well understood, the "semantic" part was purely aspirational and never well understood. The common theme with the current flurry of LLMs and their noisy proponents is that it is, in both cases, possible to do worthwhile and impressive demos with these technologies and also real applications that do useful things, but people who have their feet on the ground know that XML doesn't engender "semantics" and LLMs are not "conscious". Yet the hype meddlers keep the fire burning by suggesting that if you just do "more XML" and build bigger LLMs, then at some point real semantics and actual conscience will somehow emerge like a hatching chicken from the egg. And, being emergent properties, who is to say semantics and conscience will not emerge, at some point somehow? A "heap" of grains is emergent after all, and so is the "wetness" of water. But I have strong doubts about XHTML being more semantic than HTML5.

          And anyway, even if Google had nefarious intentions and even if they managed to steer the standardization, one has also to concede that all search engines before Google were encumbered by too much structure, too rigid approaches. When you were looking for a book in a computerized library at that point it was standard to be sat in front of a search form with many, many fields; one for the author's name, one for the title and so forth, and searching was not only a pain, it was also very hard to do for a user without prior training. Google had demonstrated it could deliver far better results with a single short form field filled out by naive users that just plonked down three or five words that were on their mind et voila. They made it plausible that instead of imposing a structure onto data at creation time maybe it's more effective to discover associations in the data at search time (well, at indexing time really).

          As for the strictness of documents, I'm not sure what it will give you what we don't get with sloppy documents. OK web browsers could refuse to display a web page if any one image tag is missing the required `alt` attribute. So now what happens, will web authors duly include alt="picture of a cat" for each picture of a cat? Maybe, to a degree, but the other 80% of alt tags will just contain some useless drivel to appease the browser. I'm actually more for strict documents than I used to be, but on the other hand we (I mean web browsers) have become quite good at reconstructing usable HTML documents from less-than perfect sources, and the reconstructed source is also a strictly validating source. So I doubt this is the missing piece; I think the semantic web failed because the idea never was strong, clear, compelling, well-defined and rewarding enough to catch on with enough people.

          If we're honest, we still don't know, 25 years later, what 'semantic' means after all.

        • tsimionescu 6 hours ago

          The semantic web is a silly dream of the 90s and 00s. It's not a realizabile technology, and Google basically showed exactly why: as soon as you have a fixed algorithm for finding pages on the web, people will start gaming that algorithm to prioritize their content over others'. And I'm not talking about malicious actors trying to publish malware, but about every single publisher that has theoney to invest in figuring out how and doing it.

          So any kind of purely algorithmic, metadata based retrieval algorithm would very quickly return almost pure garbage. What makes actual search engines work is the constant human work to change the algorithm in response to the people who are gaming it. Which goes against the idea of the semantic web somewhat, and completely against the idea of a local-first web search engine for the masses.

    • lkuty 2 hours ago

      It existed also in DTD (Document Type Definition) used with HTML 4 and below, and XML. Came fromn SGML too I guess.

    • j45 11 hours ago

      Neat reference, going to look into that.

      The <object> tag appears to include/embed other html pages.

      An embedded HTML page:

      <object data="snippet.html" width="500" height="200"></object>

      https://www.w3schools.com/tags/tag_object.asp

      • nephyrin 7 hours ago

        <object> used like this is just a poor iframe in a much shakier spot in the standards, mostly for backwards compatibility.

        Like iframe, it "includes" a full subdocument as a block element, which isn't quite what the OP is hinting at.

        • gpvos 6 hours ago

          Sound like it's good enough for headers and footers, which is 80% of what people need.

          • masklinn 5 hours ago

            That’s what lots of sites used to do in the late 90s and early aughts in order to have fixed elements.

            It was really shit. Browser navigation cues disappear, minor errors will fuck up the entire thing by navigating fixed element frames instead of contents, design flexibility disappears (even as consistent styling requires more efforts), frames don’t content-size so will clip and show scroll bars all over, debugging is absolute ass, …

            And it increases resource use.

    • timewizard 9 hours ago

      Well, that is an entire attack surface, on it's own.

      https://en.wikipedia.org/wiki/Billion_laughs_attack

  • dimal 21 hours ago

    This was the rabbit hole that I started down in the late 90s and still haven’t come out of. I was the webmaster of the Analog Science Fiction website and I was building tons of static pages, each with the same header and side bar. It drove me nuts. So I did some research and found out about Apache server side includes. Woo hoo! Keeping it DRY (before I knew DRY was a thing).

    Yeah, we’ve been solving this over and over in different ways. For those saying that iframes are good enough, they’re not. Iframes don’t expand to fit content. And server side solutions require a server. Why not have a simple client side method for this? I think it’s a valid question. Now that we’re fixing a lot of the irritation in web development, it seems worth considering.

    • EvanAnderson 9 hours ago

      Server-side includes FTW! When a buddy and I started making "web stuff" back in the mid-90s the idea of DRY also just made sense to us.

      My dialup ISP back then didn't disable using .htaccess files in the web space they provided to end users. That meant I could turn on server-side includes! Later I figured out how to enable CGI. (I even went so far as to code rudimentary webshells in Perl just so I could explore the webserver box...)

    • matchagaucho 13 hours ago

      I've become a fan of https://htmx.org for this reason.

      A small 10KB lib that augments HTML with the essential good stuff (like dynamic imports of static HTML)

      • HumanOstrich 9 hours ago

        Seems like overkill to bring in a framework just for inlining some static html. If that's all you're doing, a self-replacing script tag is neat:

            <script>
              function includeHTML(url) {
                const s = document.currentScript
                fetch(url).then(r => r.text()).then(h => {
                  s.insertAdjacentHTML('beforebegin', h)
                  s.remove()
                })
              }
            </script>
        
        ...

            <script>
              includeHTML('/footer.html')
            </script>
        
        The `script` element is replaced with the html from `/footer.html`.
        • librasteve 2 hours ago

          this here is the main idea of HTMX - extended to work for any tag p, div, content, aside …

          there are many examples of HTMX (since it is a self contained and tiny) being used alongside existing frameworks

          of course for some of us, since HTMX brings dynamic UX to back end frameworks, it is a way of life https://harcstack.org (warning - raku code may hurt your eyes)

    • unilynx 16 hours ago

      > Iframes don’t expand to fit content

      Actually, that was part of the original plan - https://caniuse.com/iframe-seamless

      • omneity 16 hours ago

        I used the seamless attribute extensively in the past, it still doesn't work the way GP intended, which is to fit in the layout flow, for example to take the full width provided by the parent, or automatically resize the height (the pain of years of my career)

        It worked rather like a reverse shadow DOM, allowing CSS from the parent document to leak into the child, removing borders and other visual chrome that would make it distinguishable from the host, except you still had to use fixed CSS layouts and resize it with JS.

    • codr7 17 hours ago

      The optimal solution would be using a template engine to generate static documents.

      • JadeNB 16 hours ago

        > The optimal solution would be using a template engine to generate static documents.

        This helps the creator, but not the consumer, right? That is, if I visit 100 of your static documents created with a template engine, then I'll still be downloading some identical content 100 times.

        • iainmerrick 2 hours ago

          I'll still be downloading some identical content 100 times.

          That doesn't seem like a significant problem at all, on the consumer side.

          What is this identical content across 100 different pages? Page header, footer, sidebar? The text content of those should be small relative to the unique page content, so who cares?

          Usually most of the weight is images, scripts and CSS, and those don't need to be duplicated.

          If the common text content is large for some reason, put the small dynamic part in an iframe, or swap it out with javascript.

          If anyone has a genuine example of a site where redundant HTML content across multiple pages caused significant bloat, I'd be interested to hear about it.

        • codr7 12 hours ago

          True for any server side solution, yes.

          On the other hand it means less work for the client, which is a pretty big deal on mobile.

        • Klaster_1 7 hours ago

          Compression Dictionary Transport [0] seems like something that can potentially address this. If you squint, this looks almost like XSLT.

          [0] https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Com...

        • giantrobot 15 hours ago

          XSLT solved this problem. But it had poor tool support (DreamWeaver etc) and a bunch of anti-XML sentiment I assume as blowback from capital-E Enterprise stacks going insane with XML for everything.

          XSLT did exactly what HTML includes could do and more. The user agent could cache stylesheets or if it wanted override a linked stylesheet (like with CSS) and transform the raw data any way it wanted.

          • px1999 12 hours ago

            The Umbraco CMS was amazing during the time that it used and supported XSLT.

            While it evaluated the xslt serverside it was a really neat and simple approach.

      • keeganpoppen 16 hours ago

        macros!

    • rbanffy 7 hours ago

      > Woo hoo! Keeping it DRY (before I knew DRY was a thing)

      I still remember the script I wrote to replace thousands (literally) slightly different headers and footers in some large websites of the 90s. How liberating to finally have that.

    • econ 21 hours ago

      You can message the page dimensions to the parent. To do it x domain you can load the same url into the parent with the height in the #location hash. It won't refresh that way.

      • dimal 20 hours ago

        I know it’s possible to work around it, but that’s not the point. This is such a common use case that it seems worthwhile to pave the cowpath. We’ve paved a lot of cowpaths that are far less trodden than this one. This is practically a cow superhighway.

        We’ve built an industry around solving this problem. What if, for some basic web publishing use cases, we could replace a complex web framework with one new tag?

        • econ 17 hours ago

          I couldn't agree more.

          <div src="foo.txt"></div>

          • wizzwizz4 15 hours ago

            https://www.w3.org/TR/xhtml2/introduction.html

            > XHTML 2 takes a completely different approach, by taking the premise that all images have a long description and treating the image and the text as equivalents. In XHTML 2 any element may have a @src attribute, which specifies a resource (such as an image) to load instead of the element.

    • fooker 13 hours ago

      > Why not have a simple client side method for this?

      Like writing a line of js?

      • sbarre 12 hours ago

        A line of JS that has to run through the Javascript interpreter in your browser rather than a simple I/O operation?

        If internally this gets optimized to a simple I/O operation (which it should) then why add the JS indirection in the first place?

      • rbanffy 7 hours ago

        A block of in-line JavaScript stops the renderer until it runs because its output cannot be determined before it completes.

        • bawolff 3 hours ago

          So would any form of html inclusion.

      • DemocracyFTW2 4 hours ago

        The difference between "a line of JS" and a standardized declarative solution is of course that a meek "line of $turing_complete_language" can not, in the general case, be known and trusted to do what it purports to do, and nothing else; you've basically enabled any kind of computation, and any kind of behavior. With an include tag or attribute that's different; it's behavior is described by standards, and (except for knowing what content we might be pulling in) we can 100% tell the effects from static analysis, that is, without executing the code. With "a line of JS" the only way, in the general case, to know what it does is to run it (an infinite number of times). Also, because it's not standardized, it's much harder to save to disk, to index and to archive it.

    • atoav 19 hours ago

      I mean in 1996s netscape you could do this (I run the server for a website that still uses this):

          <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">
          <html>
            <frameset cols="1000, *">
              <frame src="FRAMESET_navigation.html" name="navigation">
              <frame src="FRAMESET_home.html" name="in">
            </frameset>
          </html>
        
      The thing that always bugged me about frames is that they are too clever. I don't want to reload only the frame html when I rightclick and reload. Sure the idea was to cache those separately, but come on — frames and caching are meant to solve two different problems and by munching them together they somewhat sucked at solving either.

      To me includes for HTML should work in the dumbest way possible. And that means: Take the text from the include and paste it where the include was and give the browser the resulting text.

      If you want to cache a nav section separately because it appears the same on every page lets add a cache attribute that solves the problem independently:

        <nav cache-id="deadbeefnav666">
          <some-content></etc>
        </nav>
        
      To tell the browser it should load the inner html or the src of that element from cache if it has it.

      Now you could convince me thst the include should allow for more, but it being dumb is a feature not a bug.

      • lodovic 6 hours ago

        Nitpick: the HTML4 spec was released in December 1997, and HTML4.01 only in December 1999 so it probably wouldn't have run in 1996s Netscape.

        • JimDabell 4 hours ago

          The doctype doesn’t matter in this context. Netscape Navigator 2 supported frames in 1995 and would render that page.

        • cobbaut 4 hours ago

          Back then it was common for Netscape to have features that (years) later became standard HTML.

    • api 17 hours ago

      The web seems like it was deliberately designed to make any form of composability impossible. It’s one of the worst things about it as a platform.

      I’m sure some purist argument has driven this somewhere.

      • giantrobot 14 hours ago

        I look back longingly at the promise of XML services in the early days of Web 2.0. Before the term just meant JavaScript everywhere.

        All sorts of data could be linked together to display or remix by user agents.

    • luotuoshangdui 19 hours ago

      HTML is a markup language, not a programming language. It's like asking why Markdown can't handle includes. Some Markdown editors support them (just like some server-side tools do for HTML), but not all.

      • franga2000 18 hours ago

        Including another document is much closer to a markup operation than a programming operation. We already include styles, scripts, images, videos, fonts...why not document fragments?

        Markdown can't do most of those, so it makes more sense why it doesn't have includes, but I'd still argue it definitely should. I generally dislike LaTeX, but about the only thing I liked about it when writing my thesis was that I could have each chapter in its own file and just include all of them in the main file.

      • dimal 18 hours ago

        This isn’t programming. It’s transclusion[0]. Essentially, iframes and images are already forms of transclusion, so why not transclude html and have the iframe expand to fit the content?

        As I wrote that, I realized there could be cumulative layout shift, so that’s an argument against. To avoid that, the browser would have to download all transcluded content before rendering. In the past, this would have been a dealbreaker, but maybe it’s more feasible now with http multiplexing.

        [0] https://en.m.wikipedia.org/wiki/Transclusion#Client-side_HTM...

        • PoignardAzur 18 hours ago

          With Early Hints (HTTP code 103), it seems especially feasible. You can start downloading the included content one round-trip after the first byte is sent.

      • lenkite 17 hours ago

        Well, asciidoc - a markup language supports includes, so the "markup languages" analogy doesn't hold.

        https://docs.asciidoctor.org/asciidoc/latest/directives/incl...

      • crazygringo 17 hours ago

        I think this is the most likely answer.

        I'm not defending it, because when I started web development this was one of the first problems I ran into as well -- how the heck do you include a common header.

        But the original concept of HTML was standalone documents, not websites with reusable components like headers and footers and navbars.

        That being said, I still don't understand why then the frames monstrosity was invented, rather than a basic include. To save on bandwidth or something?

        • int_19h 12 hours ago

          The original concept of HTML was as an SGML subset, and SGML had this functionality, precisely because it's very handy for document authoring to be able to share common snippets.

        • giantrobot 14 hours ago

          Frames were widely abused by early web apps to do dynamic interfaces before XHR was invented/widely supported. The "app" had a bunch of sub-frames with all the links and forms carefully pointing to different frames in the frameset.

          A link in a sidebar frame would open a link in the "editor" frame which loaded a page with a normal HTML form. Submitting the form reloaded it in that same frame. Often the form would have multiple submit buttons, one to save edits in progress and another to submit the completed form and move to the next step. The current app state was maintained server side and validation was often handled there save for some basic formatting client side JavaScript could handle.

          This setup allowed even the most primitive frame-supporting browsers to use CRUD web apps. IIRC early web frameworks like WebObjects leaned into that model of web app.

          • crazygringo 13 hours ago

            Oh my goodness, yes you're right, I'd forgotten entirely about those.

            They were horrible -- you'd hit the back button and only one of the frames would go back and then the app would be in an inconsistent state... it was a mess!

            • giantrobot 12 hours ago

              You needed to hit the reset button (and hoped it worked) and never the back button! Yes, I suffered through early SAP web apps built entirely with frames and HTML forms. It was terrible.

              I don't love JavaScript monstrosities but XHR and dynamic HTML were a vast improvement over HTML forms and frame/iframe abuse.

              • cr125rider 8 hours ago

                To be fair, modern SAP web apps are also terrible.

        • mattl 14 hours ago

          A lot of early HTML was about taking the output of a different system such as a mainframe and putting that output into HTML.

          Lots of gateways between systems.

      • paulddraper 19 hours ago

        That’s the Hyper part of HTML, and what makes it special.

        It’s made to pull in external resources (as opposed to other document formats like PDF).

        Scripts, stylesheets, images, objects, favicons, etc. HTML is thematically similar.

        • ummonk 17 hours ago

          No, HTML is fundamentally different because (for a static site without any JS dom manipulation) it has all the semantic content, while stylesheets, images, objects, etc. are just about presentation.

          • paulddraper 12 hours ago

            Images are content. Videos are content. Objects/iframes are content.

            The only one that is presentational is stylesheets.

          • Aloisius 17 hours ago

            Iframes exist.

      • actinium226 16 hours ago

        Markdown doesn't have this common HTML pattern of wanting to include a header/footer in all pages of a site.

  • throwup238 20 hours ago

    The feature proposal was called HTML Imports [1], created as part of the Web Components effort.

    > HTML Imports are a way to include and reuse HTML documents in other HTML documents

    There were plans for <template> tag support and everything.

    If I remember correctly, Google implemented the proposed spec in Blink but everyone else balked for various reasons. Mozilla was concerned with the complexity of the implementation and its security implications, as well as the overlap with ES6 modules. Without vendor support, the proposal was officially discontinued.

    [1] https://www.w3.org/TR/html-imports/

    • xg15 15 hours ago

      That matches with the comment [1] on the article, citing insufficient demand, no vendor enthusiasm, etc.

      The thing is that all those are non-reasons that don't really explain anything: Low demand is hard to believe if this feature is requested for 20 years straight and there are all kinds of shim implementations using scripts, backend engines, etc. (And low demand didn't stop other features that the vendors were interested in for their own reasons)

      Vendor refusal also doesn't explain why they refused it, even to the point of rolling back implementations that already existed.

      So I'd be interested to understand the "various reasons" in more detail.

      "Security implications" also seem odd as you already are perfectly able to import HTML cross origin using script tags. Why is importing a script that does document.write() fine, but a HTML tag that does exactly the same thing hugely problematic?

      (I understand the security concern that you wouldn't want to allow something like "<import src=google.com>" and get an instant clone of the Google homepage. But that issue seems trivially solvable with CORS.)

      [1] https://frontendmasters.com/blog/seeking-an-answer-why-cant-...

      • athrowaway3z 3 hours ago

        That is a bit of a large ask.

        There are various specs/semantics you can choose, which prescribe the implementation & required cross-cutting complexity. Security is only relevant in some of them.

        To give you some idea:

        - HTML load ordering is a pretty deeply held assumption. People understand JS can change those assumptions (document.write). Adding an obscure HTML tags that does so is going to be an endless parade of bugs & edge cases.

        - To keep top-to-bottom fast we could define preload semantics (Dropping the linear req-reply, define client-cache update policy when the template changes, etc). Is that added complexity truly simpler than having the server combine templates?

        - <iframe> exists

        In other words, to do the simplest thing 75% of people want, requires a few lines of code. Either client side or server side.

        To fit the other 25% (even to 'deny' it) is endlessly complex in ways few if any can oversee.

      • NoahZuniga 4 hours ago

        Maybe something that adds to this low demand is that: 1. Web pages that are developed from the viewpoint of the user having JS, makes it trivial to implement something that provides the same results. 2. Web pages that are developed for user agents that don't run js, probably want to have some interaction, so already have a server runtime that can provide this feature. 2b. And if it doesn't have any user interaction, its probably a static content site, and nobody is writing content in HTML, so there already is a build step that provides this feature.

      • brundolf 6 hours ago

        JS-first developers want something that works the same way client-side and server-side, and the mainstream front-end dev community shifted to JS-first, for better or worse

    • uallo 14 hours ago

      HTML Imports went in a similar direction but they do not do what the blog post is about. HTML should be imported and displayed in a specific place of the document. HTML Imports could not do this without JavaScript.

      See https://github.com/whatwg/html/issues/2791#issuecomment-3112... for details.

    • thayne 14 hours ago

      To be fair, it was pretty complicated. IIRC, using it required using Javascript to instantiate the template after importing it, rather than just having something like <include src="myinclude.html">.

    • riedel 16 hours ago

      https://caniuse.com/imports says FF even had it as a config flag

    • paulddraper 9 hours ago

      Tbf, HTML Imports were significantly more complex than includes, which this article requests.

    • AtlasBarfed 15 hours ago

      Frames essentially could do html import

  • Lammy 18 hours ago
    • blorto 15 hours ago

      I always wondered why it was called ILAYER. Ty

  • Null-Set 21 hours ago

    The name of this feature is transclusion.

    https://en.wikipedia.org/wiki/Transclusion

    It was part of Project Xanadu, and originally considered to be an important feature of hypertext.

    Notably, mediawiki uses transclusion extensively. It sometimes feels like the wiki is the truest form of hypertext.

    • jes5199 9 hours ago

      Ward Cunningham (inventor of the Wiki) spent some time trying to invent a transclusion-first wiki, where everyone had their own wiki-space and used transclusion socially https://en.wikipedia.org/wiki/Federated_Wiki

      it never quite took off

  • Linux-Fan 21 hours ago

    Isn't this what proper framesets (not iframes) were supposed to do a long time ago (HTML 4?). At least they autoexpanded just fine and the user could even adjust the size to their preference.

    There was a lot of criticism for frames [1] but still they were successfully deployed for useful stuff like Java API documentation [2].

    In my opinion the whole thing didn't stay mostly because of too little flexibility for designer: Framesets were probably well enough for useful information pages but didn't account for all the designers' needs with their bulky scrollbars and limited number of subspaces on the screen. Today it is too late to revive them because framesets as-is wouldn't probably work well on mobile...

    [1] <https://www.nngroup.com/articles/why-frames-suck-most-of-the...> - I love how much of it is not applicable anymore and all of these problems mentioned with frames are present in today's web in an even nastier way?

    [2] <https://www.eeng.dcu.ie/~ee553/ee402notes/html/figures/JavaD...>

    • johannes1234321 21 hours ago

      Issue with frame set was way more fundamental: No deep linking, thus people coming via bookmarks or Google (or predecessor) were left on a page without navigation, which people then tried working around with JavaScript, which never gave it a good experience.

      • Linux-Fan 20 hours ago

        Nowdays it is sometimes the other way around: Pages are all JavaScript so no good experience in the first place. I have encountered difficulty trying to get a proper “link” to something multiple times. Also, given that Browsers love to reduce/hide the address bar I wonder if it is really still that important a feature.

        Of course "back then" this was an important feature and one of the reasons for getting rid of frames :)

  • rchaud 21 hours ago

    "Includes" functionality is considered to be server-side, i.e. handled outside of the web browser. HTML is client-side, and really just a markup syntax, not a programming language.

    As the article says, the problem is a solved one. The "includes" issue is how every web design student learns about PHP. In most CMSes, "includes" become "template partials" and are one of the first things explained in the documentation.

    There really isn't any need to make includes available through just HTML. HTML is a presentation format and doesn't do anything interesting without CSS and JS anyway.

    • naasking 20 hours ago

      > "Includes" functionality is considered to be server-side, i.e. handled outside of the web browser. HTML is client-side, and really just a markup syntax, not a programming language.

      That's not an argument that client-side includes shouldn't happen. In fact HTML already has worse versions of this via frames and iframes. A client-side equivalent of a server-side include fits naturally into what people do with HTML.

    • tgv 19 hours ago

      I think it feels off because an HTML file can include scripts, fonts, images, videos, styles, and probably a few other things. But not HTML. It can probably be coded with a custom element (<include src=.../>). I would be surprised if there wasn't a github repo with something similar.

    • c-smile 10 hours ago

      > "Includes" functionality is considered to be server-side

      Exactly! Include makes perfect sense on server-side.

      But client-side include means that the client should be able to modify original DOM at unknown moment of time. Options are

      1. at HTML parse time (before even DOM is generated). This requires synchronous request to server for the inclusion. Not desirable.

      2. after DOM creation: <include src=""> (or whatever) needs to appear in the DOM, chunk loaded asynchronously and then the <include> DOM element(sic!) needs to be replaced(or how?) by external fragment. This disables any existing DOM structure validation mechanism.

      Having said that...

      I've implemented <include> in my Sciter engine using strategy #1. It works there as HTML in Sciter usually comes from local app resources / file system where price of issuing additional "get chunk" request is negligible.

      See: https://docs.sciter.com/docs/HTML/html-include

    • cantSpellSober 20 hours ago

      Well said this is many students' intro to PHP. Why not `<include src=header.html/>` though?

      Some content is already loaded asynchronously such as images, content below the fold etc.

      > HTML is really just a markup syntax, not a programming language

      flamebait detected :) It's a declarative language, interpreted by each browser engine separately.

      • gyesxnuibh 20 hours ago

        What's the ML in HTML stand for? I think that's probably the crux of the argument. Are we gonna evolve it past its name?

        • Aloisius 18 hours ago

          If the issue is that "include" somehow makes it sound like it's not markup, the solution seems obvious. Just use the src attribute on other tags:

          <html src="/some/page.html">, <div src="/some/div.html">, <span src="/some/span.html">, etc.

          Or create a new tag that's a noun like fragment, page, document, subdoc or something.

          Surely that's no less markup than svg, img, script, video, iframe, and what not.

        • int_19h 12 hours ago

          It stands for "markup language", and was inherited from SGML, which had includes. Strictly speaking, so did early HTML (since it was just an SGML subset), it's just that browsers didn't bother implementing it, for the most part. So it's not that it didn't evolve, but rather it devolved.

          Nor is this something unique to SGML. XML is also a "markup language", yet XInclude is a thing.

          • DemocracyFTW2 3 hours ago

            > It stands for "markup language", and was inherited from SGML, which had includes

            touchay!!

        • cantSpellSober 20 hours ago

          That's why I joked about flamebait, it's hypertext though, aren't anchors essentially a goToURL() click handler in some ways? Template partials seem like a basic part of this system.

          > considered to be server-side

          Good point! Wouldn't fetching a template partial happen the same way (like fetching an image?)

        • mattl 14 hours ago

          > What's the ML in HTML stand for?

          I always assumed it stood for my initials.

    • assimpleaspossi 21 hours ago

      Agree with what you said, however, HTML is a document description language and not a presentation format. CSS is for presentation (assuming you meant styling).

      • PaulDavisThe1st 20 hours ago

        They didn't mean styling.

        HTML is a markup language that identifies the functional role of bits of text. In that sense, it is there to provide information about how to present the text, and is thus a presentation format.

        It is also a document description language, because almost all document description languages are also a presentation format.

    • amadeuspagel 18 hours ago

      This argument applies just as much to CSS and JS. Why do they include "includes" when you can just bundle on the server?

      • adregan 17 hours ago

        For caching and sharing resources across the whole site, I suppose.

        • john_the_writer an hour ago

          But that would apply to <header> and <footer> and <nav> too. We could cache them.

    • DemocracyFTW2 3 hours ago

      hearing someone assert that

      > the problem is a solved one

      is a sure-fire way to know that a problem is not solved

  • socalgal2 13 hours ago

    There are all kind of issues with HTML include as others have pointed out

    If main.html includes child/include1.html and child/include1.html has a link src="include2.html" then when the user clicks the link where does it go? If it goes to "include2.html", which by the name was meant to be included, then that page is going to be missing everything else. If it goes to main.html, how does it specify this time, use include2.html, not include1.html?

    You could do the opposite, you can have article1.html, article2.html, article3.html etc, each include header.html, footer.html, navi.html. Ok, that works, but now you've make it so making a global change to the structure of your articles requires editing all articles. In other words, if you want to add comments.html to every article you have to edit all articles and you're back to wanting to generate pages from articles based on some template at which point you don't need the browser to support include.

    I also suspect there would be other issues, like the header wants to know the title, or the footer wants a next/prev link, which now require some way to communicate this info between includes and you're basically back to generate the pages and include not being a solution

    I think if you work though the issues you'll find an HTML include would be practically useless for most use cases.

    • int_19h 12 hours ago

      These are all solvable issues with fairly obvious solutions. For example:

      > If main.html includes child/include1.html and child/include1.html has a link src="include2.html" then when the user clicks the link where does it go? If it goes to "include2.html", which by the name was meant to be included, then that page is going to be missing everything else. If it goes to main.html, how does it specify this time, use include2.html, not include1.html?

      There are two distinct use cases here: snippet reuse and embeddable self-contained islands. But the latter is already handled by iframes (the behavior being your latter case). So we only need to do the former.

      • socalgal2 6 hours ago

        > These are all solvable issues with fairly obvious solutions.

        No, they are a can of worms and decades of arguments and incompatibilities and versioning

        > But the latter is already handled by iframes

        iframes don't handle this case because the page can not adjust to the iframe's content. There have been proposals to fix this but they always run into issues.

        https://github.com/domenic/cooperatively-sized-iframes/issue...

    • john_the_writer an hour ago

      The include logic of include2.html missing everything else would also apply to all other includes.

      If a user clicked a link with src="include.css" then it'll be rubbish.

      It would be good for static data.. images, css, and static html content.

  • austin-cheney 21 hours ago

    So, HTML did have includes and they fell out of favor.

    The actual term include is an XML feature and it’s that feature the article is hoping for. HTML had an alternate approach that came into existence before XML. That approach was frames. Frames did much more than XML includes and so HTML never gained that feature. Frames lost favor due to misuse, security, accessibility, and variety of other concerns.

    • Linux-Fan 20 hours ago

      Unlike Framesets I think XML includes were never really supported in many browsers (or even any major browsers)?

      I still like to use them occasionally but it incurs a "compilation" step to evaluate them prior to handing the result of this compilation to the users/browsers.

      • LegionMammal978 20 hours ago

        As it happens, the major browsers still can do XML 'includes' to some extent, since by some miracle they haven't torn out their support for XSLT 1.0. E.g. this outputs "FizzBuzz" on Firefox:

          <!-- fizz.xml -->
          <?xml version="1.0" encoding="UTF-8"?>
          <?xml-stylesheet type="application/xslt+xml" href="style.xslt"?>
          <fizz>Fizz<buzz/></fizz>
          
          <!-- style.xslt -->
          <?xml version="1.0" encoding="UTF-8"?>
          <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
            <xsl:template match="buzz">
              <xsl:value-of select="document('buzz.xml')"/>
            </xsl:template>
          </xsl:stylesheet>
          
          <!-- buzz.xml -->
          <?xml version="1.0" encoding="UTF-8"?>
          <buzz>Buzz</buzz>
        
        You can even use XSLT for HTML5 output, if you're careful. But YMMV with which XML processors will support stylesheets.
        • ndriscoll 19 hours ago

          Yep, and this can be used to e.g. make a basically static site template and then do an include for `userdata.xml` to decorate your page with the logged in user's info (e.g. on HN, adding your username in the top right, highlighting your comments and showing the edit/delete buttons, etc.). You can for example include into a variable `<xsl:variable name="myinfo" select="document('userdata.xml')"/>` and then use it in xpath expressions like `$myinfo/user/@id`. Extremely simple, good for caching, lightweight, very high performance. Easy to fail gracefully to the logged out template. You basically get your data "API" for free since you're returning XML in your data model. I will never understand why it didn't take off.

          XML includes are blocking because XSL support hasn't been updated for 25 years, but there's no reason why we couldn't have it async by now if resources were devoted to this instead of webusb etc.

          • LegionMammal978 19 hours ago

            > if resources were devoted to this

            You'd better not jinx it: XSL support seems like just the sort of thing browser devs would want to tear out in the name of reducing attack surface. They already dislike the better-known SVG and never add any new features to it. I often worry that the status quo persists only because they haven't really thought about it in the last 20 years.

            • o11c 17 hours ago

              Fortunately, XSLT is used by far too many high-importance websites (e.g. official government legal sites) for removing it to be a real threat.

          • mr_toad 13 hours ago

            > I will never understand why it didn't take off.

            I’ve used XSLT in anger - I used it to build Excel worksheets (in XML format) using libXSLT. I found it very verbose and hard to read. And Xpath is pretty torturous.

            I wish I could have used Javascript. I wish Office objects were halfway as easy to compose as the DOM. I know a lot of people hate on Javascript and the DOM, but it’s way easier to work with than the alternatives.

            • int_19h 12 hours ago

              XQuery is basically XSLT with saner syntax.

        • Linux-Fan 19 hours ago

          Nice, didn't think of that approach and It should work very well for the purposes of static headers and footers.

  • uallo 21 hours ago

    There is an open issue about this at WHATWG (also mentioned in the comment section of the blog post):

    Client side include feature for HTML

    https://github.com/whatwg/html/issues/2791

  • rorylaitila 14 hours ago

    I'm a full stack developer. I do server side rendering. I agree that this is a 'solved problem' for that case. However there are many times I don't want to run a server or a static site generator. I manage a lot of projects. I don't want more build steps than necessary. I just want to put some HTML on the net with some basic includes, without JavaScript. But currently I would go the web component route and accept the extra JS.

  • 1718627440 17 hours ago

    This is just my own understanding, but doesn't a webpage consist of a bunch of nodes, which can be combined in any way. And an html document is supposed to be a complete set of nodes, so a combination of those won't be a single document anymore.

    Nodes can be addressed individually, but a document is the proportion for transmission containing also metadata. You can combined nodes as you like, but you can't really combined two already packed and annotated documents of nodes.

    So I would say it is more due a semantic meaning. I think there was also the idea of requesting arbitrary sets of nodes, but that was never developed and with the shift away from a semantic document, it didn't make sense anymore.

    • gugagore 17 hours ago

      I think the quickest way to say it is that there is only one head on a page, and every HTML file needs a head. So if you include one into the other, you either have two heads, or the inner document didn't have a head.

    • mr_toad 13 hours ago

      > a webpage consist of a bunch of nodes, which can be combined in any way

      More or less, but manipulating the nodes requires JavaScript, which some people would like to avoid.

  • simonjgreen 14 hours ago

    I know it’s not straight HTML, but SSI (server side includes) helped with this and back in the day made for some incredibly powerful caching solutions. You could write out chunks of your site statically and periodically refresh them in the server side, while benefitting from serving static content to your users. (This was in the pre varnish era, and before everyone was using memcached)

    I personally used this to great success on a couple of Premier League football club websites around the mid 2000s.

    • thayne 14 hours ago

      One benefit of doing it on the client is the client can cache the result of an include. So for example, instead of having to download the content of a header and footer for every page, it is just downloaded once and re-usef for future pages

      • Sesse__ 13 hours ago

        How big are your headers and footers, really? If caching them is worth the extra complexity on the client plus all the pain of cache invalidation (and the two extra requests in the non-cached case).

      • youngtaff 3 hours ago

        I’m willing to bet the runtime overhead of assembly on the client is going to be larger than the download cost of the fragments being included server or edge side and cached

        • john_the_writer 35 minutes ago

          If you measure download cost in time then sure.. If you measure download cost in terms of bytes downloaded, or server costs, then nope. The cost would be smaller to cache.

  • Kuyawa 13 hours ago

    This is the closest we can do today:

      -- index.html
    
      <html>
      <body>
        <script src="header.js"></script>
        <main>
          <h1>Hello includes</h1>
        </main>
        <script src="footer.js"></script>
      </body>
      </html>
    
      -- header.js
    
      document.currentScript.outerHTML = `
      <header>
        <h1>Header</h1>
      </header>`
    
      -- footer.js
    
      document.currentScript.outerHTML = `
      <footer>
        <h1>Footer</h1>
      </footer>`
    
    Scripts will replace their tags with html producing a clean source, not pretty but it works on the client
    • Bjartr 12 hours ago

      You could even have a server wrap static html resources in that js if you request them a certain way, like

      <script src="/include/footer.html">

      For /footer.html

      But then you probably might as well use server side includes

    • johnisgood 12 hours ago

      Pretty sure it is possible without JavaScript, too.

  • kyledrake 15 hours ago

    At least some of the blame here is the bias towards HTML being something that is dynamic code generated, as opposed to something that is statically handwritten by many people.

    There are features that would be good for the latter that have been removed. For example, if you need to embed HTML code examples, you can use the <xmp> tag, which makes it so you don't need to encode escapes. Sadly, the HTML5 spec is trying to obsolete the <xmp> tag even though it's the only way to make this work. All browsers seem to be supporting it anyways, but once it is removed you will always have to encode the examples.

    HTML spec developers should be more careful to consider people hand coding HTML when designing specifications, or at least decisions that will require JavaScript to accomplish something it probably shouldn't be needed for.

    • TZubiri 15 hours ago

      It's the other way around, HTML was designed to be hand written, and the feature set was defined at that stage. If it ended up being dynamically generated, that happened after the feature set was defined.

  • wodenokoto 2 hours ago

    In the nineties we fixed it with frames or CGI. I still think of it as one of those “if it was fiction it would be unrealistic” things (although, who writes fictional markup standards?)

  • tln 21 hours ago

    There used to be a thing for this

    https://caniuse.com/imports

    • tehbeard 21 hours ago

      No HTML imports was an idea of using the HTML document format to encapsulate the 3 distinct data types needed for custom elements:

      - JS for functionality via the custom elements API - HTML for layout via <template> tags. - CSS for aesthetics via <style> tags.

      Not for just quickly and simply inserting the contents of header.html at a specific location in the DOM.

    • HeavyStorm 15 hours ago

      Says "superseded by ES modules". Not really the same thing, right?

  • mikewarot 9 hours ago

    The reason is simple, HTML is not a hypertext markup language. Markup is the process of adding commentary and other information on top of an existing document, and HTML is ironically incapable of doing the one thing it most definitely should be able to do.

    It's so bad, that if you want to discuss the markup hypertext (I.E. putting notes on top of an existing read only text files, etc.) you'll have to Google the word "annotation" to even start to get close.

    Along with C macros, Case Sensitivity, Null terminated strings, unauthenticated email, ambient authority operating systems, HTML is one of the major mistakes of computing.

    We should have had the Memex at least a decade ago, and we've got this crap instead. 8(

  • Evidlo 15 hours ago

    You can get JS-free, client-side include functionality if you're willing to wrap your HTML in XML. Here is a demo:

    https://github.com/Evidlo/xsl-website

    • int_19h 12 hours ago

      I don't think you even need to wrap it, really. You need to make sure it's valid XML, but the root element could be <html> just fine. And then use an identity transform with <xsl:output method="html">.

    • imiric 13 hours ago

      That's interesting, thanks.

      How well supported is XSLT in modern browsers? What would be the drawbacks of using this approach for a modern website?

  • _heimdall 20 hours ago

    If I really need HTML includes for some reason, I'd reach for XSLT. I know its old, and barely maintained at best, but that was the layer intentionally added to add programming language features to the markup language that is HTML.

    • mark_and_sweep 17 hours ago

      I believe XSLT 1 is still working in all major browsers today. Here's a simple HTML 5 example with two pages sharing a header template: https://gist.github.com/MarkTiedemann/0e6d36c337159a3e6d5072...

      • _heimdall 15 hours ago

        My main gripe is a decade(s?) old Firefox bug related to rendering an HTML string to the DOM.

        That may be a fairly specific use case though, and largely it still works great today. I've done a few side projects with XSLT and web components for interactivity, worked great.

        • mark_and_sweep 13 hours ago

          What bug specifically?

          • _heimdall 10 hours ago

            Couldn't find a good link earlier, guess I didn't have quite the right keywords for search.

            Here we go, looks like its 17 years old now:

            https://bugzilla.mozilla.org/show_bug.cgi?id=98168#c99

            • DemocracyFTW2 2 hours ago

              from the linked thread:

              > The only combination that fails to render these entities correctly is Firefox/XSLT.

              Which is one good reason not to adopt XSLT to implement HTML includes. You just don't know what snags you'll hit upon but you can be sure you'll be on your own.

              > Bug 98168 (doe) Opened 24 years ago Updated 21 days ago

              Well it does look like someone's still mulling over whether and how to fix it... 24 years later...

    • SvenL 16 hours ago

      I think XSLT is still a reasonable technology in itself - the lack of updated implementations is the bad part. I think modern browsers only support 1.0 (?). At least most modern programming languages should have 3.0 support.

      • _heimdall 15 hours ago

        Firefox has a very old bug related to rendering an HTML string to the DOM without escaping it, that one has bit me a few times. Nothing a tiny inline script can't fix, but its frustrating to have such a basic feature fail.

        Debugging is also pretty painful, or I at least haven't found a good dev setup for it.

        That said, I'm happy to reach for XSLT when it makes sense. Its pretty amazing what can be done with such an old tech, for the core use case of props and templates to HTML you really don't need react.

  • evrimoztamur 21 hours ago

    If you want to include HTML sandboxes, we have iframes. If you want it served from the server, it's just text. Putting text A inside text B is a solved problem.

    • esperent 21 hours ago

      > Putting text A inside text B is a solved problem.

      Yes, but in regards to HTML it hasn't been solved in a standard way, it's been solved in hundreds, if not thousands of non standard ways. The point of the article is that having one standard way wlcould reduce a lot of complexity from the ecosystem, as ES6 imports did.

    • zamadatix 21 hours ago

      The article references both of these methods with explanations of why they don't feel they answer the question posed.

  • bambax 21 hours ago

    > We’ve got <iframe>, which technically is a pure HTML solution, but they are bad for overall performance, accessibility, and generally extremely awkward here

    What does this mean? This is a pure HTML solution, not just "technically" but in reality. (And before iframe there were frames and frameset). Just because the author doesn't like them don't make them non-existent.

    • jadamson 20 hours ago

      What do you mean what does it mean?

      An iframe is a window into another webpage, and is bounded as such both visually and in terms of DOM interfaces. A simple example would be that an iframe header can't have drop-down menus that overlap content from the page hosting it.

      They are categorically not the same DX/UX as SSI et al. and it's absolutely bizarre to me that there's so many comments making this complaint.

    • silvestrov 20 hours ago

      The real problem with iframes is that their size is set by the parent document only.

      They would be a lot more useful if we could write e.g. <iframe src=abc.html height=auto width=100> so the height of the iframe element is set by the abc.html document instead of the parent document.

      • jefftk 17 hours ago

        You could do this with js in the child document, if its important to keep js out of the parent.

      • baggy_trough 18 hours ago

        You can achieve that with js in the parent document.

        • teg4n_ 17 hours ago

          You can achieve everything with JS in the parent document, it doesn’t mean it should be required or even recommended

    • ajkjk 20 hours ago

      No way. You can't make a decent single web page by iframing a bunch of components together.

  • tanepiper 6 hours ago

    My first ever website I wrote with mod_include and .shtml - updating a website was just adding a few tags.

    Also I miss framesets - with that a proper sidebar navigation was easily possible.

    • LeonB 6 hours ago

      Same here.

      I’m not saying my first website was impressive — but as a programmer there’s no way I was copying and pasting the same header / footer stuff into each page and quickly found “shtml” and used that as much as possible.

      Then used the integrated FTP support in whatever editor it was (“HTML-kit” I think it was called?) - to upload it straight to prod. Like a true professional cowboy.

  • aquova 10 hours ago

    I 100% agree with the sentiment of this article. For my personal website, I write pretty much every page by hand, and I have a header and a footer on most of those pages. I certainly don't want to have to update every single page everytime I want to add a new navigation button to the top of the page. For a while I used PHP, but I was running a PHP server literally for only this feature. I eventually switched to JavaScript, but likewise, on a majority of my pages, this was the only JavaScript I had, and I wanted to have a "pure" HTML page for a multitude of reasons.

    In the end, I settled on using a Caddy directive to do it. It still feels like a tacked on solution, but this is about as pure as I can get to just automatically "pasting" in the code, as described in the article.

  • somethingsome 21 hours ago

    I'm not an expert on this but IMO, from a language point of view, HTML is a markup language, it 'must' have no logic or processing. It is there to structure the information not to dynamically change it. Nor even to display it nicely.

    The logic is performed elsewhere. If you were to have includes directly in HTML, it means that browsers must implement logic for HTML. So it is not 'just' a parser anymore.

    Imagine for example that I create an infinite loop of includes, who is responsible to limit me? How to ensure that all other browsers implement it in the same way?

    What happens if I perform an injection from another website? Then we start to have cors policy management to write. (iframes were bad for this)

    Now imagine using Javascript I inject an include somewhere, should the website reload in some way? So we have a dynamic DOM in HTML?

    • naasking 20 hours ago

      > from a language point of view, HTML is a markup language, it 'must' have no logic or processing.

      Client-side includes are not "processing". HTML already has frames and iframes which do this, just in a worse way, so we'd be better off.

    • omoikane 14 hours ago

      > an infinite loop of includes

      We can probably copy the specs for <frameset> and deal with it the same way:

      https://www.w3.org/TR/WD-frames-970331#:~:text=Infinite%20Re...

          Any frame that attempts to assign as its SRC a URL used by any of its ancestors is treated as if it has no SRC URL at all (basically a blank frame).
      
      > How to ensure that all other browsers implement it in the same way?

      Browsers that don't implement the specs will eventually break:

      https://bugzilla.mozilla.org/show_bug.cgi?id=8065

    • lenkite 17 hours ago

      There is a very, very broad line in that "no logic or processing". HTML/CSS already do a lot of logic and processing. And many "markup languages" have include support. Like wikitext used in wikipedia and includes in Asciidoc.

  • jsdwarf 16 hours ago

    I'd say in 80% of the cases a pure, static html include is not enough. In a menu include, you want to disable the link to the currently shown page or show a page specific breadcrumb. In a footer include, you may want a dynamic "last updated" timestamp or the current year in the copyright notice. As all these use cases required a server-side scripting language anyway, there was no push behind an html include.

    • perilunar 3 hours ago

      > In a menu include, you want to disable the link to the currently shown page

      I’ve always just styled the link to the current page differently, not disabled it, which you can do with an id on the page and a line of CSS.

  • SJC_Hacker 21 hours ago

    Initially HTML was less about the presentation layer and more about the "document" concept. Documents should be self-contained, outside of references to other documents.

    • skydhash 21 hours ago

      I still think this is the best web. Either you are a collection of interlinked documents and forms (manual pages, wiki,...), or you are a full application (figma, gmail, google docs). But a lot of sites are trying to be both. And somes are trying to be one while they are the other type.

    • spauldo 9 hours ago

      One document == one HTML page was never the idea. Documents are often way too long to comfortably read and navigate that way. Breaking them into sections and linking between them was part of the core idea of HTML.

      Includes are a standard part of many document systems. Headers and footers are a perfect example - if I update a document I certainly don't want to update the document revision number on every single page! It also allows you to add navigation between documents in a way that is easy to maintain.

      LaTeX can do it. Microsoft Word can do it (in a typically horrible Microsoftian way). Why not HTML?

  • djoldman 20 hours ago

    > Our developer brains scream at us to ensure that we’re not copying the exact code three times, we’re creating the header once then “including” it on the three (or a thousand) other pages.

    Interesting, my brain is not this way: I want to send a minimum number of files per link requested. I don't care if I include the same text because the web is generally slow and it's generally caused by a zillion files sent and a ton of JS.

  • esprehn 20 hours ago

    We discussed this back when creating web components, but the focus quickly became about SPA applications instead of MPAs and the demand for features like this was low in that space.

    I wish I would have advocated more for it though. I think it would be pretty easy to add using a new attribute on <script> since the parser already pauses there, so making something like <script transclude={url}> would likely not be too difficult.

  • ludwik 20 hours ago

    We used to have this in the form of a pair of HTML tags: <frameset> and <frame> (not to be confused with the totally separate <iframe>!). <frameset> provided the scaffolding with slots for multiple frames, letting you easily create a page made up entirely of subpages. It was once popular and, in many ways, worked quite neatly. It let you define static elements once entirely client-side (and without JS!), and reload only the necessary parts of the page - long before AJAX was a thing. You could even update multiple frames at once when needed.

    From what I remember, the main problem was that it broke URLs: you could only link to the initial state of the page, and navigating around the site wouldn't update the address bar - so deep linking wasn’t possible (early JavaScript SPA frameworks had the same issue, BTW). Another related problem was that each subframe had to be a full HTML document, so they did have their own individual URLs. These would get indexed by search engines, and users could end up on isolated subframe documents without the surrounding context the site creator intended - like just the footer, or the article content without any navigation.

  • kmoser 15 hours ago

    SHTML used to be a thing back in the 1990s: https://en.wiktionary.org/wiki/SHTML

    • Telemakhos 15 hours ago

      Or, better, "Server Side Includes" (SSI): https://en.wikipedia.org/wiki/Server_Side_Includes

      SSI is still a thing: I use it on my personal website. It isn't really part of the HTML, though: it's a server-dependent extension to HTML. It's supported by Apache and nginx, but not by every server, so you have to have control over the server stack, not just access to the documents.

  • cr125rider 8 hours ago

    Chris is an absolute legend in this space and I’m so glad he’s bringing this up. I feel like he might actually have pull here and start good discussions that might have actual solutions.

  • simultsop 14 hours ago

    It's a pity, of all web resources advancements, js, css, runtimes, web engines. HTML was the most stagnant aspect of it, despite the "HTML5" effing hype. My guess is they did not want to empower HTML and threaten SSR's, or solutions. I believe the bigest concern of not making a step is the damned backward compatibility. Some just wont budge to move.

    • Andrex 10 hours ago

      HTML5 hype started strong out of the gate because of the video and audio tags, and canvas slightly after. Those HTML tags were worth the hype.

      Flash's reputation was quite low at the time and people were ready to finally move on from plugins being required on the web. (Though the "battle" then shifted to open vs. closed codecs.)

  • jasoncartwright 21 hours ago

    I made this to get around pages being cached at CDN level, but still needing to get live data...

    https://github.com/jasoncartwright/clientsideinclude

  • miragecraft 15 hours ago

    I too lamented the loss of HTML imports and ended up coming up with my own JavaScript library for it.

    https://miragecraft.com/blog/replacing-html-imports

    At the end of the day it’s not something trivial to implement at the HTML spec/parser level.

    For relative links, how should the page doing the import handle them?

    Do nothing and let it break, convert to absolute links, or remap it as a new relative link?

    Should the include be done synchronously or asynchronously?

    The big benefit of traditional server side includes is that its synchronous, thus simplifying logic for in-page JavaScript, but all browsers are trying to eliminate synchronous calls for speed, it’s hard to see them agreeing to add a new synchronous bottleneck.

    Should it be CORS restricted? If it is then it blocks offline use (file:// protocol) which really kills its utility.

    There are a lot of hurdles to it and it’s hard to get people to agree on the exact implementation, it might be best to leave it to JavaScript libraries.

    • paul_h 29 minutes ago

      Someone else made the same - https://github.com/Paul-Browne/HTMLInclude - but it's not been updated in 7 years, leaving questions. I'll try yours and theirs in due course. Err, and the fragment @HumanOstrich said elsewhere in comments.

  • hyperhello 20 hours ago

    I think it’s because it would be so easy to make a recursive page that includes itself forever. So you have to have rules when it’s okay, and that’s more complex and opaque than just programming it yourself.

  • DJHenk 21 hours ago

    My guess: no-one needs it.

    Originally, iframe were the solution, like the posts mentions. By the time iframes became unfashionable, nobody was writing HTML with their bare hands anymore. Since then, people use a myriad of other tools and, as also mentioned, they all have a way to fix this.

    So the only group who would benefit from a better iframe is the group of people who don't use any tools and write their HTML with their bare hands in 2025. That is an astonishing small group. Even if you use a script to convert markdown files to blog posts, you already fall outside of it.

    No-one needs it, so the iframe does not get reinvented.

    • Svip 21 hours ago

      No, originally frameset[0] and frame[1] were the solution to this problem. I remember building a website in the late 1990s with frameset. iframe came later, and basically allowed you to do frames without the frameset. Anyway, frameset is also the reason every browser's user agent starts with "Mozilla".

      [0] https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/...

      [1] https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/...

      • chgs 4 hours ago

        Originally my footers and navbars were included with server side includes

    • micromacrofoot 21 hours ago

      what if it could be a larger group though? modern css has been advancing rather rapidly... I don't even need a preprocessing library any more... I've got nested rules, variables, even some light data handling... why not start beefing up html too? we've got some new features but includes would be killer

  • neuroelectron 15 hours ago

    So glad I decided early in my career to not do webpages. Look how much discussion this minor feature has generated. I did make infra tools that outputted basic html, get post cgi type of stuff. What's funny is this stuff was deployed right before AWS was launched and a year later the on prem infra was sold and the warehouse services were moved to the cloud.

    • spauldo 9 hours ago

      You and me both. I did some web dev back in the early days, and noped out when IE was dragging everyone down with its refusal to change. I have never had a reason to regret that decision.

  • altogether 4 hours ago

    just on point at all why includ this stuff to load more at once... the hole web stuff works since cgi implementations at web services on asyncronity to load just what u actually need... like now most of them are fetch or xhr calls... I mean it makes just sence for onepager to keep the markup a bit more structured... but why u want to make static rendered homepages those days?

  • altogether 4 hours ago

    at all ...why includ this stuff to load more at once... the hole web stuff works since cgi implementations at web services on asyncronity to load just what u actually need... like now most of them are fetch or xhr calls... I mean it makes just sence for onepager to keep the markup a bit more structured... but why u want to make static rendered homepages those days?

  • dfabulich 8 hours ago

    We know why HTML alone can't do includes! In https://github.com/whatwg/html/issues/2791 the standards committee discussed this. The issue has been open for years.

    The first naysayer was @dominic: https://github.com/whatwg/html/issues/2791#issuecomment-3113...

    > I don't think we should do this. The user experience is much better if such inclusion is done server-side ahead of time, instead of at runtime. Otherwise, you can emulate it with JavaScript, if you value developer convenience more than user experience.

    The "user experience" problem he's describing is a performance problem, adding an extra round-trip to the server to fetch the included HTML. If you request "/blog/article.html", and the article includes "/blog/header.html", you'll have to do another request to the server to fetch the header.

    It would also prevent streaming parsing and rendering, where the browser can parse and render HTML bit-by-bit as it streams in from the server.

    Before you say, "so, what's the big deal with adding another round trip and breaking the streaming parser?" go ahead and read through the hundreds of comments on that thread. "What's the big deal" has not convinced browser devs for at least eight years, so, pick another argument.

    I think there is a narrow opening, where some noble volunteer would spec out a streaming document-fragment parser.

    It would involve a lot of complicated technical specification detail. I know a thing or two about browser implementation and specification writing, and designing a streaming document-fragment parser is far, far beyond my ken.

    But, if you care about this, that's where you'd start. Good luck!

    P.S. There is another option available to you: it is kinda possible to do client-side includes using a service worker. A service worker is a client-side proxy server that the browser will talk to when requesting documents; the service worker can fetch document fragments and merge them together (even streaming fragments!) with just a bit of JS.

    But that option kinda sucks as a developer experience, because the service worker doesn't work the first time a user visits your site, so you'd have to implement server-side includes and also serve up document fragments, just for second-time visitors who already have the header cached.

    Still, if all you want is to return a fast cached header while the body of your page loads, service workers are a fantastic solution to that problem.

  • prkl 20 hours ago

    honestly, html can include css and javascript via link and style tags. there's no reason for it to not have an <include src="" /> tag, and let the browser parsing it fetch the content to replace it.

  • WhyNotHugo 14 hours ago

    HTML frames solved this problems just fine, but they were deprecated in favour of using AJAX to replace portions of the body as you navigate (e.g.: SPAs).

    I still feel like frames were great for their use case.

  • daveac 14 hours ago

    I desperately want to back to crafting sites by hand and not reach for react/vue as a default. I do a lot of static and tempory sites that do very little

  • dietsche 14 hours ago

    I think the authors of htmx have the same questions :)

  • franze 16 hours ago

    Now you can include HTML in HTML, see https://include.franzai.com/ - a quick Chrome Polyfill based on the discussion here. MIT License

    Github: https://github.com/franzenzenhofer/html-include-polyfill-ext...

    SHOW HN: https://news.ycombinator.com/item?id=43881815

  • johannes1234321 21 hours ago

    I would guess back in the days having extra requests was expensive, thus discouraged. Later there were attempts via xinclude, but by then PHP and similar took over or people tolerated frames.

  • paulryanrogers 13 hours ago

    HTML does have frames and iframes, which can accomplish some of the same goals.

    • mirkodrummer 13 hours ago

      it is mentioned in the article indeed; it's an awful solution that is poor in performance and break the accessibility

      • paulryanrogers 9 hours ago

        Thanks. I was reading too fast and missed the iframe reference in the article

      • mixmastamyk 12 hours ago

        Why would the performance be any better with another tag?

        • rhet0rica 12 hours ago

          A frame is a separate rendering context—it's (almost) as heavyweight as a new tab. The author wants to insert content from another file directly into the existing DOM, merging the two documents completely.

          • mixmastamyk 12 hours ago

            Negligible twenty years ago. But yes, if there's an improvement it should be merged automatically into the same document.

  • dheera 13 hours ago

    Seems everyone forgot HTML-SSI which worked something like this. Many servers and hosting websites of the 90s supported it.

        <!--#include virtual="header.html" -->
        Some content here
        <!--#include virtual="footer.html" -->
  • mmastrac 21 hours ago

    Do remote entity references still work in XHTML? XML had its issues but did have a decent toolbox of powerful if not insecure primitives.

  • bandrami 13 hours ago

    Because how would the browser decide it's in a fetch loop?

  • sreekotay 16 hours ago

    This has always worked for me. Pretty much the ask? https://gist.github.com/sreekotay/08f9dfcd7553abb8f1bb17375d...

    • CamouflagedKiwi 16 hours ago

      That's the first thing listed in the article? "Javascript to go fetch the HTML and insert it". What they're after is something that's _just_ HTML and not another language.

      • sreekotay 16 hours ago

        While you do need a server i think this is the functional equivalent? The fetch JS and insert outlined (linked to) in the article is async. This blocks execution like you'd expect an HTML include to do. It's WAY easier to reason about - which is why the initial ask, I think...

      • j45 11 hours ago

        The <object> tag appears to include/embed other html pages.

        An embedded HTML page:

        <object data="snippet.html" width="500" height="200"></object>

        https://www.w3schools.com/tags/tag_object.asp

    • alecsm 16 hours ago

      But you need a server for that to work.

      • sreekotay 16 hours ago

        you need a server for HTML to work, as practical matter. But yes. There IS a workaround to that too, if you're REALLY determined, but you have to format your HTML a giant JS comment block (lol really :))

        [edit: I'm sure there are still some file:// workflows for docs - and yes this doesn't address that]

        • mattl 14 hours ago

          You don't need a server for HTML to work, I can just hand you a USB stick/floppy disk/MO disk for your NeXT with HTML files on it.

          • sreekotay 12 hours ago

            ( •_•) ( •_•)>⌐■-■ (⌐■_■) Deal with it.

            :)

  • bdcravens 21 hours ago

    The simplest answer is that HTML wasn't designed as a presentation language, but a hypertext document language. CSS and Javascripts were add-ons after the fact. Images weren't even in the first version. Once usage of the web grew beyond the initial vision, solutions like server-side includes and server-side languages that rendered HTML were sufficient.

  • lerp-io 8 hours ago

    just use react or nextjs or whatever and move on jeez

  • mixmastamyk 13 hours ago

    Lots of rationalization in here—it's always been needed. I complained about the lack of <include src="..."> when building my first site in '94/95, with simpletext and/or notepad!

    It was not in the early spec, and seems someone powerful wouldn't allow it in later. So everyone else made work arounds, in any way they could. Resulting in the need being lessened quite a bit.

    My current best workaround is the <object data=".."> tag, which has a few better defaults than iframe. If you put a link to the same stylesheet in the include file it will match pretty well. Size with width=100%, though with height you'll need to eyeball or use javascript.

    Or, Javascript can also hoist the elements to the document level if you really need. Sample code at this site: https://www.filamentgroup.com/lab/html-includes/

  • hilti 21 hours ago

    Have you ever heard of Server Side Includes?

    https://en.wikipedia.org/wiki/Server_Side_Includes

    • rsolva 21 hours ago

      After researching this very topic earlier; SSI is the most pragmatic solution. Check out Caddy's Template Language (based on Go), it is quite capable and quite similar to building themes in Hugo. Just much more bare bones.

      I have built several sites with pure HTML+CSS, sprinkled with some light SSI with Caddy, and it is rock solid and very performant!

    • rendaw 21 hours ago

      That's mentioned in TFA under "old school web server directives".

  • drob518 20 hours ago

    Seems like this would help with caching, too.

  • dinkblam 15 hours ago

    we had no problem using <object> for headers and footers

  • steren 16 hours ago

    in the meantime, you can use <html-include> https://www.npmjs.com/package/html-include-element

  • TZubiri 15 hours ago

    Because it's HyperText, the main idea is that you link to other content, so this is not a weird feature that is being asked for, it's just a different way of doing the whole raison d'etre of the tech. In fact the tag to link stuff is the <a> tag. It just so happens that it makes you load the other "page", instead of transcluding content, the idea is that you load it.

    It wouldn't make sense to transclude the article about the United States in the article about Wyoming (and in fact modern wikipedia shows a pop up bubble doing a partial transclusion, but would benefit in no way from basic html transclusion.)

    It's a simple idea. But of course modern HTML is not at all what HTML was designed to be, but that's the canonical answer.

    The elders of HTML would just tell you to make an <a> link to whatever you wanted to transclude instead. Be it a "footer/header/table of contents" or another encylcopdic article, or whatever. Because that's how HTML works, and not the way you suggest.

    Think of what would happen if it were the case, you would transclude page A, which transcludes page B, and so with page C, possibly recursively transcluding page B and so. You would transform the User Agent (browser) into a whole WWW crawler!

    It's because HTML is pass by reference, not pass by copy.

  • Imustaskforhelp 21 hours ago

    I think this is a genuinely good question that I was also wondering some time ago.

    And it is a genuinely good question!

    I think the answer of PD says feels the truest.

    JS/CSS with all its bureaucracy are nothing compared to HTML it seems. Maybe people don't find nothing wrong with Html, maybe if they do, they just reach out for js/css and try to fix html (ahem frontend frameworks).

    That being said, I have just regurgitated what PD says has said and I give him full credit of that but I am also genuinely confused as to why I have heard that JS / CSS are bureaucratic (I remember that there was this fireship video of types being added in JS and I think I had watched it atleast 1 year ago (can be wrong) but I haven't heard anything for it and I see a lot of JS proposals just stuck from my observation

    And yet HTML is such level of bureaucratic that the answer to why HTML doesn't have a feature is because of its bureaucracy. Maybe someone can explain the history of it and why?

  • axelfontaine 21 hours ago

    Iframes, while not perfect, are pretty close though...

    • lelandfe 21 hours ago

      Making iframes be the right size is super awkward. I might actually use them more if they were easy to get responsive.

      This post does link to a technique (new to me) to extract iframe contents:

          <iframe src="/example.html" onload="this.before((this.contentDocument.body||this.contentDocument).children[0]);this.remove()"></iframe>
      • dleeftink 21 hours ago

        I've come across this technique here [0] to try it on <object> elements, but sizing is even more difficult there.

        [0]: https://www.filamentgroup.com/lab/html-includes/

      • cratermoon 20 hours ago

        Are we solving the information-centric transclusion problem, or the design-centric asset reuse problem? An iframe is fine for the former but is not geared towards design and layout solutions.

        • lelandfe 20 hours ago

          It kinda sucks for both! Dropping in a box of text that flatly does not resize to fit its contents does not fit the definition of "fine" for me, here.

          You can do some really silly maneuvers with `window.postMessage` to communicate an expected size between the parent and frame on resize, but that's expensive and fiddly.

    • webstrand 21 hours ago

      Iframes fundamentally encapsulate html documents, not fragments.

    • rendaw 21 hours ago

      Interaction between elements in different iframes is very restricted.

      • danans 21 hours ago

        IIRC, you can communicate entire JSON objects between an iframe and it's host frame with PostMessage.

        The host can then act as a server for the iframe client, even updating it's state or DOM in response to a message from the iframe.

  • nsonha 9 hours ago

    the web platform is the tech stack version of the human concept of "failing upward". It sucks but will only get more and more vital in the modern tech scene as time goes by.

  • tiku 16 hours ago

    iFrames have a src and includes other html.. We used to make sites with it way back.

  • Devasta 14 hours ago

    Honest answer: because any serious efforts to improve HTML died 20 years ago, and the web as it's envisaged today is not an infinite library of the worlds knowledge but instead a JavaScript based and platform.

    Asking for things that the W3C had specced out in 2006 for XML tech is just not reasonable if it doesn't facilitate clicks.

  • bufferoverflow 21 hours ago

    iframe is html

  • rs186 21 hours ago

    I guess for the similar reason that Markdown does not have any "include" ability -- it is a feature not useful enough yet with too many issues to deal with. They are really intended to be used as "single" documents.

    • rs186 an hour ago

      Yeah people downvote me but can't bother to leave a comment.

      If you disagree, and you think you are in the right, you probably have a somewhat good argument you can use in a reply.

      The fact that you don't means my explanation makes sense.

  • nico 21 hours ago

    FTA

    > We’ve got <iframe>, which technically is a pure HTML solution, but

    And then on the following paragraph..

    > But none of the solutions is HTML

    > None of these are a straightforward HTML tag

    Not sure what the point is. Maybe just complaining

    • ComplexSystems 21 hours ago

      <iframe> is different from what the author is asking for, it has its own DOM and etc. He wants something like an SSI but client side. He explains some of the problems right after the part you cut off above

      "We’ve got <iframe>, which technically is a pure HTML solution, but they are bad for overall performance, accessibility, and generally extremely awkward here"

    • pjc50 21 hours ago

      Iframe is stuck in a rectangular box. It's not really suitable for things like site wide headers, footers and menus.

      • vlovich123 21 hours ago

        While I get your point, headers and footers and menus tend to all live within rectangular boxes.

        • zamadatix 21 hours ago

          Headers and their menus are often problematic for this approach, unless they are 100% static (e.g. HN would work but Reddit and Google wouldn't since they both put things in their header which can expand over the content). I.e. you can make it transparent but that doesn't solve eating the interactions. The code needed to work around that is more than just using JS to do the imports.

        • jefftk 17 hours ago

          Headers and footers, yes. Menus generally need to expand when you interact with them, especially on mobile.

        • teg4n_ 17 hours ago

          Do a drop down list of links on a header in an iframe

  • typedef_struct 17 hours ago

    You could start with something like this:

        customElements.define('html-import', class extends HTMLElement {
            connectedCallback() {
                const href = this.getAttribute('href')
                const fetch = new XMLHttpRequest()
                fetch.responseType = 'document'
                fetch.addEventListener('readystatechange', (function onfetch(e) {
                    if (fetch.readyState !== XMLHttpRequest.DONE) return
                    const document = fetch.response.querySelector('body') ?? fetch.response
                    this.replaceWith(document)
                }).bind(this))
                fetch.open('GET', href)
                fetch.send()
            }
        })
    • johnfn 17 hours ago

      Well, that's not "HTML alone".

      • _benton 9 hours ago

        Could you use an object tag for this?

    • _benton 10 hours ago

      In one line

      customElements.define("include", class extends HTMLElement { connectedCallback() { fetch(this.getAttribute("href")).then(x => x.text()).then(x => this.outerHTML = x) } })