Whenever someone argues the uselessness or redundancy of a particular word, a helpful framework to understand their perspective is "Lumpers vs Splitters" : https://en.wikipedia.org/wiki/Lumpers_and_splitters
An extreme caricature example of a "lumper" would just use the word "computer" to label all Turing Complete devices with logic gates. In that mindset, having a bunch of different words like "mainframe", "pc", "smartphone", "game console", "FPGA", etc are all redundant because they're all "computers" which makes the various other words pointless.
On the other hand, the Splitters focus on the differences and I previously commented why "transpiler" keeps being used even though it's "redundant" for the Lumpers : https://news.ycombinator.com/item?id=28602355
We're all Lumpers vs Splitters to different degrees for different topics. A casual music who thinks of orchestral music as background sounds for the elevator would be "lump" both Mozart and Bach together as "classical music". But an enthusiast would get irritated and argue "Bach is not classical music, it's Baroque music. Mozart is classical music."
The latest example of this I saw was someone complaining about the word "embedding" used in LLMs. They were asking ... if an embedding is a vector, why didn't they just re-use the word "vector"?!? Why is there an extra different word?!? Lumpers-vs-splitters.
Transpilers are compilers that translate from one programming language to the other. I am not 100% sure where these "lies" come from, but it's literally in the name, it's clearly a portmanteau of translating compiler... Where exactly are people thinking the "-piler" suffix comes from?
Yes, I know. You could argue that a C compiler is a transpiler, because assembly language is generally considered a programming language. If this is you, you have discovered that there are sometimes concepts that are not easy to rigorously define but are easy for people to understand. This is not a rare phenomenon. For me, the difference is that a transpiler is intending to target a programming language that will be later compiled by another compiler, and not just an assembler. But, it is ultimately true that this definition is still likely not 100% rigorous, nor is it likely going to have 100% consensus. Yet, people somehow know a transpiler when they see one. The word will continue to be used because it ultimately serves a useful purpose in communication.
IMO: Transpilers are compilers, but not all compilers are transpilers.
In my book, transpilers are compilers that consume a programming language and target human-readable code, to be consumed by another compiler or interpreter (either by itself, or to be integrated in other projects).
i.e. the TypeScript compiler is a transpiler from TS to JS, the Nim compiler is a transpiler from Nim to C, and so on.
I guess if you really want to be pedantic, one can argue (with the above definition) that `clang -S` might be seen as a transpiler from C to ASM, but at that point, do words mean anything to you?
Why is it useless? 'Compiler' denotes the general category, within which exist various sub-categories:
For example, a 'native compiler' outputs machine code for the host system, a 'cross compiler' outputs machine code for a different system, a 'bytecode compiler' outputs a custom binary format (e.g. VM instructions), and a 'transpiler' outputs source code. These distinctions are meaningful.
> Compilers already do things that “transpilers” are supposed to do. And they do it better because they are built on the foundation of language semantics instead of syntactic manipulation.
The definition of compiler i learned was “takes some code, translate it to semantically equivalent code in a different language (which might be machine language, bytecode…)”. This is also used in PLaI, a respected learning resource: https://www.plai.org/
I think this is a pretty acceptable definition, and yes, it does make the term transpiler a little useless.
What I would add to your definition, to make a distinction from the common usage of compilation, is that the target language is on an approximately equivalent level of abstraction to the source. So, for example, Rust -> machine code is not transpilation, but Rust -> C++ is.
I think this is how the word is commonly understood, and it’s not useless (even if there’s no absolute standard of when it does or does not apply).
Edit: sorry, realise I should have read the article before commenting. The article calls out my definition as one of their ‘lies’. I guess I just disagree with the article. Words can be useful even without a 100% watertight definition. They’re for communication as well as classification.
One of the problems is that you might not use the target language at the equivalent level of abstraction. For example, C is a popular target language, but the C emitted may be very unidiomatic and nothing like human consumable code, it's not all that unusual that a language compiles all code to one big C function where the function calls in the language are jumps, which is a way to get around the limitations of the C calling conventions and stack.
The same thing applies to compilation to Javascript, the resulting code may use a tiny subset of the language.
I don't like the word transpiler, because there is nothing useful about the distinction (unless you count people using it to denigrate compilers that doesn't target traditional machine code).
I could see the case of using it as a name when the transformation is reversible, like you could probably turn Javascript back into Coffeescript.
What value does the word have? When I'm writing a compiler, it doesn't matter whether I target C or asm, or Javascript, as my output language. I'll still write it the same way.
To me, it doesn't. If someone says "tsc is a transpiler", it gives me nothing actionable. If you do say "it transpiles to JS", then I've got something, but that could just be "compiles to JS". It doesn't really tell me how the thing is constructed either.
Whenever someone argues the uselessness or redundancy of a particular word, a helpful framework to understand their perspective is "Lumpers vs Splitters" : https://en.wikipedia.org/wiki/Lumpers_and_splitters
An extreme caricature example of a "lumper" would just use the word "computer" to label all Turing Complete devices with logic gates. In that mindset, having a bunch of different words like "mainframe", "pc", "smartphone", "game console", "FPGA", etc are all redundant because they're all "computers" which makes the various other words pointless.
On the other hand, the Splitters focus on the differences and I previously commented why "transpiler" keeps being used even though it's "redundant" for the Lumpers : https://news.ycombinator.com/item?id=28602355
We're all Lumpers vs Splitters to different degrees for different topics. A casual music who thinks of orchestral music as background sounds for the elevator would be "lump" both Mozart and Bach together as "classical music". But an enthusiast would get irritated and argue "Bach is not classical music, it's Baroque music. Mozart is classical music."
The latest example of this I saw was someone complaining about the word "embedding" used in LLMs. They were asking ... if an embedding is a vector, why didn't they just re-use the word "vector"?!? Why is there an extra different word?!? Lumpers-vs-splitters.
Transpilers are compilers that translate from one programming language to the other. I am not 100% sure where these "lies" come from, but it's literally in the name, it's clearly a portmanteau of translating compiler... Where exactly are people thinking the "-piler" suffix comes from?
Yes, I know. You could argue that a C compiler is a transpiler, because assembly language is generally considered a programming language. If this is you, you have discovered that there are sometimes concepts that are not easy to rigorously define but are easy for people to understand. This is not a rare phenomenon. For me, the difference is that a transpiler is intending to target a programming language that will be later compiled by another compiler, and not just an assembler. But, it is ultimately true that this definition is still likely not 100% rigorous, nor is it likely going to have 100% consensus. Yet, people somehow know a transpiler when they see one. The word will continue to be used because it ultimately serves a useful purpose in communication.
IMO: Transpilers are compilers, but not all compilers are transpilers.
In my book, transpilers are compilers that consume a programming language and target human-readable code, to be consumed by another compiler or interpreter (either by itself, or to be integrated in other projects).
i.e. the TypeScript compiler is a transpiler from TS to JS, the Nim compiler is a transpiler from Nim to C, and so on.
I guess if you really want to be pedantic, one can argue (with the above definition) that `clang -S` might be seen as a transpiler from C to ASM, but at that point, do words mean anything to you?
I'd probably say that "transpiler" is not a very useful word with that definition.
Why is it useless? 'Compiler' denotes the general category, within which exist various sub-categories:
For example, a 'native compiler' outputs machine code for the host system, a 'cross compiler' outputs machine code for a different system, a 'bytecode compiler' outputs a custom binary format (e.g. VM instructions), and a 'transpiler' outputs source code. These distinctions are meaningful.
I can’t see why — I do think that the word does convey some sort of useful meaning with the above definition.
> Compilers already do things that “transpilers” are supposed to do. And they do it better because they are built on the foundation of language semantics instead of syntactic manipulation.
So you do know the difference.
It would be good if we had a term that didn't confuse linking with translation. In English compiling means joining together many parts, after all.
"Transpiler" is no less well-defined a term than "compiler".
The definition of compiler i learned was “takes some code, translate it to semantically equivalent code in a different language (which might be machine language, bytecode…)”. This is also used in PLaI, a respected learning resource: https://www.plai.org/
I think this is a pretty acceptable definition, and yes, it does make the term transpiler a little useless.
What I would add to your definition, to make a distinction from the common usage of compilation, is that the target language is on an approximately equivalent level of abstraction to the source. So, for example, Rust -> machine code is not transpilation, but Rust -> C++ is.
I think this is how the word is commonly understood, and it’s not useless (even if there’s no absolute standard of when it does or does not apply).
Edit: sorry, realise I should have read the article before commenting. The article calls out my definition as one of their ‘lies’. I guess I just disagree with the article. Words can be useful even without a 100% watertight definition. They’re for communication as well as classification.
One of the problems is that you might not use the target language at the equivalent level of abstraction. For example, C is a popular target language, but the C emitted may be very unidiomatic and nothing like human consumable code, it's not all that unusual that a language compiles all code to one big C function where the function calls in the language are jumps, which is a way to get around the limitations of the C calling conventions and stack.
The same thing applies to compilation to Javascript, the resulting code may use a tiny subset of the language.
I don't like the word transpiler, because there is nothing useful about the distinction (unless you count people using it to denigrate compilers that doesn't target traditional machine code).
I could see the case of using it as a name when the transformation is reversible, like you could probably turn Javascript back into Coffeescript.
What value does the word have? When I'm writing a compiler, it doesn't matter whether I target C or asm, or Javascript, as my output language. I'll still write it the same way.
It gives you a better idea what a thing does?
To me, it doesn't. If someone says "tsc is a transpiler", it gives me nothing actionable. If you do say "it transpiles to JS", then I've got something, but that could just be "compiles to JS". It doesn't really tell me how the thing is constructed either.