This raises the question: what information did Amazon Q ingest to be able to write C64 Basic, and from where – OCR'd books and magazine off Google Books? Online tutorials? That would explain whether it would be possible to adapt this workflow to supporting other relatively obscure platforms, with a limited documentation set that's certainly not available online on the internet in easily parsable HTML: e.g. PDP-11 assembly, Turbo Pascal, classic Macintosh/Macintosh Toolbox, etc.
Who knows, it might be a shot in the arm for retrocomputing enthusiasts.
This one's probably pretty well covered, actually. All of Apple's Inside Macintosh documentation is available in PDF format, and there's plenty of old programming books and magazines which have been scanned and OCRed.
Makes sense to me. Part/most of the appeal of coding 6502 or z80 for a retro platform is just how deterministic and predictable they are down to the clock cycle. AI is the opposite.
I've noticed that LLMs have a hard time remembering all the constraints of 8-bit programming. Like sometimes it assumes that 6502 registers can have a value above 255, or in C it assumes that ints have 32/64 bits.
Also, if you have an array living in the zeropage, and you use Zeropage,X or Zeropage,Y instructions to try to access beyond address FF, it will wrap back to 00. Because it's a zeropage instruction that can't go outside the zeropage.
$101? $104? This looks familiar. If you try to train an LLM on series of lottery ticket numbers, say 1-50 and then ask for a set of numbers based on the training data LLMs sometimes will give you numbers like 110 or 101.
Author of the post here. Just reading the comments so apologies for getting some of the terminology wrong. The intention was never to mislead folk , just wanted to share my enthusiasm for emulation and the fact that you could get working code.
Interesting article, but the title is a bit off. "Assembler" actually refers to the tool that converts assembly language into machine code, not the language itself. So "writing 6502 assembler" would technically mean writing the assembler software, not writing assembly code for the 6502 processor.
It's a small distinction, but surprising to see this mix up as assembly language enthusiasts tend to be sticklers for these details!
As someone who's been writing assembly language for decades going back to Univacs, most people used "assembler" as shorthand for "assembly language" or even "assembler language". It's usually quite unambiguous.
Here are examples of such usage, from 1967 and more recent:
Seems like they always start with "the assembler language," which I take as "the language of the assembler," and then they sometimes get sloppy after that. I've never heard someone say "assembler language" (or maybe I just tuned it out.)
I agree; I read the headline as meaning a 6502 assembler was written, as opposed to 6502 assembly being written.
Writing compilers gets difficult, quickly. But writing assemblers is/was common enough for simple architectures and it's quite fun and relatively easy to prove/test for correctness. At least compared to most compilers.
One of my first gigs out of school involved writing a Z80 assembler because I "needed" nonstandard (Sharp/DMG/8080) instructions to be handled during a codebase port. It was enjoyable! I recommend everyone write an 8-bit assembler at least once!
Still, TFA is very interesting and I appreciate OP's share. :-)
I wrote my own 6809 assembler [1] largely just because, and I ended up adding a 6809 emulator to it, so I can run tests during the assembly phase. I'm only aware of one other assembler (a 6502 one) that can do this. It's a fun project.
There was the time I wrote a “data-assembler” for the AVR8 because I was packing up rather complicated data structures to represent parts of graphics that appear on a persistence of vision display and how the parts are assembled into images. If you look at any assembler you see there are a lot of facilities for constructing data as well as for constructing code and I took one approach to that problem, outputting C code for an array that the system can read out of the (comparatively large) ROM.
In my experience, this is a common enough usage variation that I'm not sure how helpful it is to treat it as an error. In particular, "assembler language" seems to have been IBM's preferred phrasing at one point.
I had the same reaction. I first thought, well it's not that difficult to write an assembler for a simple 8-bit instruction set, but then upon reading the article it looks like it's instead writing programs in assembly language. Totally different.
I made a living writing assembler as early as 1986. We used assembler and assembly language interchangeably, even though it’s true that the actual executable software that parses and generates code is called an assembler.
I don't know anyone who doesn't say "I wrote X assembler" with complete understanding by all involved, and I definitely don't know anyone so pedantic they said "acksually, it's 'I wrote X assembly code'". I guess none of the dozens of assembly code makers or whatever I've know over the last 40 years was enough of a stickler. Or care one way or another.
I also understood the title to mean writing an assembler rather than writing assembly language code, and I've never heard anyone refer to writing assembly as writing assembler (or heard anyone who writes assembly referred to as an "assembly code maker", nor anyone who writes in any language referred to as an "<language> code maker").
I could imagine such phrasing being done by non-native English speakers, of which I'm have no doubt that there are a significant number.
My (unresearched) guess is that this is simply different dialects of speakers emerging with respect to informal references over the decades.
It seems to be pretty common even among native English speakers to use "writing assembler" and "writing assembly" interchangeably. If one were writing the tool that assembles to machine code, you'd say "writing an assembler".
Normally I'd agree (as a native english speaker with about twenty years of writing assembly under my belt), but the title tripped me up, too. I figured they forgot an "an" or an "s" at the end of "assembler" and i was surprised to find that no 6502 assembler was produced. It could be because I've written three different assemblers over the last two years, though, so it could be i was just projecting my own interests.
I actually prefer "writing assembly", and also think "writing assembler" is a bit confusing. But it feels like I'm several decades too late to complain about it ;)
Using “assembler” instead of “assembly” was common enough back in the day that there was no confusion. There were 100x more people writing “assembler” than writing actual “assemblers” so you know, the odds were good.
I've never heard anyone refer to writing assembly as writing assembler
I used 'assembler' back in high school, when I was learning about the 80x86. I remember because I was 'corrected' by fellow student who had never touched assembler, assembly language, machine code mnemonics, or whatever you want to call it.
I have no idea where I got the terminology, but I was reading a lot of books and Usenet posts on the subject at the time. I'm a native English speaker, for what it's worth.
> I used 'assembler' back in high school, when I was learning about the 80x86. I remember because I was 'corrected' by fellow student who had never touched assembler, assembly language, machine code mnemonics, or whatever you want to call it.
Wow, you got the Hacker News experience 30 years in advance!
Reminds me a lot of BASIC I wrote back in the day, particularly the code for the bouncing sprite.
Seriously though it makes me think of how hit-or-miss Microsoft Copilot is at writing code (we have a special license to use it at work.)
For certain things such as writing short bash, CMD.EXE and PowerShell scripts it does great. It writes great list comprehensions in Python. Can convert code defining a set of typed dicts to a set of dataclasses. Can write a SQL query using an obscure (to me) feature and then rewrite it in JooQ.
But write a CTE expression in JooQ? It doesn't understand how to break the circularity.
Configure Vite? It will insist on the same wrong answers ceaselessly. On the other hand, if you look at StackOverlow the answer seems to be "you can't get here from there" or "there is this plugin that might help if it worked but it doesn't."
This raises the question: what information did Amazon Q ingest to be able to write C64 Basic, and from where – OCR'd books and magazine off Google Books? Online tutorials? That would explain whether it would be possible to adapt this workflow to supporting other relatively obscure platforms, with a limited documentation set that's certainly not available online on the internet in easily parsable HTML: e.g. PDP-11 assembly, Turbo Pascal, classic Macintosh/Macintosh Toolbox, etc.
Who knows, it might be a shot in the arm for retrocomputing enthusiasts.
> classic Macintosh/Macintosh Toolbox
This one's probably pretty well covered, actually. All of Apple's Inside Macintosh documentation is available in PDF format, and there's plenty of old programming books and magazines which have been scanned and OCRed.
I don't think I've met a retrocomputing enthusiast yet who has a positive opinion of generative AI.
Makes sense to me. Part/most of the appeal of coding 6502 or z80 for a retro platform is just how deterministic and predictable they are down to the clock cycle. AI is the opposite.
Yeah it lowers that barrier to getting re started. It did for me at least
I've noticed that LLMs have a hard time remembering all the constraints of 8-bit programming. Like sometimes it assumes that 6502 registers can have a value above 255, or in C it assumes that ints have 32/64 bits.
Also, if you have an array living in the zeropage, and you use Zeropage,X or Zeropage,Y instructions to try to access beyond address FF, it will wrap back to 00. Because it's a zeropage instruction that can't go outside the zeropage.
$101? $104? This looks familiar. If you try to train an LLM on series of lottery ticket numbers, say 1-50 and then ask for a set of numbers based on the training data LLMs sometimes will give you numbers like 110 or 101.
Author of the post here. Just reading the comments so apologies for getting some of the terminology wrong. The intention was never to mislead folk , just wanted to share my enthusiasm for emulation and the fact that you could get working code.
Lovely post. Thanks for sharing.
Thank you!
Interesting article, but the title is a bit off. "Assembler" actually refers to the tool that converts assembly language into machine code, not the language itself. So "writing 6502 assembler" would technically mean writing the assembler software, not writing assembly code for the 6502 processor.
It's a small distinction, but surprising to see this mix up as assembly language enthusiasts tend to be sticklers for these details!
As someone who's been writing assembly language for decades going back to Univacs, most people used "assembler" as shorthand for "assembly language" or even "assembler language". It's usually quite unambiguous.
Here are examples of such usage, from 1967 and more recent:
http://www.bitsavers.org/pdf/ibm/360/asm/C28-6514-5_IBM_Syst...
https://www.ibm.com/docs/en/zos-basic-skills?topic=zos-assem...
Seems like they always start with "the assembler language," which I take as "the language of the assembler," and then they sometimes get sloppy after that. I've never heard someone say "assembler language" (or maybe I just tuned it out.)
It's definitely a thing that happens. I believe it comes from the phrase "assembler language" that IBM manuals used instead of "assembly".
I agree; I read the headline as meaning a 6502 assembler was written, as opposed to 6502 assembly being written.
Writing compilers gets difficult, quickly. But writing assemblers is/was common enough for simple architectures and it's quite fun and relatively easy to prove/test for correctness. At least compared to most compilers.
One of my first gigs out of school involved writing a Z80 assembler because I "needed" nonstandard (Sharp/DMG/8080) instructions to be handled during a codebase port. It was enjoyable! I recommend everyone write an 8-bit assembler at least once!
Still, TFA is very interesting and I appreciate OP's share. :-)
I wrote my own 6809 assembler [1] largely just because, and I ended up adding a 6809 emulator to it, so I can run tests during the assembly phase. I'm only aware of one other assembler (a 6502 one) that can do this. It's a fun project.
[1] https://github.com/spc476/a09
There was the time I wrote a “data-assembler” for the AVR8 because I was packing up rather complicated data structures to represent parts of graphics that appear on a persistence of vision display and how the parts are assembled into images. If you look at any assembler you see there are a lot of facilities for constructing data as well as for constructing code and I took one approach to that problem, outputting C code for an array that the system can read out of the (comparatively large) ROM.
In my experience, this is a common enough usage variation that I'm not sure how helpful it is to treat it as an error. In particular, "assembler language" seems to have been IBM's preferred phrasing at one point.
I had the same reaction. I first thought, well it's not that difficult to write an assembler for a simple 8-bit instruction set, but then upon reading the article it looks like it's instead writing programs in assembly language. Totally different.
I made a living writing assembler as early as 1986. We used assembler and assembly language interchangeably, even though it’s true that the actual executable software that parses and generates code is called an assembler.
I don't know anyone who doesn't say "I wrote X assembler" with complete understanding by all involved, and I definitely don't know anyone so pedantic they said "acksually, it's 'I wrote X assembly code'". I guess none of the dozens of assembly code makers or whatever I've know over the last 40 years was enough of a stickler. Or care one way or another.
I also understood the title to mean writing an assembler rather than writing assembly language code, and I've never heard anyone refer to writing assembly as writing assembler (or heard anyone who writes assembly referred to as an "assembly code maker", nor anyone who writes in any language referred to as an "<language> code maker").
I could imagine such phrasing being done by non-native English speakers, of which I'm have no doubt that there are a significant number.
My (unresearched) guess is that this is simply different dialects of speakers emerging with respect to informal references over the decades.
It seems to be pretty common even among native English speakers to use "writing assembler" and "writing assembly" interchangeably. If one were writing the tool that assembles to machine code, you'd say "writing an assembler".
Normally I'd agree (as a native english speaker with about twenty years of writing assembly under my belt), but the title tripped me up, too. I figured they forgot an "an" or an "s" at the end of "assembler" and i was surprised to find that no 6502 assembler was produced. It could be because I've written three different assemblers over the last two years, though, so it could be i was just projecting my own interests.
Still, I don't really care what it's called.
I actually prefer "writing assembly", and also think "writing assembler" is a bit confusing. But it feels like I'm several decades too late to complain about it ;)
Using “assembler” instead of “assembly” was common enough back in the day that there was no confusion. There were 100x more people writing “assembler” than writing actual “assemblers” so you know, the odds were good.
I've never heard anyone refer to writing assembly as writing assembler
I used 'assembler' back in high school, when I was learning about the 80x86. I remember because I was 'corrected' by fellow student who had never touched assembler, assembly language, machine code mnemonics, or whatever you want to call it.
I have no idea where I got the terminology, but I was reading a lot of books and Usenet posts on the subject at the time. I'm a native English speaker, for what it's worth.
> I used 'assembler' back in high school, when I was learning about the 80x86. I remember because I was 'corrected' by fellow student who had never touched assembler, assembly language, machine code mnemonics, or whatever you want to call it.
Wow, you got the Hacker News experience 30 years in advance!
And if you write an assembler in C#, you could have an assembler in an assembly.
Other mistakes everyone makes: Referring to an instruction as an Opcode.
Reminds me a lot of BASIC I wrote back in the day, particularly the code for the bouncing sprite.
Seriously though it makes me think of how hit-or-miss Microsoft Copilot is at writing code (we have a special license to use it at work.)
For certain things such as writing short bash, CMD.EXE and PowerShell scripts it does great. It writes great list comprehensions in Python. Can convert code defining a set of typed dicts to a set of dataclasses. Can write a SQL query using an obscure (to me) feature and then rewrite it in JooQ.
But write a CTE expression in JooQ? It doesn't understand how to break the circularity.
Configure Vite? It will insist on the same wrong answers ceaselessly. On the other hand, if you look at StackOverlow the answer seems to be "you can't get here from there" or "there is this plugin that might help if it worked but it doesn't."
Where does the future come in to this? This title seems like a stretch.
Shout out to Ben Eater for enabling me to understand this article. Everything I know about Assembly and the 6502, I learned from his videos.
https://eater.net/
Same. After finishing my breadboard computer, I followed along Nand to Tetris: https://www.nand2tetris.org/
Combined, hardware is finally approachable down to the gate level, even when modern systems add many layers of abstraction.
Omg these are amazing. Have not seen these resources before
Which LLM is Amazon Q based on?
It's an internal as far as I remember. Used the be called "Olympus". As for its public status I have no clue.
https://www.pymnts.com/artificial-intelligence-2/2023/amazon...
Edit : link