Solving the Mystery of ARM7TDMI Multiply Carry Flag

(bmchtech.github.io)

43 points | by skrrtww 6 hours ago ago

17 comments

  • userbinator 6 hours ago

    And just to get this out of the way, the carry flag’s behavior after multiplication isn’t an important detail to emulate at all. Software doesn’t rely on it.

    On as fixed of a hardware as a game console, and with the accompanying anti-piracy/anti-cheating/emulation efforts of that industry, I'd expect it to be. From the history of emulating previous consoles, we know that any deterministic difference can and will be exploited, either to determine whether the hardware is authentic, or incidentally as a result of unintentional bugs.

    This reminds me of the Z80, where two undefined flags resisted analysis for several decades; a 2-year-old set of slides on the state of that here: https://archive.fosdem.org/2022/schedule/event/z80/attachmen...

    • f1shy 25 minutes ago

      >> the carry flag’s behavior after multiplication isn’t an important detail to emulate at all. Software doesn’t rely on it.

      Famous last words:

      https://www.hyrumslaw.com/

    • comex an hour ago

      While it’s a bit newer than the GBA, there is at least one Wii game with intentional anti-emulation measures:

      https://tcrf.net/Cars_2_(PlayStation_3,_Xbox_360,_Windows,_W...

    • DevilStuff an hour ago

      In the GBA scene, people didn't actually tend to exploit the carry flag at all. If there was any anti emulation, it was usually flashcart related or cpu timing related.

  • ujikoluk 3 hours ago

    > Seriously, they decided that the program counter should be a general purpose register. Why???

    Don't really understand this reaction. Why not? Seems to make for a nice regular design that the PC is just another register.

    • DevilStuff an hour ago

      The big issue is that it doesn't really need to be a GPR. You never find yourself using the PC in instructions other than in, say, the occasional add instruction for switch case jumptables, or pushes / pops. So it ends up wasting instruction space, when you could've had an additional register, or encoded a zero register (which is what AARCH64 does nowadays).

      • joosters an hour ago

        But it is (or was originally) used in lots of places, not just jump tables, generally to do relative addressing, for example when you want to refer to data nearby, e.g.

        ADD r0, r15, #200

        LDR r1, [r15, #-100]

        etc

        • DevilStuff an hour ago

          Ah I miscommunicated, I still think PC can and should be used in places like the operand of an LDR / ADD. It's using it as the output of certain instructions (and allowing it to be used as such) that I take issue with. ARMv4T allowed you to set PC as the output of basically any instruction, allowing you to create cursed instructions like this lol:

          eor pc, pc, pc

          • immibis 28 minutes ago

            Isn't writing to it except by a branch instruction undefined behaviour?

            If you can use it as an operand, it has a register number, so you can use it as a result, unless you special-case one or the other, which ARM didn't do because it was supposed to be simple. They could have ignored it by omitting some write decode circuitry, but why?

            • DevilStuff 18 minutes ago

              It's not really UB, I've seen games do things like this before. Basically, all data processing instructions can now act as branch instructions, simply by having their dest be PC. Bowser's Inside Story on the DS for example liked to use EOR to write to PC, as a form of encrypting their pointers.

              Yeah I think AARCH64 special cases it? Not too familiar with their encoding or how they achieved it. My guess as to why is that it allows you to use more helpful registers (e.g. a zero register) in data processing instructions.

              I think I can see your point though - from the perspective of ARMv4T's design, which was to be a simple yet effective CPU, making the PC a GPR does its job. Nowadays the standards are different, but I can see why it made sense at the time.

    • joosters an hour ago

      Yeah, it was that way for all previous ARM processors too, for exactly that reason. Adding special cases would have increased the transistor count, for no great benefit.

      The only downside was that it exposed internal details of the pipelining IIRC. In the ARM2, a read of the PC would give the current instruction's location + 8, rather than its actual location, because by the time the instruction 'took place' the PC had moved on. So if/when you change the pipelining for future processors, you either make older code break, or have to special case the current behaviour of returning +8.

      Anyway, I don't like their reaction. What they mean is 'this decision makes writing an emulator more tricky' but the author decides that this makes the chip designers stupid. If the author's reaction to problems is 'the chip designers were stupid and wrong, I'll write a blog post insulting them' then the problem is with the author.

      • DevilStuff an hour ago

        Hey, I'm the author, sorry it came off that way, I was really just poking fun. I should've definitely phrased that better!

        But no, I really think that making the program counter a GPR isn't a good design decision - there's pretty good reasons why no modern arches do things that way anymore. I admittedly was originally in the same boat when I first heard of ARMv4T - I thought putting the PC as a GPR was quite clean, but I soon realized it just wastes instruction space, makes branch prediction slightly more complex, decrease the number of available registers (increasing register pressure), all while providing marginal benefit to the programmer

        • joosters 40 minutes ago

          I think I misread your tone, sorry.

          It's a good article though, the explanation of how multiplies work is nicely written.

          • DevilStuff 23 minutes ago

            No worries, I'm glad you brought it up so I could amend the article to be more respectful to those that I look up to :)

        • DevilStuff an hour ago

          Anyway, I took the time to rewrite that paragraph a bit to be more respectful. :P Hopefully that'll come off better. The website take a few minutes to update.

  • skrrtww 6 hours ago

    https://shonumi.github.io/blog/nds_rolling.html

    More context on how this value affects (at least one) DS game- see post from December 27th, 2019.

    • userbinator 5 hours ago

      I'm not really familiar with ARM Asm but do you think that's handwritten Asm that its author overlooked the effects of the carry and it just happened to work, or a clever "emulator trap" added by Nintendo's compiler?