The more and more bit-hacks I see, the more convinced I am that they should be handled by a cleverer compiler. Some intrinsic(s) where you say “do x, y, to these bits” and it just figures out the optimum way of doing it for whatever platform you compile for.
That's a really cool side. Sad that it doesn't allow duplicate indices.
I tried implementing the same thing but with support for duplicates and it was definitely a humbling exercise.
I'm curious if there is any formal mathematical theory for this (shifted-off bits would probably make it less elegant than pure rotation)
that page + the book Hacker's Delight are a big part of the reason I am so convinced of this :)
If compilers/languages/standard libraries provided these bit permutations, and it was just something ~everyone had learned, it would be a lot easier to work with bits without needing to come up with the bitwise ops (or use that generator). In addition it would probably make better use of the hardware: sure, people like to pretend that we’re still programming C for PDP11, but modern hardware supports more operations than C has operators for (RISC-V B extension and co have the right idea <3)
Modern compilers are probably pretty good? but I doubt they are perfect at turning code like that in OP into the best instructions.
It is probably a bit late for C/C++ though. maaaybe possible to get it into LLVM and Rust.
That's more like the opposite, OP wants to tell the compiler what transform to do and have it figure out a solution, but in your example you give the compiler a solution and it works backwards to figure out the intent and rewrite it to a completely different solution.
Most languages do have intrinsics for ROL/ROR at least, which you should generally use instead of relying on optimizer magic. I've certainly run into cases where those magic patterns don't get optimized (looking at you MSVC) but intrinsics always work.
Not per file, per block of 16 pixels. They are transferred quite often - BCn is a texture compression scheme for realtime use in GPUs. Those textures are stored and processed in compressed form.
Nice bit twiddling hack! There are also the PDEP and PEXT instructions on x86 that could be used for this purpose, but it's possibly not worth it here.
But is easily generalized to inserting k bits:
The more and more bit-hacks I see, the more convinced I am that they should be handled by a cleverer compiler. Some intrinsic(s) where you say “do x, y, to these bits” and it just figures out the optimum way of doing it for whatever platform you compile for.
This tool is handy if you want a specific bit shuffle:
https://programming.sirrida.de/calcperm.php
It doesn't always find the optimal solution but it usually gets close.
That's a really cool side. Sad that it doesn't allow duplicate indices. I tried implementing the same thing but with support for duplicates and it was definitely a humbling exercise. I'm curious if there is any formal mathematical theory for this (shifted-off bits would probably make it less elegant than pure rotation)
that page + the book Hacker's Delight are a big part of the reason I am so convinced of this :)
If compilers/languages/standard libraries provided these bit permutations, and it was just something ~everyone had learned, it would be a lot easier to work with bits without needing to come up with the bitwise ops (or use that generator). In addition it would probably make better use of the hardware: sure, people like to pretend that we’re still programming C for PDP11, but modern hardware supports more operations than C has operators for (RISC-V B extension and co have the right idea <3)
Modern compilers are probably pretty good? but I doubt they are perfect at turning code like that in OP into the best instructions. It is probably a bit late for C/C++ though. maaaybe possible to get it into LLVM and Rust.
Well it is already the case, for example things like ROR or ROL are generated by optimizers[1] from certain patterns of bitops.
[1]: https://godbolt.org/z/eE1cx9GT4
That's more like the opposite, OP wants to tell the compiler what transform to do and have it figure out a solution, but in your example you give the compiler a solution and it works backwards to figure out the intent and rewrite it to a completely different solution.
Most languages do have intrinsics for ROL/ROR at least, which you should generally use instead of relying on optimizer magic. I've certainly run into cases where those magic patterns don't get optimized (looking at you MSVC) but intrinsics always work.
> there’s only ever between 1 and 3 anchor bits
How often are these texturers transfered and how small are they?
Doing all this to avoid sending at most three bits for a texture file sounds like a colossal waste of human life and of money.
Not per file, per block of 16 pixels. They are transferred quite often - BCn is a texture compression scheme for realtime use in GPUs. Those textures are stored and processed in compressed form.
https://en.wikipedia.org/wiki/S3_Texture_Compression
Nice bit twiddling hack! There are also the PDEP and PEXT instructions on x86 that could be used for this purpose, but it's possibly not worth it here.
On arm you can use https://developer.arm.com/documentation/dui0489/i/arm-and-th...
which is pretty convenient for this use-case.
Am I missing something or does that instruction only clear bits in the field and not shift the adjacent bits to contract the field to be empty?