If you don't want a hard coded table, you could always do this.
unsigned kCastagnoli[256];
void InitializeCrc32(unsigned table[256], unsigned polynomial) {
unsigned d, i, r;
for (d = 0; d < 256; ++d) {
r = d;
for (i = 0; i < 8; ++i)
r = r >> 1 ^ (r & 1 ? polynomial : 0);
table[d] = r;
}
}
unsigned Castagnoli(unsigned h, unsigned long w, long n) {
long i;
static int once;
if (!once) {
InitializeCrc32(kCastagnoli, 0x82f63b78);
once = 1;
}
for (i = 0; i < n; ++i) {
h = h >> 8 ^ kCastagnoli[(h & 255) ^ (w & 255)];
w >>= 8;
}
return h;
}
That does the same thing as the crc32 instructions in x86.
I’d probably just go ahead on the original implementation and this and use uint32_t. The submitters because you might as well be more cache friendly on 64 bit systems, and here unsigned only guarantees 16 bits. People still use AVRs and the like where sizeof unsigned == 2. Granted, you’d probably use a 16 bit table there, but the portability costs nothing and gains clarity.
This version has the caveat that it is technically not thread safe.
Fair enough (though a Unisys 2200 might be a stronger example, people still use them, but neither support C99 AFAIK). But by “break” here, it's only fair to point out, if such a compiler even existed, it would have the property of definitely failing with an error at compile time for uint32_t not existing, it’s not going to compile and overflow at runtime. (And yes you will practically get a warning with the pasted code, and shame on anyone for not using -Werror, but pedantically this isn't even guaranteed sadly enough). They are completely different classes of portability defects and only one is a correctness issue.
Setting aside whether there is even a C99 compiler for anything Symbolics, or anything where uint32_t doesn’t exist (already assuming C99 or above here). Is there?
Meanwhile there are a few existing supported gcc architectures where sizeof int == 2.
Pick what level of pragmatism you prefer I guess. I prefer avoiding the standard integer types unless I can comfortably fit in their guaranteed sizes. I don’t really care about obsolete machines that will never have a C99 compiler outside a novelty.
Not sure why you're so triggered. I didn't even say you were wrong or anything. I originally thought the more pressing issue is filling half a table up with 0s. It's some idle commentary on a piece of code in a comment, relax?
this simply matches the `crc_reflected()` in section 18 of "A Painless Guide to CRC Error Detection Algorithms" [1] with hardcoded init and final xor's to match one specific CRC-32 spec. nothing wrong with that, but it isn't original at all either.
i recommend the "painless guide" for anyone constructing CRC algorithms in software. it breaks down the entire algorithm, including various trade-offs and choices, as well as for different polynomials and other parameters.
then you also have the catalogs of parameters [2] and [3]
OT: I copied the error detection code that AcuRite uses in their temperature sensors for use in my rain gauge. (I got the code by taking a look at the decoder for AcuRite temperature sensors that is included in rtl_433).
At the time I thought it was a CRC, but later I realized that it isn't. I've been trying to find out what the name is for that kind of code but have failed. I wonder if anyone here happens to know?
Here is the code:
byte check_code(byte const message[], unsigned nbytes, byte gen, byte key)
{
byte sum = 0;
for (unsigned k = 0; k < nbytes; ++k) {
byte data = message[k];
for (int i = 7; i >= 0; --i) {
if ((data >> i) & 1)
sum ^= key;
if (key & 1)
key = (key >> 1) ^ gen;
else
key = (key >> 1);
}
}
return sum;
}
I thought it was a CRC because I saw it shifting and saw it xoring when it shifted out a 1. That's quite reminiscent of the method of CRC computation that treats bit strings as representing polynomials in GF(2) and computes the CRC by doing polynomial division essentially the same way we would do it by hand.
But when computing a CRC that way what you xor the data with is a constant (the representation of the generator polynomial). In the algorithm above the constant is not xor'ed with the data. In fact nothing is xor'ed with the data.
The constant, named 'gen' which is another reason I thought at first it was a CRC, is actually the feedback polynomial for a Galois linear feedback shift register (LFSR), which is seeded with the 'key' parameter. That LFSR generates one byte for every bit of the message, and the bytes that are generated for the message bits that are set are xor'ed together, and it is the result of that that is the check code.
In higher level terms the general approach it is using can be described like this in a Pythonish pseudocode, assuming PRNG() is a psuedorandom byte generator, and seed_PRNG() is a function to seed that generator:
def check_code(message, key):
sum = 0
seed_PRNG(key)
foreach bit in message:
next_random = PRNG()
if bit == 1:
sum ^= next_random
return sum
Pick a Galois LSFR with feedback polynomial 'gen' as your PRNG and that matches the AcuRite code.
A normal table-based crc32. Since we don't see the generator, we cannot see the polynom. You usually put the generator in the code
If you don't want a hard coded table, you could always do this.
That does the same thing as the crc32 instructions in x86.I’d probably just go ahead on the original implementation and this and use uint32_t. The submitters because you might as well be more cache friendly on 64 bit systems, and here unsigned only guarantees 16 bits. People still use AVRs and the like where sizeof unsigned == 2. Granted, you’d probably use a 16 bit table there, but the portability costs nothing and gains clarity.
This version has the caveat that it is technically not thread safe.
What if you've got a Symbolics 3600 which has 36 bit words?
Your suggestion will break on that.
We clearly want uint_least32_t.
Fair enough (though a Unisys 2200 might be a stronger example, people still use them, but neither support C99 AFAIK). But by “break” here, it's only fair to point out, if such a compiler even existed, it would have the property of definitely failing with an error at compile time for uint32_t not existing, it’s not going to compile and overflow at runtime. (And yes you will practically get a warning with the pasted code, and shame on anyone for not using -Werror, but pedantically this isn't even guaranteed sadly enough). They are completely different classes of portability defects and only one is a correctness issue.
Setting aside whether there is even a C99 compiler for anything Symbolics, or anything where uint32_t doesn’t exist (already assuming C99 or above here). Is there? Meanwhile there are a few existing supported gcc architectures where sizeof int == 2.
Pick what level of pragmatism you prefer I guess. I prefer avoiding the standard integer types unless I can comfortably fit in their guaranteed sizes. I don’t really care about obsolete machines that will never have a C99 compiler outside a novelty.
> I don’t really care about obsolete machines that will never have a C99 compiler outside a novelty.
And you know what? I don't care about people who use C on 16-bit computers, unless I'm being paid to.
I've written plenty of assembly they can consume, like https://justine.lol/sectorlisp2/
Not sure why you're so triggered. I didn't even say you were wrong or anything. I originally thought the more pressing issue is filling half a table up with 0s. It's some idle commentary on a piece of code in a comment, relax?
I suggest you write a few test cases to verify that it works reliably. I have my doubts about that, given this code fragment:
I replaced that code with
Would this be better?How would you suggest I replace that? I am a C novice but I have heard of the buffer overflows caused by scanf.
FTR, an implementation in “low-level” JavaScript:
From https://GitHub.com/PaulCapron/pwa2uwp/blob/master/src/zip.jsthis simply matches the `crc_reflected()` in section 18 of "A Painless Guide to CRC Error Detection Algorithms" [1] with hardcoded init and final xor's to match one specific CRC-32 spec. nothing wrong with that, but it isn't original at all either.
i recommend the "painless guide" for anyone constructing CRC algorithms in software. it breaks down the entire algorithm, including various trade-offs and choices, as well as for different polynomials and other parameters.
then you also have the catalogs of parameters [2] and [3]
[1] https://www.zlib.net/crc_v3.txt [2] https://reveng.sourceforge.io/crc-catalogue/ [3] https://users.ece.cmu.edu/~koopman/crc/
OT: I copied the error detection code that AcuRite uses in their temperature sensors for use in my rain gauge. (I got the code by taking a look at the decoder for AcuRite temperature sensors that is included in rtl_433).
At the time I thought it was a CRC, but later I realized that it isn't. I've been trying to find out what the name is for that kind of code but have failed. I wonder if anyone here happens to know?
Here is the code:
I thought it was a CRC because I saw it shifting and saw it xoring when it shifted out a 1. That's quite reminiscent of the method of CRC computation that treats bit strings as representing polynomials in GF(2) and computes the CRC by doing polynomial division essentially the same way we would do it by hand.But when computing a CRC that way what you xor the data with is a constant (the representation of the generator polynomial). In the algorithm above the constant is not xor'ed with the data. In fact nothing is xor'ed with the data.
The constant, named 'gen' which is another reason I thought at first it was a CRC, is actually the feedback polynomial for a Galois linear feedback shift register (LFSR), which is seeded with the 'key' parameter. That LFSR generates one byte for every bit of the message, and the bytes that are generated for the message bits that are set are xor'ed together, and it is the result of that that is the check code.
In higher level terms the general approach it is using can be described like this in a Pythonish pseudocode, assuming PRNG() is a psuedorandom byte generator, and seed_PRNG() is a function to seed that generator:
Pick a Galois LSFR with feedback polynomial 'gen' as your PRNG and that matches the AcuRite code.So there are many different flavors of crc32. Usually you can always change the initialization value. This algorithm is only one use case.
Check this table out https://www.crccalc.com/?crc=123456789&method=&datatype=0&ou...
I would really appreciate someone documenting exactly which variant of CRC32 is being used here.
CRC-32/ISO-HDLC
https://www.crccalc.com/?crc=123456789&method=CRC-32%2FISO-H...