> You might ask: how do you know whether or not software has been fuzzed?
zbar has great barcode reading performance! I've seen far newer software that's nowhere near as good in terms of real-world performance.
But it seems the original developer hasn't updated it since 2009 [1] - and fuzz testing only rose to prominence in ~2012 with the rise of tools like afl-fuzz.
I would be absolutely astonished if it had ever been fuzzed.
> Cut out any unnecessary features to limit attack vectors. ZBar by default scans all code types, which means that an attacker can trigger a bug in any of the scanners. If you only need to scan QR codes for instance, then ZBar can be configured to do so in the code
Absolutely sensible, yes.
Not just for security, but also because packages sometimes have extra barcodes. If you're scanning an EAN-13 on a pack of pasta, decoding a QR code for a pasta recipe website is just going to confuse things :)
I've seen the "overzealous barcode scanner" issue happen with some gas station POS systems, to the point where the seasoned cashiers know to cover the QR codes with their fingers before attempting to scan an item.
Sounds like the POS software isn't controlling the reader well, maybe because it wasn't adjusted for this model of reader. Or the reader's firmware could have been misconfigured, from what it's supposed to be for that POS setup.
The modern reader firmware tend to have multiple modes and many options. Some modes are as simple as "scan whatever you see out of the many formats you support, and spit out the decoded value of something as USB Serial". Or, worse, "...as USB Keyboard".
You can imagine how easy those modes are to integrate with POS software, without implementing the proprietary protocol for that device, and you can also imagine how poorly that can work out.
If you owned a store with a POS setup with flaky reader behavior like this, and were stuck with it, you could try reconfiguring the reader (to, say, disable QR support). This reprogramming can sometimes be done via documented protocol, via sketchy Windows software, or via... barcode... Careful you don't make it worse.
(Our startup used modern readers (multiple 1D formats, QR, NFC) for a factory station, and had to do a lot of experimenting with different brands and models, to get the behavior and speed we needed. We even managed to brick a reader, just with configuration changes, not flashing firmware.)
I went to a meeting the other day in a building with a touch screen registration system. The woman in front of me was struggling with it. Every time she tapped the register button the system decided that some part of her was a badly formed barcode, printed an error message and exited back to the menu. She eventually got it working by moving to the side until it wanted to take her picture.
Absolutely. I helped with a physical inventory count project using smartphones as the "terminals". The barcode app we didn't allow us to selectively turn off symbologies. We ended up with a ton of links to recipes, websites, etc in the data.
It's also a common annoyance in grocery store apps.
Kroger, for example, has an app that allows you to scan items to add them to a virtual cart as you shop and avoid scanning them at the register... however the same app is used to read QR codes on in-store coupons, which are "helpfully" placed very close to the price tags with UPC barcodes on them.
If I want to scan one of those coupon QR codes, I need to either start with the camera very close to the QR code or cover the barcode with my finger.
I once reported a bug to a barcode decoding library, reporting that it crashed when the barcode contained a zero byte. They responded that they wouldn't fix it because barcodes aren't supposed to contain zero bytes.
"But it crashed. That's bad. I can't stop people scanning bad barcodes."
"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook
> They responded that they wouldn't fix it because barcodes aren't supposed to contain zero bytes.
Sad. What a poor understanding of our field.
The number one rule of them all is: "Never trust (user) input".
A slightly more powerful variation being: "assume all input is malicious until proven otherwise".
I mean: on one hand there are people who fuzz, who test, who think about edge cases, who think about security, who think about uptime, etc. And OTOH you have people saying "such input shouldn't happen". It's just really pathetic.
I think a difference between an application and a library (or module, etc) is that it is ok for the latter to expect sanitized input and be wrapped in try/catch blocks. The world is less finite than code and a module might be deployed in a variety of contexts which might make some checks undesirable.
In computing, the robustness principle is a design guideline for software that states: "be conservative in what you do, be liberal in what you accept from others". It is often reworded as: "be conservative in what you send, be liberal in what you accept". The principle is also known as Postel's law, after Jon Postel, who used the wording in an early specification of TCP.
If that's the case, the library should also have another function or method that can validate the barcode if the application should so choose. The library is the barcode expert, the app is the business logic expert. Expecting every app to now become barcode experts doesn't make sense.
Also, that law gets quoted, and IMO is a rather large design mistake.
The library also has the best chance to fix and prevent security issues systemically. I have played this game for a while now. Library engineers often want to pass the buck onto users of their tools. That is not good developer or user experience. Also crashing is the opposite of robust.
> Surprisingly, libFuzzer struggled to figure out that input should be of size 1024 and couldn’t start fuzzing.
Is this surprising? Does libFuzzer support Redqueen or laf-intel like AFL++ [0][1] which will pick up on any comparisons (like a comparison to size=1024) and fuzz with the intention of changing that comparison to become true or false (to put it overly simple)?
If I wanted to learn more about fuzzing, does anyone have suggestions?
I'd love to get to a point I could fuzz a program but the gulf of execution is vast -- I enjoyed attempting OSCP, but I can't keep paying for lab extensions.
(I also have a gut feeling there's a lot of unfuzzed apps which people don't look at because they're utilitarian and don't use the network much. So if I can phish you, then leverage some innocuous tool for RCE or whatever... useful.)
But I've struggled to find resources on this topic -- anyone know of a book, course, or wiki?
The authors of this blog (FD: my company) have a testing handbook[1], which has a full chapter dedicated to fuzzing[2]. We're always open to feedback on it!
I'm learning about fuzzing too, and I just wrote a tutorial about what I learned so far.[0]
The issue I found with a lot of fuzzing tutorials is that they're difficult to reproduce because there's a lot of work in setting up the environment and toolchain. In my tutorial, you can kick off fuzzing with one command, but I also walk through how I created the workflow step by step.
I would start with the AFL++ documentation (https://aflplus.plus/features/), and an open source program that you want to fuzz. The easiest programs to fuzz with AFL are ones that parse a file format from the command line, the smaller the better and written in C or C++ (just for ease of recompiling with instrumentation).
Parsing network protocols and ABIs is possible, but usually requires a fair amount of coding.
I don't quite follow the input - does this mean they created Barcodes or Data Codes that crashed the library? I.e. something that I can print out and that might break a few devices if printed on, for example, my luggage before checking it in?
You got it. Crashing the device where the barcode is being interpreted (and possible getting arbitrary code execution).
Secondarily, there's probably also a rich vein to be mined scanning barcodes like "'); DROP TABLE Item" that would exploit systems further up the chain. That's not what this article is covering (since they're just looking at the barcode scanning library).
There would be some fun in carrying around a bunch of "edge case" barcodes ("programming" barcodes for various kinds of scanners, SQL injection attacks, etc) and feeding them to unsupervised barcode scanners "in the wild" to see what happens.
My interpretation of the original article is they use the fuzzer to find an arbitrary very small bitmap input which when passed to the library causes it to crash. It’s unclear if the input image is even a valid bitmap image format that would correctly open in an image viewer.
This is definitely still a problem because there might be situations where you’re allowing an end user to pass an image file in and are then passing it unmodified to this library to interpret the barcode in it, but it’s not the same as some special barcode that encodes data that crashes the library.
So for example this blog entry does not describe a situation where you can just print out a barcode and when you scan the barcode then the library crashes or has the opportunity for arbitrary code execution. That would be a very exciting exploit. They don’t actually rule out the possibility, but they didn’t get anywhere near fuzzing at that level in this blog post.
Crashing the library - and potential arbitrary code execution!
However, zbar isn't used all that widely in industry. The airport's baggage handling system is much more likely to have a self-contained scanner from Cognex or Omron or Zebra running propriety, closed-source software.
Only slightly related but on the topic of barcodes and security I'd like to recommend this excellent talk by Felix Lindner, it is quite a few years old but I'd guess stuff like barcode scanners are not the most frequently updated things:
Kind of sad to see that the library "custodian" as it were seemingly uninterested in fixing the software in question. This may not effect most commercial scanners but the fact that it is even out there in wild is a bit disconcerting to say the least. Just another "brick in the wall" insofar as supply-chain (in)security goes....
> You might ask: how do you know whether or not software has been fuzzed?
zbar has great barcode reading performance! I've seen far newer software that's nowhere near as good in terms of real-world performance.
But it seems the original developer hasn't updated it since 2009 [1] - and fuzz testing only rose to prominence in ~2012 with the rise of tools like afl-fuzz.
I would be absolutely astonished if it had ever been fuzzed.
> Cut out any unnecessary features to limit attack vectors. ZBar by default scans all code types, which means that an attacker can trigger a bug in any of the scanners. If you only need to scan QR codes for instance, then ZBar can be configured to do so in the code
Absolutely sensible, yes.
Not just for security, but also because packages sometimes have extra barcodes. If you're scanning an EAN-13 on a pack of pasta, decoding a QR code for a pasta recipe website is just going to confuse things :)
[1] https://sourceforge.net/projects/zbar/files/zbar/
I've seen the "overzealous barcode scanner" issue happen with some gas station POS systems, to the point where the seasoned cashiers know to cover the QR codes with their fingers before attempting to scan an item.
Sounds like the POS software isn't controlling the reader well, maybe because it wasn't adjusted for this model of reader. Or the reader's firmware could have been misconfigured, from what it's supposed to be for that POS setup.
The modern reader firmware tend to have multiple modes and many options. Some modes are as simple as "scan whatever you see out of the many formats you support, and spit out the decoded value of something as USB Serial". Or, worse, "...as USB Keyboard".
You can imagine how easy those modes are to integrate with POS software, without implementing the proprietary protocol for that device, and you can also imagine how poorly that can work out.
If you owned a store with a POS setup with flaky reader behavior like this, and were stuck with it, you could try reconfiguring the reader (to, say, disable QR support). This reprogramming can sometimes be done via documented protocol, via sketchy Windows software, or via... barcode... Careful you don't make it worse.
(Our startup used modern readers (multiple 1D formats, QR, NFC) for a factory station, and had to do a lot of experimenting with different brands and models, to get the behavior and speed we needed. We even managed to brick a reader, just with configuration changes, not flashing firmware.)
I went to a meeting the other day in a building with a touch screen registration system. The woman in front of me was struggling with it. Every time she tapped the register button the system decided that some part of her was a badly formed barcode, printed an error message and exited back to the menu. She eventually got it working by moving to the side until it wanted to take her picture.
Absolutely. I helped with a physical inventory count project using smartphones as the "terminals". The barcode app we didn't allow us to selectively turn off symbologies. We ended up with a ton of links to recipes, websites, etc in the data.
Reminds me of the Jurassic Park novel where they ask the computer to find 10 velociraptors on the island and it finds 10. And they actually have 20.
Can you really blame the computer tho? That sounds more like a case a PEBCAC, if you ask me...
It's also a common annoyance in grocery store apps.
Kroger, for example, has an app that allows you to scan items to add them to a virtual cart as you shop and avoid scanning them at the register... however the same app is used to read QR codes on in-store coupons, which are "helpfully" placed very close to the price tags with UPC barcodes on them.
If I want to scan one of those coupon QR codes, I need to either start with the camera very close to the QR code or cover the barcode with my finger.
It appears to have been forked: https://github.com/mchehab/zbar
I once reported a bug to a barcode decoding library, reporting that it crashed when the barcode contained a zero byte. They responded that they wouldn't fix it because barcodes aren't supposed to contain zero bytes.
"But it crashed. That's bad. I can't stop people scanning bad barcodes."
Do you by chance remember which library, and which barcode symbology? (barcode library developer here :-)
"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook
> They responded that they wouldn't fix it because barcodes aren't supposed to contain zero bytes.
Sad. What a poor understanding of our field.
The number one rule of them all is: "Never trust (user) input".
A slightly more powerful variation being: "assume all input is malicious until proven otherwise".
I mean: on one hand there are people who fuzz, who test, who think about edge cases, who think about security, who think about uptime, etc. And OTOH you have people saying "such input shouldn't happen". It's just really pathetic.
I think a difference between an application and a library (or module, etc) is that it is ok for the latter to expect sanitized input and be wrapped in try/catch blocks. The world is less finite than code and a module might be deployed in a variety of contexts which might make some checks undesirable.
In computing, the robustness principle is a design guideline for software that states: "be conservative in what you do, be liberal in what you accept from others". It is often reworded as: "be conservative in what you send, be liberal in what you accept". The principle is also known as Postel's law, after Jon Postel, who used the wording in an early specification of TCP.
https://en.wikipedia.org/wiki/Robustness_principle
If that's the case, the library should also have another function or method that can validate the barcode if the application should so choose. The library is the barcode expert, the app is the business logic expert. Expecting every app to now become barcode experts doesn't make sense.
Also, that law gets quoted, and IMO is a rather large design mistake.
The library also has the best chance to fix and prevent security issues systemically. I have played this game for a while now. Library engineers often want to pass the buck onto users of their tools. That is not good developer or user experience. Also crashing is the opposite of robust.
Malformed data is a fact of life. A parser should gracefully fail when this eventuality happens.
> Surprisingly, libFuzzer struggled to figure out that input should be of size 1024 and couldn’t start fuzzing.
Is this surprising? Does libFuzzer support Redqueen or laf-intel like AFL++ [0][1] which will pick up on any comparisons (like a comparison to size=1024) and fuzz with the intention of changing that comparison to become true or false (to put it overly simple)?
0: https://github.com/AFLplusplus/AFLplusplus/blob/stable/instr...
1: https://github.com/AFLplusplus/AFLplusplus/blob/stable/instr...
If I wanted to learn more about fuzzing, does anyone have suggestions?
I'd love to get to a point I could fuzz a program but the gulf of execution is vast -- I enjoyed attempting OSCP, but I can't keep paying for lab extensions.
(I also have a gut feeling there's a lot of unfuzzed apps which people don't look at because they're utilitarian and don't use the network much. So if I can phish you, then leverage some innocuous tool for RCE or whatever... useful.)
But I've struggled to find resources on this topic -- anyone know of a book, course, or wiki?
The authors of this blog (FD: my company) have a testing handbook[1], which has a full chapter dedicated to fuzzing[2]. We're always open to feedback on it!
[1]: https://appsec.guide/
[2]: https://appsec.guide/docs/fuzzing/
I'm learning about fuzzing too, and I just wrote a tutorial about what I learned so far.[0]
The issue I found with a lot of fuzzing tutorials is that they're difficult to reproduce because there's a lot of work in setting up the environment and toolchain. In my tutorial, you can kick off fuzzing with one command, but I also walk through how I created the workflow step by step.
[0] https://mtlynch.io/nix-fuzz-testing-1/
I would start with the AFL++ documentation (https://aflplus.plus/features/), and an open source program that you want to fuzz. The easiest programs to fuzz with AFL are ones that parse a file format from the command line, the smaller the better and written in C or C++ (just for ease of recompiling with instrumentation).
Parsing network protocols and ABIs is possible, but usually requires a fair amount of coding.
https://github.com/antonio-morales/Fuzzing101
Is a good course
I'm working with barcode scanners and difficulties handling a variety of inputs.
My boss keeps telling me "it's not that difficult". I keep telling him "it's more difficult than you believe".
I don't quite follow the input - does this mean they created Barcodes or Data Codes that crashed the library? I.e. something that I can print out and that might break a few devices if printed on, for example, my luggage before checking it in?
You got it. Crashing the device where the barcode is being interpreted (and possible getting arbitrary code execution).
Secondarily, there's probably also a rich vein to be mined scanning barcodes like "'); DROP TABLE Item" that would exploit systems further up the chain. That's not what this article is covering (since they're just looking at the barcode scanning library).
There would be some fun in carrying around a bunch of "edge case" barcodes ("programming" barcodes for various kinds of scanners, SQL injection attacks, etc) and feeding them to unsupervised barcode scanners "in the wild" to see what happens.
My interpretation of the original article is they use the fuzzer to find an arbitrary very small bitmap input which when passed to the library causes it to crash. It’s unclear if the input image is even a valid bitmap image format that would correctly open in an image viewer.
This is definitely still a problem because there might be situations where you’re allowing an end user to pass an image file in and are then passing it unmodified to this library to interpret the barcode in it, but it’s not the same as some special barcode that encodes data that crashes the library.
So for example this blog entry does not describe a situation where you can just print out a barcode and when you scan the barcode then the library crashes or has the opportunity for arbitrary code execution. That would be a very exciting exploit. They don’t actually rule out the possibility, but they didn’t get anywhere near fuzzing at that level in this blog post.
Crashing the library - and potential arbitrary code execution!
However, zbar isn't used all that widely in industry. The airport's baggage handling system is much more likely to have a self-contained scanner from Cognex or Omron or Zebra running propriety, closed-source software.
Only slightly related but on the topic of barcodes and security I'd like to recommend this excellent talk by Felix Lindner, it is quite a few years old but I'd guess stuff like barcode scanners are not the most frequently updated things:
Toying with barcodes - https://www.youtube.com/watch?v=QCtdEYnlykA
Kind of sad to see that the library "custodian" as it were seemingly uninterested in fixing the software in question. This may not effect most commercial scanners but the fact that it is even out there in wild is a bit disconcerting to say the least. Just another "brick in the wall" insofar as supply-chain (in)security goes....
There could be any number of reasons for that apart from negligence. AFAIK it’s a single person, so „bus factor“ comes to mind.
See, I've never tried to do barcode decoding in software via images - I've always used an imager with internal decoding.