> produce full-color images that are equal in quality to those produced by conventional cameras
I was really skeptical of this since the article conveniently doesn't include any photos taken by the nano-camera, but there are examples [1] in the original paper that are pretty impressive.
Those images are certainly impressive, but I certainly don't agree with the statement "equal in quality to those produced by conventional cameras": they're quite obviously lacking in sharpness and color.
There's one of those Taboola type ads going around with a similar image that suggests it is a close up of belly fat. Given the source and their propensity for using images unrelated to topic, so not sure if that's what it really is.
I wonder how they took pictures with four different cameras from the exact same position at the exact same point in time. Maybe the chameleon was staying very still, and maybe the flowers were indoors and that's why they didn't move in the breeze, and they used a special rock-solid mount that kept all three cameras perfectly aligned with microscopic precision. Or maybe these aren't genuine demonstrations, just mock-ups, and they didn't even really have a chameleon.
They didn't really have a chameleon. See "Experimental setup" in the linked paper [emphasis mine]:
> After fabrication of the meta-optic, we account for fabrication error by performing a PSF calibration step. This is accomplished by using an optical relay system to image a pinhole illuminated by fiber-coupled LEDs. We then conduct imaging experiments by replacing the pinhole with an OLED monitor. The OLED monitor is used to display images that will be captured by our nano-optic imager.
But shooting a real chameleon is irrelevant to what they're trying to demonstrate here.
At the scales they're working at here ("nano-optics"), there's no travel distance for chromatic distortion to take place within the lens. Therefore, whether they're shooting a 3D scene (a chameleon) or a 2D scene (an OLED monitor showing a picture of a chameleon), the light that makes it through their tiny lens to hit the sensor is going to be the same.
(That's the intuitive explanation, at least; the technical explanation is a bit stranger, as the lens is sub-wavelength – and shaped into structures that act as antennae for specific light frequencies. You might say that all the lens is doing is chromatic distortion — but in a very controlled manner, "funnelling" each frequency of inbound light to a specific part of the sensor, somewhat like a MIMO antenna "funnels" each frequency-band of signal to a specific ADC+DSP. Which amounts to the same thing: this lens doesn't "see" any difference between 3D scenes and 2D images of those scenes.)
Given the size of their camera, you could glue it to the center of another camera’s lens with relatively insignificant effect on the larger camera’s performance.
what happens when you go too far from trusting what you see/read/hear on the internet? simple logic gets tossed out like a baby in the bathwater.
now, here's the rig I'd love to see with this: take a hundred of them and position them like a bug's eye to see what could be done with that. there'd be so much overlapping coverage that 3D would be possible, yet the parallax would be so small that makes me wonder how much depth would be discernible
> Ultrathin meta-optics utilize subwavelength nano-antennas to modulate incident light with greater design freedom and space-bandwidth product over conventional diffractive optical elements (DOEs).
Is this basically a visible-wavelength beamsteering phased array?
Interesting. This idea appears pretty much exactly at the end of Bob Shaw's 1972 SFnal collection Other Days, Other Eyes. The starting premise is the invention of "slow glass" that looks like an irrelevant gimmick but ends up revolutionizing all sorts of things, and the final bits envisage a disturbing surveillance society with these tiny passive cameras spread everywhere.
It's a good read; I don't think the extrapolation of one technical advance has ever been done better.
How does this work? If it's just reconstructing the images with nn, a la Samsung pasting a picture of the moon when it detected a white disc on the image, it's not very impressive.
I had the same thought, but it sounds like this operates at a much lower level than that kind of thing:
> Then, a physics-based neural network was used to process the images captured by the meta-optics camera. Because the neural network was trained on metasurface physics, it can remove aberrations produced by the camera.
I'd like to see some examples showing how it does when taking a picture of completely random fractal noise. That should show it's not just trained to reconstruct known image patterns.
Generally it's probably wise to be skeptical of anything that appears to get around the diffraction limit.
I believe the claim is that the NN is trained to reconstruct pixels, not images. As in so many areas, the diffraction limit is probabalistic so combining information from multiple overlapping samples and NNs trained on known diffracted -> accurate pairs may well recover information.
You’re right that it might fail on noise with resolution fine enough to break assumptions from the NN training set. But that’s not a super common application for cameras, and traditional cameras have their own limitations.
Not saying we shouldn’t be skeptical, just that there is a plausible mechanism here.
we've had very good chromatic aberration correction since I got a degree in imaging technology and that was over 20 years ago so I'd imagine it's not particularly difficult for name your flavour of ML.
My concern would be that if it can't produce accurate results on a random noise test, then how do we trust that it actually produces accurate results (as opposed to merely plausible results) on normal images?
Multilevel fractal noise specifically would give an indication of how fine you can go.
Years ago I saw an interview with a futurist that mentioned the following:
"One day, your kids will go to the toy store and get a sheet of stickers. Each sticker is actually a camera with an IPv6 address. That means they can put a sticker somewhere, go and point a browser at that address and see a live camera feed.
I should point out: all of the technology to do this already exists, it just hasn't gotten cheap enough to mass market. When economies of scale do kick in, society is going to have to deal with a dramatic change in what they think 'physical privacy' means."
Given the tiny dimensions, and wide field, adding regular lenses over an array could create extreme wide field, like 160x160 degrees, for everyday phone cameras. Or very small 360x180 degree stand-alone cameras. AR glasses with a few cameras could operate with 360x160 degrees and be extremely situationally aware!
Another application would be small light field cameras. I don't know enough to judge if this is directly applicable, or adaptable to that. But it would be wonderful to finally have small cheap light field cameras. Both for post-focus adjustment and (better than stereo) 3D image sensing and scene reconstruction.
Chalk another one up for Vernor Vinge. This tech seems like it could directly enable the “ubiquitous surveillance” from _A Deepness in the Sky_. Definitely something to watch closely.
3 or 4 mm in diameter, according to a scene in chapter 6, big enough to have similar resolution to that of a human eye, according to Paul, but able to look in any direction without physically rotating.
In chapter 13 the enemy describes them as using Fourier optics, though that seemed to be their speculation - not sure whether it was right.
I've been interested in smart dust for a while; recently the news seems to have dried up, and while that may have been other stuff taking up all the attention (and investment money), I suspect that many R&D teams went under government NDAs because they are now good enough to be interesting.
The other side to the localizers is the communication / mesh networking, and the extremely effective security partitioning. Even Anne couldn't crack them! It's certainly a lot to package in such a small form
Everyone here is thinking about privacy and surveillance and here I am wondering if this is what lets us speed up nano cameras to relativistic speeds with lasers to image other solar systems up close.
We'll need even bigger[1] breakthroughs in propulsion if it's going to be self-propelling itself back to Sol at relativistic speeds.
1. A "simpler" sci-fi solution foe a 1-way trip that's still out of our reach is a large light sail and huge Earth-based laser, but his required "smaller" breakthroughs in material science
Well if you can propel something forward you can propel it backwards as well.
I'm assuming some sort of fixed laser type propulsion mechanism would leverage a type of solar sail technology. Maybe you could send a phased laser signal that "vibrates" a solar sail towards the source of energy instead of away.
> Well if you can propel something forward you can propel it backwards as well
Not necessarily - at least with currently known science. Light sails work ok transferring momentum from photons, allowing positive acceleration from a giant laser Earth. Return trip requires a giant laser on the other side.
I meant to say the "simpler" (but still very complicated) solar sail approach was for a one-way trip. On paper, our civilization can muster the energy required to accelerate tiny masses to relativistic speeds. A return trip at those speeds would require a nee type of science to concentrate that amount of energy in a small mass and use it for controlled propulsion.
> The meta-optics camera is the first device of its kind to produce full-color images that are equal in quality to those produced by conventional cameras, which are an order of magnitude larger. In fact, the meta-optics camera is 500,000 times smaller than conventional cameras that capture the same level of image quality.
That would make them 6 orders of magnitude larger.
Are they not? Every modern camera does the same thing. Upscaling, denoising, deblurring, adjusting colors, bumping and dropping shadows and highlights, pretty much no aspect of the picture is the way the sensor sees it once the rest of the pipeline is done. Phone cameras do this to a more extreme degree than say pro cameras, but they all do it.
To point out the obvious, film cameras don't, nor do many digital cameras. Unless you mean modern in the sense of "cameras you can buy from best buy right now", of course. But that isn't very interesting: best buy has terrible taste in cameras.
There are a lot of steps like that provided you want an image that you want to show to the user (i.e. Jpeg).
You do have somehow merge the 3 Bayer filter detections on rectangular grid, which involves interpolation. You do have to subtract some sort of bias in a detector, possibly correct for different sensitivity across the detector. You have to map the raw 'electron counts' into Jpeg scale which involves another set of decisions/image processing steps
There is clear processing in terms of interpreting the raw sensor data as you're describing. Then there are blurrier processes still, like "denoising" and "upscaling", which straddle the line between bias-correction and alteration. Then there's modification of actual color and luminance as the parent was describing. Now we're seeing full alterations applied automatically with neural nets, literally altering shapes and shadows and natural lighting phenomena.
I think it's useful to distinguish all of these even if they are desired. I really love my iPhone camera, but there's something deeply unsettling about how it alters the photos. It's fundamentally producing a different image you can get with either film or through your eyes. Naturally this is true for all digital sensors but we once could point out specifically how and why the resulting image differs from what our eyes see. It's no longer easy to even enumerate the possible alterations that go on via software, let alone control many of them, and I think there will be backlash at some point (or stated differently, a market for cameras that allow controlling this).
I've got to imagine it's frustrating for people who rely on their phone cameras for daily work to find out that upgrading a phone necessarily means relearning its foibles and adjusting how you shoot to accommodate it. Granted, I mostly take smartphone photos in situations where i'd rather not be neurotic about the result (candids, memories, reminders, etc) but surely there are professionals out there who can speak to this.
The cameras themselves might not, but in order to get a decent picture you will need to apply demosaicing and gamma correction in software at the very least, even with high end cameras.
Right, and the point ppl are making upthread is that deterministic signal processing and probabilistic reconstruction approaches are apples and oranges.
"The Neural network Image Processing features in this camera are arguably even more important here than they are in the R5 Mark II. A combination of deep learning and algorithmic AI is used to power In-Camera Upscaling, which transforms the pedestrian-resolution 24.2MP images into pixel-packed 96MP photos – immediately outclassing every full-frame camera on the market, and effectively hitting GFX and Hasselblad territory.
"On top of that is High ISO Noise Reduction, which uses AI to denoise images by 2 stops. It works wonders when you're pushing those higher ISOs, which are already way cleaner than you'd expect thanks to the flagship image sensor and modest pixel count."
The paper says that reconstructing an actual image from the raw data produced by the sensor takes ~58ms of computation, so doing it for 10,000 sensors would naively take around ten minutes, though I'm sure there's room for optimization and parallelization.
The sensors produce 720x720px images, so a 100x100 array of them would produce 72,000x72,000px images, or ~5 gigapixels. That's a lot of pixels for a smartphone to push around and process and store.
Sensor size is super important for resulting quality, that's why pros still lug around huge full frame (even if mirrorless) cameras and not run around with phones. There are other reasons ie speed for sports but lets keep it simple (also speed is affected by data amount processed, which goes back to resolution).
Plus higher resolution sensors have this nasty habit of producing too large files, processing of which slows down given devices compared to smaller, crisper photos and they take much more space, even more so for videos. That's probably why Apple held to 12mpix main camera for so long, there were even 200mpix sensors available around if wanted.
The watchers would be able to blackmail/control anybody who engages in private activities that they don't want to be public. So who watches the watchers? And who watches them? No. Privacy is 100% required in a free society.
Unless you're in your own home, I think it's basically a guarantee at this point that you're being recorded. Could be CCTV, trail cameras, some random recording a TikTok, etc...
That will only hold while being watched is rare. See Clarke and Baxter's Light of Other Days for an examination of the consequences of ubiquitous surveillance.
The rayban metaglasshole comes to mind. Now its just journalists who fool people in the street with AI face recognition tricks, and its all still fun and games. But this is clearly a horror invention, merrily introduced by jolly zuck, boss of facelook.
Grains of rice are pretty big, and the images they demonstrate are NOT that impressive. There are cameras you can BUY right now whose size is 1x1x2 mm (smaller than a grain if rice) which produce images that compare. Here is one example: https://www.digikey.com/en/products/detail/ams-osram-usa-inc...
Maybe I'm being too skeptical, and certainly I am only a layman in this field, but the amount of ANN-based post-processing it takes to produce the final image seems to cast suspicion on the meaning of the result.
At what point do you reduce the signal to the equivalent of an LLM prompt, with most of the resulting image being explained by the training data?
Yeah, I know that modern phone cameras are also heavily post-processed, but the hardware is at least producing a reasonable optical image to begin with. There's some correspondence between input and output; at least they're comparable.
I've seen someone on this site comment to the effect that if they could use a tool like dall-e to generate a picture of "their dog" that looked better than a photo they could take themselves, they would happily take it over a photo.
The future is going to become difficult for people who find value in creative activities, beyond just a raw audio/visual/textual signal at the output. I think most people who really care about a creative medium would say there's some kind of value in the process and the human intentionality that creates works, both for the creator who engages in it and the audience who is aware of it.
In my opinion most AI creative tools don't actually benefit serious creators, they just provide a competitive edge for companies to sell new products and enable more dilettantes to enter the scene and flood us with mediocrity
It is possible to create realistic images and videos with AI, making anyone do anything. Whether a photo or video is real or not will soon be impossible to distinguish, and it won't matter to those who want to cause harm
We've always had this sort of stuff. Back in the 70s you had cameras the size of lighters. There's solutions for anyone determined enough. Even with authoritarian states, you'll find counter measures with sufficient demand. It's reed in the wind shit. Hopefully we won't kill ourselves in the process.
Can we agree that in the field of cameras we surpassed science fiction?
I can remember watching a TV series as a child where a time traveler went back to the 80s and some person told him that everything is about miniaturization. Then he pointed to a little pin on the time traveler's jacket, which was actually a camera, and said: "This little pin for example could one day hold a full video camera", which seemed a bit ridiculous at that time.
It's interesting how they mention beneficial impacts on medicine and science in general, but everyone knows that the first applications will likely be military and surveillance.
And since it's AI improved, all of th will hurt people because of hallucinations.
I don't trust human to avoid taking shorcuts once the tech is available, it's too convenient to have "information" for so cheap, and less costly to silence the occasional scandal.
This kind of thing -- that humans can do today with current technology -- is why if an ET intelligence that could travel interstellar distances wanted to observe us we would never know unless they wanted us to know.
Their probes could be the size of sand grains, maybe even dust. Maybe not quite sophons, but not much better as far as our odds of finding anything. I suppose there would have to be something larger to receive signals from these things and send them back (because physics), but that could be hanging out somewhere we'd be unlikely to see it.
Yet another Fermi paradox answer: we are looking for big spacecraft when the universe is full of smart dust.
> produce full-color images that are equal in quality to those produced by conventional cameras
I was really skeptical of this since the article conveniently doesn't include any photos taken by the nano-camera, but there are examples [1] in the original paper that are pretty impressive.
[1] https://www.nature.com/articles/s41467-021-26443-0/figures/2
Those images are certainly impressive, but I certainly don't agree with the statement "equal in quality to those produced by conventional cameras": they're quite obviously lacking in sharpness and color.
conventional ultra thin lens cameras are mostly endoscopes, so it's up against this: https://www.endoscopy-campus.com/wp-content/uploads/Neuroend...
Just curious, what am I looking at here?
my education is on the imaging side not the medical side but I believe this: https://www.mayoclinic.org/diseases-conditions/neuroendocrin... + this: https://emedicine.medscape.com/article/176036-overview?form=... - looks like it was shot with this: https://vet-trade.eu/enteroscope/218-olympus-enteroscope-sif...
There's one of those Taboola type ads going around with a similar image that suggests it is a close up of belly fat. Given the source and their propensity for using images unrelated to topic, so not sure if that's what it really is.
Tiny cameras will always be limited in aperture, so low light and depth of field will be a challenge.
I wonder how they took pictures with four different cameras from the exact same position at the exact same point in time. Maybe the chameleon was staying very still, and maybe the flowers were indoors and that's why they didn't move in the breeze, and they used a special rock-solid mount that kept all three cameras perfectly aligned with microscopic precision. Or maybe these aren't genuine demonstrations, just mock-ups, and they didn't even really have a chameleon.
They didn't really have a chameleon. See "Experimental setup" in the linked paper [emphasis mine]:
> After fabrication of the meta-optic, we account for fabrication error by performing a PSF calibration step. This is accomplished by using an optical relay system to image a pinhole illuminated by fiber-coupled LEDs. We then conduct imaging experiments by replacing the pinhole with an OLED monitor. The OLED monitor is used to display images that will be captured by our nano-optic imager.
But shooting a real chameleon is irrelevant to what they're trying to demonstrate here.
At the scales they're working at here ("nano-optics"), there's no travel distance for chromatic distortion to take place within the lens. Therefore, whether they're shooting a 3D scene (a chameleon) or a 2D scene (an OLED monitor showing a picture of a chameleon), the light that makes it through their tiny lens to hit the sensor is going to be the same.
(That's the intuitive explanation, at least; the technical explanation is a bit stranger, as the lens is sub-wavelength – and shaped into structures that act as antennae for specific light frequencies. You might say that all the lens is doing is chromatic distortion — but in a very controlled manner, "funnelling" each frequency of inbound light to a specific part of the sensor, somewhat like a MIMO antenna "funnels" each frequency-band of signal to a specific ADC+DSP. Which amounts to the same thing: this lens doesn't "see" any difference between 3D scenes and 2D images of those scenes.)
Given the size of their camera, you could glue it to the center of another camera’s lens with relatively insignificant effect on the larger camera’s performance.
Camera rigs exist for this exact reason.
what happens when you go too far from trusting what you see/read/hear on the internet? simple logic gets tossed out like a baby in the bathwater.
now, here's the rig I'd love to see with this: take a hundred of them and position them like a bug's eye to see what could be done with that. there'd be so much overlapping coverage that 3D would be possible, yet the parallax would be so small that makes me wonder how much depth would be discernible
Also interesting: the paper is from 2021.
> Ultrathin meta-optics utilize subwavelength nano-antennas to modulate incident light with greater design freedom and space-bandwidth product over conventional diffractive optical elements (DOEs).
Is this basically a visible-wavelength beamsteering phased array?
Yup. It's also passive. The nanostructures act like delay lines.
Interesting. This idea appears pretty much exactly at the end of Bob Shaw's 1972 SFnal collection Other Days, Other Eyes. The starting premise is the invention of "slow glass" that looks like an irrelevant gimmick but ends up revolutionizing all sorts of things, and the final bits envisage a disturbing surveillance society with these tiny passive cameras spread everywhere.
It's a good read; I don't think the extrapolation of one technical advance has ever been done better.
How does this work? If it's just reconstructing the images with nn, a la Samsung pasting a picture of the moon when it detected a white disc on the image, it's not very impressive.
I had the same thought, but it sounds like this operates at a much lower level than that kind of thing:
> Then, a physics-based neural network was used to process the images captured by the meta-optics camera. Because the neural network was trained on metasurface physics, it can remove aberrations produced by the camera.
I'd like to see some examples showing how it does when taking a picture of completely random fractal noise. That should show it's not just trained to reconstruct known image patterns.
Generally it's probably wise to be skeptical of anything that appears to get around the diffraction limit.
I believe the claim is that the NN is trained to reconstruct pixels, not images. As in so many areas, the diffraction limit is probabalistic so combining information from multiple overlapping samples and NNs trained on known diffracted -> accurate pairs may well recover information.
You’re right that it might fail on noise with resolution fine enough to break assumptions from the NN training set. But that’s not a super common application for cameras, and traditional cameras have their own limitations.
Not saying we shouldn’t be skeptical, just that there is a plausible mechanism here.
we've had very good chromatic aberration correction since I got a degree in imaging technology and that was over 20 years ago so I'd imagine it's not particularly difficult for name your flavour of ML.
My concern would be that if it can't produce accurate results on a random noise test, then how do we trust that it actually produces accurate results (as opposed to merely plausible results) on normal images?
Multilevel fractal noise specifically would give an indication of how fine you can go.
Years ago I saw an interview with a futurist that mentioned the following:
"One day, your kids will go to the toy store and get a sheet of stickers. Each sticker is actually a camera with an IPv6 address. That means they can put a sticker somewhere, go and point a browser at that address and see a live camera feed.
I should point out: all of the technology to do this already exists, it just hasn't gotten cheap enough to mass market. When economies of scale do kick in, society is going to have to deal with a dramatic change in what they think 'physical privacy' means."
Maybe it's possible but i can't i seem to think of an energy harvesting Method that would fit that system without direct sunlight.
I'm very skeptical this technology already exists. Maybe if you vastly change the meaning of "sticker"
"PCB-with-onboard-battery-and-adhesive-backing-icker"
Wow.
Given the tiny dimensions, and wide field, adding regular lenses over an array could create extreme wide field, like 160x160 degrees, for everyday phone cameras. Or very small 360x180 degree stand-alone cameras. AR glasses with a few cameras could operate with 360x160 degrees and be extremely situationally aware!
Another application would be small light field cameras. I don't know enough to judge if this is directly applicable, or adaptable to that. But it would be wonderful to finally have small cheap light field cameras. Both for post-focus adjustment and (better than stereo) 3D image sensing and scene reconstruction.
Chalk another one up for Vernor Vinge. This tech seems like it could directly enable the “ubiquitous surveillance” from _A Deepness in the Sky_. Definitely something to watch closely.
Also the scatterable surveillance cameras used in his other great novel, 'The Peace War' [0]. Although IIRC they were the size of seeds or similar.
[0] https://en.wikipedia.org/wiki/The_Peace_War
3 or 4 mm in diameter, according to a scene in chapter 6, big enough to have similar resolution to that of a human eye, according to Paul, but able to look in any direction without physically rotating.
In chapter 13 the enemy describes them as using Fourier optics, though that seemed to be their speculation - not sure whether it was right.
I've been interested in smart dust for a while; recently the news seems to have dried up, and while that may have been other stuff taking up all the attention (and investment money), I suspect that many R&D teams went under government NDAs because they are now good enough to be interesting.
I wonder if someone tried to build a localizer how small they could actually be made?
PS It's "Vernor"
The other side to the localizers is the communication / mesh networking, and the extremely effective security partitioning. Even Anne couldn't crack them! It's certainly a lot to package in such a small form
Thanks, I typed that on my phone and it "fixed" it for me without me noticing.
Or Rudy Rucker’s Postsingular, where the “orphidnet” utility fog enables universal perception/visualization.
I haven't read deepness in the sky but it's interesting how wrong alot of scifi got this. Cameras are always considerably bigger than grains of sand
Well, Deepness is set a few thousand years in the future, so we've got some time to work on it.
Everyone here is thinking about privacy and surveillance and here I am wondering if this is what lets us speed up nano cameras to relativistic speeds with lasers to image other solar systems up close.
Thank you!
It's been a while since I've heard anyone talk about the Starshot project[0]. Maybe this would help revitalize it.
Also even without aiming for Proxima Centauri, it would be great to have more cameras in our own planetary system.
--
[0] - https://en.wikipedia.org/wiki/Breakthrough_Starshot
Gilster writes about it every few months
https://www.centauri-dreams.org/2024/01/19/data-return-from-...
we would also need a transmitter of equivalent size to send those images back. also an energy source
Just do round trip!
We'll need even bigger[1] breakthroughs in propulsion if it's going to be self-propelling itself back to Sol at relativistic speeds.
1. A "simpler" sci-fi solution foe a 1-way trip that's still out of our reach is a large light sail and huge Earth-based laser, but his required "smaller" breakthroughs in material science
Well if you can propel something forward you can propel it backwards as well.
I'm assuming some sort of fixed laser type propulsion mechanism would leverage a type of solar sail technology. Maybe you could send a phased laser signal that "vibrates" a solar sail towards the source of energy instead of away.
> Well if you can propel something forward you can propel it backwards as well
Not necessarily - at least with currently known science. Light sails work ok transferring momentum from photons, allowing positive acceleration from a giant laser Earth. Return trip requires a giant laser on the other side.
As well as a way around Newton's Third Law.
I meant to say the "simpler" (but still very complicated) solar sail approach was for a one-way trip. On paper, our civilization can muster the energy required to accelerate tiny masses to relativistic speeds. A return trip at those speeds would require a nee type of science to concentrate that amount of energy in a small mass and use it for controlled propulsion.
> The meta-optics camera is the first device of its kind to produce full-color images that are equal in quality to those produced by conventional cameras, which are an order of magnitude larger. In fact, the meta-optics camera is 500,000 times smaller than conventional cameras that capture the same level of image quality.
That would make them 6 orders of magnitude larger.
> as well as implementing unique AI-powered image post-processing to create high-quality images from the camera.
They're not comparable, in the intuitive sense, to conventional cameras.
Are they not? Every modern camera does the same thing. Upscaling, denoising, deblurring, adjusting colors, bumping and dropping shadows and highlights, pretty much no aspect of the picture is the way the sensor sees it once the rest of the pipeline is done. Phone cameras do this to a more extreme degree than say pro cameras, but they all do it.
To point out the obvious, film cameras don't, nor do many digital cameras. Unless you mean modern in the sense of "cameras you can buy from best buy right now", of course. But that isn't very interesting: best buy has terrible taste in cameras.
There are a lot of steps like that provided you want an image that you want to show to the user (i.e. Jpeg). You do have somehow merge the 3 Bayer filter detections on rectangular grid, which involves interpolation. You do have to subtract some sort of bias in a detector, possibly correct for different sensitivity across the detector. You have to map the raw 'electron counts' into Jpeg scale which involves another set of decisions/image processing steps
There is clear processing in terms of interpreting the raw sensor data as you're describing. Then there are blurrier processes still, like "denoising" and "upscaling", which straddle the line between bias-correction and alteration. Then there's modification of actual color and luminance as the parent was describing. Now we're seeing full alterations applied automatically with neural nets, literally altering shapes and shadows and natural lighting phenomena.
I think it's useful to distinguish all of these even if they are desired. I really love my iPhone camera, but there's something deeply unsettling about how it alters the photos. It's fundamentally producing a different image you can get with either film or through your eyes. Naturally this is true for all digital sensors but we once could point out specifically how and why the resulting image differs from what our eyes see. It's no longer easy to even enumerate the possible alterations that go on via software, let alone control many of them, and I think there will be backlash at some point (or stated differently, a market for cameras that allow controlling this).
I've got to imagine it's frustrating for people who rely on their phone cameras for daily work to find out that upgrading a phone necessarily means relearning its foibles and adjusting how you shoot to accommodate it. Granted, I mostly take smartphone photos in situations where i'd rather not be neurotic about the result (candids, memories, reminders, etc) but surely there are professionals out there who can speak to this.
Huh, I like your comment. It's such a nice way of pointing out someone equating marketability to quality.
Pro cameras do not do this to any degree.
Edit: by default.
The cameras themselves might not, but in order to get a decent picture you will need to apply demosaicing and gamma correction in software at the very least, even with high end cameras.
Right, and the point ppl are making upthread is that deterministic signal processing and probabilistic reconstruction approaches are apples and oranges.
It's trivial to make most AI implementations deterministic; just use a constant RNG seed.
"AI-powered image post-processing" is only done in smartphones I believe.
Not anymore. DSLR makers are already using AI (in-camera neural network processing) for things like upscaling and noise removal. https://www.digitalcameraworld.com/reviews/canon-eos-r1-revi...
"The Neural network Image Processing features in this camera are arguably even more important here than they are in the R5 Mark II. A combination of deep learning and algorithmic AI is used to power In-Camera Upscaling, which transforms the pedestrian-resolution 24.2MP images into pixel-packed 96MP photos – immediately outclassing every full-frame camera on the market, and effectively hitting GFX and Hasselblad territory.
"On top of that is High ISO Noise Reduction, which uses AI to denoise images by 2 stops. It works wonders when you're pushing those higher ISOs, which are already way cleaner than you'd expect thanks to the flagship image sensor and modest pixel count."
All kinds of exciting implications for small cameras and lens assemblies in VR/AR
How would someone detect sensors so small?
How would someone excrete an array of these cameras if ingested?
If you eat something the size of a grain of salt that isn't digestible, excreting it poses no problem.
you could detect the supporting electronics with a nonlinear junction detector but they are not cheap
(2021) The real story: https://light.princeton.edu/publication/neural-nano-optics/
If that’s true, maybe it would allow you to put a 10,000 camera array (100x100) on a smartphone, and do interesting things with computational imaging?
Some rough numbers:
The paper says that reconstructing an actual image from the raw data produced by the sensor takes ~58ms of computation, so doing it for 10,000 sensors would naively take around ten minutes, though I'm sure there's room for optimization and parallelization.
The sensors produce 720x720px images, so a 100x100 array of them would produce 72,000x72,000px images, or ~5 gigapixels. That's a lot of pixels for a smartphone to push around and process and store.
72,000*72,000* say, 24 bits per color * 3 colors, equals ~43 GiB per image.
edit: mixed up bits and bytes
Careful with your bits vs bytes there
edited, thanks!
Sensor size is super important for resulting quality, that's why pros still lug around huge full frame (even if mirrorless) cameras and not run around with phones. There are other reasons ie speed for sports but lets keep it simple (also speed is affected by data amount processed, which goes back to resolution).
Plus higher resolution sensors have this nasty habit of producing too large files, processing of which slows down given devices compared to smaller, crisper photos and they take much more space, even more so for videos. That's probably why Apple held to 12mpix main camera for so long, there were even 200mpix sensors available around if wanted.
That's a nice innovation that I'm not that happy about, as there would be even less privacy...
Maybe on the other side it's good news as ppl are usually their best selves when they are being watched.
The watchers would be able to blackmail/control anybody who engages in private activities that they don't want to be public. So who watches the watchers? And who watches them? No. Privacy is 100% required in a free society.
Unless you're in your own home, I think it's basically a guarantee at this point that you're being recorded. Could be CCTV, trail cameras, some random recording a TikTok, etc...
> ppl are usually their best selves when they are being watched.
I don't think that view holds up.
A, it very much depends on who is watching, what their incentives are, and what power they hold.
And B, it also depends on who is being watched - not everyone thrives under a microscope. Are they the type to feel stifled? Or rebellious?
Also, whose definition of "best self" are we using, that of the person being watched or of the person controlling the camera?
That will only hold while being watched is rare. See Clarke and Baxter's Light of Other Days for an examination of the consequences of ubiquitous surveillance.
This is no news?
Has been published in 2021. Also here https://news.ycombinator.com/item?id=29399828
This won’t be good for society.
How will it be all that different than the ubiquitous imaging we have now?
You can sometimes find a hidden camera today.
The rayban metaglasshole comes to mind. Now its just journalists who fool people in the street with AI face recognition tricks, and its all still fun and games. But this is clearly a horror invention, merrily introduced by jolly zuck, boss of facelook.
It will be the same, but worse.
First thought that came to mind - insect-sized killer drones. I guess that's the informational context we are in right now.
The Air Force was already publicly talking about such things in 2009: https://m.youtube.com/watch?v=_5YkQ9w3PJ4
You would still have to power the thing and store the data etc. This is just about the lense.
Grains of rice are pretty big, and the images they demonstrate are NOT that impressive. There are cameras you can BUY right now whose size is 1x1x2 mm (smaller than a grain if rice) which produce images that compare. Here is one example: https://www.digikey.com/en/products/detail/ams-osram-usa-inc...
It is pretty easy to interface with too - i did it with a pi pico microcontroller: https://x.com/dmitrygr/status/1753585604971917313
The OP describes them as the size of a grain of salt, not a grain of rice.
Maybe I'm being too skeptical, and certainly I am only a layman in this field, but the amount of ANN-based post-processing it takes to produce the final image seems to cast suspicion on the meaning of the result.
At what point do you reduce the signal to the equivalent of an LLM prompt, with most of the resulting image being explained by the training data?
Yeah, I know that modern phone cameras are also heavily post-processed, but the hardware is at least producing a reasonable optical image to begin with. There's some correspondence between input and output; at least they're comparable.
I've seen someone on this site comment to the effect that if they could use a tool like dall-e to generate a picture of "their dog" that looked better than a photo they could take themselves, they would happily take it over a photo.
The future is going to become difficult for people who find value in creative activities, beyond just a raw audio/visual/textual signal at the output. I think most people who really care about a creative medium would say there's some kind of value in the process and the human intentionality that creates works, both for the creator who engages in it and the audience who is aware of it.
In my opinion most AI creative tools don't actually benefit serious creators, they just provide a competitive edge for companies to sell new products and enable more dilettantes to enter the scene and flood us with mediocrity
This seems like it's going to ne a serious problem for privacy... not that anyone cares.
It is possible to create realistic images and videos with AI, making anyone do anything. Whether a photo or video is real or not will soon be impossible to distinguish, and it won't matter to those who want to cause harm
Is it possible to make an orbital death laser with this?
I mean. If you find a way of harnessing enough energy.
No link in the article to the actual paper?
It’s in the "Further Reading" section at the bottom: https://www.nature.com/articles/s41467-021-26443-0
There is some optics thing that looks cool, but it doesn't say how the image is actually recorded.
Then there is the whole "neural" part. Do these get "enhanced" by a generative AI that fills the blur based on the most statistically likely pixels?
The article is pretty bad.
I don't want a camera the size of a grain of salt! At least not while surveillance capitalism and creeping authoritariarism are in full swing...
We've always had this sort of stuff. Back in the 70s you had cameras the size of lighters. There's solutions for anyone determined enough. Even with authoritarian states, you'll find counter measures with sufficient demand. It's reed in the wind shit. Hopefully we won't kill ourselves in the process.
For those who want to cause harm (discredit), they don't need a real photo; AI is enough
Just file a complaint with the United Nations Ethics Czar. Oh.. wait.
Not a single mention of the obvious privacy concerns in the article
Can we agree that in the field of cameras we surpassed science fiction?
I can remember watching a TV series as a child where a time traveler went back to the 80s and some person told him that everything is about miniaturization. Then he pointed to a little pin on the time traveler's jacket, which was actually a camera, and said: "This little pin for example could one day hold a full video camera", which seemed a bit ridiculous at that time.
It's interesting how they mention beneficial impacts on medicine and science in general, but everyone knows that the first applications will likely be military and surveillance.
And since it's AI improved, all of th will hurt people because of hallucinations.
I don't trust human to avoid taking shorcuts once the tech is available, it's too convenient to have "information" for so cheap, and less costly to silence the occasional scandal.
> but everyone knows that the first applications will likely be military and surveillance.
military, surveillance and porn
The article was published in 2021. Why do they repost this as "news" three years later?
This kind of thing -- that humans can do today with current technology -- is why if an ET intelligence that could travel interstellar distances wanted to observe us we would never know unless they wanted us to know.
Their probes could be the size of sand grains, maybe even dust. Maybe not quite sophons, but not much better as far as our odds of finding anything. I suppose there would have to be something larger to receive signals from these things and send them back (because physics), but that could be hanging out somewhere we'd be unlikely to see it.
Yet another Fermi paradox answer: we are looking for big spacecraft when the universe is full of smart dust.
as an optic scientists i would protest my work being lumped zogether with the psychedelics of AIchemists