A “meta-optics” camera that is the size of a grain of salt

(cacm.acm.org)

171 points | by rbanffy 8 hours ago ago

109 comments

  • bhaney 7 hours ago

    > produce full-color images that are equal in quality to those produced by conventional cameras

    I was really skeptical of this since the article conveniently doesn't include any photos taken by the nano-camera, but there are examples [1] in the original paper that are pretty impressive.

    [1] https://www.nature.com/articles/s41467-021-26443-0/figures/2

    • roelschroeven 6 hours ago

      Those images are certainly impressive, but I certainly don't agree with the statement "equal in quality to those produced by conventional cameras": they're quite obviously lacking in sharpness and color.

      • neom 5 hours ago

        conventional ultra thin lens cameras are mostly endoscopes, so it's up against this: https://www.endoscopy-campus.com/wp-content/uploads/Neuroend...

      • queuebert 32 minutes ago

        Tiny cameras will always be limited in aperture, so low light and depth of field will be a challenge.

      • card_zero 5 hours ago

        I wonder how they took pictures with four different cameras from the exact same position at the exact same point in time. Maybe the chameleon was staying very still, and maybe the flowers were indoors and that's why they didn't move in the breeze, and they used a special rock-solid mount that kept all three cameras perfectly aligned with microscopic precision. Or maybe these aren't genuine demonstrations, just mock-ups, and they didn't even really have a chameleon.

        • derefr an hour ago

          They didn't really have a chameleon. See "Experimental setup" in the linked paper [emphasis mine]:

          > After fabrication of the meta-optic, we account for fabrication error by performing a PSF calibration step. This is accomplished by using an optical relay system to image a pinhole illuminated by fiber-coupled LEDs. We then conduct imaging experiments by replacing the pinhole with an OLED monitor. The OLED monitor is used to display images that will be captured by our nano-optic imager.

          But shooting a real chameleon is irrelevant to what they're trying to demonstrate here.

          At the scales they're working at here ("nano-optics"), there's no travel distance for chromatic distortion to take place within the lens. Therefore, whether they're shooting a 3D scene (a chameleon) or a 2D scene (an OLED monitor showing a picture of a chameleon), the light that makes it through their tiny lens to hit the sensor is going to be the same.

          (That's the intuitive explanation, at least; the technical explanation is a bit stranger, as the lens is sub-wavelength – and shaped into structures that act as antennae for specific light frequencies. You might say that all the lens is doing is chromatic distortion — but in a very controlled manner, "funnelling" each frequency of inbound light to a specific part of the sensor, somewhat like a MIMO antenna "funnels" each frequency-band of signal to a specific ADC+DSP. Which amounts to the same thing: this lens doesn't "see" any difference between 3D scenes and 2D images of those scenes.)

        • gcanyon 3 hours ago

          Given the size of their camera, you could glue it to the center of another camera’s lens with relatively insignificant effect on the larger camera’s performance.

        • cliffy 5 hours ago

          Camera rigs exist for this exact reason.

          • dylan604 3 hours ago

            what happens when you go too far from trusting what you see/read/hear on the internet? simple logic gets tossed out like a baby in the bathwater.

            now, here's the rig I'd love to see with this: take a hundred of them and position them like a bug's eye to see what could be done with that. there'd be so much overlapping coverage that 3D would be possible, yet the parallax would be so small that makes me wonder how much depth would be discernible

    • baxtr 6 hours ago

      Also interesting: the paper is from 2021.

    • Intralexical 6 hours ago

      > Ultrathin meta-optics utilize subwavelength nano-antennas to modulate incident light with greater design freedom and space-bandwidth product over conventional diffractive optical elements (DOEs).

      Is this basically a visible-wavelength beamsteering phased array?

      • itishappy 5 hours ago

        Yup. It's also passive. The nanostructures act like delay lines.

        • mrec 3 hours ago

          Interesting. This idea appears pretty much exactly at the end of Bob Shaw's 1972 SFnal collection Other Days, Other Eyes. The starting premise is the invention of "slow glass" that looks like an irrelevant gimmick but ends up revolutionizing all sorts of things, and the final bits envisage a disturbing surveillance society with these tiny passive cameras spread everywhere.

          It's a good read; I don't think the extrapolation of one technical advance has ever been done better.

    • andrepd 7 hours ago

      How does this work? If it's just reconstructing the images with nn, a la Samsung pasting a picture of the moon when it detected a white disc on the image, it's not very impressive.

      • nateroling 7 hours ago

        I had the same thought, but it sounds like this operates at a much lower level than that kind of thing:

        > Then, a physics-based neural network was used to process the images captured by the meta-optics camera. Because the neural network was trained on metasurface physics, it can remove aberrations produced by the camera.

        • Intralexical 6 hours ago

          I'd like to see some examples showing how it does when taking a picture of completely random fractal noise. That should show it's not just trained to reconstruct known image patterns.

          Generally it's probably wise to be skeptical of anything that appears to get around the diffraction limit.

          • brookst 6 hours ago

            I believe the claim is that the NN is trained to reconstruct pixels, not images. As in so many areas, the diffraction limit is probabalistic so combining information from multiple overlapping samples and NNs trained on known diffracted -> accurate pairs may well recover information.

            You’re right that it might fail on noise with resolution fine enough to break assumptions from the NN training set. But that’s not a super common application for cameras, and traditional cameras have their own limitations.

            Not saying we shouldn’t be skeptical, just that there is a plausible mechanism here.

            • neom 5 hours ago

              we've had very good chromatic aberration correction since I got a degree in imaging technology and that was over 20 years ago so I'd imagine it's not particularly difficult for name your flavour of ML.

            • Intralexical 2 hours ago

              My concern would be that if it can't produce accurate results on a random noise test, then how do we trust that it actually produces accurate results (as opposed to merely plausible results) on normal images?

              Multilevel fractal noise specifically would give an indication of how fine you can go.

  • alexpotato 5 hours ago

    Years ago I saw an interview with a futurist that mentioned the following:

    "One day, your kids will go to the toy store and get a sheet of stickers. Each sticker is actually a camera with an IPv6 address. That means they can put a sticker somewhere, go and point a browser at that address and see a live camera feed.

    I should point out: all of the technology to do this already exists, it just hasn't gotten cheap enough to mass market. When economies of scale do kick in, society is going to have to deal with a dramatic change in what they think 'physical privacy' means."

    • petra 2 hours ago

      Maybe it's possible but i can't i seem to think of an energy harvesting Method that would fit that system without direct sunlight.

    • brokensegue 3 hours ago

      I'm very skeptical this technology already exists. Maybe if you vastly change the meaning of "sticker"

      • Workaccount2 3 hours ago

        "PCB-with-onboard-battery-and-adhesive-backing-icker"

  • Nevermark an hour ago

    Wow.

    Given the tiny dimensions, and wide field, adding regular lenses over an array could create extreme wide field, like 160x160 degrees, for everyday phone cameras. Or very small 360x180 degree stand-alone cameras. AR glasses with a few cameras could operate with 360x160 degrees and be extremely situationally aware!

    Another application would be small light field cameras. I don't know enough to judge if this is directly applicable, or adaptable to that. But it would be wonderful to finally have small cheap light field cameras. Both for post-focus adjustment and (better than stereo) 3D image sensing and scene reconstruction.

  • mwigdahl 7 hours ago

    Chalk another one up for Vernor Vinge. This tech seems like it could directly enable the “ubiquitous surveillance” from _A Deepness in the Sky_. Definitely something to watch closely.

    • KineticLensman 6 hours ago

      Also the scatterable surveillance cameras used in his other great novel, 'The Peace War' [0]. Although IIRC they were the size of seeds or similar.

      [0] https://en.wikipedia.org/wiki/The_Peace_War

      • EdwardCoffin 5 hours ago

        3 or 4 mm in diameter, according to a scene in chapter 6, big enough to have similar resolution to that of a human eye, according to Paul, but able to look in any direction without physically rotating.

        In chapter 13 the enemy describes them as using Fourier optics, though that seemed to be their speculation - not sure whether it was right.

    • ben_w 7 hours ago

      I've been interested in smart dust for a while; recently the news seems to have dried up, and while that may have been other stuff taking up all the attention (and investment money), I suspect that many R&D teams went under government NDAs because they are now good enough to be interesting.

    • arethuza 7 hours ago

      I wonder if someone tried to build a localizer how small they could actually be made?

      PS It's "Vernor"

      • cmpb 5 hours ago

        The other side to the localizers is the communication / mesh networking, and the extremely effective security partitioning. Even Anne couldn't crack them! It's certainly a lot to package in such a small form

      • mwigdahl 7 hours ago

        Thanks, I typed that on my phone and it "fixed" it for me without me noticing.

    • gcanyon 3 hours ago

      Or Rudy Rucker’s Postsingular, where the “orphidnet” utility fog enables universal perception/visualization.

    • 12907835202 7 hours ago

      I haven't read deepness in the sky but it's interesting how wrong alot of scifi got this. Cameras are always considerably bigger than grains of sand

      • cmpb 5 hours ago

        Well, Deepness is set a few thousand years in the future, so we've got some time to work on it.

  • ep_jhu 6 hours ago

    Everyone here is thinking about privacy and surveillance and here I am wondering if this is what lets us speed up nano cameras to relativistic speeds with lasers to image other solar systems up close.

    • TeMPOraL 6 hours ago

      Thank you!

      It's been a while since I've heard anyone talk about the Starshot project[0]. Maybe this would help revitalize it.

      Also even without aiming for Proxima Centauri, it would be great to have more cameras in our own planetary system.

      --

      [0] - https://en.wikipedia.org/wiki/Breakthrough_Starshot

    • skandinaff 6 hours ago

      we would also need a transmitter of equivalent size to send those images back. also an energy source

      • Workaccount2 3 hours ago

        Just do round trip!

        • sangnoir an hour ago

          We'll need even bigger[1] breakthroughs in propulsion if it's going to be self-propelling itself back to Sol at relativistic speeds.

          1. A "simpler" sci-fi solution foe a 1-way trip that's still out of our reach is a large light sail and huge Earth-based laser, but his required "smaller" breakthroughs in material science

          • DCH3416 29 minutes ago

            Well if you can propel something forward you can propel it backwards as well.

            I'm assuming some sort of fixed laser type propulsion mechanism would leverage a type of solar sail technology. Maybe you could send a phased laser signal that "vibrates" a solar sail towards the source of energy instead of away.

            • sangnoir 17 minutes ago

              > Well if you can propel something forward you can propel it backwards as well

              Not necessarily - at least with currently known science. Light sails work ok transferring momentum from photons, allowing positive acceleration from a giant laser Earth. Return trip requires a giant laser on the other side.

          • SoftTalker an hour ago

            As well as a way around Newton's Third Law.

            • sangnoir an hour ago

              I meant to say the "simpler" (but still very complicated) solar sail approach was for a one-way trip. On paper, our civilization can muster the energy required to accelerate tiny masses to relativistic speeds. A return trip at those speeds would require a nee type of science to concentrate that amount of energy in a small mass and use it for controlled propulsion.

  • 1024core 40 minutes ago

    > The meta-optics camera is the first device of its kind to produce full-color images that are equal in quality to those produced by conventional cameras, which are an order of magnitude larger. In fact, the meta-optics camera is 500,000 times smaller than conventional cameras that capture the same level of image quality.

    That would make them 6 orders of magnitude larger.

  • pizza234 7 hours ago

    > as well as implementing unique AI-powered image post-processing to create high-quality images from the camera.

    They're not comparable, in the intuitive sense, to conventional cameras.

    • Etheryte 6 hours ago

      Are they not? Every modern camera does the same thing. Upscaling, denoising, deblurring, adjusting colors, bumping and dropping shadows and highlights, pretty much no aspect of the picture is the way the sensor sees it once the rest of the pipeline is done. Phone cameras do this to a more extreme degree than say pro cameras, but they all do it.

      • PittleyDunkin 6 hours ago

        To point out the obvious, film cameras don't, nor do many digital cameras. Unless you mean modern in the sense of "cameras you can buy from best buy right now", of course. But that isn't very interesting: best buy has terrible taste in cameras.

        • sega_sai 6 hours ago

          There are a lot of steps like that provided you want an image that you want to show to the user (i.e. Jpeg). You do have somehow merge the 3 Bayer filter detections on rectangular grid, which involves interpolation. You do have to subtract some sort of bias in a detector, possibly correct for different sensitivity across the detector. You have to map the raw 'electron counts' into Jpeg scale which involves another set of decisions/image processing steps

          • PittleyDunkin 2 hours ago

            There is clear processing in terms of interpreting the raw sensor data as you're describing. Then there are blurrier processes still, like "denoising" and "upscaling", which straddle the line between bias-correction and alteration. Then there's modification of actual color and luminance as the parent was describing. Now we're seeing full alterations applied automatically with neural nets, literally altering shapes and shadows and natural lighting phenomena.

            I think it's useful to distinguish all of these even if they are desired. I really love my iPhone camera, but there's something deeply unsettling about how it alters the photos. It's fundamentally producing a different image you can get with either film or through your eyes. Naturally this is true for all digital sensors but we once could point out specifically how and why the resulting image differs from what our eyes see. It's no longer easy to even enumerate the possible alterations that go on via software, let alone control many of them, and I think there will be backlash at some point (or stated differently, a market for cameras that allow controlling this).

            I've got to imagine it's frustrating for people who rely on their phone cameras for daily work to find out that upgrading a phone necessarily means relearning its foibles and adjusting how you shoot to accommodate it. Granted, I mostly take smartphone photos in situations where i'd rather not be neurotic about the result (candids, memories, reminders, etc) but surely there are professionals out there who can speak to this.

        • kristjank 6 hours ago

          Huh, I like your comment. It's such a nice way of pointing out someone equating marketability to quality.

      • stevenae 6 hours ago

        Pro cameras do not do this to any degree.

        Edit: by default.

        • vlabakje90 6 hours ago

          The cameras themselves might not, but in order to get a decent picture you will need to apply demosaicing and gamma correction in software at the very least, even with high end cameras.

          • gyomu 5 hours ago

            Right, and the point ppl are making upthread is that deterministic signal processing and probabilistic reconstruction approaches are apples and oranges.

            • oasisaimlessly an hour ago

              It's trivial to make most AI implementations deterministic; just use a constant RNG seed.

      • cubefox 6 hours ago

        "AI-powered image post-processing" is only done in smartphones I believe.

        • CharlesW 2 hours ago

          Not anymore. DSLR makers are already using AI (in-camera neural network processing) for things like upscaling and noise removal. https://www.digitalcameraworld.com/reviews/canon-eos-r1-revi...

          "The Neural network Image Processing features in this camera are arguably even more important here than they are in the R5 Mark II. A combination of deep learning and algorithmic AI is used to power In-Camera Upscaling, which transforms the pedestrian-resolution 24.2MP images into pixel-packed 96MP photos – immediately outclassing every full-frame camera on the market, and effectively hitting GFX and Hasselblad territory.

          "On top of that is High ISO Noise Reduction, which uses AI to denoise images by 2 stops. It works wonders when you're pushing those higher ISOs, which are already way cleaner than you'd expect thanks to the flagship image sensor and modest pixel count."

  • gatkinso 30 minutes ago

    All kinds of exciting implications for small cameras and lens assemblies in VR/AR

  • foul 7 hours ago

    How would someone detect sensors so small?

    How would someone excrete an array of these cameras if ingested?

    • thfuran 6 hours ago

      If you eat something the size of a grain of salt that isn't digestible, excreting it poses no problem.

    • kaimac 6 hours ago

      you could detect the supporting electronics with a nonlinear junction detector but they are not cheap

  • burnt-resistor 5 hours ago
  • curiousObject 7 hours ago

    If that’s true, maybe it would allow you to put a 10,000 camera array (100x100) on a smartphone, and do interesting things with computational imaging?

    • bhaney 7 hours ago

      Some rough numbers:

      The paper says that reconstructing an actual image from the raw data produced by the sensor takes ~58ms of computation, so doing it for 10,000 sensors would naively take around ten minutes, though I'm sure there's room for optimization and parallelization.

      The sensors produce 720x720px images, so a 100x100 array of them would produce 72,000x72,000px images, or ~5 gigapixels. That's a lot of pixels for a smartphone to push around and process and store.

      • fragmede 7 hours ago

        72,000*72,000* say, 24 bits per color * 3 colors, equals ~43 GiB per image.

        edit: mixed up bits and bytes

        • bhaney 6 hours ago

          Careful with your bits vs bytes there

    • jajko 7 hours ago

      Sensor size is super important for resulting quality, that's why pros still lug around huge full frame (even if mirrorless) cameras and not run around with phones. There are other reasons ie speed for sports but lets keep it simple (also speed is affected by data amount processed, which goes back to resolution).

      Plus higher resolution sensors have this nasty habit of producing too large files, processing of which slows down given devices compared to smaller, crisper photos and they take much more space, even more so for videos. That's probably why Apple held to 12mpix main camera for so long, there were even 200mpix sensors available around if wanted.

  • taosx 7 hours ago

    That's a nice innovation that I'm not that happy about, as there would be even less privacy...

    Maybe on the other side it's good news as ppl are usually their best selves when they are being watched.

    • krunck 4 hours ago

      The watchers would be able to blackmail/control anybody who engages in private activities that they don't want to be public. So who watches the watchers? And who watches them? No. Privacy is 100% required in a free society.

    • hypeatei 7 hours ago

      Unless you're in your own home, I think it's basically a guarantee at this point that you're being recorded. Could be CCTV, trail cameras, some random recording a TikTok, etc...

    • mandmandam 7 hours ago

      > ppl are usually their best selves when they are being watched.

      I don't think that view holds up.

      A, it very much depends on who is watching, what their incentives are, and what power they hold.

      And B, it also depends on who is being watched - not everyone thrives under a microscope. Are they the type to feel stifled? Or rebellious?

      • Scarblac 7 hours ago

        Also, whose definition of "best self" are we using, that of the person being watched or of the person controlling the camera?

    • ninalanyon 6 hours ago

      That will only hold while being watched is rare. See Clarke and Baxter's Light of Other Days for an examination of the consequences of ubiquitous surveillance.

  • roflmaostc 7 hours ago

    This is no news?

    Has been published in 2021. Also here https://news.ycombinator.com/item?id=29399828

  • jdalgetty 7 hours ago

    This won’t be good for society.

    • whynotminot 7 hours ago

      How will it be all that different than the ubiquitous imaging we have now?

      • timdiggerm 7 hours ago

        You can sometimes find a hidden camera today.

      • rapnie 6 hours ago

        The rayban metaglasshole comes to mind. Now its just journalists who fool people in the street with AI face recognition tricks, and its all still fun and games. But this is clearly a horror invention, merrily introduced by jolly zuck, boss of facelook.

      • hk__2 7 hours ago

        It will be the same, but worse.

  • delegate 7 hours ago

    First thought that came to mind - insect-sized killer drones. I guess that's the informational context we are in right now.

  • dmitrygr an hour ago

    Grains of rice are pretty big, and the images they demonstrate are NOT that impressive. There are cameras you can BUY right now whose size is 1x1x2 mm (smaller than a grain if rice) which produce images that compare. Here is one example: https://www.digikey.com/en/products/detail/ams-osram-usa-inc...

    It is pretty easy to interface with too - i did it with a pi pico microcontroller: https://x.com/dmitrygr/status/1753585604971917313

    • eurleif 40 minutes ago

      The OP describes them as the size of a grain of salt, not a grain of rice.

  • ripe 7 hours ago

    Maybe I'm being too skeptical, and certainly I am only a layman in this field, but the amount of ANN-based post-processing it takes to produce the final image seems to cast suspicion on the meaning of the result.

    At what point do you reduce the signal to the equivalent of an LLM prompt, with most of the resulting image being explained by the training data?

    Yeah, I know that modern phone cameras are also heavily post-processed, but the hardware is at least producing a reasonable optical image to begin with. There's some correspondence between input and output; at least they're comparable.

    • mahoho 6 hours ago

      I've seen someone on this site comment to the effect that if they could use a tool like dall-e to generate a picture of "their dog" that looked better than a photo they could take themselves, they would happily take it over a photo.

      The future is going to become difficult for people who find value in creative activities, beyond just a raw audio/visual/textual signal at the output. I think most people who really care about a creative medium would say there's some kind of value in the process and the human intentionality that creates works, both for the creator who engages in it and the audience who is aware of it.

      In my opinion most AI creative tools don't actually benefit serious creators, they just provide a competitive edge for companies to sell new products and enable more dilettantes to enter the scene and flood us with mediocrity

  • guerrilla 7 hours ago

    This seems like it's going to ne a serious problem for privacy... not that anyone cares.

    • nachox999 4 hours ago

      It is possible to create realistic images and videos with AI, making anyone do anything. Whether a photo or video is real or not will soon be impossible to distinguish, and it won't matter to those who want to cause harm

  • casey2 3 hours ago

    Is it possible to make an orbital death laser with this?

    • DCH3416 6 minutes ago

      I mean. If you find a way of harnessing enough energy.

  • dcreater 7 hours ago

    No link in the article to the actual paper?

  • d--b 6 hours ago

    There is some optics thing that looks cool, but it doesn't say how the image is actually recorded.

    Then there is the whole "neural" part. Do these get "enhanced" by a generative AI that fills the blur based on the most statistically likely pixels?

    The article is pretty bad.

  • xg15 7 hours ago

    I don't want a camera the size of a grain of salt! At least not while surveillance capitalism and creeping authoritariarism are in full swing...

    • DCH3416 9 minutes ago

      We've always had this sort of stuff. Back in the 70s you had cameras the size of lighters. There's solutions for anyone determined enough. Even with authoritarian states, you'll find counter measures with sufficient demand. It's reed in the wind shit. Hopefully we won't kill ourselves in the process.

    • nachox999 4 hours ago

      For those who want to cause harm (discredit), they don't need a real photo; AI is enough

    • rapnie 6 hours ago

      Just file a complaint with the United Nations Ethics Czar. Oh.. wait.

  • kaimac 6 hours ago

    Not a single mention of the obvious privacy concerns in the article

  • jansan 7 hours ago

    Can we agree that in the field of cameras we surpassed science fiction?

    I can remember watching a TV series as a child where a time traveler went back to the 80s and some person told him that everything is about miniaturization. Then he pointed to a little pin on the time traveler's jacket, which was actually a camera, and said: "This little pin for example could one day hold a full video camera", which seemed a bit ridiculous at that time.

  • rippeltippel 7 hours ago

    It's interesting how they mention beneficial impacts on medicine and science in general, but everyone knows that the first applications will likely be military and surveillance.

    • BiteCode_dev 7 hours ago

      And since it's AI improved, all of th will hurt people because of hallucinations.

      I don't trust human to avoid taking shorcuts once the tech is available, it's too convenient to have "information" for so cheap, and less costly to silence the occasional scandal.

    • meiraleal 7 hours ago

      > but everyone knows that the first applications will likely be military and surveillance.

      military, surveillance and porn

  • cubefox 6 hours ago

    The article was published in 2021. Why do they repost this as "news" three years later?

  • api 6 hours ago

    This kind of thing -- that humans can do today with current technology -- is why if an ET intelligence that could travel interstellar distances wanted to observe us we would never know unless they wanted us to know.

    Their probes could be the size of sand grains, maybe even dust. Maybe not quite sophons, but not much better as far as our odds of finding anything. I suppose there would have to be something larger to receive signals from these things and send them back (because physics), but that could be hanging out somewhere we'd be unlikely to see it.

    Yet another Fermi paradox answer: we are looking for big spacecraft when the universe is full of smart dust.

  • ashoeafoot 2 hours ago

    as an optic scientists i would protest my work being lumped zogether with the psychedelics of AIchemists