First want to say that I sincerely appreciate you working on this problem. The proliferation of deepfakes is something that virtually every technology industry is dealing with right now.
Suppose that deepfake technology progressed to the point where it is still detectable by your technology, but is impossible for the naked eye. In that scenario (which many would call an eventuality), wouldn't you also be compelled to serve as an authoritative entity on the detection of deepfakes?
Imagine a future politician who is caught on video doing something scandalous, or a court case where someone is questioning the veracity of some video evidence. Are the creators of deepfake detection algorithms going to testify as expert witnesses, and how could they convince a human judge/jury that the output of their black box algorithm isn't a false positive?
Yeah but does it actually work, though? There have been a lot of online tools claiming to be "AI detectors" and they all seem pretty unreliable. Can you talk us through what you look for, the most common failure modes and (at suitably high level) how you dealt with those?
We've actually deployed to several Tier 1 banks and large enterprises already for various use-cases (verification, fraud detection, threat intelligence, etc.). The feedback that we've gotten so far is that our technology is high accuracy and a useful signal.
In terms of how our technology works, our research team has trained multiple detection models to look for specific visual and audio artifacts that the major generative models leave behind. These artifacts aren't perceptible to the human eye / ear, but they are actually very detectable to computer vision and audio models.
Each of these expert models gets combined into an ensemble system that weighs all the individual model outputs to reach a final conclusion.
We've got a rigorous process of collecting data from new generators, benchmarking them, and retraining our models when necessary. Often retrains aren't needed though, since our accuracy seems to transfer well across a given deepfake technique. So even if new diffusion or autoregressive models come out, for example, the artifacts tend to be similar and are still caught by our models.
I will say that our models are most heavily benchmarked on convincing audio/video/image impersonations of humans. While we can return results for items outside that scope, we've tended to focus training and benchmarking on human impersonations since that's typically the most dangerous risk for businesses.
So that's a caveat to keep in mind if you decide to try out our Developer Free Plan.
We have been working on this problem since 2020 and have created an trained an ensemble of AI detection models working together to tell you what is real and what is fake!
We see who signs up for Reality Defender and instantly notice traffic patterns and other abnormalities that allow us to see if an account is in violation of terms of service. Also, our free tier is capped at 50 free scans a month which will not allow for said attackers to discern any tangible learnings or tactics they can use to bypass our detection models.
Just based on the first post, where they talk about their API a bit, it sounds like a system hosted on their machines(?). So, I assume AI trainers won’t be able to run it locally, to train off it.
Although, I always get a bad smell from that sort of logic, because it feels vaguely similar to security through obscurity in the sense that it relies on the opposition now knowing what you are doing.
I feel like a much easier solution is enforcing data provinence. Ssl for media hash, attach to metadata. The problem with AI isnt the fact its ai, its that people can invest little effort to sway things with undue leverage. A single person can look like 100's with signficantly less effort than previously. The problem with ai content is it makes abuse of public spaces much easier. Forcing people to take credit for work produced makes things easier (not solved) kind of like email. Being able to block media by domain would be a dream, but spam remains an issue.
so, tie content to domains. A domain vouches for content works like that content having been a webpage or email from said domain. Signed hash in metadata is backwards compatible and its easy to make browsers etc display warnings on unsigned content, content from new domains, blacklisted domains, etc.
benefit here is while we'll have more false negatives, unlike something like this tool, it does not cause real harm on false positives, which will be numerous if it wants to be better tham simply making someome accountable for media.
AI detection cannot work, will not work, and will cause more harm than it prevents. stuff like this is irresponsible and dangerous.
I understand the appeal of hashing-based provenance techniques, though they’ve faced some significant challenges in practice that render them ineffective at best. While many model developers have explored these approaches with good intentions, we’ve seen that they can be easily circumvented or manipulated, particularly by sophisticated bad actors who may not follow voluntary standards.
We recognize that no detection solution is 100% accurate. There will be occasional false positives and negatives. That said, our independently verified an internal testing shows we’ve achieved the lowest error rates currently available for addressing deepfake detection.
I’d respectfully suggest that dismissing AI detection entirely might be premature, especially without hands-on evaluation. If you’re interested, I’d be happy to arrange a test environment where you could evaluate our solution’s performance firsthand and see how it might fit your specific use case.
It's sadly not often enough I see a young company doing work that I feel only benefits society, but this is one of those times, so thank you and congratulations.
Thank you! We’ve been working on this since 2021 (and some of us a bit before that), and we’re reminded every day that we are ultimately working something that helps people on the macro and micro level. We want a world free of the malevolent uses of deepfakes for ourselves, our loved ones, and everyone beyond, and feel all should be privy to such protection.
How easy is it to fool Reality Defender into making false positives?
Whenever I'm openly performing nefarious illegal acts in public, I always wear my Sixfinger, so if anyone takes a photo of me, I can plausibly deny it by pointing out (while not wearing it) that the photo shows six fingers, and obviously must have been AI generated.
In support of said nefarious illegal acts, the Sixfinger includes a cap-loaded grenade launcher, gun, fragmentation bomb, ballpoint pen, code signaler, and message missile launcher. It's like a Swiss Army Finger! You can 3d print a cool roach clip attachment too.
I feel like this will be the next big cat and mouse sega after ad-blockers;
1) Produce AI tool
2) Tool gets used for bad
3) Use anti-AI/AI detection to avoid/check for AI tool
4) AI tool introduces anti-anti-AI/detection tools
5) Repeat
This is definitely a concern, but this is more or less how the cybersecurity space already works. Having dedicated researchers and a good business model helps a lot for keeping detectors like RD on the forefront of capabilities.
I find that most companies find that to be a feature, not a bug. People are more likely to hit accept if they can only see a small chunk of the content.
First want to say that I sincerely appreciate you working on this problem. The proliferation of deepfakes is something that virtually every technology industry is dealing with right now.
Suppose that deepfake technology progressed to the point where it is still detectable by your technology, but is impossible for the naked eye. In that scenario (which many would call an eventuality), wouldn't you also be compelled to serve as an authoritative entity on the detection of deepfakes?
Imagine a future politician who is caught on video doing something scandalous, or a court case where someone is questioning the veracity of some video evidence. Are the creators of deepfake detection algorithms going to testify as expert witnesses, and how could they convince a human judge/jury that the output of their black box algorithm isn't a false positive?
Yeah but does it actually work, though? There have been a lot of online tools claiming to be "AI detectors" and they all seem pretty unreliable. Can you talk us through what you look for, the most common failure modes and (at suitably high level) how you dealt with those?
We've actually deployed to several Tier 1 banks and large enterprises already for various use-cases (verification, fraud detection, threat intelligence, etc.). The feedback that we've gotten so far is that our technology is high accuracy and a useful signal.
In terms of how our technology works, our research team has trained multiple detection models to look for specific visual and audio artifacts that the major generative models leave behind. These artifacts aren't perceptible to the human eye / ear, but they are actually very detectable to computer vision and audio models.
Each of these expert models gets combined into an ensemble system that weighs all the individual model outputs to reach a final conclusion.
We've got a rigorous process of collecting data from new generators, benchmarking them, and retraining our models when necessary. Often retrains aren't needed though, since our accuracy seems to transfer well across a given deepfake technique. So even if new diffusion or autoregressive models come out, for example, the artifacts tend to be similar and are still caught by our models.
I will say that our models are most heavily benchmarked on convincing audio/video/image impersonations of humans. While we can return results for items outside that scope, we've tended to focus training and benchmarking on human impersonations since that's typically the most dangerous risk for businesses.
So that's a caveat to keep in mind if you decide to try out our Developer Free Plan.
Give it a try for yourself. It's free!
We have been working on this problem since 2020 and have created an trained an ensemble of AI detection models working together to tell you what is real and what is fake!
How do you prevent bad actors from using your tools as a feedback loop to tune models that can evade detection?
We see who signs up for Reality Defender and instantly notice traffic patterns and other abnormalities that allow us to see if an account is in violation of terms of service. Also, our free tier is capped at 50 free scans a month which will not allow for said attackers to discern any tangible learnings or tactics they can use to bypass our detection models.
You would need thousands to tens of thousands of images, not just 50 to produce an adversarial network that could use the API as a check.
If someone wanted to buy it, I'm sure reality defender has protection especially because you can predict adversarial guesses.
It would be trivial for them to build "this user is sending progressively more realistic, rapid responses" if they haven't built that already.
Won't this just become the fitness function for training future models?
Just based on the first post, where they talk about their API a bit, it sounds like a system hosted on their machines(?). So, I assume AI trainers won’t be able to run it locally, to train off it.
Although, I always get a bad smell from that sort of logic, because it feels vaguely similar to security through obscurity in the sense that it relies on the opposition now knowing what you are doing.
now -> not
True, haha. Although “the opposition now knowing what you are doing” is the big danger for this sort of scheme!
I feel like a much easier solution is enforcing data provinence. Ssl for media hash, attach to metadata. The problem with AI isnt the fact its ai, its that people can invest little effort to sway things with undue leverage. A single person can look like 100's with signficantly less effort than previously. The problem with ai content is it makes abuse of public spaces much easier. Forcing people to take credit for work produced makes things easier (not solved) kind of like email. Being able to block media by domain would be a dream, but spam remains an issue.
so, tie content to domains. A domain vouches for content works like that content having been a webpage or email from said domain. Signed hash in metadata is backwards compatible and its easy to make browsers etc display warnings on unsigned content, content from new domains, blacklisted domains, etc.
benefit here is while we'll have more false negatives, unlike something like this tool, it does not cause real harm on false positives, which will be numerous if it wants to be better tham simply making someome accountable for media.
AI detection cannot work, will not work, and will cause more harm than it prevents. stuff like this is irresponsible and dangerous.
I understand the appeal of hashing-based provenance techniques, though they’ve faced some significant challenges in practice that render them ineffective at best. While many model developers have explored these approaches with good intentions, we’ve seen that they can be easily circumvented or manipulated, particularly by sophisticated bad actors who may not follow voluntary standards.
We recognize that no detection solution is 100% accurate. There will be occasional false positives and negatives. That said, our independently verified an internal testing shows we’ve achieved the lowest error rates currently available for addressing deepfake detection.
I’d respectfully suggest that dismissing AI detection entirely might be premature, especially without hands-on evaluation. If you’re interested, I’d be happy to arrange a test environment where you could evaluate our solution’s performance firsthand and see how it might fit your specific use case.
I worked in the fraud space and could see this being a useful tool for identifying AI generated IDs + liveness checks. Will give it a try.
It's sadly not often enough I see a young company doing work that I feel only benefits society, but this is one of those times, so thank you and congratulations.
Thank you! We’ve been working on this since 2021 (and some of us a bit before that), and we’re reminded every day that we are ultimately working something that helps people on the macro and micro level. We want a world free of the malevolent uses of deepfakes for ourselves, our loved ones, and everyone beyond, and feel all should be privy to such protection.
How easy is it to fool Reality Defender into making false positives?
Whenever I'm openly performing nefarious illegal acts in public, I always wear my Sixfinger, so if anyone takes a photo of me, I can plausibly deny it by pointing out (while not wearing it) that the photo shows six fingers, and obviously must have been AI generated.
In support of said nefarious illegal acts, the Sixfinger includes a cap-loaded grenade launcher, gun, fragmentation bomb, ballpoint pen, code signaler, and message missile launcher. It's like a Swiss Army Finger! You can 3d print a cool roach clip attachment too.
"How did I ever get along with five???"
https://www.youtube.com/watch?v=ElVzs0lEULs
https://www.museumofplay.org/blog/sixfinger-sixfinger-man-al...
https://www.museumofplay.org/app/uploads/2010/11/Sixfinger-p...
I feel like this will be the next big cat and mouse sega after ad-blockers;
1) Produce AI tool 2) Tool gets used for bad 3) Use anti-AI/AI detection to avoid/check for AI tool 4) AI tool introduces anti-anti-AI/detection tools 5) Repeat
This is definitely a concern, but this is more or less how the cybersecurity space already works. Having dedicated researchers and a good business model helps a lot for keeping detectors like RD on the forefront of capabilities.
About time. Much needed. I just wish this was open source and built in public.
On my todo list to build a bot that finds sly AI responses for engagement farming
On a 2k desktop using Chrome, your website font/layout is way too big, especially your consent banner--it takes up 1/3 of the screen.
I find that most companies find that to be a feature, not a bug. People are more likely to hit accept if they can only see a small chunk of the content.
And the scrolling behavior is infuriating
please do not hijack the scroll wheel
It's like scrolling through molasses.