We need to remember that creating fakes is an application, not a tool -- and that a malicious applications are not the whole story.
By Ben Lorica and Mike Loukides.[A version of this post appears on the O'Reilly Radar.]
Deepfakes have been very much in the news for the past two years. Itâs time to think about what deepfakes are and what they mean. Where do they come from? Why now? Is this just a natural evolution in the history of technology?
Deepfakes are media that are created by AI. They appear to be genuine (e.g., a video of President Obama) but have limited connection to reality. An audio track can be created that sounds indistinguishable from the victim, saying something the victim would never have said. Video can be generated from existing videos or photos that match the soundtrack, so that the mouth moves correctly and the facial expressions look natural. Itâs not surprising that humans have trouble detecting fakes; with the current technology, even shallow fakes are too good.Â
Deepfakes are the logical extension of older AI research. It wasnât long ago that we read about AI generating new paintings in the style of Rembrandt and other Dutch Masters, stylizing pictures in the style of Van Gogh and Picasso, and so on. At the time, there was more concern about the future of human creativity: would we still need artists? Would we live in a world full of fake Van Goghs? We shrugged those âfakesâ off because we were asking the wrong questions. We donât need more Van Goghs any more than we need more Elvises on velvet. We may end up with a few fake Rembrandts where they shouldnât be, but the art world will survive.
If thatâs the wrong question, whatâs the right one? The problem with deepfakes is that simulating an artistâs style collided with the rise of fake news. Fake news isnât new by any means; there have always been conspiracy theorists who are marvelously skeptical of âtraditionalâ media, but are completely unskeptical of their own sources, whether they claim that Tibetans are spying on us through a system of underground tunnels or that vaccinations cause autism.Â
To this collision, add three more factors: the democratization of AI, the decrease in the cost of computing power, and the phenomenon of virality. Deepfakes jumped out of the lab and into the streets. You donât need a Ph.D. to generate fake media, nor do you need the resources of a nation state to acquire enough computing power. Some easily available tools and a credit card to buy time on AWS are all you need. In some cases, it only takes an app: in China, a popular iPhone app lets you put your face into movie clips. (Ironically, backlash against this app didnât take place because of the fakes but because of the appâs privacy policy.) Once youâve created a fake, you can use social media to propagate it. YouTubeâs and FaceBookâs algorithms for optimizing âengagementâ can make any content viral in seconds.Â
That all adds up to a scary picture. We will certainly see deepfakes in politics, though as security expert @thegrugq points out, cheap fakes are better than deepfakes for shaping public opinion. Deepfakes might be more dangerous in computer security, where they can be used to circumvent authentication or perform high-quality phishing attacks. Symantec has reported that it has seen such attacks in the field, and recently an AI-generated voice that mimicked a CEO was used in a major fraud.
Deepfakes for good
The scary story has been covered in many places, and itâs not necessary to repeat it here. Whatâs more interesting is to realize that deepfakes are just about high quality image generation. âFakesâ are a matter of context; they are specific applications of technologies for synthesizing video and other media. There are many contexts in which synthetic video can be used for good.Here are a few of these applications. Synthesia creates videos with translations, in which video is altered so that the speakerâs movements match the translation. It provides an easy way to create multilingual public service announcements that feel natural. You donât have to find and film actors capable of getting your message across in many languages.
One of the biggest expenses in video games is creating compelling video. Landscapes are important, but so are dialog and facial expressions. Synthetic video is useful for creating and animating Anime characters; NVidia has used generative adversarial networks (GANs) to create visuals that can be used in video games.Â
There are many fields, such as medicine, in which collecting labeled training data is difficult. In one experiment, synthetic MRI images showing brain cancers were created to train neural networks to analyze MRIs. This technique has two advantages. First, cancer diagnoses are relatively rare, so itâs difficult to find enough images; and second, using synthetic images raises few privacy issues, if any. A large set of synthetic cancerous MRIs can be created from a small set of actual MRIs without compromising patient data because the synthetic MRIs donât match any real person.
Another medical application is creating synthetic voices for people who have lost the ability to speak. Project ReVoice can create synthetic voices for ALS patients based on recordings of their own voice, rather than using mechanical-sounding synthetic voices. Remember hearing Stephen Hawking âspeakâ with his robotic computer-generated voice? That was state-of-the-art technology a few years ago. ReVoice could give a patient their own voice back.Â
Many online shopping sites are designed to make it easier to find clothes that you like and that fit. Deep fake technologies can be used to take images of customers and edit in the clothing they are looking at. The images could even be animated so they can see how an outfit moves as they walk.
Policies and protections
We will see a lot of fakes: some deep, some shallow, some innocuous, some serious. The more important question is what should be done about it. So far, social media companies have done little to detect and alert us to fakes, whether they are deep or shallow. Facebook has admitted that they were slow to detect a fake video of Nancy Pelosi--and that video was an unsophisticated shallow fake. You could argue that any photoshopped picture is a âshallow fake,â and it isnât hard to find social media âinfluencersâ whose influence depends, in part, on Photoshop. Deepfakes will be even harder to detect. What role should social media companies such as Facebook and YouTube have in detecting and policing fakes?Social media companies, not users, have the computing resources and the technical expertise needed to detect fakes. For the time being, the best detectors are very hard to fool. And Facebook has just announced the Deepfake Detection Challenge, in partnership with Microsoft and a number of universities and research groups, to âcatalyze more research and developmentâ in detecting fakes.Â
Hany Farid estimates that people working on video synthesis outnumber people working on detection 100:1, but the ratio isnât the real problem. The future of deep fake fraud will be similar to what weâve already seen with cybersecurity, which is dominated by âscript kiddiesâ who use tools developed by others, but who canât generate their own exploits. Regardless of the sophistication of the tools, fakes coming from âfake kiddiesâ will be easily detectable, just because those tools are used so frequently. Any signatures they leave in the fakes will show up everywhere and be easily caught. Thatâs how we deal with email spam now: if spam were uncommon, it would be much harder to detect. It also wouldnât be a problem.
In addition to the âfake kiddies,â there will be a small number of serious researchers who build the tools. They are a bigger concern. However, itâs not clear that they have an economic advantage. Media giants like Facebook and Google have the deep pockets needed to build state-of-the-art detection tools. They have practically unlimited computing resources, an army of researchers, and the ability to pay much more than a crooked advertising agency. The real problem is that media sites make more money from serving fake media than from blocking it; they emphasize convenience and speed over rigorous screening. And, given the number of posts that they screen, even a 0.1% false positive rate is going to create a lot of alerts.Â
When fake detection tools are deployed, the time needed to detect a fake is important. Fake media does its damage almost instantly. Once a fake video has entered a social network, it will circulate indefinitely. Announcing after the fact that it is a fake does little good, and may even help the fake to propagate. Given the nature of virality, fakes have to be stopped before theyâre allowed to circulate. And given the number of videos posted on social media, even with Facebook- or Google-like resources, responding quickly enough to stop a fake from propagating will be very difficult. We havenât seen any data on the CPU resources required to detect fakes with the current technology, but researchers working on detection tools will need to take speed into account.
In addition to direct fake detection, it should be possible to use metadata to help detect and limit the spread of fakes. Renée DiResta has argued that spam detection techniques could work; and older research into USENET posting patterns has shown that itâs possible to identify the role users take using only metadata from their posts, not the content. While techniques like these wonât be the whole solution, they represent an important possibility: can we identify bad actors by the way they act, not the content they post? If we can, that would be a powerful tool.
Since many fakes take the form of political advertisements, the organizations that run these advertisements must bear some accountability. Facebook is tightening up its requirements for political ads, requiring tax ID numbers and other documentation, along with âpaid forâ disclaimers. These stricter requirements could still be spoofed, but they are an improvement. Facebookâs new rules go at least part way toward Edward Docxâs three suggestions for regulation:Â
Nobody should be allowed to advertise on social media during election campaigns unless strongly authenticatedâwith passports, certificates of company registration, declarations of ultimate beneficial ownership. The source and application of funds needs to be clear and easily visible. All ads should be recordedâas should the search terms used to target people.The danger is that online advertising is searching for engagement and virality, and itâs much easier to maximize engagement metrics with faked extreme content. Media companies and their customers--the advertisers--must wean themselves from their addiction to the engagement habit. Docxâs suggestions would at least leave an audit trail, so it would be possible to reconstruct who showed which advertisement to whom. They donât, however, address the bigger technical problem of detecting fakes in real time. Weâd add a fourth suggestion: social media companies should not pass any video on to their consumers until it has been tested, even if that delays posting. While Facebook is obviously interested in tightening up authentication requirements, we doubt they will be interested in adding delays in the path between those who post video and their audiences.Â
Is regulation a solution? Regulation brings its own problems. Regulators may not understand what theyâre regulating adequately, leading to ineffective (or even harmful) regulation with easy technical workarounds. Regulators are likely to be unduly influenced by the companies they are regulating, who may suggest rules that sound good but donât require them to change their practices. Compliance also places a bigger burden on new upstarts who want to compete with established media companies such as Facebook and Google.Â
Defending against disinformation
What can individuals do against a technology thatâs designed to confuse them? Itâs an important question, regardless of whether some sort of regulation âsaves the day.â Itâs entirely too easy to imagine a dystopia where weâre surrounded by so many fakes that itâs impossible to tell whatâs real. However, there are some basic steps you can take to become more aware of fakes and to prevent propagating them.Perhaps most important, never share or âlikeâ content that you havenât actually read or watched. Too many people pass along links to content they havenât seen themselves. Theyâre going entirely by a clickbait title, and those titles are designed to be misleading. Itâs also better to watch entire videos rather than short clips; watching the entire video gives context that youâd otherwise miss. Itâs very easy to extract misleading video clips from larger pieces without creating a single frame of fake video!
When something goes viral, avoid piling on; virality is almost always harmful. Virality depends on getting thousands of people in a feedback loop of narcissistic self-validation that has almost nothing to do with the content itself.Â
Itâs important to use critical thinking; itâs also important to think critically about all your media, especially media that supports your point of view. Confirmation bias is one of the most subtle and powerful ways of deceiving yourself. Skepticism is necessary, but it has to be applied evenly. Itâs useful to compare sources and to rely on well-known facts. For example, if someone shares a video of âBoris Johnson in Thailand in June 2014â with you, you can dismiss the video without watching it because you know Boris was not in Thailand at that time. Strong claims require stronger evidence, and rejecting evidence because you donât like what it implies is a great way to be taken in by fake media.
While most discussions of deepfakes have focused on social media consumption, theyâre perhaps more dangerous in other forms of fraud, such as phishing. Defending yourself against this kind of fraud is not fundamentally difficult: use two factor authentication (2FA). Make sure there are other channels to verify any communication. If you receive voicemail asking you to do something, there should be an independent way to confirm that the message is genuineâperhaps by making a call back to a prearranged number. Donât do anything simply because a voice tells you to. That voice may not be what you think it is.
If youâre very observant, you can detect fakery in a video itself. Real people blink frequently, every 2 to 10 seconds. Blinks are hard to simulate because synthetic video is usually derived from still photographs, and there are few photographs of people blinking. Therefore, people in fake video may not blink, or they may blink infrequently. There may be slight errors in synchronization between the sound and the video; do the lips match the words? Lighting and shadows may be off in subtle but noticeable ways. There may be other minor but detectable errors: noses that donât point in quite the right direction, distortions or blurred areas on an image thatâs otherwise in focus, and the like. However, blinking, synchronization, and other cues show how quickly deepfakes are evolving. After the problem with blinking was publicized, the next generation of software incorporated the ability to synthesize blinking. That doesnât mean these cues are useless; we can expect that many garden-variety fakes wonât be using the latest software. But the organizations building detection tools are in an escalating arms race with bad actors on technologyâs leading edge.
We donât expect many people to inspect every video or audio clip they see in such detail. We do expect fakes to get better, we expect both deep and shallow fakes to proliferate, and we expect people to charge genuine video with being faked. After all, with fake news, the real goal isnât to spread disinformation; itâs to nurture an attitude of suspicion and distrust. If everything is under a cloud of suspicion, the bad actors win.
Therefore, we need to be wary and careful. Skepticism is usefulâafter all, itâs the basis for scienceâbut denial isnât skepticism. Some kind of regulation may help social media to come to terms with fakes, but itâs naive to pretend that regulating media will solve the problem. Better tools for detecting fakes will help, but exposing a fake frequently does little to change peoplesâ minds, and we expect the ability to generate fakes will at least keep pace with the technology for detecting them. Detection may not be enough; the gap between the time a fake is posted and the time itâs detected may well be enough for disinformation to take hold and go viral.Â
Above all, though, we need to remember that creating fakes is an application, not a tool. The ability to synthesize video, audio, text, and other information sources can be used for good or ill. The creators of OpenAIâs powerful tool for creating fake texts concluded that âafter careful monitoring, they had not yet found any attempts of malicious use but had seen multiple beneficial applications, including in code autocompletion, grammar help, and developing question-answering systems for medical assistance.â Malicious applications are not the whole story. The question is whether we will change our own attitudes toward our information sources and become more informed, rather than less. Will we evolve into users of information who are more careful and aware? The fear is that fakes will evolve faster than we can; the hope is that weâll grow beyond media that exists only to feed our fears and superstitions.
Related content: