Deepfakes leveled up in 2025 – here’s what’s coming next

Deepfakes Leveled up in 2025—Here's What's Coming Next

Over the course of 2025, deepfakes improved dramatically. AI-generated faces, voices and full-body performances that mimic real people increased in quality far beyond what even many experts expected would be the case just a few years ago. They were also increasingly used to deceive people.

For many everyday scenarios — especially low-resolution video calls and media shared on social media platforms — their realism is now high enough to reliably fool nonexpert viewers. In practical terms, synthetic media have become indistinguishable from authentic recordings for ordinary people and, in some cases, even for institutions.

And this surge is not limited to quality. The volume of deepfakes has grown explosively: Cybersecurity firm DeepStrike estimates an increase from roughly 500,000 online deepfakes in 2023 to about 8 million in 2025, with annual growth nearing 900%.

I’m a computer scientist who researches deepfakes and other synthetic media. From my vantage point, I see that the situation is likely to get worse in 2026 as deepfakes become synthetic performers capable of reacting to people in real time.

Just about anyone can now make a deepfake video.

Dramatic improvements

Several technical shifts underlie this dramatic escalation. First, video realism made a significant leap thanks to video generation models designed specifically to maintain temporal consistency. These models produce videos that have coherent motion, consistent identities of the people portrayed, and content that makes sense from one frame to the next. The models disentangle the information related to representing a person’s identity from the information about motion so that the same motion can be mapped to different identities, or the same identity can have multiple types of motions.

These models produce stable, coherent faces without the flicker, warping or structural distortions around the eyes and jawline that once served as reliable forensic evidence of deepfakes.

Second, voice cloning has crossed what I would call the “indistinguishable threshold.” A few seconds of audio now suffice to generate a convincing clone – complete with natural intonation, rhythm, emphasis, emotion, pauses and breathing noise. This capability is already fueling large-scale fraud. Some major retailers report receiving over 1,000 AI-generated scam calls per day. The perceptual tells that once gave away synthetic voices have largely disappeared.

Third, consumer tools have pushed the technical barrier almost to zero. Upgrades from OpenAI’s Sora 2 and Google’s Veo 3 and a wave of startups mean that anyone can describe an idea, let a large language model such as OpenAI’s ChatGPT or Google’s Gemini draft a script, and generate polished audio-visual media in minutes. AI agents can automate the entire process. The capacity to generate coherent, storyline-driven deepfakes at a large scale has…

Access the original article

Subscribe
Don't miss the best news ! Subscribe to our free newsletter :