Unveiling Deepfake Detection Using Vision Transformers:  A Survey and Experimental Study

Pritesh Patil; Govind Dayma; Sujay Farkade; Harshvardhan Pawar; Swayam Pilare

doi:https://www.doi.org/10.59256/indjcst.20260501005

ARCHIVES

Original Article

Unveiling Deepfake Detection Using Vision Transformers: A Survey and Experimental Study

Pritesh Patil¹ Govind Dayma² Sujay Farkade³ Harshvardhan Pawar⁴ Swayam Pilare⁵

¹Professor, Department of Information Technology, AISSMS Institute of Information Technology, Pune, Maharashtra, India. ²,³,⁴,⁵ Department of Information Technology, AISSMS Institute of Information Technology, Pune, Maharashtra, India.

Published Online: January-April 2026

Pages: 29-40

Cite this article

↗ https://www.doi.org/10.59256/indjcst.20260501005

Abstract

View PDF

There is a lot of concern with how fast artificial intelligence (AI), machine learning, and other technologies have allowed the production of fake, but very realistic synthetic media (deepfakes). Deepfakes create problems with trustworthiness of media, individuals’ privacy rights, and national security. Generative models are rapidly advancing, and especially diffusion based models, are allowing for less noticeable artifacting in manipulated photos; CNNs may not be able to detect these types of photo manipulations as effectively as they used to. In addition to providing a structured review of image based methods for detecting deepfakes using Vision Transformer Architectures (which use self-attention to capture semantic relationship globally across the entire image); we will also provide experimental evaluation of an image-based Vision Transformer architecture for detecting deepfakes generated by current generative models. Experimental results on well established benchmarks and diffusion generated images indicate the accuracy of our approach ranges between 80 – 85%, showing the ability of transformer based models to detect global inconsistency in deepfakes. We will also discuss some challenges to detecting deepfakes including data quality, generalizing to new forms of manipulation, adversarial robustness, and ethics of deepfakes. Additionally, we highlight emerging areas of research, specifically Explainable Artificial Intelligence (XAI), to support development of completely transparent deepfake detection systems. Ultimately, this work highlights the need for Vision Transformer Architecture based approaches to develop robust and future ready deepfake detection systems.

Quick Links

Download

Manuscript Template Copyright Form

Policies

Share Article

X

Facebook

Or copy link

https://test.indjcst.com/archives/10.59256/indjcst.20260501005

*Instagram doesn't support direct link sharing from web. Copy the link and share it in your Instagram story or post.

ARCHIVES

Unveiling Deepfake Detection Using Vision Transformers: A Survey and Experimental Study

Cite this article

Abstract

Related Articles

Artificial Intelligence in Learning and Teaching

Admin Assist: An AI – Driven Configuration and Orchestration for Enterprise Application

Enhancing Blood Group Identification using pigeon inspired optimization: An Innovative Approach

Eco-Genius: Power Up Smart, Power Down Waste

Crowd-Sourced Disaster Response and Rescue Assistant

A Novel Stateful Orchestration Pattern for Data Affinity and Transactional Integrity in Sharded Backend Architectures

PlumX Metrics

Dimension

Quick Links

Download

Policies

Share Article