The Universitat Oberta de Catalunya is leading a project with Japanese and Polish researchers to automatically differentiate between original and fake multimedia content, using techniques from digital content forensics analysis, watermarking and AI.
Social media represents a major channel for the spreading of fake news and disinformation. This situation has been made worse with recent advances in photo and video editing and artificial intelligence tools, which make it easy to tamper with audiovisual files – for example with so-called deepfakes, which combine and superimpose images, audio and video clips to create montages that look like real footage.
Researchers from the K-riptography and Information Security for Open Networks (KISON) and the Communication Networks & Social Change (CNSC) groups of the Internet Interdisciplinary Institute (IN3) at the Universitat Oberta de Catalunya (UOC) have launched a new project to develop innovative technology that, using artificial intelligence and data concealment techniques, should help users to automatically differentiate between original and adulterated multimedia content, thus contributing to minimising the reposting of fake news. DISSIMILAR is an international initiative headed by the UOC including researchers from the Warsaw University of Technology (Poland) and Okayama University (Japan).
‘The project has two objectives: firstly, to provide content creators with tools to watermark their creations, thus making any modification easily detectable; and secondly, to offer social media users tools based on latest-generation signal processing and machine learning methods to detect fake digital content,’ explained Professor David Megías, KISON lead researcher and director of the IN3. Furthermore, DISSIMILAR aims to include ‘the cultural dimension and the viewpoint of the end user throughout the entire project’, from the designing of the tools to the study of usability in the different stages.
The danger of bias
Currently, there are basically two types of tools to detect fake news. Firstly, there are automatic ones based on machine learning, of which (currently) only a few prototypes are in existence. And, secondly, there are the fake news detection platforms featuring human involvement, as is the case with Facebook and Twitter, which require the participation of people to ascertain whether specific content is genuine or fake.
According to David Megías, this centralised solution could be affected by ‘different biases’ and encourage censorship. ‘We believe that an objective assessment based on technological tools might be a better option, provided that users have the last word on deciding, on the basis of a pre-evaluation, whether they can trust certain content or not,’ he explained.
For Megías, there is no ‘single silver bullet’ that can detect fake news: rather, detection needs to be carried out with a combination of different tools. ‘That’s why we’ve opted to explore the concealment of information (watermarks), digital content forensics analysis techniques (to a great extent based on signal processing) and, it goes without saying, machine learning’, he noted.
Automatically verifying multimedia files
Digital watermarking comprises a series of techniques in the field of data concealment that embed imperceptible information in the original file to be able ‘easily and automatically’ verify a multimedia file.
‘It can be used to indicate a content’s legitimacy by, for example, confirming that a video or photo has been distributed by an official news agency, and can also be used as an authentication mark, which would be deleted in the case of modification of the content, or to trace the origin of the data. In other words, it can tell if the source of the information (e.g. a Twitter account) is spreading fake content,’ explained Megías.
Digital content forensics analysis techniques
The project will combine the development of watermarks with the application of digital content forensics analysis techniques. The goal is to leverage signal processing technology to detect the intrinsic distortions produced by the devices and programs used when creating or modifying any audiovisual file.
These processes give rise to a range of alterations, such as sensor noise or optical distortion, which could be detected by means of machine learning models. ‘The idea is that the combination of all these tools improves outcomes when compared with the use of single solutions,’ stated Megías.
Studies with users in Spain, Poland and Japan
One of the key characteristics of DISSIMILAR is its ‘holistic’ approach and its gathering of the ‘perceptions and cultural components around fake news’. With this in mind, different user-focused studies will be carried out, broken down into different stages.