When you snap a digital photo with your camera or phone, it stores more than just the pixels and colors that make up the image. Each image file also contains metadata. This includes details ranging from creation date and copyright info to the location where the photo was taken.
The same goes for images modified with many photo editing programs. Image editing programs often add metadata to images including modification timestamps, system info, and tracked changes.
Metadata can pose a privacy threat to people who share and post photos online. Although some social networks and photo storage and sharing sites scrub metadata from uploaded photos, Comparitech researchers say that many fail to do so. This +could allow attackers to gather personal information from images posted online. For example, if someone posts a vacation photo with GPS coordinates and a timestamp in the metadata, an attacker could easily find when and where they traveled.
Metadata can be categorized into three broad categories:
- System metadata is generated when the image is stored (i.e. when a photo is taken or edits are saved). It includes specific labeled criteria, like the date and time the image was created and details about the camera and/or editing software used
- Substantive metadata include the contents of the actual file, such as tracked changes to an edited image
- Embedded metadata include data entered into a document that is not normally visible, such as formulas in an Excel spreadsheet
Image metadata can be embedded internally in common image file formats like JPEG and PNG. Such image data is usually stored in Exif (exchangeable image file format). But it can also exist outside the image file in a digital asset management (DAM) system. These are sometimes referred to as “sidecar” files, and are often stored in the XMP format.
Metadata has three broad use cases:
- Describing the contents of the file, including keywords, names of persons pictured, and location coordinates
- Copyright info includes the creator’s attribution, licensing restrictions, credits, and terms of use
- Administrative data can include the creation date, modification date, location, and other system metadata mentioned above.
Which image sharing services scrub metadata and which ones don’t?
Comparitech researchers analyzed the metadata scrubbing practices of 12 popular image storage and sharing services online. They uploaded an image of the Mona Lisa loaded with metadata to each of the services. After the upload, they then downloaded the image from each respective service to see if the metadata remained intact or not.
Let’s start with the most popular places to share images on the web. Imgur, Facebook and Instagram all scrub all metadata from photos upon upload. You don’t have to worry about leaking metadata when uploading images to these sites. Bear in mind, however, that even though users of those sites don’t have access to metadata, the sites themselves do.
Flickr keeps all of the original metadata data and even displays a lot of it on each photo’s web page.
Photobox.co.uk tags photos in the metadata comments section to indicate that uploaded images are compressed. The rest of the metadata is intact. It was the only service that actually added or modified data.
The remaining image sharing and storage services we examined didn’t remove or modify any metadata except for “date modified” timestamps:
- Pasteboard.co
- Turbmoimagehost.com
- Linkpicture.com
- 8upload.com
- Imgpile.com
- Postimages.org
- Imgbb.com
- Imageupload.io
- Gifyu
If you don’t want to expose EXIF metadata on those sites, you’ll have to scrub images beforehand. More on how to do that below.
How you can be tracked using EXIF metadata: research examples
Comparitech researchers proved the sensitivity of image metadata by using publicly available images to track down image subjects and creators. (Note: we’ve scrubbed all of the following images of their original metadata).
Let’s start with a simple example. Using the GPS metadata in the above photo, we determined it was taken near Sørstranda, Norway.
The next subject was a photo of a man’s face. Using the image metadata, reverse image search, and a bit of open-source intelligence (OSINT), researchers were able to identify him as a previous game-show contestant. They found his country, date of birth, wedding date, spouse’s name, Facebook profile, Twitter account, LinkedIn page, Instagram account, work experience, skills, education, and interests. Researchers were also able to identify and find info about the subject’s game-show teammates as well.
Another subject was a passport-style headshot featuring a man in what appear to be military fatigues. Researchers were able to track down the image to a site with photos of the subject’s school graduation. Using the school name and graduation gallery, researchers retrieved the names of everyone in his graduating class. With the possibilities narrowed down, they found a man with a name similar to that of the image filename. Researchers went on to find the man’s Facebook and Instagram profiles. Using these images, they further discovered he was indeed a soldier. They learned his division and brigade, and info about his closest relatives.
Lastly, researchers identified a Philippine national using a photo of herself posted on an image-sharing site. The subject is holding up photo identification. Such photos are often used to verify the subject’s identity to a digital service, such as an online bank. Researchers were able to find out the subject’s country, birth date, weight, height, blood type, address, Facebook profile, job, education, that she recently had Covid-19, and her Youtube channel.
Metadata used as court evidence
Metadata from images and other files has been used as evidence in courts of law and police investigations, demonstrating metadata’s value from a privacy perspective. Here are a few prominent examples:
- In 2016, two Harvard students used GPS coordinates stored in the metadata of photos posted on the dark web to identify drug dealers 229 drug dealers. Dark web drug dealers often post images of their products online to help prove their credibility, but they often forget to scrub EXIF data beforehand.
- In 2017, an employee of Bio-Rad Laboratories filed a suit against his employer alleging he was fired for telling authorities about potential bribery in China. A performance review with a metadata timestamp dated after he was fired served as evidence in the case, resulting in a higher payout for violating laws against firing whistleblowers. This is the biggest metadata-linked payout to date at $10.8 million in damages.
- In 2015, a judge threw out a case in which a woman accused her spouse of physical abuse. The plaintiff provided several photos as evidence of abuse, but the metadata indicated the date that the wife had claimed the abuse occurred three months after the photos were taken.
- In 2021, metadata was used to solve a disagreement over the validity of the Wills left by a recently deceased woman. The first Will left the estate to the woman’s niece. A second Will appeared to leave the estate to her grandchildren. Analysis of the second Will’s metadata revealed that it had been forged by one of the grandchildren.
- In a 2019 court case involving SPV-LS, LLC v. Transamerica Life Ins. Co., metadata showed that an agreement “purportedly signed” on one date was not actually created until almost a year later — two days before being presented to the Court.
How to remove metadata from images
Cameras and camera apps vary quite a bit, but many of them have an option to turn off or limit the generation of metadata. Check your camera or app settings.
Most cameras and image editing programs store image metadata in the EXIF format. You might be able to edit EXIF data on exiting images through your camera or photo editing app.
Some programs are specifically designed to work with metadata. ExifTool and Image Scrubber are two great open-source options.
Windows 11 comes with a built-in option to remove metadata. However, this will only remove metadata that Windows 11 understands, which means it could leave some metadata behind. Still, it should at least help minimize the information stored in images. If you’re a PC user, just follow these steps:
- Right-click the file you’d like to remove metadata from and select Properties to open a new window
- Click the Details tab at the top
- Click the link that says Remove Properties and Personal Information at the bottom. Another new window will pop up.
- In the Remove Properties window, select Remove the following properties from this file:
- Click Select All, then OK
How to remove metadata from documents
Windows 11 has a Document Inspector tool that makes it east to check open documents for metadata. It’s available in Word, Excel, and PowerPoint.
To open the Document Inspector:
- Choose the File tab, and then choose Info.
- Choose Check for Issues.
- Choose Inspect Document.
- Select Document Properties and Personal Information.
- Click the Inspect button
- Click Removal All