A checksum is a string of numbers and letters that’s used to “check” whether data or a file has been altered during storage or transmission. Checksums often accompany software downloaded from the web so that users can ensure the file or files were not modified in transit. If the checksum from the software vendor matches the checksum of the downloaded installation files on your computer, then no errors or modifications were made. If the checksum values don’t match up, the download might have been corrupted or compromised by hackers.
This article will explain how to use checksums to validate files on both a Mac and a PC. First, we’ll explain how to use a checksum, and then go into more detail about how it works.
For demonstration purposes, we’ll download VLC Media Player, a free and open source program that comes with a checksum.
How to use a checksum on Windows
There are many tools and utilities out there for validating checksums on Windows, but we’ll use built-in tools that come with Windows 7, 8, 10 and 11.
Start by downloading the file you want to check as usual. Remember that if it’s a compressed (zipped) file, you’ll want to run the checksum on the compressed folder before extracting the contents.
VLC’s website allows you to simply click a link to view the checksum right on the download page. Other software vendors might require you to download the checksum in a text file, in which case you can open it using Notepad or a similar text editor.
The checksum is a long string of seemingly random numbers and letters. Once you can see it, follow these steps:
- Open Command Prompt by holding Windows Key and pressing ‘R’. Type “cmd” into the text field and press Enter.
- Navigate to the folder where your file is located. If you use the default settings, this command should work:
cd Downloads
- Enter the following command, replacing [FILENAME] with the file you want to validate, including its extension, and [HASH] with the hash algorithm specified by the software vendor. In this case, the VLC download page says the hash algorithm is SHA256.
certutil -hashfile [FILENAME] [HASH]
- Press Enter to generate the checksum. Compare the checksum from the software vendor to the one you just created.
If the two checksums match, you’re good to go. The file hasn’t been corrupted or modified from the original version.
If the checksums don’t match, there’s a problem. It might not have downloaded properly, or a hacker could have hijacked your connection to make you download a corrupt file from a malicious server. The modified version could contain malware or other flaws. We do not recommend installing any software that does not have a validated checksum.
Windows’ certutil command can use the following hash algorithms to generate a checksum:
- MD2
- MD4
- MD5
- SHA1
- SHA256
- SHA384
- SHA512
How to use a checksum on Mac OS
You can validate a checksum on Mac using built-in functions in Terminal. Start by downloading the file you want to validate along with the checksum from the vendor. Again, we’ll use VLC Media Player as an example.
When you download VLC, the checksum can be viewed right on the download page, but some software might require you to download the checksum in a separate text file. You can open such a file in TextEdit to view the checksum.
With the software vendor’s checksum in hand, follow these instructions:
- Open a Terminal by clicking the magnifying glass icon in the top right corner, searching for “terminal“, and clicking the first result.
- Assuming you downloaded the file, you want to check into your default Downloads folder, navigate to that folder using the cd command in the Terminal:
cd Downloads
- The command to generate a checksum varies depending on the hash algorithm. In this case, that’s SHA256. Enter the following command into the terminal:
shasum -a 256 vlc-3.0.6.dmg
.
- The checksum will appear on the next line of the terminal. Compare it with the checksum generated by the software vendor and ensure it matches.
If the two checksums match, then the file hasn’t been corrupted or modified from the original version, and you’re good to go.
If the checksums don’t match up, then don’t install it. It might not have downloaded properly, or the connection could have been hijacked to make you download a malicious file. We do not recommend installing any software that does not have a validated checksum.
If you’re using a hash algorithm other than SHA256 on Mac, here are the commands you’ll need, replacing [filename] with the name of the file you wish to validate:
- MD5:
md5 [filename]
- SHA1:
shasum -a 1 [filename]
- SHA256:
shasum -a 256 [filename]
- SHA384:
shasum -a 384 [filename]
- SHA512:
shasum -a 512 [filename]
We recommend using SHA256 or higher when possible. MD5 and SHA1 have been deprecated and aren’t as secure.
How checksums work
Hashing is a one-way encryption function that takes in data of any size and outputs a value of fixed size. The SHA256 hashing algorithm used above, for example, gives you a sequence of 64 letters and numbers known as a “hash”. Whether the input is a text file with one sentence or an entire operating system, the output length will always be 64 characters. The hash will be the same every time so long as the data put into the hashing algorithm remains constant.
When downloading software, the hash value is used as the checksum. Hashing is also used by companies to verify users’ passwords without storing the password in plain text on a server. In email systems that use digital signatures, hashing is used to ensure emails haven’t been modified in transit, where the hash value is called a “message digest” instead of a checksum.
Checksums are an integral part of the IP protocol, the underlying technology that enables the internet. When data is transmitted across the internet in IP packets, checksums are used to ensure those packets haven’t been modified. Unlike software downloads, these protocols automate the validation process without the need for user input. Read more about the TCP/IP and UDP/IP protocols here to learn more.
If even one bit of data or code is altered in the original data, then the hash value, checksum, or message digest will be drastically different. Therefore, if a piece of downloaded software contains any errors or modifications that make it different from what the software vendor officially published, then the hash values, checksums, or message digests will not match.
What checksum algorithms are there?
A parity bit is the simplest form of checksum algorithm. It involves adding an extra bit — the parity bit — to a string of binary code. The parity bit is chosen so that the total number of bits with the value of 1 is either even or odd – depending on the chosen parity scheme.
For an even parity scheme, you set the parity bit to 0 if the data already contains an even number of 1s. If the data has an odd number of 1s, you set the parity bit to 1, so that the total becomes even.
For an odd parity scheme, you set the parity bit to 1 if the data contains an even number of 1s. If the data has an odd number of 1s, you set the parity bit to 0, so that the total becomes odd.
When the recipient receives the data, they can count the number of 1s in the received data and parity bit. If the total matches the expected parity (even or odd, depending on the scheme), it indicates that there are no single-bit errors in the data. However, if the total doesn’t match the expected parity, the recipient knows that there is an issue.
The problem with parity bit checksums is that errors effecting several bits at once may not be detected. Checksum algorithms such as Adler-32 and Fletcher’s checksum address this, and other, weaknesses.