Python File MD5 Checksum

Python File MD5 Checksum

We do need to get the MD5 checksum for a file on quiet a few occasion while working on a daily basis.

While working with Python you can get the MD5 checksum for a file.

Python File MD5 Checksum

Python provides a reliable method to get the File MD5 Checksum.

The following is small script which will provide the MD5 Checksum.

import hashlib

def findMD5(targetFile):
   m = hashlib.md5()
   with open(targetFile, 'rb') as target_file:
      line = target_file.read()
      m.update(line)
   return m.hexdigest()

So the above function in Python will get the MD5 Checksum of the targetFile which has to be supplied while executing it.

In the above code, Python opened the file in rb mode, which means that the target file will be read in a binary mode. This is because the MD5 function needs to read the file as a sequence of bytes. This will help the function to read any kind of file and not just text file.

Python File MD5 Checksum for Large File

Please note that the read() function in the above example does not have any arguments with it. Which means that in this case that the read() function read all the contents of the file and load it in the memory. This can be dangerous, as the system might run out of memory while running MD5 on a large file.

In order to take care of this problem we can pass the blocksize to the read() function.

Let’s consider the following version of the above code.

import hashlib

def findMD5(targetFile):
   m = hashlib.md5()
   with open(targetFile, 'rb') as target_file:
      line = target_file.read(8192)
      while len(line) > 0:
         m.update(line)
         line = target_file.read(8192)
   return m.hexdigest()

So in the above function Python will read() the targetFile in the blocksize of 8192 bytes instead of reading the entire contents of the file and loading it in the memory.

Python File MD5 Checksum

No Comments

Post a Comment

Time limit is exhausted. Please reload CAPTCHA.