• 文章架构
  • 利用 Python 计算MD5值_hashlib


目的

  • 日常开发中,经常涉及到针对某些值进行加密的情况(隐私信息,例如密码等信息)。
  • 利用 Python 某些模块将 DataFrame(pandas)某列进行MD5加密处理很方便。
  • 利用Python 3 与 Python 2中不同模块处理之间的差异(例如,Python2 MD5模块与Python 3 hashlib模块)。

加密方式

hashlib 模块(​​Py3 docs​​)
  • Python 2 / Python 3 中均可完成该模块的安装.
import platform
pv = platform.python_version()
print (pv)

import hashlib

deomo_val = 'kngines'
md5_val = hashlib.md5(deomo_val.encode('utf8')).hexdigest()
print ('src_val : %s \nmd5_val : %s'
  • Py 2 运行结果
  • 利用 Python 计算MD5值_hashlib_02

  • Py 3 运行结果
  • 利用 Python 计算MD5值_md5sum_03


  • Help on module hashlib
help('hashlib')  # 查看该模块详情
Help on module hashlib:

NAME
hashlib - hashlib module - A common interface to many hash functions.

MODULE REFERENCE
https://docs.python.org/3.6/library/hashlib

The following documentation is automatically generated from the Python
source files. It may be incomplete, incorrect or include features that
are considered implementation detail and may vary between Python
implementations. When in doubt, consult the module reference at the
location listed above.

DESCRIPTION
new(name, data=b'', **kwargs) - returns a new hash object implementing the
given hash function; initializing the hash
using the given binary data.

Named constructor functions are also available, these are faster
than using new(name):

md5(), sha1(), sha224(), sha256(), sha384(), sha512(), blake2b(), blake2s(),
sha3_224, sha3_256, sha3_384, sha3_512, shake_128, and shake_256.

More algorithms may be available on your platform but the above are guaranteed
to exist. See the algorithms_guaranteed and algorithms_available attributes
to find out what algorithm names can be passed to new().

NOTE: If you want the adler32 or crc32 hash functions they are available in
the zlib module.

Choose your hash function wisely. Some have known collision weaknesses.
sha384 and sha512 will be slow on 32 bit platforms.

Hash objects have these methods:
- update(arg): Update the hash object with the bytes in arg. Repeated calls
are equivalent to a single call with the concatenation of all
the arguments.
- digest(): Return the digest of the bytes passed to the update() method
so far.
- hexdigest(): Like digest() except the digest is returned as a unicode
object of double length, containing only hexadecimal digits.
- copy(): Return a copy (clone) of the hash object. This can be used to
efficiently compute the digests of strings that share a common
initial substring.

For example, to obtain the digest of the string 'Nobody inspects the
spammish repetition':

>>> import hashlib
>>> m = hashlib.md5()
>>> m.update(b"Nobody inspects")
>>> m.update(b" the spammish repetition")
>>> m.digest()
b'\xbbd\x9c\x83\xdd\x1e\xa5\xc9\xd9\xde\xc9\xa1\x8d\xf0\xff\xe9'

More condensed:

>>> hashlib.sha224(b"Nobody inspects the spammish repetition").hexdigest()
'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2'

CLASSES
builtins.object
_blake2.blake2b
_blake2.blake2s
_sha3.sha3_224
_sha3.sha3_256
_sha3.sha3_384
_sha3.sha3_512
_sha3.shake_128
_sha3.shake_256

FUNCTIONS
md5 = openssl_md5(...)
Returns a md5 hash object; optionally initialized with a string

new = __hash_new(name, data=b'', **kwargs)
new(name, data=b'') - Return a new hashing object using the named algorithm;
optionally initialized with data (which must be bytes).

pbkdf2_hmac(...)
pbkdf2_hmac(hash_name, password, salt, iterations, dklen=None) -> key

Password based key derivation function 2 (PKCS #5 v2.0) with HMAC as
pseudorandom function.

sha1 = openssl_sha1(...)
Returns a sha1 hash object; optionally initialized with a string

sha224 = openssl_sha224(...)
Returns a sha224 hash object; optionally initialized with a string

sha256 = openssl_sha256(...)
Returns a sha256 hash object; optionally initialized with a string

sha384 = openssl_sha384(...)
Returns a sha384 hash object; optionally initialized with a string

sha512 = openssl_sha512(...)
Returns a sha512 hash object; optionally initialized with a string

DATA
__all__ = ('md5', 'sha1', 'sha224', 'sha256', 'sha384', 'sha512', 'bla...
algorithms_available = {'DSA', 'DSA-SHA', 'MD4', 'MD5', 'MDC2', 'RIPEM...
algorithms_guaranteed = {'blake2b', 'blake2s', 'md5', 'sha1', 'sha224'...

MD5 模块加密 (Python 2 自带模块)
  • 注意警告信息(不推荐使用该方式加密,已废弃)
import md5

src_val = 'kngines'
mnew = md5.new() # Returns a md5 hash object; optionally initialized with a string
mnew.update(src_val) # Update this hash object's state with the provided string.
print (mnew.hexdigest()) # Return the digest value as a string of
  • 运行结果
  • 利用 Python 计算MD5值_python_04


加密/校验文本

  • Linux命令 md5sum
[root@localhost xxxx]# md5sum test.log 
9e05895ce1f42385c407f71e5bb84105 test.log
[root@localhost synway]# md5sum --h
Usage: md5sum [OPTION]... [FILE]...
Print or check MD5 (128-bit) checksums.
With no FILE, or when FILE is -, read standard input.

-b, --binary read in binary mode
-c, --check read MD5 sums from the FILEs and check them
--tag create a BSD-style checksum
-t, --text read in text mode (default)
Note: There is no difference between binary and text mode option on GNU system.

The following four options are useful only when verifying checksums:
--quiet don't print OK for each successfully verified file
--status don't output anything, status code shows success
--strict exit non-zero for improperly formatted checksum lines
-w, --warn warn about improperly formatted checksum lines

--help display this help and exit
--version output version information and exit

The sums are computed as described in RFC 1321. When checking, the input
should be a former output of this program. The default mode is to print
a line with checksum, a character indicating input mode ('*' for binary,
space for text), and name for each FILE.

GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
For complete documentation, run: info coreutils 'md5sum invocation'
  • Python 加密文本
import hashlib

f = open('./test.log', 'r')
f_md5 = hashlib.md5()
f_md5.update(f.read().encode('utf8'))

# f = open('./test.log', 'rb')
# f_md5 = hashlib.md5()
# f_md5.update(f.read())
  • 运行结果
  • 利用 Python 计算MD5值_python_05


References