Python实现AWS鉴权


一、基于对象存储的图像处理

基于对象存储的图像处理接口非常简单,就可以实现即用即销毁的水印/裁剪/格式转换/缩略图等功能。以华为云为例,只需要像S3协议下载对象一样额外加上x-image-process参数就可以获取到想要的图像。
https://obs.cn-southwest-2.myhuaweicloud.com/{bucket}/{图片对象名}?x-image-process=image/sharpen,100/blur,r_1,s_1/resize,m_lfit,h_400,w_400,limit_1 目前部分云平台提供了自己的各个语言的SDK,但是有些不支持图像处理,并且对象操作中也封装的非常死,虽然图像处理的功能只需要加一个get参数,但是也没有入口。这些都基于HTTP协议做的,只不过是多一种AWS的aksk鉴权方式,我们只要实现了AWS的鉴权,就打开了AWS各种接口的大门,后面基于对象存储的视频音频处理都能处理自如。

二、AWS-sign 鉴权流程

1.用户身份证 ak & sk

可以认为ak是用户名,sk是用户密码。aksk由服务端颁发,当用户想要和服务端交互时,使用aksk进行鉴权,也开始了aws的鉴权流程。
aksk样式:

access_key = 'XXXXXXXXXXXXXXXXXXXX'
secret_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

2.鉴权流程

关于鉴权流程官网也有详细说明和代码示例,参考https://docs.amazonaws.cn/general/latest/gr/signing_aws_api_requests.html,这里做主要的介绍。

step1: 首先基于当前准备要发生的HTTP请求生成一个规范请求。即根据 请求方法/url/参数/时间/特定的请求头 等生成一段有一定特征的严格约定的请求字符。AWS对这部分做了规定。一个规范请求大概是如下:

GET
/zgctest1
acl=
host:obs.cn-southwest-2.myhuaweicloud.com
x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
x-amz-date:20220630T164106Z

host;x-amz-content-sha256;x-amz-date
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

上面的内容一个字符都不能错,不能有多于的空格。换行,url转义,顺序等都有严格的限制。自上而下分别是:

请求方法
URL中host之后到get参数?之间的内容,必须以/开头
get请求参数和值,参数无值时以=号结尾
参与签名的请求头和值,一行一个请求头。其中host/x-amz-date都是必选项,部分高版本x-amz-content-sha256也是必选项。x-amz-content-sha256值为body的sha256签名值,为空则为固定的签名初始值。
header2:value
header3:value
固定空行
签名包含的请求头名称,全小写,字典序,和header:value一一对应
请求body部分sha256签名值,和x-amz-content-sha256值相同

step2: 基于上面生成的规范请求,创造出待签名的字符串,大致如下:

AWS4-HMAC-SHA256
20220630T164106Z
20220630/region/service/aws4_request
fe36c5a06a80c2212e570eddc82053624c655dae2b96ada44a5c187c9c747ea7

分别对应含义:

加密方式:固定的值AWS4-HMAC-SHA256
时间:x-amz-date请求头的值
请求日期/region/服务名称/协议版本固定的aws4_request。其中region 服务名称一般没有固定要求,服务端一般不做检验
ste1中规范请求的sha256签名值

step3: 计算签名。aksk开始排上用场,将sk,签名字符串都揉到一起,两者谁出错,服务端都无法经过同样的算法得到相同的签名值,从而就达到了鉴权的校验的目的。具体的算法伪代码如下:

kSecret = SK
kDate = HMAC("AWS4" + kSecret, Date)
kRegion = HMAC(kDate, Region)
kService = HMAC(kRegion, Service)
kSigning = HMAC(kService, "aws4_request")
signature = HexEncode(HMAC(kSigning, string to sign))

signature 就是我们要的签名

step4: 将我们计算的签名和相关信息添加到Authorization的HTTP header中格式如下:

Authorization:AWS4-HMAC-SHA256 Credential=N4OHZY6DXW4YTLYTPZJE/20220630/stack-test-18/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=44c9f46837092bf1759f1249eefd5ab7a4aa5d87eb6e3adccb3f7397575dba9f

Authorization:AWS4-HMAC-SHA256 Credential={AK值}/{日期}/{region}/{service}/aws4_request, SignedHeaders={参与签名的header1名称};{参与签名的header2名称}…, Signature=step3的签名值

最终我们的AWS请求就生成完了,剩下的就可以交给通用的HTTP库处理了。

三、贴上AWS的GET请求处理的Python代码

# https://docs.aws.amazon.com/general/latest/gr/sigv4-signed-request-examples.html
# AWS Version 4 signing example
# See: http://docs.aws.amazon.com/general/latest/gr/sigv4_signing.html
# This version makes a GET request and passes the signature
# in the Authorization header.
import datetime, hashlib, hmac
import requests


# ************* REQUEST VALUES *************
method = 'GET'
service = 's3'
host = 'obs.cn-southwest-2.myhuaweicloud.com'
region = 'guiyang'
endpoint = 'http://obs.cn-southwest-2.myhuaweicloud.com'
request_parameters = 'acl='

# Key derivation functions. See:
# http://docs.aws.amazon.com/general/latest/gr/signature-v4-examples.html#signature-v4-examples-python
def sign(key, msg):
    return hmac.new(key, msg.encode('utf-8'), hashlib.sha256).digest()

def getSignatureKey(key, dateStamp, regionName, serviceName):
    kDate = sign(('AWS4' + key).encode('utf-8'), dateStamp)
    kRegion = sign(kDate, regionName)
    kService = sign(kRegion, serviceName)
    kSigning = sign(kService, 'aws4_request')
    return kSigning

access_key = 'XXXXXXXXXXXXXXXXXXXX'
secret_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

# Create a date for headers and the credential string
t = datetime.datetime.utcnow()
amzdate = t.strftime('%Y%m%dT%H%M%SZ')
datestamp = t.strftime('%Y%m%d') # Date w/o time, used in credential scope


# ************* TASK 1: CREATE A CANONICAL REQUEST *************
# http://docs.aws.amazon.com/general/latest/gr/sigv4-create-canonical-request.html

# Step 1 is to define the verb (GET, POST, etc.)--already done.

# Step 2: Create canonical URI--the part of th e URI from domain to query
# string (use '/' if no path)
canonical_uri = '/zgctest1'

# Step 3: Create the canonical query string. In this example (a GET request),
# request parameters are in the query string. Query string values must
# be URL-encoded (space=%20). The parameters must be sorted by name.
# For this example, the query string is pre-formatted in the request_parameters variable.
canonical_querystring = request_parameters

# Step 4: Create the canonical headers and signed headers. Header names
# must be trimmed and lowercase, and sorted in code point order from
# low to high. Note that there is a trailing \n.
payload_hash = hashlib.sha256(('').encode('utf-8')).hexdigest()
canonical_headers = 'host:' + host + '\n' + 'x-amz-content-sha256:' + payload_hash  + '\n' + 'x-amz-date:' + amzdate + '\n'

# Step 5: Create the list of signed headers. This lists the headers
# in the canonical_headers list, delimited with ";" and in alpha order.
# Note: The request can include any headers; canonical_headers and
# signed_headers lists those that you want to be included in the
# hash of the request. "Host" and "x-amz-date" are always required.
signed_headers = 'host;x-amz-content-sha256;x-amz-date'

# Step 6: Create payload hash (hash of the request body content). For GET
# requests, the payload is an empty string ("").
payload_hash = payload_hash

# Step 7: Combine elements to create canonical request
canonical_request = method + '\n' + canonical_uri + '\n' + canonical_querystring + '\n' + canonical_headers + '\n' + signed_headers + '\n' + payload_hash

print('规范请求:\n' + canonical_request)

# ************* TASK 2: CREATE THE STRING TO SIGN*************
# Match the algorithm to the hashing algorithm you use, either SHA-1 or
# SHA-256 (recommended)
algorithm = 'AWS4-HMAC-SHA256'
credential_scope = datestamp + '/' + region + '/' + service + '/' + 'aws4_request'
string_to_sign = algorithm + '\n' +  amzdate + '\n' +  credential_scope + '\n' +  hashlib.sha256(canonical_request.encode('utf-8')).hexdigest()

print('加密字符串:\n' + string_to_sign)

# ************* TASK 3: CALCULATE THE SIGNATURE *************
# Create the signing key using the function defined above.
signing_key = getSignatureKey(secret_key, datestamp, region, service)

# Sign the string_to_sign using the signing_key
signature = hmac.new(signing_key, (string_to_sign).encode('utf-8'), hashlib.sha256).hexdigest()

print('签名:\n' + signature)

# ************* TASK 4: ADD SIGNING INFORMATION TO THE REQUEST *************
# The signing information can be either in a query string value or in
# a header named Authorization. This code shows how to use a header.
# Create authorization header and add to request headers
authorization_header = algorithm + ' ' + 'Credential=' + access_key + '/' + credential_scope + ', ' +  'SignedHeaders=' + signed_headers + ', ' + 'Signature=' + signature

# The request can include any headers, but MUST include "host", "x-amz-date",
# and (for this scenario) "Authorization". "host" and "x-amz-date" must
# be included in the canonical_headers and signed_headers, as noted
# earlier. Order here is not significant.
# Python note: The 'host' header is added automatically by the Python 'requests' library.
headers = {'x-amz-date':amzdate, 'Authorization':authorization_header}
headers.update({'x-amz-content-sha256': payload_hash})
print('header:' + str(headers))

# ************* SEND THE REQUEST *************
request_url = endpoint + canonical_uri + '?' + canonical_querystring

print('Request URL = ' + request_url)
r = requests.get(request_url, headers=headers)

print('Response code: %d\n' % r.status_code)
print(r.text)