问题现象
某些前端发来的请求会在前端加密发送到网关,并在网关解密之后发到真正的微服务,并将结果加密返回给前端。
实现网关加密后,发现一次加密请求后,紧接着的非加密GET请求,就会出现400的错误。再发一次相同的GET请求,就会正常,观察后端微服务的收到网关请求的accessLog,发现接收到的请求解析有问题:
## 400的请求
- - - [04/Jan/2018:19:48:30 +0800] "-" 400 - 0 0.000 - "-" null null 10.120.242.152
## 正常的请求
- - - [04/Jan/2018:19:50:18 +0800] "GET /v1/api/XXX HTTP/1.1" 200 156 11 0.011 http://www.xxx.com "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36" http-nio-8111-exec-28 10.120.242.151 10.120.242.152
问题定位
首先查看那次400请求的HTTP抓包,发现HTTP包结构是完整的:
19:48:30.224244 52:54:00:32:c5:5e > 52:54:00:66:bc:63, ethertype IPv4 (0x0800), length 1762: (tos 0x0, ttl 64, id 50111, offset 0, flags [DF], proto TCP (6), length 1748)
10.120.242.152.27725 > 10.120.242.151.8111: Flags [P.], cksum 0x00e7 (incorrect -> 0xfdf0), seq 917602625:917604321, ack 2125955651, win 29, options [nop,nop,TS val 793264903 ecr 3278809206], length 1696
0x0000: 4500 06d4 c3bf 4000 4006 7644 0a78 f298 E.....@.@.vD.x..
0x0010: 0a78 f297 6c4d 1faf 36b1 8141 7eb7 8243 .x..lM..6..A~..C
0x0020: 8018 001d 00e7 0000 0101 080a 2f48 4307 ............/HC.
0x0030: c36e a876 4745 5420 2f76 312f 6669 6e41 .n.vGET./v1/finA
0x0040: 6363 732f 6669 6e41 6363 2f75 7365 7242 ccs/finAcc/userB
0x0050: 616c 2f4b 4553 2048 5454 502f 312e 310d al/KES.HTTP/1.1.
。。。。。。
0x06d0: 0d0a 0d0a ....
在Tomcat容器代码处打断点,读取出来的内容是有残缺的:
前面那一段Get 和路径不见了
我们再看一下上一个加密请求的包内容:
11:03:27.703518 52:54:00:32:c5:5e > 52:54:00:66:bc:63, ethertype IPv4 (0x0800), length 1832: (tos 0x0, ttl 64, id 12872, offset 0, flags [DF], proto TCP (6), length 1818)
10.120.242.152.15124 > 10.120.242.151.8111: Flags [P.], cksum 0x012d (incorrect -> 0xd94b), seq 84397903:84399669, ack 2813208375, win 33, options [nop,nop,TS val 777069391 ecr 3262603428], length 1766
0x0000: 4500 071a 3248 4000 4006 0776 0a78 f298 E...2H@.@..v.x..
0x0010: 0a78 f297 3b14 1faf 0507 cf4f a7ae 2737 .x..;......O..'7
。。。。。。
0x06b0: 436f 6e74 656e 742d 4c65 6e67 7468 3a20 Content-Length:.
0x06c0: 3630 0d0a 436f 6e6e 6563 7469 6f6e 3a20 108.Connection:.
0x06d0: 4b65 6570 2d41 6c69 7665 0d0a 0d0a 7b22 Keep-Alive....{"
0x06e0: 7068 6f6e 654e 6f22 3a22 3235 3437 3635 phoneNo":"254765
0x06f0: 3433 3231 3030 222c 2270 6179 416d 6f75 432100","payAmou
0x0700: 6e74 223a 3130 3030 3030 3030 2c22 7061 nt":10000000,"pa
0x0710: 7943 6849 6422 3a31 307d yChId":10}
发现末尾的Content-Length不对,应该是60,而不是108.
解密前的长度是108,而解密后的长度是60。可能是这个原因,导致了下一个请求Tomcat丢失处理了。
Debug修改Content-Length为60,问题不再出现。可见就是这个原因
我们在解密修改包的时候,并没有成功修改Content-length
解决方案
1.换容器,换成Jetty问题消失,JettyNIO不会处理Content-Length字段,但是换容器对整体改动大,而且我们的场景适合Tomcat(大量的短小请求)
2.每个请求新建HttpClient连接,对于不同连接,TomcatNIO不会丢失处理,但是这样有性能损耗,不推荐。
3.改对Content-length,这个肯定是最佳方案,但是找对修改的地方确实换了一些时间,这里贴出核心原理代码:
对于Zuul网关的每次请求,都是一次Ribbon调用,Ribbon调用有上下文,里面有ContentLength这一项:
RibbonRoutingFilter.java
protected RibbonCommandContext buildCommandContext(RequestContext context) {
HttpServletRequest request = context.getRequest();
MultiValueMap<String, String> headers = this.helper
.buildZuulRequestHeaders(request);
MultiValueMap<String, String> params = this.helper
.buildZuulRequestQueryParams(request);
String verb = getVerb(request);
InputStream requestEntity = getRequestBody(request);
if (request.getContentLength() < 0 && !verb.equalsIgnoreCase("GET")) {
context.setChunkedRequestBody();
}
String serviceId = (String) context.get(SERVICE_ID_KEY);
Boolean retryable = (Boolean) context.get(RETRYABLE_KEY);
String uri = this.helper.buildZuulRequestURI(request);
// remove double slashes
uri = uri.replace("//", "/");
long contentLength = useServlet31 ? request.getContentLengthLong(): request.getContentLength();
return new RibbonCommandContext(serviceId, verb, uri, retryable, headers, params,
requestEntity, this.requestCustomizers, contentLength);
}
注意到long contentLength = useServlet31 ? request.getContentLengthLong(): request.getContentLength();
这个方法,对于Tomcat,request就是org.apache.catalina.connector.Request这个类:
@Override
public long getContentLengthLong() {
return coyoteRequest.getContentLengthLong();
}
@Override
public int getContentLength() {
return coyoteRequest.getContentLength();
}
再进一步看coyoteRequest的相关方法:
public int getContentLength() {
long length = getContentLengthLong();
if (length < Integer.MAX_VALUE) {
return (int) length;
}
return -1;
}
public long getContentLengthLong() {
if( contentLength > -1 ) {
return contentLength;
}
MessageBytes clB = headers.getUniqueValue("content-length");
contentLength = (clB == null || clB.isNull()) ? -1 : clB.getLong();
return contentLength;
}
所以,我们在解密完包之后,对于Tomcat需要修改ContentLength,修改方式就是添加如下代码到你解密使用的Wrapper或者Filter中:
//Only for tomcat, fix content-length or there will be bugs
if (request instanceof com.netflix.zuul.http.HttpServletRequestWrapper) {
com.netflix.zuul.http.HttpServletRequestWrapper request1 = (com.netflix.zuul.http.HttpServletRequestWrapper) request;
RequestFacade requestFacade = (RequestFacade) request1.getRequest();
try {
Field field = RequestFacade.class.getDeclaredField("request");
field.setAccessible(true);
Request o = (Request) field.get(requestFacade);
//将Content-length放进去
o.getCoyoteRequest().setContentLength(this.contentLength);
} catch (Exception e) {
log.info("catch exception: ", e);
}
}