Apache2.2和Tomcat5.0的整合(二)—解决REDIRECT_URL的中文路径问题现象通过mod_jk整合Apache和Tomcat,在Ser
Apache2.2和Tomcat5.0的整合(二)—解决REDIRECT_URL的中文路径问题
现象
通过mod_jk整合Apache和Tomcat,在Servlet中可以通过request.getAttribute("REDIRECT_URL")获得Apache自定义错误处理的原始URL。
不幸的是,如果这个REDIRECT_URL包含中文路径,我们获得的将是乱码。
例如:请求不存在的页面http://localhost/rp/数字故宫/hh.html,
返回结果为:/rp/??°?-???????/hh.html
问题定位问题出在哪里呢?
作以下测试:
在Apache的httpd.conf加入如下两行
JkEnvVar DPM1 %e6%95%b0%e5%ad%97%e6%95%85%e5%ae%ab
JkEnvVar DPM2 数字故宫
结果第一行返回: %e6%95%b0%e5%ad%97%e6%95%85%e5%ae%ab
第二行返回: ??°?-???????
即jk本身不能正确将环境变量的中文值送到request.getAttribute
下载mod_jk源代码http://tomcat.apache.org/download-connectors.cgi
通过代码分析、日志跟踪,确定Apache送到jk的REDIRECT_URL值为utf-8编码
此变量值经过jk_b_append_string函数写入到缓冲后,通过socket8009发送给tomcat之后,
再经过tomcat接收分析之后,出现乱码。
解决办法解决办法,既然jk处理西文没有问题,何不将中文URL进行URLEnocode
从网上收到urlencode的C实现:
c# 代码?
- static?unsigned?char?hexchars[]?=?"0123456789ABCDEF";??
- ??
- char?*urlencode(char?*s)??
- {??
- ????register?int?x,?y;??
- ????unsigned?char?*str;??
- ????int?len=strlen(s);??
- ??
- ????str?=?(unsigned?char?*)?malloc(3?*?strlen(s)?+?1);??
- ????for?(x?=?0,?y?=?0;?len--;?x++,?y++)?{??
- ????????str[y]?=?(unsigned?char)?s[x];??
- ????????if?(str[y]?==?'?')?{??
- ????????????str[y]?=?'+';??
- #ifndef?CHARSET_EBCDIC??
- ????????}?else?if?((str[y]?<?'0'?&&?str[y]?!=?'-'?&&?str[y]?!=?'.')?||??
- ???????????????????(str[y]?<?'A'?&&?str[y]?>?'9')?||??
- ???????????????????(str[y]?>?'Z'?&&?str[y]?<?'a'?&&?str[y]?!=?'_')?||??
- ???????????????????(str[y]?>?'z'))?{??
- ????????????str[y++]?=?'%';??
- ????????????str[y++]?=?hexchars[(unsigned?char)?s[x]?>>?4];??
- ????????????str[y]?=?hexchars[(unsigned?char)?s[x]?&?15];??
- ????????}??
- #else?/*CHARSET_EBCDIC*/??
- ????????}?else?if?(!isalnum(str[y])?&&?strchr("_-.",?str[y])?==?NULL)?{??
- ????????????/*?Allow?only?alphanumeric?chars?and?'_',?'-',?'.';?escape?the?rest?*/??
- ????????????str[y++]?=?'%';??
- ????????????str[y++]?=?hexchars[os_toascii[(unsigned?char)?s[x]]?>>?4];??
- ????????????str[y]?=?hexchars[os_toascii[(unsigned?char)?s[x]]?&?0x0F];??
- ????????}??
- #endif?/*CHARSET_EBCDIC*/??
- ????}??
- ??
- ??
- ????str[y]?=?'\0';??
- ????return?((char?*)?str);??
- }??
将ajp_marshal_into_msgb函数体发送属性部分进行修改,修改后的代码为:
cpp 代码?
- ?if?(s->num_attributes?>?0)?{??
- ?????for?(i?=?0;?i?<?s->num_attributes;?i++)?{??
- //c4w??
- char?*?pval=s->attributes_values[i];??
- char?*?url=NULL;??
- if(strcmp(s->attributes_names[i],"REDIRECT_URL")==0){??
- ????url=urlencode(s->attributes_values[i]);??
- ????//pval=url;??
- }??
- ?????????if?(jk_b_append_byte(msg,?SC_A_REQ_ATTRIBUTE)?||??
- ?????????????jk_b_append_string(msg,?s->attributes_names[i])?||??
- ?????????????jk_b_append_string(msg,?pval))?{??
- ?????????????jk_log(l,?JK_LOG_ERROR,??
- ????????????????????"failed?appending?attribute?%s=%s",??
- ????????????????????s->attributes_names[i],?pval);??
- ?????????????JK_TRACE_EXIT(l);??
- ?????????????return?JK_FALSE;??
- ?????????}??
- if(url!=NULL)?free(url);??
- ?????}??
- ?}??
最后,在jsp中对REDIRECT_URL进行解码
String url=URLDecoder.decode((String)request.getAttribute("REDIRECT_URL"),"utf-8");
结论编译,启动tomcat,重新请求不存在的页面http://localhost/rp/数字故宫/hh.html,返回结果为:/rp/??°?-???????/hh.html
看到了了正确的结果:/rp/数字故宫/hh.html