下载网页出现乱码
我用UrlConnection下载网页,有时候中文会出现乱码(出现机率不是很高,但是下载的多了肯定会出现,比如下载会变成“?载?”),请问这是什么问题?该如何解决?谢谢
我下载网页的代码如下:
public StringBuffer download(String strUrl,String encoding) {
StringBuffer strContent=new StringBuffer();
URL myurl;
try {
myurl = new URL(strUrl);
URLConnection ucon =myurl.openConnection();
ucon.setConnectTimeout(httpConnectTimeOut);
ucon.setReadTimeout(httpReadTimeOut);
ucon.setRequestProperty( "User-Agent ", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Maxthon; .NET CLR 1.1.4322) ");
InputStream in=ucon.getInputStream();
byte buffer[]=new byte[1024];
while(true)
{
int n=in.read(buffer);
if(n <0)
break;
strContent.append(new String(buffer,0,n,encoding));
}
in.close();
} catch (MalformedURLException e) {
//e.printStackTrace();
Out.print(Out.ERROR_INFO, "False Url! ");
} catch (IOException e) {
//e.printStackTrace();
//Out.print(Out.ERROR_INFO, "Http error! "+strUrl);
}
return strContent;
}
我传入的encoding参数根据网站不同,有utf-8,也有GB2312
[解决办法]
学习...
up
[解决办法]
编码都用GB2312...
[解决办法]
加一句试试:ucon.setRequestProperty( "Accept-Language ", "zh-cn ");