小弟希望学习写爬虫，哪位大神能帮忙写一个获取html的例子程序吗,该怎么处理

2012-03-12

小弟希望学习写爬虫，哪位大神能帮忙写一个获取html的例子程序吗小弟希望学习写爬虫，哪位大神能帮忙写一个

小弟希望学习写爬虫，哪位大神能帮忙写一个获取html的例子程序吗
小弟希望学习写爬虫，哪位大神能帮忙写一个获取html的例子程序吗？
好像给制定的URL，“www.baidu.com”,程序能把这个网页保存到一个html文件

[解决办法]
正好手头上有一个，希望对你有帮助

Java code

    private String getListHtml(String listUrl) throws IOException {        String sHtml = "";        URLConnection uc = null;        BufferedReader br = null;        try {            java.net.URL url = new URL(listUrl);            uc = url.openConnection();            //设置模拟浏览器浏览参数，防止部分网站阻止            uc.setRequestProperty("User-Agent",                    "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)");            uc.connect();            boolean bgCatch = false;            boolean edCatch = false;            String line = "";            br = new BufferedReader(new InputStreamReader(new DataInputStream(                    uc.getInputStream()), "UTF8"));//编码视具体情况而定            while ((line = br.readLine()) != null) {                System.out.println(line);//自己处理获得的每行HTML内容                sHtml += line + "\r\n";            }        } catch (MalformedURLException e) {            e.printStackTrace();            throw new IOException("该地址格式不正确！");        } catch (ConnectException e) {            e.printStackTrace();            throw new IOException("该地址不可到达！");        } finally {            try {                if (br != null)                    br.close();            } catch (Exception e) {                e.printStackTrace();            }        }        return sHtml;    }
[解决办法]
楼主也可以考虑下用httpClient.jar

热点排行

J2SE开发

小弟希望学习写爬虫，哪位大神能帮忙写一个获取html的例子程序吗,该怎么处理