首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 网站开发 > XML SOAP >

团购领航xml解析学习总结

2012-12-27 
团购导航xml解析学习总结团购导航网站如:http://www.tuanp.com,,http://www.tuan800.com,首先需要对抓取的

团购导航xml解析学习总结

团购导航网站如:http://www.tuanp.com,,http://www.tuan800.com,首先需要对抓取的团购网站进行分析,因为很多团购网站没有对外的API即xml格式文件,然后对xml格式分析,且在后台录入xml格式标签,最后使用定时器每隔一段时间抓取。

1.团购网站信息Test.xml

<?xml version="1.0" encoding="utf-8"?><result><error>0</error><data><site><site_name>团购网站</site_name><site_title>团购标题</site_title><site_url>http://xxx.xxx.com</site_url></site><teams><team><id>21</id><link>http://xxx.xxx.com/team.php?id=21</link><large_image_url>http://xxx.xxx.com/upfile/team/2010/1013/12869663322493.jpg</large_image_url><small_image_url>http://xxx.xxx.com/upfile/team/2010/1013/12869663322493_index.jpg</small_image_url><title> Title1……</title><product> Product1…… </product><team_price>37.50</team_price><market_price>60.90</market_price><rebate>6.16</rebate><start_date>2010-10-14T00:00:00+08:00</start_date><end_date>2010-10-20T00:00:00+08:00</end_date><state>success</state><tipped>true</tipped><tipped_date>2010-10-15T10:55:14+08:00</tipped_date><tipping_point>10</tipping_point><current_point>192</current_point><conditions><limited_quantity>false</limited_quantity><maximum_purchase>0</maximum_purchase><expiration_date>2011-01-14T00:00:00+08:00</expiration_date></conditions><city></city><group></group></team><team><id>23</id><link>http://xxx.xxx.com/team.php?id=23</link><large_image_url>http://xxx.xxx.com/upfile/team/2010/1015/12870801418707.jpg</large_image_url><small_image_url>http://xxx.xxx.com/upfile/team/2010/1015/12870801418707_index.jpg</small_image_url><title> Title2…… </title><product> Product2……</product><team_price>58.00</team_price><market_price>298.00</market_price><rebate>1.95</rebate><start_date>2010-10-16T00:00:00+08:00</start_date><end_date>2010-10-19T00:00:00+08:00</end_date><state>success</state><tipped>true</tipped><tipped_date>2010-10-16T10:45:10+08:00</tipped_date><tipping_point>10</tipping_point><current_point>138</current_point><conditions><limited_quantity>false</limited_quantity><maximum_purchase>0</maximum_purchase><expiration_date>2011-01-16T00:00:00+08:00</expiration_date></conditions><city></city><group></group></team></teams></data></result>

?

?

2.后台java解析程序:? //

按照路径定位解析技术:dom4j解析

public static void main(String[] args) {try {SAXReader reader = new SAXReader();Document doc = reader.read(new File("Test.xml"));Element root = doc.getRootElement();List list = root.selectNodes("//data/teams/team");Element element;for (int i = 0; i < list.size(); i++) {element = (Element) list.get(i);String id = getNodeText(element, "id");System.out.println("id:"+id);String title = getNodeText(element, "title");System.out.println("title:"+title);String image = getNodeText(element, "large_image_url");System.out.println("image:"+image);String dealUrl = getNodeText(element, "link");System.out.println("dealUrl:"+dealUrl);}} catch (Exception e) {e.printStackTrace();}}public static String getNodeText(Element element, String nodename) throws Exception{Node node = element.selectSingleNode(nodename);if(node == null) {throw new RuntimeException("标签错误");}return node.getText();}

?

?

?

?

1 楼 xiangfei209 2011-04-07   lz 团购导航分类,弄过吗?比如饮食,美发等等

热点排行