python抓取ASP网页的有关问题

2012-12-30

python抓取ASP网页的问题我想抓取一个网站页面上数据，用我经常用的python代码，如下：import pycurladdress

python抓取ASP网页的问题
我想抓取一个网站页面上数据，用我经常用的python代码，如下：
import pycurl
address=' http://xin.cz3.nus.edu.sg/group/cjttd/ZFTTDDRUG.asp?ID=DAP000001'
print address
c = pycurl.Curl()
html = StringIO.StringIO()
c.setopt(pycurl.URL, address)
c.setopt(pycurl.WRITEFUNCTION, html.write)
c.setopt(pycurl.FOLLOWLOCATION, 1)
c.setopt(pycurl.MAXREDIRS, 5)
c.perform()
page=html.getvalue()

可是得到的内容不是正常打开网页的内容
返回的内容是：

Sorry, the page cannot be displayed
There is a problem with the page you are trying to reach and it cannot be displayed.
--------------------------------------------

Please try the following:

?Contact the Web site administrator to let them know that this error has occured for this URL address.
HTTP 500.100 - Internal server error: ASP error.
Internet Information Services

--------------------------------------------

Technical Information (for support personnel)
Error Type:
SessionID, ASP 0164 (0x80004005)
An invalid TimeOut value was specified.
/LM/W3SVC/1/ROOT/global.asa, line 33
?Browser Type:
PycURL/7.18.2
?Page:
GET /group ttd/ZFTTDDRUG.asp

用urllib2也是一样。
很奇怪，抓其它网页都没问题，就是这个网站的网页抓不到。是不是他们加了什么设置不让抓取。还是ASP网页这样抓不行。
麻烦那位高手赐教？谢谢！！
[解决办法]
设下user agent：

c.setopt(pycurl.USERAGENT, 'any')

热点排行

perl python

python抓取ASP网页的有关问题