从网页提取几个数字,弄了一半不知道怎么办了..
网页源码
...
<tr>
<td>..月销售数量..</td>
...
<td>...</td>
<td>
<ul>
<li row="1" class="" onclick="OZ.r(this)" style="cursor:pointer" >696</li>
<li row="1" class="" onclick="OZ.r(this)" style="cursor:pointer" >225</li>
<li row="1" class="" onclick="OZ.r(this)" style="cursor:pointer" >236</li>
</ul>
<ul>
<li row="1" class="" onclick="OZ.r(this)" style="cursor:pointer" >639</li>
<li row="1" class="" onclick="OZ.r(this)" style="cursor:pointer" >210</li>
<li row="1" class="" onclick="OZ.r(this)" style="cursor:pointer" >267</li>
</ul>
</td>
<td>...</td>
...
<td>...</td>
</tr>
...
C#代码:
IHTMLDocument2 iHTMLDocument2 = GetIHTMLDocument2(str);
HTMLDocumentClass document = (HTMLDocumentClass)(iHTMLDocument2.body.document);
foreach (IHTMLElement tr in document.getElementsByTagName("tr"))
if (tr.innerText.Contains("销售数量"))
{
}
要提取那6个数字,tr.innerText的字符串把他们都连在一起了,大概是这个样子:
“月销售数量...696225236 639210267...”,用split()能切分吗,是不是只能用正则了,小弟不是程序员不懂正则,请高手指教!
[解决办法]
var matches = Regex.Matches(html, @"(?<=pointer""\s\>)\d+(?=\<\/li\>)");
foreach (var item in matches)
{
Console.WriteLine(item.Value);
}
[解决办法]