关于VB的文件读写效率
程序主要是为了实现将Output文件夹下面的所有TXT文件中,包含TRIGSTR_XXXX(X为1~4位数)字符串,分别对应1.XML(TRIGSTR_XXXX),2.XML(TRIGSTR_XXX),3.XML(TRIGSTR_XX),4.XML(TRIGSTR_X)。通过正则表达式,将TXT文件中的TRIGSTR_XXXX替换成XML文件中的内容(XML的格式为 <item> <old> TRIGSTR_XXXX </old> <new> 字符串 </new> </item> )。程序写出来了,没有什么逻辑错误,但是执行的速度很慢,整个OUTPUT文件夹大概500K的TXT文件,用另外一个软件(别人编写的)大概90秒就可以出结果。但是我自己的程序,就算是31K的文件,也要65秒,请问这个程序的速度瓶颈在哪里?
个人怀疑是文件操作的效率太低或者正则表达式的效率太低,不知道那位大侠知道。
源程序
Option Explicit
Private Sub Command1_Click()
Dim ofso As FileSystemObject
Dim fo As Folder
Dim f As File
Dim str1 As String
Dim str2 As String
Dim str3 As String
Dim Count As Integer
Dim regex1, regex2
Dim objMatch1, objMatch2 As Match
Dim colMatches1, colMatches2 As MatchCollection
Dim CountFind, CountReplace As Integer
Dim zerostart As Boolean
If Command1.Caption <> "转换 " Then
MsgBox "已经完成转换,如果需要再次转换,请删除转换后产生的新文件,并重启本程序 "
Exit Sub
End If
Command1.Caption = "正在转换…… "
Set ofso = New FileSystemObject
Set fo = ofso.GetFolder( "Output ")
'Set fo = ofso.GetFolder( "E:\Output ")
Set regex1 = New RegExp
Set regex2 = New RegExp
For Each f In fo.Files
Open "Output\ " + f.Name For Input As #1
'Open "E:\Output\ "+ f.Name For Input As #1
Open "Output\New_ " + f.Name For Output As #3
'Open "E:\Output\New_ " + f.Name For Output As #3
'Open "E:\Output\Debug.txt " For Output As #4
Do While Not EOF(1)
Line Input #1, str1
'MsgBox "读入TXT字符串: " & str1
regex1.Pattern = "TRIGSTR_([0-9]+) "
Set colMatches1 = regex1.Execute(str1)
regex1.Global = True
For Each objMatch1 In colMatches1
CountFind = CountFind + 1
Count = 4 - Len(objMatch1.SubMatches(0)) + 1
Open Count & ".xml " For Input As #2
'Open "E:\ " & Count & ".xml " For Input As #2
Do While Not EOF(2)
Line Input #2, str2
'If objMatch1 = "TRIGSTR_196 " Then MsgBox str2
'开始正则表达式比较
regex2.Pattern = objMatch1 + " </old> <new> (.+?) </new> "
If regex2.Test(str2) Then
Set colMatches2 = regex2.Execute(str2)
Set objMatch2 = colMatches2(0)
str3 = objMatch2.SubMatches(0)
'Print #4, "原字符串: " & str1
'Print #4, "查得字符串: " & str2
'Print #4, "替换字符串: " & str3
str1 = regex1.Replace(str1, str3)
CountReplace = CountReplace + 1
'Print #4, "替换结果: " & str1 & vbCrLf
End If
Loop
Close #2
Next
Print #3, str1
Loop
'Close #4
Close #3
Close #1
Next
If CountFind > CountReplace Then
MsgBox "总共查找到符合条件字符串 " & CountFind & "个,替换 " & CountReplace & "个 " & vbCrLf & "请检查所提供文件是否正确! "
Else
MsgBox "转换成功! "
End If
Command1.Caption = "转换完成 "
End Sub
[解决办法]
你文件不断的在打开关闭 当然慢了
[解决办法]
楼上说的是,开始一起打开,最后一起关闭。文件的打开和关闭涉及磁盘交换,当然很耗时。
[解决办法]
算法问题,在循环中不断的进行文件I/O操作,速度肯定慢了。
═══════════════════
http://www.egooglet.com 资料、源码下载
http://bbs.j2soft.cn 论坛交流
═══════════════════
[解决办法]
嘿嘿,楼上两位性誉强
对于LZ的问题,我进来前就在估计,是不是一行一行地在读......
结果真的看到了Line Input #1, str1..............
给你一个建议:
二进制打开,一次性读入内容到一个字符串数组,循环时,改成处理这个字符串数组,会快很多的
另外,你输出时,也应该先写在内存里,全部完成后,再一次性写入文件.
天真热.......