Not sure if this is the right spot for this question but I'm here!

I'm trying to make a vb script that opens large text files, one at time, searches each line for Null values (there can be many) and then replaces them with a space. My coworker put the following together which takes a very long time because it is basically iterating through millions of characters in the file:

Do Until objFile.AtEndOfStream 
    strCharacters = objFile.Read(1)
    'Wscript.Echo strCharacters
   ' Wscript.Echo Asc(strCharacters)

    If Asc(strCharacters) = 0 Then
        objOutFile.Write(" ")
    Else
        objOutFile.Write(strCharacters)
    End If

Loop 

What I was hoping to do was read each line and perform the Replace function if there is an ASCII value of 0 but I keep getting an error with:

Do Until objFile.AtEndOfStream
    strCharacters = objFile.ReadLine
    newStrCharacters = Replace(strCharacters,Chr(Asc(0))," ")
    'Wscript.Echo strCharacters
    'Wscript.Echo newStrCharacters
    objOutFile.Write(newStrCharacters)

Loop

However I get prompted with an Invalid Call or Argument throw and I don't understand why? Both variables in my loop, before and after (strCharacters & newStrCharacters) display correctly, I just can't seem to write to the file, which looks like:

strTemp = "C:\Documents and Settings\...\SpecialCharRemoval\out1.txt"
Set objOutFile = objFSO.CreateTextFile(strTemp,True)

Can someone please nudge me in the right direction?

Recommended Answers

All 6 Replies

I guess the problem is in the Chr(Asc(0)) call. Asc expects a string and you are providing a number.
You only need to put Chr(0).
Or possibly vbNullChar or whatever it is (can't remember).

Hi SalmiSoft and thanks. I did realize that about the Asc call and switched Chr with Asc but it seems it is the Write that's erroring because the newStrCharacters holds data, it's just not writing it to the new file.

I'm still trying to solve this but to no avail. I've tried the following functions:

Do until objFile.AtEndOfStream
    strCharacters = objFile.ReadLine
    newStrCharacters = strClean3(strCharacters)
    on error resume next
    f.WriteLine newStrCharacters
    on error goto 0
Loop

f.Close
Wscript.Echo "done"

Function strClean (strToClean)
    Dim objRegExp, outputStr, s

    Set objRegExp = New Regexp
    objRegExp.IgnoreCase = True
    objRegExp.Global = True
    objRegExp.Pattern = ""
    outputStr = objRegExp.Replace(strToClean," ")
    strClean = outputStr
End Function

Function strClean2(strToClean)
    tempstr = strToClean
    charToLose = ""
    s = replace(tempstr,charToLose," ")
    strClean2 = s
End Function

Function strClean3(strToClean)
    intStarter = Len(strToClean)
    s = strToClean
    For i = intStarter to 2 Step -1
        ch = ""
        s = Replace(s,ch, " ")
    Next
    wscript.echo strToclean

End Function

However, the end results of strClean2 and 3 are lines printed where regular letters are replaced with null characters (that I cannot even display here but they look like little squares).

The following works:

Do Until objFile.AtEndOfStream 
    strCharacters = objFile.Read(1) 
    'Wscript.Echo strCharacters
    'Wscript.Echo Asc(strCharacters)

    If Asc(strCharacters) = 0 Then
        objOutFile.Write(" ")
    Else
        objOutFile.Write(strCharacters)
    End If
Loop 

However, each line can have hundreds, if not a 1000 plus characters in it and each file can have up to about 150,000 lines so going about it this way is not too quick or efficient.

Can anyone help?

What do you mean by

quick or efficient?

    Function replaceNull()
        Dim filename = "a.txt"
        Dim regex As Regex = New Regex(New String(vbNullChar))
        Dim replacement As String = " "

        Dim reader As StreamReader = My.Computer.FileSystem.OpenTextFileReader(filename)
        Dim writer As StreamWriter = New StreamWriter("out_" + filename, True)
        Dim line As String

        Do
            line = reader.ReadLine
            Dim result As String = line
            Try
                result = regex.Replace(line, replacement)
            Catch
            Finally
                writer.WriteLine(result)
            End Try
        Loop Until line Is Nothing
        writer.Close()
        reader.Close()
    End Function

    Sub Main()

        Dim TimerStart As DateTime
        TimerStart = Now

        replaceNull()

        Dim TimeSpent As System.TimeSpan
        TimeSpent = Now.Subtract(TimerStart)
        MsgBox(TimeSpent.TotalSeconds & " seconds.")

    End Sub

a.txt contains 203114 lines, each one with 868 characters per line. The null character is at the beginning and middle of the line. Here's the line:

Hello world [null][null] yadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayada[null][null][null] yadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadayadav

It took @3-4 seconds to complete the task.

I mean the vbscript takes about 15 minutes to go through each line and do it's thing. I tried your suggestion but I get "Expected End of Statement" errors. Thanks though.

Well my solution has been built in the form of VBA. The following does what I need and it works way faster than the script I have.

Sub RnR_Nulls()

    Dim ws As Worksheet
    Dim Data() As Byte
    Dim Filename As String
    Dim fText As String, TextLines As Variant
    Dim i As Long
    Set ws = ActiveSheet

    Application.ScreenUpdating = False
    Filename = "***\Desktop\VBA\Dev\***\SpecialCharRemoval\AGET4.1.TXT"
    Open Filename For Binary Access Read As #1
    ReDim Data(LOF(1))
    Get #1, , Data
    Close #1

    Text = StrConv(Data, vbUnicode)
    Text = Replace(Text, Chr(0), Chr(32))
    TextLines = Split(Text, vbCrLf)

    For i = 0 To UBound(TextLines)
        ws.Cells(ws.Rows.Count, 1).End(xlUp).Offset(1, 0).Value = TextLines(i)
    Next i
    Application.ScreenUpdating = True

End Sub
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.