Hello all,

I am a vb.net newbie and I am trying to write a code that when executed pulls the latest file on an FTP server (based on creation date). So far, I've been able to list all the files in the remote directory. I know this can be done using regular expressions but unfortunately, I have no idea of how to implement that inside a code. So, any help is appreciated.

Regular expression to get the oldest file which I found somewhere [but need one to get the newest file though :( ]

(\d{2}-\d{2}-\d{2}\s{1,10}\d{2}:\d{2}\w{2})\s{1,21}(\d{1,10})\s{1,10}(.+)

My progress so far.

Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click

        Dim ftpRequest As FtpWebRequest = DirectCast(WebRequest.Create("ftp://ftp.sitename.com/logs"), FtpWebRequest)

        ftpRequest.Method = WebRequestMethods.Ftp.ListDirectoryDetails

        ftpRequest.Credentials = New NetworkCredential("username", "password")

        Dim ftpResponse As FtpWebResponse = DirectCast(ftpRequest.GetResponse(), FtpWebResponse)

        Dim ftpResponseStream As Stream = ftpResponse.GetResponseStream()

        Dim ftpReader As New StreamReader(ftpResponseStream)

        Console.WriteLine(ftpReader.ReadToEnd())

        MsgBox("Directory List Complete...!")

        ftpReader.Close()
        ftpResponse.Close()

    End Sub

Thanks for the reply cgeier. The link you posted was really helpful. But right now, after implementing regex into my code, I can't get it to print output (the console.writeline does nothing) I don't receive any errors or anything. Just nothing happens. The output stream doesn't create the file either. So, I am assuming the problem to be within the regex or the way I am reading the responsestream for use with regex. If anyone can shed some light on what I am doing wrong and make some corrections, it would be great.

Here's my updated code:

Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click

        Dim ftpRequest As FtpWebRequest = DirectCast(WebRequest.Create("ftp://ftp.sitename.com/logs"), FtpWebRequest)

        ftpRequest.Method = WebRequestMethods.Ftp.ListDirectoryDetails

        ftpRequest.Credentials = New NetworkCredential("username", "password")

        Dim ftpResponse As FtpWebResponse = DirectCast(ftpRequest.GetResponse(), FtpWebResponse)

        Dim ftpResponseStream As Stream = ftpResponse.GetResponseStream()

        Dim ftpReader As New StreamReader(ftpResponseStream)

        Dim result As String = ftpReader.ReadToEnd

        Dim pattern As String = "(\d{2}-\d{2}-\d{2}\s{1,10}\d{2}:\d{2}\w{2})\s{1,21}(\d{1,10})\s{1,10}(.+)"


            For Each match As Match In Regex.Matches(result, pattern)

                Console.WriteLine(match.Value())

                Dim output As IO.Stream

                ftpRequest.Method = WebRequestMethods.Ftp.DownloadFile(match.Value)

            output = System.IO.File.Create("E:\new.zip")

                output.Close()

            Next

        MsgBox("Can't see sh*t!")

        ftpReader.Close()
        ftpResponse.Close()

    End Sub

P.S: I also tried to catch exceptions but there was none.

I could see the ftp directory listings in the console but can't figure out how I could select and download the most recently modified/created file. I tried to use getdatetimestamp but the server threw a 550, file/directory not found error. I even tried the webclient method and tried to parse through html elements after a POST login. But the download links were wrapped inside a java script enabled html container. So, couldn't get it to work.

Here's the ftp directory listing
http://i.imgur.com/iXjKvoC.png

Edited 2 Years Ago by devdevil10

Try one of the following:

Version 1:

Dim pattern As String = "[-rwx]{10}\s+\d\s+\d\s+\d\s+\d+\s+(?<monthName>[A-Z][a-z]+)\s+(?<monthNumber>\d+)\s+(?<timeOfDay>\d{2}:\d{2})\s+(?<filename>\d{4}-\d{2}-\d{2}-\d{4}.txt.tgz).*"

Version 2:

Dim pattern As String = "[-rwx]{10}\s+\d\s+\d\s+\d\s+\d+\s+(?<monthName>[A-Z][a-z]+)\s+(?<monthNumber>\d+)\s+(?<timeOfDay>\d{2}:\d{2})\s+(?<filenameDatePart>\d{4}-\d{2}-\d{2})-(?<filenameTimePart>\d{4}).txt.tgz.*"

Version 3:

Dim pattern As String = "[-rwx]{10}\s+\d\s+\d\s+\d\s+\d+\s+(?<monthName>[A-Z][a-z]+)\s+(?<monthNumber>\d+)\s+(?<timeOfDay>\d{2}:\d{2})\s+(?<filenameDateYear>\d{4})-(?<filenameDateMonth>\d{2})-(?<filenameDateDay>\d{2})-(?<filenameTimePart>\d{4}).txt.tgz.*"

Group names are "monthName", "monthNumber", "timeOfDay", etc...:

Console.WriteLine("monthName: '" + myMatch.Groups("monthName").Value + "'")

or

For Each g As Group In myGroups
    Console.WriteLine(g.Value)
Next

The code you provided was helpful in finding files that were created during a specified time window. But what I really wanted was to automatically get the last modified/created file from the FTP server. I don't want to keep on modfiying the regex everytime to get a file. As mentioned in my earlier post, getdatetime function didn't work either. Any other alternative route to achive this? Also, thanks for this effort, cgeier.

I don't understand why you would have to "modify the regex everytime". From the data you provided, the last filename is the newest filename. So you just have to get the last filename. If the filenames could be listed in a different order then what you need to do is search through the list of files and determine which one is the newest.

The following code will find the newest filename given the filename format that you have provided:

    Public Function extractData(ByVal myData As String) As String

        Dim output As String = String.Empty
        Dim lastFilename As String = String.Empty
        Dim newestFilename As String = String.Empty

        'Dim pattern As String = "[-rwx]{10}\s+\d\s+\d\s+\d\s+\d+\s+(?<monthName>[A-Z][a-z]+)\s+(?<monthNumber>\d+)\s+(?<timeOfDay>\d{2}:\d{2})\s+(?<filenameDateYear>\d{4})-(?<filenameDateMonth>\d{2})-(?<filenameDateDay>\d{2})-(?<filenameTimePart>\d{4}).txt.tgz.*"
        'Dim pattern As String = "[-rwx]{10}\s+\d\s+\d\s+\d\s+\d+\s+(?<monthName>[A-Z][a-z]+)\s+(?<monthNumber>\d+)\s+(?<timeOfDay>\d{2}:\d{2})\s+(?<filename>\d{4}-\d{2}-\d{2}-\d{4}.txt.tgz).*"
        'Dim pattern As String = ".*(?<monthName>[A-Z][a-z]+)\s+(?<monthNumber>\d+)\s+(?<timeOfDay>\d{2}:\d{2})\s+(?<filename>\d{4}-\d{2}-\d{2}-\d{4}.txt.tgz).*"

        Dim pattern As String = ".*(?<filename>\d{4}-\d{2}-\d{2}-\d{4}.txt.tgz).*"

        'create new instance of RegEx
        Dim myRegex As New Regex(pattern)

        'create new instance of Match
        Dim myMatch As Match = myRegex.Match(myData)

        'keep looking for matches until no more are found
        While myMatch.Success

            'get groups
            Dim myGroups As GroupCollection = myMatch.Groups

            'For Each groupName As String In myRegex.GetGroupNames
            'output += String.Format("Group: '{0}' Value: {1}", groupName, myGroups(groupName).Value)
            'output += System.Environment.NewLine
            'Next

            'Console.WriteLine(output)

            lastFilename = myGroups("filename").Value

            If Not String.IsNullOrEmpty(newestFilename) Then
                If lastFilename > newestFilename Then
                    newestFilename = lastFilename
                End If
            Else
                newestFilename = myGroups("filename").Value
            End If

            'Console.WriteLine("lastFilename: " + lastFilename + " newestFilename: " + newestFilename)


            'get next match
            myMatch = myMatch.NextMatch()
        End While

        Return newestFilename
    End Function

Usage:

Dim lastFilename as String = extractData(myData)

Example usage:

        Dim myData As String = String.Empty

        myData = "-rw-r--r--   1 0     0      14490 Jul  3 19:10 2014-07-03-1800.txt.tgz" + System.Environment.NewLine
        myData += "-rw-r--r--   1 0     0     30250 Jul  3 20:10 2014-07-03-1900.txt.tgz" + System.Environment.NewLine
        myData += "-rw-r--r--   1 0     0      8365 Jul  5 11:10 2014-07-05-1000.txt.tgz" + System.Environment.NewLine
        myData += "-rw-r--r--   1 0     0     31712 Jul  5 12:10 2014-07-05-1100.txt.tgz" + System.Environment.NewLine
        myData += "-rw-r--r--   1 0     0      7870 Jul  6 09:10 2014-07-06-0800.txt.tgz" + System.Environment.NewLine
        myData += "-rw-r--r--   1 0     0     19665 Jul  6 18:10 2014-07-05-1700.txt.tgz" + System.Environment.NewLine
        myData += "-rw-r--r--   1 0     0      6309 Jul  6 08:10 2014-07-06-0700.txt.tgz" + System.Environment.NewLine


        Dim lastFilename as String = extractData(myData)

Usage of code in "Module1":

Dim lastFilename as String = extractData(getTestData())

Edited 2 Years Ago by cgeier

Appologies for misunderstanding your code (I am still learning :)). I am on a very tight schedule at the moment so I'll check out your code this weekend and report back with the results. Once again, thank you, cgeier for your help.

The following should work:

    Private _filenamePattern As String = ".*(?<filename>\d{4}-\d{2}-\d{2}-\d{4}.txt.tgz).*"

    Private _ftpRequest As FtpWebRequest = Nothing

    Private _downloadDirectory As String = String.Empty
    Private _ftpUrl As String = String.Empty
    Private _password As String = "user@name.com"
    Private _username As String = "anonymous"

        'Enum
    Private Enum FtpMethod
        ListFiles = 0
        Download = 1
    End Enum

        Private Function getNewestFilename(ByVal directoryData As String) As String

        Dim output As String = String.Empty
        Dim lastFilename As String = String.Empty
        Dim newestFilename As String = String.Empty

        'create new instance of RegEx
        Dim myRegex As New Regex(_filenamePattern)

        'create new instance of Match
        Dim myMatch As Match = myRegex.Match(directoryData)

        'keep looking for matches until no more are found
        While myMatch.Success

            'get groups
            Dim myGroups As GroupCollection = myMatch.Groups

            'For Each groupName As String In myRegex.GetGroupNames
            'output += String.Format("Group: '{0}' Value: {1}", groupName, myGroups(groupName).Value)
            'output += System.Environment.NewLine
            'Next

            'Console.WriteLine(output)

            lastFilename = myGroups("filename").Value

            If Not String.IsNullOrEmpty(newestFilename) Then
                If lastFilename > newestFilename Then
                    newestFilename = lastFilename
                End If
            Else
                newestFilename = myGroups("filename").Value
            End If

            'Console.WriteLine("lastFilename: " + lastFilename + " newestFilename: " + newestFilename)


            'get next match
            myMatch = myMatch.NextMatch()
        End While

        If String.IsNullOrEmpty(newestFilename) Then
            newestFilename = "No matches found for specified pattern."
        End If

        Return newestFilename
    End Function

        Private Function CreateFtpWebRequest(ByVal ftpUrl As String, ByVal username As String, ByVal password As String, ByVal ftpMethod As FtpMethod, ByVal keepAlive As Boolean) As Stream

        'defined as Private
        '_ftpRequest = DirectCast(WebRequest.Create(New Uri(ftpUrl)), FtpWebRequest)
        _ftpRequest = WebRequest.Create(New Uri(ftpUrl))

        'either download the file or
        'list the files in the directory
        If ftpMethod = 0 Then
            'list files in directory
            _ftpRequest.Method = WebRequestMethods.Ftp.ListDirectoryDetails
        ElseIf ftpMethod = 1 Then
            'download file
            _ftpRequest.Method = WebRequestMethods.Ftp.DownloadFile
        End If

        'set username and password
        _ftpRequest.Credentials = New NetworkCredential(username, password)

        'return Stream
        Return _ftpRequest.GetResponse().GetResponseStream

    End Function

        Public Function getFileFromFtpServer() As String

        'set bufferSize to 256 kb
        Dim bufferSize As Integer = 256 * 1024

        'allocate buffer of bufferSize
        Dim buffer(bufferSize) As Byte

        'number of bytes read to buffer
        Dim bytesIn As Integer = 0

        Dim ftpResponseStream As Stream = Nothing
        Dim ftpReader As StreamReader = Nothing
        Dim directoryListing As String = String.Empty
        Dim filename As String = String.Empty
        Dim fqFilename As String = String.Empty
        Dim fileLength As Integer = 0
        Dim output As IO.Stream = Nothing
        Dim outputFilename As String = String.Empty

        'total bytes received
        Dim totalBytesIn As Integer = 0

        Dim errMsg As String = String.Empty

        If String.IsNullOrEmpty(_ftpUrl) Then
            errMsg = "Error: " + "FtpUrl not set."
            Return errMsg
        End If

        If String.IsNullOrEmpty(_downloadDirectory) Then
            errMsg = "Error: " + "DownloadDirectory not set."
            Return errMsg
        End If

        'create new request to get list of files
        'FtpMethod is an Enum created above
        ftpResponseStream = CreateFtpWebRequest(FtpUrl, Username, Password, FtpMethod.ListFiles, False)

        ftpReader = New StreamReader(ftpResponseStream)

        'get list of files
        directoryListing = ftpReader.ReadToEnd

        'close StreamReader
        ftpReader.Close()

        'close Stream
        ftpResponseStream.Close()

        'get newest filename
        filename = getNewestFilename(directoryListing)

        If filename.Length > 0 And Not filename.StartsWith("No matches found") Then

            'need to append filename to ftpUrl
            'in order to download the file

            If _ftpUrl.EndsWith("/") Then
                fqFilename = FtpUrl + filename
            Else
                fqFilename = FtpUrl + "/" + filename
            End If


            Console.WriteLine("fqFilename: " & fqFilename)

            'create new request to download fqFilename
            'FtpMethod is an Enum created above
            ftpResponseStream = CreateFtpWebRequest(fqFilename, Username, Password, FtpMethod.Download, True)

            'get file length
            fileLength = _ftpRequest.GetResponse().ContentLength

            Console.WriteLine("file length: " & fileLength)

            'file to save to
            If DownloadDirectory.EndsWith("\") Then
                outputFilename = DownloadDirectory + filename
            Else
                outputFilename = DownloadDirectory + "\" + filename
            End If

            'create filename on local computer
            output = System.IO.File.Create(outputFilename)

            Do

                'read bytes to buffer
                'and get actual # of bytes read
                bytesIn = ftpResponseStream.Read(buffer, 0, bufferSize)

                'Console.WriteLine("bytesIn: " & bytesIn)

                If bytesIn > 0 Then
                    'write buffer to file
                    output.Write(buffer, 0, bytesIn)

                    totalBytesIn += bytesIn

                    Console.WriteLine("Downloaded: " & totalBytesIn & " of " & fileLength)

                    'prevent unresponsiveness
                    'it is better to run this Function
                    'in a BackgroundWorkerThread.
                    'Then 'Application.DoEvents()' can be deleted
                    Application.DoEvents()
                End If
            Loop Until bytesIn < 1

        Else
            Console.WriteLine(filename)
        End If

        'close Stream(s)
        If Not output Is Nothing Then
            output.Close()
        End If

        If Not ftpResponseStream Is Nothing Then
            ftpResponseStream.Close()
        End If


        buffer = Nothing
        _ftpRequest = Nothing

        Return totalBytesIn.ToString()
    End Function

I've attached the files. There is a version that uses backgroundWorker. Just call "downloadFile" (in Module1.vb).

Edited 2 Years Ago by cgeier

Wow, that was an incredible code and the resource links were also spot on. I have come back to confirm that the code works like charm! This code must have sucked a lot of your time and I am grateful for your help. Thanks a lot cgeier.

This question has already been answered. Start a new discussion instead.