Nested Recursion is Fun! v.Help me find redundancy

Discussion in 'OT Technology' started by CodeX, Feb 5, 2010.

  1. CodeX

    CodeX Guest

    In my program I display a hierarchical list of all valid files on the computer, that is, files that can be loaded by the application. I do this by recursively searching the entire hard disk.

    This in and of itself was not too hard to figure out, but I recently modified it to run this recursive search constantly, in a background thread, to actively monitor changes to the hard drive and update the file list in real time. It works beautifully, with any change I make in windows explorer appearing almost instantly in my application. However, now I have several levels of nested recursion to accomplish this and for some reason I think there is redundancy in the algorithm that could be eliminated to reduce processor usage.

    So, I'm going to post this, feel free to use this technique if you want I don't care, but I hope someone who is good with recursion will look this over and see if anything can be cut out of it.

    TIA.

    Code:
    
    Public HDDMonitor As Thread
    
    Delegate Function TreeViewNodesAdd_Delegate(ByVal path As String, ByVal name As String, ByVal IconIndex As Integer) As Windows.Forms.TreeNode
    
    Delegate Function TreeViewNodesAddIndirect_Delegate(ByVal ParentNode As Windows.Forms.TreeNode, ByVal path As String, ByVal name As String, ByVal IconIndex As Integer) As Windows.Forms.TreeNode
    
    
        Public Sub AfterMainLoad(ByVal sender As Object, ByVal e As System.EventArgs) Handles MainLoadTimer.Tick
    
            .....
    
            HDDMonitor = New Thread(AddressOf HDDMonitorMain)
            HDDMonitor.Priority = ThreadPriority.Lowest
            HDDMonitor.IsBackground = True
            HDDMonitor.Start()
    
            ....
    
        End Sub 'Delayed start after form load (to make sure it draws quickly)
    
    
         Public Sub HDDMonitorMain()
            While 1
                'Check for new files and files that no longer exist on the disk and update the file list as necessary
                'The Sleeps prevent 100% CPU usage, file list is still updated within ~1 second of change to disk (depends on number of files in list)
                UpdateFileList(TreeView1.Nodes(0), "###")
                Thread.Sleep(100)
                UpdateFileList(TreeView1.Nodes(1), ".OSA")
                Thread.Sleep(100)
                UpdateFileList(TreeView1.Nodes(2), ".LTS")
                Thread.Sleep(100)
                UpdateFileList(TreeView1.Nodes(3), ".BMP")
                Thread.Sleep(100)
            End While
        End Sub 
    
        Public Sub UpdateFileList(ByVal ParentNode As TreeNode, ByVal TypeExt As String)
            Dim Files() As String
            Dim Folders() As String
    
            Try
    
                'Recurse directories, remove nodes that no longer reference a valid file, remove directory nodes that do not contain valid files
                For Each ChildNode As TreeNode In ParentNode.Nodes
                    If Directory.Exists(ChildNode.Name) Then
                        If ProbeForValidFiles(ChildNode.Name, TypeExt) Then
                            UpdateFileList(ChildNode, TypeExt) 'Recursive component of this, digs down to the lowest subdirectory, work is done from the lowest subdirectory up to the root node that was passed in initially
                        Else
                            ChildNode.Remove()
                        End If
                    ElseIf Not File.Exists(ChildNode.Name) Then 'Removes files/folders that no longer exist/are empty
                        If ChildNode.Parent.Nodes.Count = 1 Then
                            ChildNode.Parent.Remove() 'Handles removal of empty directories
                        Else
                            ChildNode.Remove() 'Remove node that no longer links to a valid file
                        End If
                    End If
                Next
    
                'Get list of files and folders in the current directory
                Folders = Directory.GetDirectories(ParentNode.Name)
                Files = Directory.GetFiles(ParentNode.Name)
    
                'For each subfolder in this folder check to see if it is in the file list
                ' If not, check to see if it contains valid files, if so add it
                For Each Folder As String In Folders
                    Dim Found As Boolean = False
    
                    'Check to see if this folder exists in the treeview
                    For Each Node As TreeNode In ParentNode.Nodes
                        If String.Compare(Folder, Node.Name) Then Continue For
                        Found = True
                        Exit For
                    Next
                    If Found Then Continue For
    
                    'If we get here then the folder was not found in the tree, check it for files and add if necessary
                    If ProbeForValidFiles(Folder, TypeExt) Then
                        AddNewFolder(Folder, ParentNode, TypeExt)
                    End If
                Next
    
                'For each file in this directory check to see if it is a valid file and exists in the tree
                For Each File As String In Files
                    Dim Found As Boolean = False
    
                    If Not ExtensionEquals(File, TypeExt) Then Continue For
    
                    'This "File" should be a child of "ParentNode"... so look for it, if not then add it
                    For Each Node As TreeNode In ParentNode.Nodes
                        If String.Compare(File, Node.Name) Then Continue For
                        Found = True
                        Exit For
                    Next
    
                    'If it was not found, add it.
                    If Not Found Then
                        TreeView1.Invoke(New TreeViewNodesAddIndirect_Delegate(AddressOf TreeViewNodesAddIndirect), New Object() {ParentNode, System.IO.Path.GetFullPath(File), System.IO.Path.GetFileName(File), 12})
                    End If
                Next
    
            Catch
            End Try
        End Sub 'Recursive HDD monitor function
    
        'Adds a new folder and all of its valid contents to the tree
        Private Sub AddNewFolder(ByVal FolderPath As String, ByVal ParentNode As TreeNode, ByVal TypeExt As String)
    
            Dim SubFolders() As String = Directory.GetDirectories(FolderPath)
            Dim Files() As String = Directory.GetFiles(FolderPath)
    
            'Add this folder to the tree
            ParentNode = TreeView1.Invoke(New TreeViewNodesAddIndirect_Delegate(AddressOf TreeViewNodesAddIndirect), New Object() {ParentNode, System.IO.Path.GetFullPath(FolderPath), Path.GetFileNameWithoutExtension(FolderPath), 10})
    
            'Recurse subdirectories
            For Each SubFolder As String In SubFolders
                AddNewFolder(SubFolder, ParentNode, TypeExt)
            Next
    
            'Add valid Files
            For Each File As String In Files
                If Not ExtensionEquals(File, TypeExt) Then Continue For
                TreeView1.Invoke(New TreeViewNodesAddIndirect_Delegate(AddressOf TreeViewNodesAddIndirect), New Object() {ParentNode, System.IO.Path.GetFullPath(File), System.IO.Path.GetFileName(File), 12})
            Next
    
        End Sub
    
        'Returns true if recursive search of this directory path finds valid files
        Private Function ProbeForValidFiles(ByVal FolderPath, ByVal TypeExt) As Boolean
    
            'Get list of folders and files contained in this folder
            Dim SubFolders() As String = Directory.GetDirectories(FolderPath)
            Dim Files() As String = Directory.GetFiles(FolderPath)
    
            'If no folders or files in this folder return false
            If SubFolders.Length = 0 And Files.Length = 0 Then Return False
    
            'If there are folders in this folder, recursively search each of them for valid files, return true as soon as one is found
            If SubFolders.Length > 0 Then
                For Each SubFolder As String In SubFolders
                    If ProbeForValidFiles(SubFolder, TypeExt) Then Return True
                Next
            End If
    
            'If there are files in this folder check each of them against the type we are looking for, return true as soon as one valid file found
            If Files.Length > 0 Then
                For Each File As String In Files
                    If Not ExtensionEquals(File, TypeExt) Then Continue For
                    Return True
                Next
                Return False
            End If
    
        End Function
    
        Public Function TreeViewNodesAdd(ByVal path As String, ByVal name As String, ByVal IconIndex As Integer) As System.Windows.Forms.TreeNode
            Return TreeView1.Nodes.Add(path, name, IconIndex, IconIndex)
        End Function 'Delegate to add node to TreeView root
    
        Public Function TreeViewNodesAddIndirect(ByVal ParentNode As Windows.Forms.TreeNode, ByVal path As String, ByVal name As String, ByVal IconIndex As Integer) As Windows.Forms.TreeNode
            Return ParentNode.Nodes.Add(path, name, IconIndex, IconIndex)
        End Function 'Delegate to add node to TreeView child node
    
        Public Function ExtensionEquals(ByVal FilePath As String, ByVal Compare As String) As Boolean
    
            Dim Extension As String = Path.GetExtension(FilePath).ToUpper
            If Extension.Length <> 4 Then Return False
    
            If Compare = "###" Then
                If String.Compare(Extension, ".SOR") = 0 Or (Char.IsDigit(Extension(1)) And Char.IsDigit(Extension(2)) And Char.IsDigit(Extension(3))) Then Return True
                Return False
            End If
    
            If String.Compare(Extension, Compare) = 0 Then Return True
            Return False
    
        End Function
    
     
    Last edited by a moderator: Feb 5, 2010
  2. CodeX

    CodeX Guest

    One thing I see is that the ProbeForValidFiles function only really needs to check each leaf node, because once the leaf evaluates true for that function then all of its parent nodes will be true as well.

    The problem is, I don't know where the leaf nodes are, because I could simultaneously be adding new ones... What I could do is run a pre-process that drills straight down each path until it finds a directory where that function would return false, and then somehow remember that each node on the tree above that point should return true. By remembering this I would no longer have to run that function for each node on the tree, which is what it is doing now...

    Of course, like I said, this ignores new leaf nodes that may be added in the same iteration... but that's okay since they will be picked up on the next, it will just add a delay when a file is created on the disk that would end up being a leaf node in this tree structure. I guess the question is whether this delay is worth the marginal reduction in processor usage from this optimization.
     
  3. GOGZILLA

    GOGZILLA Double-Uranium Member

    Joined:
    Jan 16, 2003
    Messages:
    10,760
    Likes Received:
    3
    Location:
    Plantation, FL
  4. ChosenGSR

    ChosenGSR Mama always said you'd be the chosen one

    Joined:
    Oct 24, 2001
    Messages:
    50,971
    Likes Received:
    209
    Location:
    HoCo, MD
    I didn't take the time to look into your code but the approach seems wrong to me all together. I would probably do an initial scan to get the list of available files, then instantiate a file watcher that would watch the entire drive if so you desire. Otherwise you're hammering away at your hard drive non stop.
     
  5. CodeX

    CodeX Guest

    That is exactly what it does...
     
  6. CodeX

    CodeX Guest

  7. ChosenGSR

    ChosenGSR Mama always said you'd be the chosen one

    Joined:
    Oct 24, 2001
    Messages:
    50,971
    Likes Received:
    209
    Location:
    HoCo, MD
    I am confused by your two last posts :rofl:
     
  8. CodeX

    CodeX Guest

    Well, what do you think that FileSystemWatcher class does? It probably does exactly what I am doing, it's just pre-built.

    Also, I didn't go into detail about your last post, but no, it does not use the hard drive very much at all. When the app starts there is a flurry of hard drive activity as it scans it for the first time, after that it is almost only read access and even then it almost never occurs as the data is cached until something on the file system changes.

    I'm almost positive that what I am doing is the same thing that the FileSystemWatcher does, except the OS handles it for you so I am just doubling up on the resources used.

    So anyway it was a helpful suggestion and I didn't know about it and appreciate the advice :wavey:
     
  9. GOGZILLA

    GOGZILLA Double-Uranium Member

    Joined:
    Jan 16, 2003
    Messages:
    10,760
    Likes Received:
    3
    Location:
    Plantation, FL
    probably what the filewatcher does is have a hook into the OS that generates an event in the case of a file write which is what youre looking for. Then it probably looks at the list of dirs/files you want to be notified of and if theres a match lets you know.
     
  10. CodeX

    CodeX Guest

    Right, and what my implementation does is read the contents of the directories I care about, which is then cached until a change is made, and then new data is read. There really is no hard drive activity caused by my implementation until files are changed on the disk. So while they may be implemented differently I don't think there is an appreciable difference in hard drive usage, but I am sure using already-existent system notifications would be more efficient processor usage wise.

    In any case I'll definitely be looking into it on monday, thanks again!
     
  11. ChosenGSR

    ChosenGSR Mama always said you'd be the chosen one

    Joined:
    Oct 24, 2001
    Messages:
    50,971
    Likes Received:
    209
    Location:
    HoCo, MD
    How do you know changes have been made?
     
  12. ChosenGSR

    ChosenGSR Mama always said you'd be the chosen one

    Joined:
    Oct 24, 2001
    Messages:
    50,971
    Likes Received:
    209
    Location:
    HoCo, MD
    That's exactly how it works. It's a managed wrapper for windows apis that raise file i/o events. It's how every single system that needs file watching capability is written in .NET.
     
  13. CodeX

    CodeX Guest

    The caching I am referring to must be handled by the OS... I am only speculating about this. The code I wrote would appear to read from the hard drive constantly, but it does not, only when a file is changed in a directory that it is monitoring, so I assume when I request the file lists from a directory the OS provides my application with a cached copy in memory if nothing has changed since the last time I asked.
     
  14. GOGZILLA

    GOGZILLA Double-Uranium Member

    Joined:
    Jan 16, 2003
    Messages:
    10,760
    Likes Received:
    3
    Location:
    Plantation, FL
    btw thats vb.net right?
     
  15. CodeX

    CodeX Guest

    yeah, should have mentioned that in the first post probably :o
     
  16. ChosenGSR

    ChosenGSR Mama always said you'd be the chosen one

    Joined:
    Oct 24, 2001
    Messages:
    50,971
    Likes Received:
    209
    Location:
    HoCo, MD
    Well as it has been stated the proper solution is to use a file watcher. My only advice to you is in the future if you're trying to solve a problem first do a little research, it's almost guaranteed that there is a solution out there, one that is typically recognized as the "tried and true" or a "standard". This is something that I did not understand when I was a junior developer for the first few years working for a small company.
     
  17. CodeX

    CodeX Guest

    That's good advice, but don't discount the value in learning to implement things on your own, you'll find you learn a lot more and end up much more capable when you don't build all of your applications out of lego bricks. I'm an embedded C/asm developer in a VB.NET drag-and-drop world here. I see the value in the simplicity of it, but despise those that rely on pre-built implementations to the point that those implementations might as well be magic to them.

    I'm sure there are many many developers out there who rely on the tried and true solution so often they wouldn't be able to implement or modify them if the need arose. The fact is, the way I was doing it worked, it is less efficient than the method provided by microsoft, but it does not constantly exercise the hard drive (as was suggested) and it has low processing overhead. If this FileWatcher class was not pointed out to me I would have released what I had and I am sure no end user would have ever known the difference.
     
    Last edited by a moderator: Feb 10, 2010

Share This Page