[ Tcllib Home | Main Table Of Contents | Table Of Contents | Keyword Index | Categories | Modules | Applications ]

fileutil_traverse(n) 0.6 tcllib "file utilities"

Name

fileutil_traverse - Iterative directory traversal

Table Of Contents

Synopsis

  • package require Tcl 8.3
  • package require fileutil::traverse ?0.6?
  • package require fileutil
  • package require control

Description

This package provides objects for the programmable traversal of directory hierarchies. The main command exported by the package is:

::fileutil::traverse ?objectName? path ?option value...?

The command creates a new traversal object with an associated global Tcl command whose name is objectName. This command may be used to invoke various operations on the traverser. If the string %AUTO% is used as the objectName then a unique name will be generated by the package itself.

Regarding the recognized options see section OPTIONS. Note that all these options can be set only during the creation of the traversal object. Changing them later is not possible and causes errors to be thrown if attempted.

The object command has the following general form:

$traverser command ?arg arg ...?

Command and its arguments determine the exact behavior of the object.

The following commands are possible for traversal objects:

$traverser files

This method is the most highlevel one provided by traversal objects. When invoked it returns a list containing the names of all files and directories matching the current configuration of the traverser.

$traverser foreach filevar script

The highlevel files method (see above) is based on this mid-level method. When invoked it finds all files and directories matching per the current configuration and executes the script for each path. The current path under consideration is stored in the variable named by filevar. Both variable and script live / are executed in the context of the caller of the method. In the method files the script simply saves the found paths into the list to return.

$traverser next filevar

This is the lowest possible interface to the traverser, the core all higher methods are built on. When invoked it returns a boolean value indicating whether it found a path matching the current configuration (True), or not (False). If a path was found it is stored into the variable named by filevar, in the context of the caller.

The foreach method simply calls this method in a loop until it returned False. This method is exposed so that we are also able to incrementally traverse a directory hierarchy in an event-based manner.

Note that the traverser does follow symbolic links, except when doing so would cause it to enter a link-cycle. In other words, the command takes care to not lose itself in infinite loops upon encountering circular link structures. Note that even links which are not followed will still appear in the result.

OPTIONS

-prefilter command_prefix

This callback is executed for directories. Its result determines if the traverser recurses into the directory or not. The default is to always recurse into all directories. The callback is invoked with a single argument, the absolute path of the directory, and has to return a boolean value, True when the directory passes the filter, and False if not.

-filter command_prefix

This callback is executed for all paths. Its result determines if the current path is a valid result, and returned by next. The default is to accept all paths as valid. The callback is invoked with a single argument, the absolute path to check, and has to return a boolean value, True when the path passes the filter, and False if not.

-errorcmd command_prefix

This callback is executed for all paths the traverser has trouble with. Like being unable to change into them, get their status, etc. The default is to ignore any such problems. The callback is invoked with a two arguments, the absolute path for which the error occured, and the error message. Errors thrown by the filter callbacks are handled through this callback too. Errors thrown by the error callback itself are not caught and ignored, but allowed to pass to the caller, i.e. however invoked the next. Any other results from the callback are ignored.

Warnings and Incompatibilities

0.4.4

In this version the traverser's broken system for handling symlinks was replaced with one working correctly and properly enumerating all the legal non-cyclic paths under a base directory.

While correct this means that certain pathological directory hierarchies with cross-linked sym-links will now take about O(n**2) time to enumerate whereas the original broken code managed O(n) due to its brokenness.

A concrete example and extreme case is the "/sys" hierarchy under Linux where some hundred devices exist under both "/sys/devices" and "/sys/class" with the two sub-hierarchies linking to the other, generating millions of legal paths to enumerate. The structure, reduced to three devices, roughly looks like

	/sys/class/tty/tty0 --> ../../dev/tty0
	/sys/class/tty/tty1 --> ../../dev/tty1
	/sys/class/tty/tty2 --> ../../dev/tty1
	/sys/dev/tty0/bus
	/sys/dev/tty0/subsystem --> ../../class/tty
	/sys/dev/tty1/bus
	/sys/dev/tty1/subsystem --> ../../class/tty
	/sys/dev/tty2/bus
	/sys/dev/tty2/subsystem --> ../../class/tty

When having to handle such a pathological hierarchy it is recommended to use the -prefilter option to prevent the traverser from following symbolic links, like so:

    package require fileutil::traverse
    proc NoLinks {fileName} {
        if {[string equal [file type $fileName] link]} {
            return 0
        }
        return 1
    }
    fileutil::traverse T /sys/devices -prefilter NoLinks
    T foreach p {
        puts $p
    }
    T destroy

Bugs, Ideas, Feedback

This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category fileutil of the Tcllib Trackers. Please also report any ideas for enhancements you may have for either package and/or documentation.

When proposing code changes, please provide unified diffs, i.e the output of diff -u.

Note further that attachments are strongly preferred over inlined patches. Attachments can be made by going to the Edit form of the ticket immediately after its creation, and then using the left-most button in the secondary navigation bar.

Keywords

directory traversal, traversal

Category

Programming tools