#### List Processing

For most of the history of DOS list processing was impossible - it wasn't until DOS 6 introduced provisions for loading so much stuff high that some older programs broke because they couldn't run properly in the lowest 64K of memory, and MicroSoft introduced LOADFIX.COM as a kludge to fix that problem that a means of executing a program (even a batch file) for each item in the kinds of list that we can generate with batch techniques. There are other elements in the solution as well. This came to my attention in a very imperfect form on usenet (it is said to have been published in PC-Magazine, but since I refuse to read any magazine that is more cardboard than content I have no personal knowledge of this). As it came to my attention, the process required that LOADFIX.COM exist in c:\dos\ and that there not be any preexisting program in the default directory, in c:\dos\ or in the PATH ahead of c:\dos with the names ENTER.COM or NEW.BAT. I find these restrictions unacceptable and have restyled the approach to avoid these restrictions. My approach does have the restrictions that the current (and any given) directory must be writable, that the convention that anything with a name beginning with '}' can be treated as transient and reusable, and that LOADFIX exist somewhere in the PATH.

Obviously, if we are building the list manually or with AWK or another high level language we can make our list entries have the form


call foo arguments

which allows the list to be used as a batch file that calls foo.BAT for each element in the list. However, it has not been possible to simulate that in batch generated lists because we couldn't prefix the line with "call foo" or induce CALL to respond to some other name. We have for long been able to prefix each item in a list with things like the "Enter new ..." prompts from DATE and TIME and to create a batch file named ENTER.BAT that would execute for the first, but only the first, element in the resulting list because the invocation is a jump instead of a call and so it never returned to the original list file executing as a batch file. This trick has its uses of course, but none of them allowed processing multiple items.

The key elements in the LOADFIX solution to list processing are

(01) The output of some batch functions generate lists we would like to process
(02) DATE can prefix each line with a fixed string, the first two fields of which can be used as valid file names
(03) DATE hangs unless we put a valid date or a blank line at the end of the list file
(04) FIND "Enter" can clean up the output from DATE by passing only the prefixed list lines
(05) The filtered output from FIND is in a format that is valid batch language syntax, provided we make a program named ENTER(.BAT, .COM, or .EXE) available
(06) LOADFIX.COM can be copied and renamed ENTER.COM
(07) Since ENTER is a .COM program, it will return to the invoking batch file (ENTER.BAT would not)
(08) LOADFIX can execute a batch file if the name of the batch file is the first argument passed to it
(09) The first argument passed to ENTER.COM (LOADFIX.COM) is "new"
(10) We can provide a NEW.BAT to respond to the NEW command executed by ENTER.COM
(11) The remainder of the command tail passed to ENTER.COM will be passed to NEW.BAT
(12) The first two arguments passed to NEW.BAT are the remaining parts of the DATE prompt and must be discarded
(13) Two SHIFT commands will discard the first two arguments in NEW.BAT or we can simply refer to them as %3 instead of %1, %4 instead of %2, etc.
(14) The remaining arguments are the original line generated in item (01)
(15) If we didn't make any mistakes implementing the previous 14 items, NEW.BAT will process each of the original list items, one at a time
(16) There must be at least two fields in the string passed to DATE

Now, that may be like going around your elbow to get to your thumb, but it does work, as will be demonstrated in the examples that follow.

One thing to keep in mind while reading this is that when I describe an example, I haven't yet explored the thing that makes it different. This leads to placing the information about gotchas in the text for the examples rather than up front. And there are definitely gotchas here, though they are mostly predictable if one takes the time to throughly think the situation out (time that I just don't have (the first version of this essay was written in what little time I could spare from digging flower beds over the four day 4th of July holiday ('96))). I did come back to write this paragraph, and I will mention one serious gotcha: each call to NEW(.BAT, .COM, or .EXE) gets a separate environment with little or no free space.

A somewhat different approach to list processing, using the command processor instead of LOADFIX.EXE that works in real DOS, Win95, and NT4 is illustrated in the page on Multi-OS Batch Programs.

Tom Levadas is doing some interesting things with PROMPT and broken commands to the command processor.

This first example is fairly trivial - it generates a list of subdirectories in a given directory in date order and does a DIR on each in turn. .

As is often the case in these examples, I have made certain compromises in order to keep line length under 80 characters (to prevent line wrap on most displays) and commands under 64 characters.

If the program barfs with a syntax error on the FOR command, your PATH is longer than it needs to be and you should look into the concept of using batch wrappers around your programs.


@echo off
if !%1==!}{ goto %2
for %%a in (COM EXE BAT) do if exist enter.%%a %0 }{ err1 %%a
for %%a in (COM EXE BAT) do if exist new.%%a %0 }{ err2 %%a
echo if not exist %%1\loadfix.com goto end > }{.bat
echo copy %%1\loadfix.com enter.com >> }{.bat
echo :end >> }{.bat
for %%a in (%path%) do call }{ %%a
del }{.bat
if not exist enter.com goto err3
echo dir %3> new.bat
dir %1 /b /o:d > }1{.dat
echo.>> }1{.dat
date < }1{.dat | find "Enter" > }2{.bat
call }2{
for %%a in (1{.dat 2{.bat) do del }%%a
for %%a in (new.bat enter.com) do del %%a
goto end
:err1
echo Unable to continue: ENTER.%3 exists in the default directory.
goto end
:err2
echo Unable to continue: NEW.%3 exists in the default directory.
goto end
:err3
echo Unable to continue: LOADFIX.COM is not in the PATH.
:end


This can be generalized to a module that accepts the names of a list file and a program to execute for each item in the list, and then does just that. The program argument must be the name of a .BAT, .COM, or .EXE file and must be the complete filespec, including the extension (if the file is in the default directory, it need not have the path part of the filespec, but still must have the extension - the list file can have any extension, but must also be specified in such a way that it can be identified by the COPY command. Whatever the executable format (exe, com, or bat) it must recognize that its third and following arguments are the list entry and that the first two arguments are just byproducts of the process of getting the list into usable form.


@echo off
if !%1==!}{ goto %2
for %%a in (COM EXE BAT) do if exist enter.%%a %0 }{ err1 %%a
for %%a in (COM EXE BAT) do if exist new.%%a %0 }{ err2 %%a
echo if not exist %%1\loadfix.com goto end > }{.bat
echo copy %%1\loadfix.com enter.com >> }{.bat
echo :end >> }{.bat
for %%a in (%path%) do call }{ %%a
del }{.bat
if not exist enter.com goto err3
copy %2 new.* > nul
copy %1 }1{.dat
echo.>> }1{.dat
date < }1{.dat | find "Enter" > }2{.bat
call }2{
for %%a in (1{.dat 2{.bat) do del }%%a
for %%a in (new.bat enter.com) do del %%a
goto end
:err1
echo Unable to continue: ENTER.%3 exists in the default directory.
goto end
:err2
echo Unable to continue: NEW.%3 exists in the default directory.
goto end
:err3
echo Unable to continue: LOADFIX.COM is not in the PATH.
:end


Again, the program will fail if the path is too long - if the program is to be used only on one or more machines with LOADFIX in a known location, the search and copy code can be replaced with a simple absolute COPY command.

The program to execute can, of course, be a batch wrapper for any executable - the extra arguments can be discarded there:


@echo off
shift
shift
foo %1 %2 %3 %4 %5 %6 %7 %8 %9
:end


where foo is the program to execute.

The RENnnn.BAT example can be used to rename a sorted list of files generated by DIR with /b, and /a:, and/or /o: switches.

The COPY of the given program to NEW.* can also be eliminated by simply making the wrapper have the name NEW.BAT. The next example uses a hard coded location for LOADFIX and a wrapper named NEW.BAT with the RENnnn program to rename the list of files to the form BILLnnn.DOC.

NEW.BAT


@echo off
shift
shift
rennnn %1 m:\960704\x BILL DOC 999


LISTPROC.BAT


@echo off
if not exist enter.com goto err1
if not exist new.bat goto err2
if !%1==! goto err3
copy %1 }1{.dat > nul
echo.>> }1{.dat
date < }1{.dat | find "Enter" > }2{.bat
call }2{
for %%a in (1{.dat 2{.bat) do del }%%a
goto end
:err1
echo Unable to continue: ENTER..COM not in the default directory.
goto end
:err2
echo Unable to continue: NEW..BAT not in the default directory.
:err3
echo Unable to continue: No file list passed as an argument.
:end


Note that this version does not produce any "1 file copied" messages: "> nul" following a COPY command does that, but it isn't possible to put the '>' in a batch file created by use of the ECHO command since it is interpreted as the redirection operator.

We can also deal with the need for a program named NEW(.COM, .EXE, or .BAT) by accepting the name of the actual program, FOOBAR.BAT for example, and creating a wrapper named NEW.BAT around it with the list processor:


if not exist new.bat goto err2
if !%1==! goto err3
copy %1 }1{.dat > nul
echo.>> }1{.dat
date < }1{.dat | find "Enter" > }2{.bat
call }2{
for %%a in (1{.dat 2{.bat) do del }%%a
goto end

becomes

if !%1==! goto err3
copy %1 }1{.dat > nul
echo.>> }1{.dat
date < }1{.dat | find "Enter" > }2{.bat
echo shift > new.bat
echo shift >> new.bat
echo %2 %%1 %%2 %%3 %%4 %%5 %%6 %%7 %%9 >> new.bat
call }2{
for %%a in (}1{.dat }2{.bat new.bat) do del %%a
goto end


That takes two arguments: the list file name and the program name and arranges for the program to receive up to nine fields from the original list as arguments (nine is the maximum we can reference in a batch file, so NEW.BAT is limited to passing just those). Note that the structure of the next to last line is subtly different.

One useful program based on the above is SWEEP.BAT which takes the name of a directory, the name of a program, and a list of arguments for that program. It executes the given program or command once for the default directory and once for each subdirectory within it. For example


sweep dir /b /a:-d


would display a list of all the files in the default branch of the directory tree. This code is derived from that immediately above and omits the error checking. This is just about the minimum usable list processing program.

SWEEP1.BAT


@echo off
dir /b /s /a:d > }1{.dat
echo.>> }1{.dat
date < }1{.dat | find "Enter" > }2{.bat
echo @shift > new.bat
echo @shift >> new.bat
echo %1 %%1 %2 %3 %4 %5 %6 %7 %8 %9 >> new.bat
call }2{
for %%a in (}1{.dat }2{.bat new.bat) do del %%a


Note that this passes the directory to the program as an argument - it does not execute the program in the directory. A slight variation accepts a given directory as the base from which to sweep - use '.' to mean the current directory. The directory is the first argument passed to the batch file which moves the program name to the second, and so forth.

SWEEP2.BAT


@echo off
dir %1 /b /s /a:d > }1{.dat
echo.>> }1{.dat
date < }1{.dat | find "Enter" > }2{.bat
echo @shift > new.bat
echo @shift >> new.bat
echo %2 %%1 %3 %4 %5 %6 %7 %8 %9 >> new.bat
call }2{
for %%a in (}1{.dat }2{.bat new.bat) do del %%a


It appears possible to actually change to the directory and execute from there, but I haven't worked it out (there are limits to my time, patience, and interest - and I feel I should leave something for the reader to do).

A more refined class of list processing is to work on specific list entries rather than on the whole list. A real world example that has been suggested is to delete everything except the newest file in a directory. The approach is to use the list processing scheme above to set an environment variable with the current file name - obviously only the last one will remain when the list processing ends. By making the list with DIR /b /o:d we get a list of filenames in order of date, newest last. We will assume that LOADFIX.COM has already been copied to C:\BIN\ENTER.COM - if you are going to actually use list processing batch files, it would make things easier if you just went ahead and did that (assuming that the instance of ENTER.COM in the \bin directory is the only or first ENTER(.COM, .EXE, or .BAT) in the path.

In this program, we put all our batch files in c:\bin, which is in the path.

FOO.BAT


@echo off
echo on
if  "%3"=="" goto end
echo set fname=%3> %temp%\}2{.bat
:end


BAR.BAT


@echo off
shift
shift
if not !%1 == !%fname% if exist %1 del %1


KEEPLAST.BAT


@echo off
dir %1 /b /o:d /a:-d > %temp%\flist.dat
echo.>> %temp%\flist.dat
date < %temp%\flist.dat | find "Enter" > c:\bin\}{.bat
cd %1
copy c:\bin\foo.bat c:\bin\new.bat
call c:\bin\}{
call %temp%\}2{
copy c:\bin\bar.bat c:\bin\new.bat
call c:\bin\}{
for %%a in (new }{) do del c:\bin\%%a.bat
for %%a in (}2{.bat flist.dat) do del %temp%\%%a


Note that the program changes to the working directory, which must not be the default at the time the program starts (normally it will be the parent of the target directory). Also, the target directory can't be the TEMP directory.

That program gets a clean list of the files in the directory before changing to it and running the list batch file twice: once to leave the name of the last file in the environment variable and then to delete all files that don't have that name. It is essential that there not be a NEW(.BAT, .EXE, or .COM) in the target directory.

How about the first list entry? We don't need the list processor for that - just use ENTER.BAT instead of NEW.bat, use the %4 instead of %3, and execute the list (after processing with DATE) - this will execute ENTER.BAT only once, the first line in the list file.

A real-world example of this is a scheme for adding a backup file to a directory and deleting the oldest, thereby maintaining a constant number of backups in the directory. For this example we will assume that the backups are zips and that there are other files in the directory. We will also assume that the files to back up are all the *.db files in the parent of the backup directory. The target (backup) directory is also the working directory. The backup file will be given the date as it's name. We'll use YYMMDDG.BAT to get the date and leave it in the DDATE environment variable. YYMMDDG and DBBACKUP are somewhere in the path; ENTER.BAT is in the current directory.

ENTER.BAT


@del %4


DBBACKUP.BAT


@echo off
call yymmddg
call pkzip %ddate% ..\*.db
dir *.zip /b /o:d > flist.dat
echo.>> flist.dat
date < flist.dat | find "Enter" > }{.bat
call }{
for %%a in ( }{.bat flist.dat ) do del %%a
set ddate=


Note that since we didn't use LOADFIX, the list was processed into ENTER.BAT and its fourth rather than third argument was used as the filename.

Another example was prompted by a reader's question about how to find and execute a program without knowing where it is on the disk and without it being in the path. The exchange of messages led to the following - NOTE CAREFULLY: this example uses the fifth instead of the fourth argument to ENTER.BAT because the program is for Windows NT 3.5, NOT for MSDOS 6.22 as is usual for the code here. "FOO" is, as usual, any file name or other syntatically appropriate string.

It would be tempting to use


dir c:\foo.exe /s /b > }{.bat
call }{

but what if there are multiple copies of the target file?

We can get around that with


dir c:\foo.exe /s /b | find "FOO" > }1{.dat
echo.>> }1{.dat
date < }1{.dat | find "Enter" > }{.bat
echo %%5> enter.bat
call }{

The flow there is the parent CALLs }{.BAT; }{.BAT jumps to ENTER.BAT; ENTER.BAT terminates and returns to the parent (since the CALL return is pending and ENTER.BAT is the first thing to terminate and therefore activate the return).


DIR foo.exe /s /b

generates a line of the form "C:\BAR\BAZ\FOO.EXE" (upper case) with modifications to the path part if other copies exist for each FOO.EXE on the default drive.

ECHO.>

adds a blank line to the end of the file containing the strings (which are directory entries, but have a suitable format for use as commands). DATE prefixes the strings with "Enter the new date (mm-dd-yy): " giving us "Enter the new date (mm-dd-yy): C:\BAR\BAZ\FOO.EXE". When this line is executed (}{.BAT is called) it jumps to ENTER.BAT and passes it the argument sting "the new date (mm-dd-yy): C:\BAR\BAZ\FOO.EXE". Since the only command in ENTER.BAT is "%5" it issues the fifth argument "C:\BAR\BAZ\FOO.EXE" as a command. Extension to all available, but unknown, drives is possible, but a bit ugly - make that very ugly if the possibility of removable media exists. If removable media are not possible (meaning the "Abort, retry, fail?" is not possible) something like this can be used (not tested):

@echo off
if !%1 == !}{ goto %2
rem> }1{.dat
for %%a in ( c d e f g h i j k l m n o p q r s t u v w x y z ) do call %0 }{ pass2 %%a
echo.>> }1{.dat
date < }1{.dat | find "Enter" > }{.bat
echo %%5> enter.bat
call }{
goto end
:pass2
dir %3:\foo.exe /s /b | find "FOO" >> }1{.dat
:end

Again, the first instance found will be the one invoked. Adding code to deal with removable media could be as simple as prefixing the DIR command with "COMMAND /f /c", but that will fail if there are two logical drives for one physical drive and the one asked for is not the current one. I don't know of any way around that. It will not fail if there is no disk in the drive though.

Note that the prompt string from DATE under NT is not the same as the one from MSDOS: it uses "Enter the new date ..." instead of "Enter new date ..." - I am told that this is true with both COMMAND.COM and CMD.EXE under NT.

Another approach is to actually count the entries to keep. We'll use a bang counter but any counter type would do.

In these "real_world" examples, it makes sense to put ENTER.COM (a renamed copy of LOADFIX.COM) in the path, in c:\bin for these examples since the examples used here are pretty much the sorts of programs that would be run under known conditions - mostly on a single machine with a hard drive.

This example also assumes that NEW.BAT is in the current directory and that the target directory is the current default and is the same directory as the the one with the files to back up. The latter condition implies that the source files and target archives have unique filename extensions - that is, all the archives are .ZIP and all .ZIP files are archives, and that all .DB files are to be backed up and that all files to be backed up have .DB extensions. There is nothing magic about .DB, I just made it up for testing the programs.

This program keeps exactly five (the newest five) .ZIP files in the directory, regardless of how many are in the directory when the program is run.

The really big gotcha is that we get a fresh copy of the environment, with little or no free space in it for each call to NEW.BAT - this is a serious handicap. If the root environment were available, we could use

NEW.BAT


@echo off
if not $%count%==$%target% goto inc
del %3
goto end
:inc
set count=%count%!
:end


or if we got a limited environment, but it were the same for each call to NEW.BAT we could create a dummy variable with enough space in the master program and delete it in NEW.BAT to make room - but no such luck - we have to do it the hard way. Note that a byproduct is that we can't change the master (root) environment from NEW.BAT.

For this, we have to assume that there are no files named }!{.DAT in the working directory - we are going to use a file as the variable: we'll count the number of lines in the file, and if less then our target, we'll add one more '!' line. We have to use FIND to count the lines and unfortunately, it doesn't return a unique ERRORLEVEL that tells us how many matches it found. It does (MS-DOS 6.22) return a zero if it finds any match, so we can use that feature on a second FIND filter to recognize the preset count. To make the count work properly, we have to set the target count to one less than we really want (4 in this case, since we want five .ZIP files to remain in the directory.

NEW.BAT


@echo off
find /c "!" }!{.dat | find "%target%" > nul
if errorlevel 1 goto inc
del %3
goto end
:inc
echo !>> }!{.dat
:end


KEEPFIVE


@echo off
set target=4
REM> }!{.dat
call yymmddg
pkzip %ddate% *.db
set ddate=
dir *.zip /b /o:-d > flist.dat
echo.>> flist.dat
date < flist.dat | find "Enter" > }1{.bat
call }1{
del flist.dat
del }1{.bat
del }!{.dat
set target=
set count=


Note that this, as the previous example, does not delete the files backed up and that, unlike the previous example, the directory listing is sorted with the newest files first (/o:-d). If you want to delete the files as they are archived, change


pkzip %ddate% *.db

to

pkzip -m %ddate% *.db


Here's an example that doesn't use the contents of the passed DIR listing - it uses the number of arguments. The idea is that files can be sorted into size categories based on the number of fields in the DIR listing, which depends on the number of commas in the size fields. This code assumes that all files of interest have extensions.

TOOBIG.BAT


@echo off
set pattern=*.*
if not %1!==! set pattern=%1
dir /a:-d %pattern% | find /v "e" | find ":" > }{.dat
echo.>> }{.dat
date < }{.dat | find "En" > }{.bat
echo @echo off> new.bat
echo shift>> new.bat
echo shift>> new.bat
echo set f=%%7>> new.bat
echo if not %%f%%!==! del %%1.%%2>> new.bat
call }{
for %%a in ( }{.bat }{.dat new.bat ) do del %%a
for %%a in ( pattern f ) do set %%a=


The default search pattern is "*.*" - if an argument is given, it must be a valid DIR search pattern. This example is hard coded to delete all files that match the pattern and are over 999,999 bytes long. The size group selected is controlled by the argument number following the "%%" in echo set f=%%7>> new.bat. Things get a bit stickier if we need to use %10 - there isn't any %10. We can't just stick in a bunch of SHIFT commands - we have to save the filename first. I'll omit the obvious code. The two shift command already present get rid of the two fields left over from the DATE "Enter" string, which puts the file's base name in the first argument: %1.

So far all these examples have been contrived to illustrate points about list processing and the only list source shown was DIR. This example is a real program - one in continuous use as part of a student-proofing scheme for a package running on other machines. This program gets the user IDs of any users logged in from the four machines running the package and monitors the user's home directory root for a flag file created by a watchdog on the machine running the package being monitored. When found, it changes the user's password to a random string, effectively locking the user's account, and logs the event. In the following, D: is a RAMDRIVE, O: is the daemon's home directory, R: is the drive with the user's directories, USERS.EXE is a utility that generates a list of users meeting certain criteria (here the criterion is the type of Network Interface Card in the user's machine, which limits the search to a certain class of machines), and the command to change the password is in the flag file (~ALARM~.BAT), which is executed as the payload of ME280.BAT.

ME280.BAT

@echo off
if %0==D:\ME280 goto cont
break on
if not exist d:\me280.bat copy o:\me280\me280.bat d:\
D:\ME280
:cont
copy o:\me280\enter.com d:\
copy o:\nogame\users.exe d:\
copy o:\me280\new.bat d:\
:loop
d:\users /a=0080488187* /l=d:\userlist.dat,s
find < d:\userlist.dat "00804881874B" > d:\short.lst
find < d:\userlist.dat "00804881875C" >> d:\short.lst
find < d:\userlist.dat "008048818775" >> d:\short.lst
find < d:\userlist.dat "0080488187BF" >> d:\short.lst
echo.>> short.lst
date < short.lst | find "Enter" > d:\}{.bat
d:
cd \
call }{
del d:\short.lst
del d:\}{.bat
del d:\userlist.dat
if exist o:\me280\me280.stp goto end
goto loop
:end


The format of the list is such that the user's ID is the fifth argument when the list entry is passed to ENTER.COM and the fourth when passed to NEW.BAT.

NEW.BAT


@echo off
if not exist r:\home\%4\~alarm~.bat goto end
call r:\home\%4\~alarm~.bat
echo. >> o:\me280\me280.log
echo %4 >> o:\me280\me280.log
echo.| date >> o:\me280\me280.log
echo.| time >> o:\me280\me280.log
del r:\home\%4\~alarm~.bat
:end


ENTER.COM is of course a renamed copy of LOADFIX.COM. Since all the drives except the RAMDRIVE are network drives the files are not cached and the RAMDRIVE acts as the cache. Note that the first thing the program does when it is invoked from its original directory is to copy itself and the other programs it uses to the RAMDRIVE - the second is to transfer control to the copy of itself that it just made in the RAMDRIVE. This batch program was easier to write and debug than the .EXE it replaced, and it runs at least as fast (though admittedly, the .EXE did not have access to a RAMDRIVE for its transient files). It is also interesting to note that this program runs in a DOS window, and even though it changes the TEMP setting, other daemons running in other DOS windows are unaffected. This is of course the expected behavior, but it should be noted anyway.

One might wonder why I published such an important part of a security scheme where the students involved can get at it. The server log shows that they don't read this book and it would be interesting to see if they can find a way to beat it - there is one, but it is very subtle and sure to attract administrative attention (and it's not likely to be particularly effective anyway).

  ** Copyright 1996, Ted Davis - all rights reserved **

Input and feedback from readers are welcome. NOTE: the subject of the message must contain the word "batch" for the message to get past the spam filter.