Execute commands for multiple files using fish
Recently I have downloaded some submissions from a course I am currently giving. Luckily the tool being used allows to download an archive with all submissions at once, but unfortunately the structure of the archive is not really handy. It contained a folder for every student, and in this folder there was another zip archive, which contained the actual submission of the given student.
In addition to that every single archive contained a JavaScript project, in each of which I had to execute npm install
in order to test the submission afterwards. So I could either go into each folder, unzip the archive and install the NPM packages for each and every one of them. Or I could take some time, and figure out how to unzip all archives and install the required packages with two small command line scripts.
Of course I chose the latter!
First of all I had to find a glob that matches all the files I want to execute the command (unzip
in my case) for. The ls
command is great for testing this:
$ ls */*.zip
ls
is usually used to list the content of entire directories, but also allows to list files being passed to it. This is really useful in combination with the wildcards the fish shell provides. fish
tries to match the */*.zip
expression to as many files as possible, whereby the *
can be a string of any length not containing /
, which is the folder delimiter in linux. That means it will find any arbitrarily named file ending in .zip
being located directly in a sub folder of the current directory. So the above command returned file names like erika_mustermann/submission.zip
and john_doe/assignment.zip
for me, which indicated that I was on the right track. I always start tasks like that this way, because I want to make sure that I am not accidentally executing the commands for the wrong files.
Next step: Figure out how to use the for
loop in fish
. I still prefered to play it safe, so I decided to only output the names of the files using the echo
command within the for
loop.
$ for file in */*.zip;
echo $file;
end
Basically the for
loop in fish
is a simple for .. in
construct, whereby the name after the for
is being used as the variable name (mind that it is defined without a prefix, but when accessing it a $
is required), and the part between in
and ;
is the expression that is being looped over, which is every file that matches the glob in our case. Other than some differences in whitespaces the output should be pretty much the same as in the previous ls
command.
So I knew that the for
loop was receiving the correct values and I could start writing the actual command I want to execute for all of these files. I’ve ended up with the following:
$ for file in */*.zip;
unzip $file -d (dirname $file);
end
The unzip
command takes the name of the archive file you want to extract. The only problem is that it will extract the archive’s content in the current directory instead of right next to itself. Fortunately the unzip
command comes with a -d
option, that allows to specify the directory you want the archive to be extracted to. However, this option expects a directory, not a file. All I’ve got until now is the file name of the archive, but thankfully dirname
exists as well. That’s a command that will remove everything after the last /
in the given string, including the /
itself. So it is a great utility to get the parent directory of another file or directory. I use that command in combination with fish’s command substitution, which allows me to execute another command and use it as parameter of an outer command. That is done by putting the inner command in paranthesis.
The above command unarchived all zip files right where there are located. But some of the students added another root folder in their archive, and others didn’t. So the structure was not always the same, but I still wanted to run npm install
for all of them at once. Therefore I decided to use another for
loop with a different glob. The **
wildcard is similiar to the *
wildcard, but with one important difference: It also includes /
, meaning that it matches an arbitrary number of sub folders, not just a single one. This way it does not matter if another folder has been added as root in the archives.
In addition to that I also make use of the fact that multiple commands can be used within a for
loop if they are delimited by a ;
.
$ for file in **/package.json;
cd (dirname $file);
npm install;
cd -;
end
This loop finds all package.json
files in any of the current sub directories, changes to the containing directory of the file by again using the dirname
command with the $file
variable. Afterwards npm install
is executed to install all the dependencies, and once that is finished it goes back to the previous working directory (that’s what the -
stands for).
For some reason I was always a bit hesitant when using loops in the command line. Somehow the command line always felt to me like something to which I enter a single command and get a result in return, which is a pretty simple protocol. Using stuff like pipes still felt ok to me, but a loop might need to be broken into multiple lines quite quickly, which does not feel natural to me. But it is probably a much bigger time saver than other features of shells, so I checking them out is highly recommended!