Loop over file names from `find`?

Asked
Active3 hr before
Viewed126 times

6 Answers

90%

3 The best way to loop over file names depends quite a bit on what you actually want to do with it, but unless you can guarantee no files have any whitespace in their name, this isn't a great way to do it. So what do you want to do in looping over the files? – Kevin Mar 8 '12 at 2:26 , This is the best answer! Works with: * Spaces in filenames * No matching files * exit when looping over the results – EM0 Feb 8 '18 at 12:37 ,This would separate the printed items with a \0 character that isn't allowed in any of the file systems in file or folder names, to my knowledge, and therefore should cover all bases. xargs picks them up one by one then ...,shell option globstar: "If set, the pattern ‘**’ used in a filename expansion context will match all files and zero or more directories and subdirectories. If the pattern is followed by a ‘/’, only directories and subdirectories match." see Bash Manual, Shopt Builtin

TL;DR: If you're just here for the most correct answer, you probably want my personal preference (see the bottom of this post):

# execute `process`
once
for each file
find. - name '*.txt' - exec process {}\;

The best way depends on what you want to do, but here are a few options. As long as no file or folder in the subtree has whitespace in its name, you can just loop over the files:

for i in $x;
do # Not recommended, will
break on whitespace
process "$i"
done

Marginally better, cut out the temporary variable x:

for i in $(find - name\ * .txt);
do # Not recommended, will
break on whitespace
process "$i"
done

It is much better to glob when you can. White-space safe, for files in the current directory:

for i in * .txt;
do # Whitespace - safe but not recursive.
process "$i"
done

By enabling the globstar option, you can glob all matching files in this directory and all subdirectories:

# Make sure globstar is enabled
shopt - s globstar
for i in **
/*.txt; do # Whitespace-safe and recursive
    process "$i"
done

In some cases, e.g. if the file names are already in a file, you may need to use read:

# IFS = makes sure it doesn 't trim leading and trailing whitespace
# - r prevents interpretation of \escapes.
while IFS = read - r line;
do # Whitespace - safe EXCEPT newlines
process "$line"
done < filename

read can be used safely in combination with find by setting the delimiter appropriately:

find. - name '*.txt' - print0 |
   while IFS = read - r - d ''
line;
do
   process "$line"
done

For more complex searches, you will probably want to use find, either with its -exec option or with -print0 | xargs -0:

# execute `process`
once
for each file
find. - name\ * .txt - exec process {}\;

# execute `process`
once with all the files as arguments *:
   find. - name\ * .txt - exec process {} +

   # using xargs *
   find. - name\ * .txt - print0 | xargs - 0 process

# using xargs with arguments after each filename(implies one run per filename)
find. - name\ * .txt - print0 | xargs - 0 - I {}
process {}
argument
load more v
88%

This method returns a list containing the names of the entries in the directory given by path. The list is in arbitrary order, and does not include the special entries '.' and '..' even if they are present in the directory.,This method will iterate over all descendant files in subdirectories. Consider the example above, but in this case, this method recursively prints all images in C:\Users\admin directory.,This tutorial will show you some ways to iterate files in a given directory and do some actions on them using Python.,The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix shell, although results are returned in arbitrary order.

Example: print out all paths to files that have jpg or png extension in C:\Users\admin directory

import os

directory = r 'C:\Users\admin'
for filename in os.listdir(directory):
   if filename.endswith(".jpg") or filename.endswith(".png"):
   print(os.path.join(directory, filename))
else :
   continue
load more v
72%

Source : linux – Loop over file names from `find`? – Stack Overflow,works with other commands than just find,allows ‘batching’ (grouping) in command lines, say xargs -n 10 (ten at a time),Summarizing the comments saying that -exec…+ is better, I prefer xargs because it is more versatile:

Some examples…

find. - name '*.mp3' - exec cmd {}\;

or

find. - name '*.mp3' - print0 | xargs - 0 cmd
sudo find - name '*.mp3' - print0 | sudo xargs - 0 require_root.sh
sudo find - name '*.mp3' - print0 | xargs - 0 nonroot.sh
load more v
65%

Short answer (closest to your answer, but handles spaces),Better answer (also handles wildcards and newlines in file names),Best answer (based on Gilles' answer), Student asked me if it is necessary to simplify fractions at the end of answering a question. I'm not sure how to respond

Short answer (closest to your answer, but handles spaces)

OIFS = "$IFS"
IFS = $ '\n'
for file in `find . -type f -name "*.csv"`
do
   echo "file = $file"
diff "$file"
"/some/other/path/$file"
read line
done
IFS = "$OIFS"

Better answer (also handles wildcards and newlines in file names)

find. - type f - name "*.csv" - print0 |
   while IFS = read - r - d ''
file;
do
   echo "file = $file"
diff "$file"
"/some/other/path/$file"
read line < /dev/tty
done

Best answer (based on Gilles' answer)

find. - type f - name '*.csv' - exec sh - c '
file = "$0"
echo "$file"
diff "$file"
"/some/other/path/$file"
read line < /dev/tty
' exec-sh {} ';
'

Or even better, to avoid running one sh per file:

find. - type f - name '*.csv' - exec sh - c '
for file do
   echo "$file"
diff "$file"
"/some/other/path/$file"
read line < /dev/tty
done
   ' exec-sh {} +

So it's effectively doing this:

for file in "zquery"
"-"
"abc"...

To tell it to only split the input on newlines, you need to do

IFS = $ '\n'

If you are using sh or dash instead of ksh93, bash or zsh, you need to write IFS=$'\n' like this instead:

IFS = '
'

Inside the loop where you do

diff $file / some / other / path / $file

To tell the shell not to expand wildcard characters, put the variable inside double quotes, e.g.

diff "$file"
"/some/other/path/$file"

The same problem could also bite us in

for file in `find . -name "*.csv"`

For example, if you had these three files

file1.csv
file2.csv *
   .csv

It would be as if you had run

for file in file1.csv file2.csv * .csv

which will get expanded to

for file in file1.csv file2.csv * .csv file1.csv file2.csv

Instead, we have to do

find. - name "*.csv" - print |
   while IFS = read - r file;
do
   echo "file = $file"
diff "$file"
"/some/other/path/$file"
read line < /dev/tty
done

We can handle that by changing -print to -print0 and using read -d '' on the end of a pipeline:

find. - name "*.csv" - print0 |
   while IFS = read - r - d ''
file;
do
   echo "file = $file"
diff "$file"
"/some/other/path/$file"
read char < /dev/tty
done

A more portable way of writing this that doesn't require bash or zsh or remembering all the above rules about null bytes (again, thanks to Gilles):

find. - name '*.csv' - exec sh - c '
file = "$0"
echo "$file"
diff "$file"
"/some/other/path/$file"
read char < /dev/tty
' exec-sh {} ';
'

*3. Skipping directories whose names end in .csv

find. - name "*.csv"

To avoid this, add -type f to the find command.

find. - type f - name '*.csv' - exec sh - c '
file = "$0"
echo "$file"
diff "$file"
"/some/other/path/$file"
read line < /dev/tty
' exec-sh {} ';
'

If you need to set variables and have them still set at the end of the loop, you can rewrite it to use process substitution like this:

i = 0
while IFS = read - r - d ''
file;
do
   echo "file = $file"
diff "$file"
"/some/other/path/$file"
read line < /dev/tty
i = $((i + 1))
done < < (find. - type f - name '*.csv' - print0)
echo "$i files processed"
load more v
75%

Where find command options are:,-type f : Only search files,The output of find command is piped out to the xargs command, and xargs options are:,HINT: find replaces {} with the current filename and \; marks the end of the command.

for f in file1 file2 file3 file5
do
   echo "Processing $f"
# do something on $f
done
load more v
40%

In this method we will use os.listdir() function which is in the os library. This function returns a list of names of files present in the directory and in no order.,This method uses os.scandir() function returns an iterator that is used to access the file. The entries are yielded in arbitrary order. It lists the directories or files immediately under that directory.,So to get specific type of file from a particular directory we need to iterate through the directory and subdirectory and print the file with specific extension.,Import the os library and pass the directory in the os.listdir() function.

Other "undefined-undefined" queries related to "Loop over file names from `find`?"