PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Forums Hébergement > Forum Serveur - Sécurité et techniques > comp.unix.shell > find - exec vs xargs
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
comp.unix.shell Using and programming the Unix shell.

find - exec vs xargs

Réponse
 
LinkBack Outils de la discussion
Vieux 17/03/2008, 03h43   #1
freightcar
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut find - exec vs xargs

I have a small script that will find some files and then grep for a
specific keywords in them. The number of files is ~ 20,000 and they
are all small 1-2 KB. If I use exec grep it takes way longer (20-30s)
to complete than xargs grep (2-3s).

I would love to know why.
  Réponse avec citation
Vieux 17/03/2008, 03h52   #2
freightcar
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: find - exec vs xargs

On Mar 16, 10:43pm, freightcar <freight...@gmail.com> wrote:
> I have a small script that will find some files and then grep for a
> specific keywords in them. The number of files is ~ 20,000 and they
> are all small 1-2 KB. If I use exec grep it takes way longer (20-30s)
> to complete than xargs grep (2-3s).
>
> I would love to know why.


forgot to mention that I am using "find"
  Réponse avec citation
Vieux 17/03/2008, 04h16   #3
Wayne
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: find - exec vs xargs

freightcar wrote:
> I have a small script that will find some files and then grep for a
> specific keywords in them. The number of files is ~ 20,000 and they
> are all small 1-2 KB. If I use exec grep it takes way longer (20-30s)
> to complete than xargs grep (2-3s).
>
> I would love to know why.


It is likely that the time of this command is dominated by the
time it takes to create a new process (for the grep command).
The common way to use find with -exec is:
find ... -exec command '{}' \;
That will be slower than:
find ... |xargs command
because the first way runs the command once per file. In
your case, that means starting 20,000 processes.

With xargs (or a more sophisticated "find ... -exec" command)
far fewer processes are started, perhaps only one.

-Wayne
  Réponse avec citation
Vieux 17/03/2008, 09h38   #4
Stephane CHAZELAS
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: find - exec vs xargs

2008-03-16, 23:16(-04), Wayne:
> freightcar wrote:
>> I have a small script that will find some files and then grep for a
>> specific keywords in them. The number of files is ~ 20,000 and they
>> are all small 1-2 KB. If I use exec grep it takes way longer (20-30s)
>> to complete than xargs grep (2-3s).
>>
>> I would love to know why.

>
> It is likely that the time of this command is dominated by the
> time it takes to create a new process (for the grep command).
> The common way to use find with -exec is:
> find ... -exec command '{}' \;
> That will be slower than:
> find ... |xargs command
> because the first way runs the command once per file. In
> your case, that means starting 20,000 processes.
>
> With xargs (or a more sophisticated "find ... -exec" command)
> far fewer processes are started, perhaps only one.

[...]

The output of find -print is not post processable because it
outputs a list of file names separated by NL characters while NL
is as valid as any other character in a file name.

And the default format expected by xargs is not a newline
separated list, it's a space (including NL) separated list where
quotes and backslashes also have their role to play.

xargs also has stupid limitations on the length of the
arguments.

All that makes it very difficult to use xargs reliably unless
you use GNU's -0 option.

Standard implementations of find have -exec cmd {} +
which will run fewer commands, so you don't need xargs.

--
Stéphane
  Réponse avec citation
Vieux 17/03/2008, 13h07   #5
Maxwell Lol
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: find - exec vs xargs

Stephane CHAZELAS <this.address@is.invalid> writes:

> All that makes it very difficult to use xargs reliably unless
> you use GNU's -0 option.


Unless you have control over the filenames.........

  Réponse avec citation
Vieux 17/03/2008, 16h27   #6
Edward Rosten
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: find - exec vs xargs

On Mar 17, 2:38 am, Stephane CHAZELAS <this.addr...@is.invalid> wrote:
> 2008-03-16, 23:16(-04), Wayne:


> xargs also has stupid limitations on the length of the
> arguments.


I do not believe it's xargs, but rather the kernel. BASH has similar
limits for `` or $() on non internal commands. Copying a little from
my xterm:

~ $yes | tr -d '\n' | head -c 1000000 | xargs /bin/true
xargs: argument line too long
~ $/bin/true `yes | tr -d '\n' | head -c 1000000`
bash: /bin/true: Argument list too long
~ $true `yes | tr -d '\n' | head -c 1000000`

note that bash has a builtin true and has no arbitrary limits on the
length of command line arguments.

-Ed
--
(You can't go wrong with psycho-rats.)(http://mi.eng.cam.ac.uk/~er258)

/d{def}def/f{/Times s selectfont}d/s{11}d/r{roll}d f 2/m{moveto}d -1
r 230 350 m 0 1 179{ 1 index show 88 rotate 4 mul 0 rmoveto}for/s 12
d f pop 235 420 translate 0 0 moveto 1 2 scale show showpage
  Réponse avec citation
Vieux 17/03/2008, 16h59   #7
Stephane CHAZELAS
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: find - exec vs xargs

2008-03-17, 08:27(-07), Edward Rosten:
> On Mar 17, 2:38 am, Stephane CHAZELAS <this.addr...@is.invalid> wrote:
>> 2008-03-16, 23:16(-04), Wayne:

>
>> xargs also has stupid limitations on the length of the
>> arguments.

>
> I do not believe it's xargs, but rather the kernel. BASH has similar
> limits for `` or $() on non internal commands. Copying a little from
> my xterm:
>
> ~ $yes | tr -d '\n' | head -c 1000000 | xargs /bin/true
> xargs: argument line too long
> ~ $/bin/true `yes | tr -d '\n' | head -c 1000000`
> bash: /bin/true: Argument list too long
> ~ $true `yes | tr -d '\n' | head -c 1000000`
>
> note that bash has a builtin true and has no arbitrary limits on the
> length of command line arguments.

[...]

You must be refering to the execve(2) system call limitation on
the size of envp+argv (which of course doesn't affect shell
builtins as there's no execve for them).

xargs is the tool to overcome that limitation by breaking the
arg list and run as many commands as necessary so that the
execve(2)'s limit is not reached.

But that's not the limitation I was thinking of. I know some
xargs implementations have a very low limit (around 255 bytes)
on the size of an argument, lower than LINE_MAX or the max
length of a path for instance, but now that I'm looking for
supporting information, it may be for the -I option only (where
POSIX limits it to 255 bytes).

--
Stéphane
  Réponse avec citation
Vieux 18/03/2008, 03h16   #8
Maxwell Lol
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: find - exec vs xargs

Edward Rosten <Edward.Rosten@gmail.com> writes:

> > xargs also has stupid limitations on the length of the
> > arguments.

>
> I do not believe it's xargs, but rather the kernel.



Yup. Check the value of ARG_MAX in /usr/include/linux/limits.h
On my system it's:

#define ARG_MAX 131072 /* # bytes of args + environ for exec() */


  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 14h26.


Édité par : vBulletin® version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières ©2000-2008
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,14810 seconds with 16 queries