|
|
|
|
||||||
| comp.unix.shell Using and programming the Unix shell. |
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
How do I search a compressed log file for a given regexp pattern, and
print only the first line of results? At first I considered piping "grep -Z" to "head -1". But grep still has to process the entire file for all matches. When only one is required, that doesn't seem to be very efficient. My second thought was to use gawk since it could do a pattern match and print only the first matching line. It's more elegant for this purpose, but it doesn't decompress files (to my knowledge), so I'd have to resort to piping the results of "zcat" (that that the entire uncompressed file) to "gawk", which is also quite inefficient. It would be nice if grep had an option to only print N results, but such a feature doesn't seem to be implemented. Am I missing something? Is it maybe possible that "head -1" actually terminates after printing the first line, thus forcing "grep -Z" to receive a SIGPIPE and quit immediately? This has me quite curious. ![]() TIA, --Randall |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
2006-12-6, 12:19(-08), R Krause:
> How do I search a compressed log file for a given regexp pattern, and > print only the first line of results? > > At first I considered piping "grep -Z" to "head -1". But grep still has > to process the entire file for all matches. When only one is required, > that doesn't seem to be very efficient. Not necessarily the whole file. It will probably stop at the second occurrence (as it will get a SIGPIPE for writing to its stdout that is now gone) unless it buffers its output in which case he will stop when he writes its second buffer of output. > > My second thought was to use gawk since it could do a pattern match and > print only the first matching line. It's more elegant for this purpose, > but it doesn't decompress files (to my knowledge), so I'd have to > resort to piping the results of "zcat" (that that the entire > uncompressed file) to "gawk", which is also quite inefficient. zcat would uncompress chunks at a time. grep -Z for me is to append a '\0' to a filename instead of a '\n' in grep -l. But if it's for uncompressing for you, then I'd bet grep calls uncompress internally. In uncompress < file.Z | awk '/pattern/ {print; exit}' uncompress < file.Z | sed '/pattern/!d;q' uncompress will again probably write a buffer full worth of data at a time. Once awk exits, uncompress will terminate (killed by a SIGPIPE) the next time it tries to write to the pipe. > It would be nice if grep had an option to only print N results, but > such a feature doesn't seem to be implemented. Am I missing something? > Is it maybe possible that "head -1" actually terminates after printing > the first line, thus forcing "grep -Z" to receive a SIGPIPE and quit > immediately? This has me quite curious. ![]() [...] That's what head does. head will stop after it has read one line from the pipe (it may read more than one line if there's more than one available though), the problem is that grep will wait until it has enough to send its output to head (you don't notice it when you run grep without | head because grep's behavior (wrt buffering) is different when its stdout is a terminal). GNU grep has a --line-buffered option to not buffer. But GNU grep also has the -m <n> option you were after. -m, --max-count=NUM stop after NUM matches --line-buffered flush output on every line So -- Stéphane |
|
![]() |
| Outils de la discussion | |
|
|