Afficher un message
Vieux 25/08/2006, 23h52   #6
Bill Seivert
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: finding consecutive repetitive lines



osiris@abydos.kmt wrote:
> On 22 Aug 2006 18:02:41 -0700, "Xicheng Jia" <xicheng@gmail.com> wrote:
>
>>Ed Morton wrote:
>>
>>>osiris@abydos.kmt wrote:
>>>
>>>
>>>>I have 100+ files of about 150-200 lines each.
>>>>Each file has 8 aligned fields and could be separated with /usr/bin/cut.
>>>>
>>>>After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5.
>>>>I then use FIELD1 to get the line that was just read and
>>>>the following 4 lines with:
>>>>
>>>> /usr/local/bin/grep -A4 "^$FIELD1" "$FILE"
>>>>
>>>>which I then pipe to grep with
>>>>
>>>> |grep "${FIELD4}.*${FIELD5}" |wc -l
>>>>
>>>>I assign the count to a variable
>>>>and then test the variable if is equal to 5.
>>>>
>>>>I am getting some accurate results but not perfect results.
>>>>
>>>>To clarify, I want all lines printed if $FIELD4 and $FIELD5
>>>>are the same in five successive lines.
>>>>
>>>>Can anyone see the fault in my logic or is there a better way?
>>>>Thanks for any .
>>>
>>>Something like this (untested) should do it:
>>>
>>>awk '{
>>> key = $4 FS $5
>>> if (key == prev) {
>>> cnt++; saved = saved $0 ORS
>>> if (cnt == 5) {
>>> printf "%s", saved
>>> cnt = 0; saved = ""
>>> }
>>> } else {
>>> cnt = 0; saved = ""
>>> }
>>> prev = key
>>>}' file
>>>
>>>Think about what you want to do for the 6th consecutive matching line....
>>>
>>> Ed.

>>
>>A version which will print all records having a identical key within
>>more than 5 consecutive lines.
>>
>>{
>> key = $4 FS $5
>> if (key == prev) {
>> cnt++; saved = saved $0 ORS
>> next;
>> } else if (cnt >= 5) {
>> printf "%s", saved;
>> }
>> cnt = 1; saved = $0 ORS
>> prev = key
>>}
>>
>>--
>>XC

>
>
> This worked almost perfectly. For some reason it skipped a known occurrence
> of one the repetive fields that was at EOF. Would it be easy to also
> include the line before and the line following (just to see the change)?
> Thanks a bunch.


Sounds like you should have an END block to handle any remaining
fields (i.e., from the last record). Probably printing the "saved"
value will suffice.

Bill Seivert

  Réponse avec citation
 
Page generated in 0,06410 seconds with 9 queries