PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Forums Hébergement > Forum Serveur - Sécurité et techniques > comp.unix.shell > finding consecutive repetitive lines
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
comp.unix.shell Using and programming the Unix shell.

finding consecutive repetitive lines

Réponse
 
LinkBack Outils de la discussion
Vieux 22/08/2006, 07h05   #1
osiris@abydos.kmt
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut finding consecutive repetitive lines

I have 100+ files of about 150-200 lines each.
Each file has 8 aligned fields and could be separated with /usr/bin/cut.

After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5.
I then use FIELD1 to get the line that was just read and
the following 4 lines with:

/usr/local/bin/grep -A4 "^$FIELD1" "$FILE"

which I then pipe to grep with

|grep "${FIELD4}.*${FIELD5}" |wc -l

I assign the count to a variable
and then test the variable if is equal to 5.

I am getting some accurate results but not perfect results.

To clarify, I want all lines printed if $FIELD4 and $FIELD5
are the same in five successive lines.

Can anyone see the fault in my logic or is there a better way?
Thanks for any .
  Réponse avec citation
Vieux 22/08/2006, 13h13   #2
Ed Morton
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: finding consecutive repetitive lines

osiris@abydos.kmt wrote:

> I have 100+ files of about 150-200 lines each.
> Each file has 8 aligned fields and could be separated with /usr/bin/cut.
>
> After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5.
> I then use FIELD1 to get the line that was just read and
> the following 4 lines with:
>
> /usr/local/bin/grep -A4 "^$FIELD1" "$FILE"
>
> which I then pipe to grep with
>
> |grep "${FIELD4}.*${FIELD5}" |wc -l
>
> I assign the count to a variable
> and then test the variable if is equal to 5.
>
> I am getting some accurate results but not perfect results.
>
> To clarify, I want all lines printed if $FIELD4 and $FIELD5
> are the same in five successive lines.
>
> Can anyone see the fault in my logic or is there a better way?
> Thanks for any .


Something like this (untested) should do it:

awk '{
key = $4 FS $5
if (key == prev) {
cnt++; saved = saved $0 ORS
if (cnt == 5) {
printf "%s", saved
cnt = 0; saved = ""
}
} else {
cnt = 0; saved = ""
}
prev = key
}' file

Think about what you want to do for the 6th consecutive matching line....

Ed.
  Réponse avec citation
Vieux 23/08/2006, 02h02   #3
Xicheng Jia
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: finding consecutive repetitive lines

Ed Morton wrote:
> osiris@abydos.kmt wrote:
>
> > I have 100+ files of about 150-200 lines each.
> > Each file has 8 aligned fields and could be separated with /usr/bin/cut.
> >
> > After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5.
> > I then use FIELD1 to get the line that was just read and
> > the following 4 lines with:
> >
> > /usr/local/bin/grep -A4 "^$FIELD1" "$FILE"
> >
> > which I then pipe to grep with
> >
> > |grep "${FIELD4}.*${FIELD5}" |wc -l
> >
> > I assign the count to a variable
> > and then test the variable if is equal to 5.
> >
> > I am getting some accurate results but not perfect results.
> >
> > To clarify, I want all lines printed if $FIELD4 and $FIELD5
> > are the same in five successive lines.
> >
> > Can anyone see the fault in my logic or is there a better way?
> > Thanks for any .

>
> Something like this (untested) should do it:
>
> awk '{
> key = $4 FS $5
> if (key == prev) {
> cnt++; saved = saved $0 ORS
> if (cnt == 5) {
> printf "%s", saved
> cnt = 0; saved = ""
> }
> } else {
> cnt = 0; saved = ""
> }
> prev = key
> }' file
>
> Think about what you want to do for the 6th consecutive matching line....
>
> Ed.


A version which will print all records having a identical key within
more than 5 consecutive lines.

{
key = $4 FS $5
if (key == prev) {
cnt++; saved = saved $0 ORS
next;
} else if (cnt >= 5) {
printf "%s", saved;
}
cnt = 1; saved = $0 ORS
prev = key
}

--
XC

  Réponse avec citation
Vieux 23/08/2006, 08h04   #4
Rakesh Sharma
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: finding consecutive repetitive lines


osiris@abydos.kmt wrote:
> I have 100+ files of about 150-200 lines each.
> Each file has 8 aligned fields and could be separated with /usr/bin/cut.
>
> After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5.
> I then use FIELD1 to get the line that was just read and
> the following 4 lines with:

......[snip]

> To clarify, I want all lines printed if $FIELD4 and $FIELD5
> are the same in five successive lines.
>




sed -e '
/regex/!d
s/^/\
/

:loop
$d;N
/.*\n[^ \n]* [^ \n]* [^ \n]* \([^ \n]* [^ \n]*\) [^\n]*\n[^ \n]* [^
\n]* [^ \n]* \1 [^\n]*$/!d
s/\n/&/5;tend
bloop

:end
s/.//
' yourfile

  Réponse avec citation
Vieux 25/08/2006, 10h27   #5
osiris@abydos.kmt
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: finding consecutive repetitive lines


On 22 Aug 2006 18:02:41 -0700, "Xicheng Jia" <xicheng@gmail.com> wrote:
>
>Ed Morton wrote:
>> osiris@abydos.kmt wrote:
>>
>> > I have 100+ files of about 150-200 lines each.
>> > Each file has 8 aligned fields and could be separated with /usr/bin/cut.
>> >
>> > After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5.
>> > I then use FIELD1 to get the line that was just read and
>> > the following 4 lines with:
>> >
>> > /usr/local/bin/grep -A4 "^$FIELD1" "$FILE"
>> >
>> > which I then pipe to grep with
>> >
>> > |grep "${FIELD4}.*${FIELD5}" |wc -l
>> >
>> > I assign the count to a variable
>> > and then test the variable if is equal to 5.
>> >
>> > I am getting some accurate results but not perfect results.
>> >
>> > To clarify, I want all lines printed if $FIELD4 and $FIELD5
>> > are the same in five successive lines.
>> >
>> > Can anyone see the fault in my logic or is there a better way?
>> > Thanks for any .

>>
>> Something like this (untested) should do it:
>>
>> awk '{
>> key = $4 FS $5
>> if (key == prev) {
>> cnt++; saved = saved $0 ORS
>> if (cnt == 5) {
>> printf "%s", saved
>> cnt = 0; saved = ""
>> }
>> } else {
>> cnt = 0; saved = ""
>> }
>> prev = key
>> }' file
>>
>> Think about what you want to do for the 6th consecutive matching line....
>>
>> Ed.

>
>A version which will print all records having a identical key within
>more than 5 consecutive lines.
>
>{
> key = $4 FS $5
> if (key == prev) {
> cnt++; saved = saved $0 ORS
> next;
> } else if (cnt >= 5) {
> printf "%s", saved;
> }
> cnt = 1; saved = $0 ORS
> prev = key
>}
>
>--
>XC


This worked almost perfectly. For some reason it skipped a known occurrence
of one the repetive fields that was at EOF. Would it be easy to also
include the line before and the line following (just to see the change)?
Thanks a bunch.
  Réponse avec citation
Vieux 25/08/2006, 23h52   #6
Bill Seivert
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: finding consecutive repetitive lines



osiris@abydos.kmt wrote:
> On 22 Aug 2006 18:02:41 -0700, "Xicheng Jia" <xicheng@gmail.com> wrote:
>
>>Ed Morton wrote:
>>
>>>osiris@abydos.kmt wrote:
>>>
>>>
>>>>I have 100+ files of about 150-200 lines each.
>>>>Each file has 8 aligned fields and could be separated with /usr/bin/cut.
>>>>
>>>>After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5.
>>>>I then use FIELD1 to get the line that was just read and
>>>>the following 4 lines with:
>>>>
>>>> /usr/local/bin/grep -A4 "^$FIELD1" "$FILE"
>>>>
>>>>which I then pipe to grep with
>>>>
>>>> |grep "${FIELD4}.*${FIELD5}" |wc -l
>>>>
>>>>I assign the count to a variable
>>>>and then test the variable if is equal to 5.
>>>>
>>>>I am getting some accurate results but not perfect results.
>>>>
>>>>To clarify, I want all lines printed if $FIELD4 and $FIELD5
>>>>are the same in five successive lines.
>>>>
>>>>Can anyone see the fault in my logic or is there a better way?
>>>>Thanks for any .
>>>
>>>Something like this (untested) should do it:
>>>
>>>awk '{
>>> key = $4 FS $5
>>> if (key == prev) {
>>> cnt++; saved = saved $0 ORS
>>> if (cnt == 5) {
>>> printf "%s", saved
>>> cnt = 0; saved = ""
>>> }
>>> } else {
>>> cnt = 0; saved = ""
>>> }
>>> prev = key
>>>}' file
>>>
>>>Think about what you want to do for the 6th consecutive matching line....
>>>
>>> Ed.

>>
>>A version which will print all records having a identical key within
>>more than 5 consecutive lines.
>>
>>{
>> key = $4 FS $5
>> if (key == prev) {
>> cnt++; saved = saved $0 ORS
>> next;
>> } else if (cnt >= 5) {
>> printf "%s", saved;
>> }
>> cnt = 1; saved = $0 ORS
>> prev = key
>>}
>>
>>--
>>XC

>
>
> This worked almost perfectly. For some reason it skipped a known occurrence
> of one the repetive fields that was at EOF. Would it be easy to also
> include the line before and the line following (just to see the change)?
> Thanks a bunch.


Sounds like you should have an END block to handle any remaining
fields (i.e., from the last record). Probably printing the "saved"
value will suffice.

Bill Seivert

  Réponse avec citation
Vieux 26/08/2006, 21h27   #7
Xicheng Jia
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: finding consecutive repetitive lines

osiris@abydos.kmt wrote:
> On 22 Aug 2006 18:02:41 -0700, "Xicheng Jia" <xicheng@gmail.com> wrote:
> >
> >Ed Morton wrote:
> >> osiris@abydos.kmt wrote:
> >>
> >> > I have 100+ files of about 150-200 lines each.
> >> > Each file has 8 aligned fields and could be separated with /usr/bin/cut.
> >> >
> >> > After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5.
> >> > I then use FIELD1 to get the line that was just read and
> >> > the following 4 lines with:
> >> >
> >> > /usr/local/bin/grep -A4 "^$FIELD1" "$FILE"
> >> >
> >> > which I then pipe to grep with
> >> >
> >> > |grep "${FIELD4}.*${FIELD5}" |wc -l
> >> >
> >> > I assign the count to a variable
> >> > and then test the variable if is equal to 5.
> >> >
> >> > I am getting some accurate results but not perfect results.
> >> >
> >> > To clarify, I want all lines printed if $FIELD4 and $FIELD5
> >> > are the same in five successive lines.
> >> >
> >> > Can anyone see the fault in my logic or is there a better way?
> >> > Thanks for any .
> >>
> >> Something like this (untested) should do it:
> >>
> >> awk '{
> >> key = $4 FS $5
> >> if (key == prev) {
> >> cnt++; saved = saved $0 ORS
> >> if (cnt == 5) {
> >> printf "%s", saved
> >> cnt = 0; saved = ""
> >> }
> >> } else {
> >> cnt = 0; saved = ""
> >> }
> >> prev = key
> >> }' file
> >>
> >> Think about what you want to do for the 6th consecutive matching line....
> >>
> >> Ed.

> >
> >A version which will print all records having a identical key within
> >more than 5 consecutive lines.
> >
> >{
> > key = $4 FS $5
> > if (key == prev) {
> > cnt++; saved = saved $0 ORS
> > next;
> > } else if (cnt >= 5) {
> > printf "%s", saved;
> > }
> > cnt = 1; saved = $0 ORS
> > prev = key
> >}
> >
> >--
> >XC

>
> This worked almost perfectly. For some reason it skipped a known occurrence
> of one the repetive fields that was at EOF. Would it be easy to also
> include the line before and the line following (just to see the change)?
> Thanks a bunch.


yes, you need to add a END block, like:

{
key = $4 FS $5
if (key == prev) {
cnt++; saved = saved $0 ORS
next;
} else if (cnt >= 5) {
printf "%s", saved;
}
cnt = 1; saved = $0 ORS
prev = key
}
END {
if (cnt >= 5) printf "%s", saved;
}

--
Xicheng

  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 17h08.


Édité par : vBulletin® version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières ©2000-2008
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,21546 seconds with 15 queries