|
|
|
|
||||||
| comp.unix.shell Using and programming the Unix shell. |
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
I have 100+ files of about 150-200 lines each.
Each file has 8 aligned fields and could be separated with /usr/bin/cut. After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5. I then use FIELD1 to get the line that was just read and the following 4 lines with: /usr/local/bin/grep -A4 "^$FIELD1" "$FILE" which I then pipe to grep with |grep "${FIELD4}.*${FIELD5}" |wc -l I assign the count to a variable and then test the variable if is equal to 5. I am getting some accurate results but not perfect results. To clarify, I want all lines printed if $FIELD4 and $FIELD5 are the same in five successive lines. Can anyone see the fault in my logic or is there a better way? Thanks for any . |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
osiris@abydos.kmt wrote:
> I have 100+ files of about 150-200 lines each. > Each file has 8 aligned fields and could be separated with /usr/bin/cut. > > After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5. > I then use FIELD1 to get the line that was just read and > the following 4 lines with: > > /usr/local/bin/grep -A4 "^$FIELD1" "$FILE" > > which I then pipe to grep with > > |grep "${FIELD4}.*${FIELD5}" |wc -l > > I assign the count to a variable > and then test the variable if is equal to 5. > > I am getting some accurate results but not perfect results. > > To clarify, I want all lines printed if $FIELD4 and $FIELD5 > are the same in five successive lines. > > Can anyone see the fault in my logic or is there a better way? > Thanks for any . Something like this (untested) should do it: awk '{ key = $4 FS $5 if (key == prev) { cnt++; saved = saved $0 ORS if (cnt == 5) { printf "%s", saved cnt = 0; saved = "" } } else { cnt = 0; saved = "" } prev = key }' file Think about what you want to do for the 6th consecutive matching line.... Ed. |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
Ed Morton wrote:
> osiris@abydos.kmt wrote: > > > I have 100+ files of about 150-200 lines each. > > Each file has 8 aligned fields and could be separated with /usr/bin/cut. > > > > After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5. > > I then use FIELD1 to get the line that was just read and > > the following 4 lines with: > > > > /usr/local/bin/grep -A4 "^$FIELD1" "$FILE" > > > > which I then pipe to grep with > > > > |grep "${FIELD4}.*${FIELD5}" |wc -l > > > > I assign the count to a variable > > and then test the variable if is equal to 5. > > > > I am getting some accurate results but not perfect results. > > > > To clarify, I want all lines printed if $FIELD4 and $FIELD5 > > are the same in five successive lines. > > > > Can anyone see the fault in my logic or is there a better way? > > Thanks for any . > > Something like this (untested) should do it: > > awk '{ > key = $4 FS $5 > if (key == prev) { > cnt++; saved = saved $0 ORS > if (cnt == 5) { > printf "%s", saved > cnt = 0; saved = "" > } > } else { > cnt = 0; saved = "" > } > prev = key > }' file > > Think about what you want to do for the 6th consecutive matching line.... > > Ed. A version which will print all records having a identical key within more than 5 consecutive lines. { key = $4 FS $5 if (key == prev) { cnt++; saved = saved $0 ORS next; } else if (cnt >= 5) { printf "%s", saved; } cnt = 1; saved = $0 ORS prev = key } -- XC |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
osiris@abydos.kmt wrote: > I have 100+ files of about 150-200 lines each. > Each file has 8 aligned fields and could be separated with /usr/bin/cut. > > After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5. > I then use FIELD1 to get the line that was just read and > the following 4 lines with: ......[snip] > To clarify, I want all lines printed if $FIELD4 and $FIELD5 > are the same in five successive lines. > sed -e ' /regex/!d s/^/\ / :loop $d;N /.*\n[^ \n]* [^ \n]* [^ \n]* \([^ \n]* [^ \n]*\) [^\n]*\n[^ \n]* [^ \n]* [^ \n]* \1 [^\n]*$/!d s/\n/&/5;tend bloop :end s/.// ' yourfile |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
On 22 Aug 2006 18:02:41 -0700, "Xicheng Jia" <xicheng@gmail.com> wrote: > >Ed Morton wrote: >> osiris@abydos.kmt wrote: >> >> > I have 100+ files of about 150-200 lines each. >> > Each file has 8 aligned fields and could be separated with /usr/bin/cut. >> > >> > After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5. >> > I then use FIELD1 to get the line that was just read and >> > the following 4 lines with: >> > >> > /usr/local/bin/grep -A4 "^$FIELD1" "$FILE" >> > >> > which I then pipe to grep with >> > >> > |grep "${FIELD4}.*${FIELD5}" |wc -l >> > >> > I assign the count to a variable >> > and then test the variable if is equal to 5. >> > >> > I am getting some accurate results but not perfect results. >> > >> > To clarify, I want all lines printed if $FIELD4 and $FIELD5 >> > are the same in five successive lines. >> > >> > Can anyone see the fault in my logic or is there a better way? >> > Thanks for any . >> >> Something like this (untested) should do it: >> >> awk '{ >> key = $4 FS $5 >> if (key == prev) { >> cnt++; saved = saved $0 ORS >> if (cnt == 5) { >> printf "%s", saved >> cnt = 0; saved = "" >> } >> } else { >> cnt = 0; saved = "" >> } >> prev = key >> }' file >> >> Think about what you want to do for the 6th consecutive matching line.... >> >> Ed. > >A version which will print all records having a identical key within >more than 5 consecutive lines. > >{ > key = $4 FS $5 > if (key == prev) { > cnt++; saved = saved $0 ORS > next; > } else if (cnt >= 5) { > printf "%s", saved; > } > cnt = 1; saved = $0 ORS > prev = key >} > >-- >XC This worked almost perfectly. For some reason it skipped a known occurrence of one the repetive fields that was at EOF. Would it be easy to also include the line before and the line following (just to see the change)? Thanks a bunch. |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
osiris@abydos.kmt wrote: > On 22 Aug 2006 18:02:41 -0700, "Xicheng Jia" <xicheng@gmail.com> wrote: > >>Ed Morton wrote: >> >>>osiris@abydos.kmt wrote: >>> >>> >>>>I have 100+ files of about 150-200 lines each. >>>>Each file has 8 aligned fields and could be separated with /usr/bin/cut. >>>> >>>>After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5. >>>>I then use FIELD1 to get the line that was just read and >>>>the following 4 lines with: >>>> >>>> /usr/local/bin/grep -A4 "^$FIELD1" "$FILE" >>>> >>>>which I then pipe to grep with >>>> >>>> |grep "${FIELD4}.*${FIELD5}" |wc -l >>>> >>>>I assign the count to a variable >>>>and then test the variable if is equal to 5. >>>> >>>>I am getting some accurate results but not perfect results. >>>> >>>>To clarify, I want all lines printed if $FIELD4 and $FIELD5 >>>>are the same in five successive lines. >>>> >>>>Can anyone see the fault in my logic or is there a better way? >>>>Thanks for any . >>> >>>Something like this (untested) should do it: >>> >>>awk '{ >>> key = $4 FS $5 >>> if (key == prev) { >>> cnt++; saved = saved $0 ORS >>> if (cnt == 5) { >>> printf "%s", saved >>> cnt = 0; saved = "" >>> } >>> } else { >>> cnt = 0; saved = "" >>> } >>> prev = key >>>}' file >>> >>>Think about what you want to do for the 6th consecutive matching line.... >>> >>> Ed. >> >>A version which will print all records having a identical key within >>more than 5 consecutive lines. >> >>{ >> key = $4 FS $5 >> if (key == prev) { >> cnt++; saved = saved $0 ORS >> next; >> } else if (cnt >= 5) { >> printf "%s", saved; >> } >> cnt = 1; saved = $0 ORS >> prev = key >>} >> >>-- >>XC > > > This worked almost perfectly. For some reason it skipped a known occurrence > of one the repetive fields that was at EOF. Would it be easy to also > include the line before and the line following (just to see the change)? > Thanks a bunch. Sounds like you should have an END block to handle any remaining fields (i.e., from the last record). Probably printing the "saved" value will suffice. Bill Seivert |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
osiris@abydos.kmt wrote:
> On 22 Aug 2006 18:02:41 -0700, "Xicheng Jia" <xicheng@gmail.com> wrote: > > > >Ed Morton wrote: > >> osiris@abydos.kmt wrote: > >> > >> > I have 100+ files of about 150-200 lines each. > >> > Each file has 8 aligned fields and could be separated with /usr/bin/cut. > >> > > >> > After reading a line in a loop, I cut FIELD1, FIELD4 and awk FIELD5. > >> > I then use FIELD1 to get the line that was just read and > >> > the following 4 lines with: > >> > > >> > /usr/local/bin/grep -A4 "^$FIELD1" "$FILE" > >> > > >> > which I then pipe to grep with > >> > > >> > |grep "${FIELD4}.*${FIELD5}" |wc -l > >> > > >> > I assign the count to a variable > >> > and then test the variable if is equal to 5. > >> > > >> > I am getting some accurate results but not perfect results. > >> > > >> > To clarify, I want all lines printed if $FIELD4 and $FIELD5 > >> > are the same in five successive lines. > >> > > >> > Can anyone see the fault in my logic or is there a better way? > >> > Thanks for any . > >> > >> Something like this (untested) should do it: > >> > >> awk '{ > >> key = $4 FS $5 > >> if (key == prev) { > >> cnt++; saved = saved $0 ORS > >> if (cnt == 5) { > >> printf "%s", saved > >> cnt = 0; saved = "" > >> } > >> } else { > >> cnt = 0; saved = "" > >> } > >> prev = key > >> }' file > >> > >> Think about what you want to do for the 6th consecutive matching line.... > >> > >> Ed. > > > >A version which will print all records having a identical key within > >more than 5 consecutive lines. > > > >{ > > key = $4 FS $5 > > if (key == prev) { > > cnt++; saved = saved $0 ORS > > next; > > } else if (cnt >= 5) { > > printf "%s", saved; > > } > > cnt = 1; saved = $0 ORS > > prev = key > >} > > > >-- > >XC > > This worked almost perfectly. For some reason it skipped a known occurrence > of one the repetive fields that was at EOF. Would it be easy to also > include the line before and the line following (just to see the change)? > Thanks a bunch. yes, you need to add a END block, like: { key = $4 FS $5 if (key == prev) { cnt++; saved = saved $0 ORS next; } else if (cnt >= 5) { printf "%s", saved; } cnt = 1; saved = $0 ORS prev = key } END { if (cnt >= 5) printf "%s", saved; } -- Xicheng |
|
![]() |
| Outils de la discussion | |
|
|