|
|
|
|
||||||
| comp.unix.shell Using and programming the Unix shell. |
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hi All,
I'm looking to put something together that would compare and integrate columns of data from two files. In one file I have load average that I got from sar and prettied up. from file: load_avg date time 1min 5min 15min 2006-11-07 10:00:01 0.02 0.02 0.00 2006-11-07 10:10:01 0.03 0.02 0.00 2006-11-07 10:20:01 0.01 0.02 0.00 2006-11-07 10:30:01 0.01 0.00 0.00 2006-11-07 10:40:01 0.00 0.00 0.00 2006-11-07 10:50:01 0.02 0.03 0.00 2006-11-07 11:00:01 0.07 0.06 0.01 2006-11-07 11:10:01 0.05 0.05 0.00 2006-11-07 11:20:01 0.01 0.04 0.00 2006-11-07 11:30:01 0.02 0.04 0.00 2006-11-07 11:40:01 0.24 0.06 0.02 2006-11-07 11:50:01 0.06 0.04 0.00 The other file is queries per second from apache logs. from file: hits_per_second date time qps 2006-11-07 10:59:36 2 2006-11-07 10:59:37 1 2006-11-07 10:59:38 1 2006-11-07 10:59:40 1 2006-11-07 10:59:41 1 2006-11-07 10:59:43 1 2006-11-07 10:59:44 1 2006-11-07 10:59:45 1 2006-11-07 10:59:46 1 2006-11-07 10:59:47 1 2006-11-07 10:59:49 1 2006-11-07 10:59:50 1 2006-11-07 10:59:51 2 2006-11-07 10:59:52 1 2006-11-07 10:59:53 1 2006-11-07 10:59:54 2 2006-11-07 11:00:40 2 2006-11-07 11:00:41 3 2006-11-07 11:00:43 1 2006-11-07 11:00:44 3 2006-11-07 11:00:45 2 2006-11-07 11:00:46 4 2006-11-07 11:00:48 1 2006-11-07 11:00:49 2 2006-11-07 11:00:50 4 I'd like to find a way (my attempts have been with awk) to get the load averages added on after the qps column. Since my load avg is only done every 10 minutes I want to put the load average for 10:50 tacked on to any qps result in the 10:50:00-10:59:59 range. Example of what I'd like to see. (since 0.03 0.00 0.00 is the load avg from 10:50 & 0.07 0.06 0.01 is the load avg from 11:00) 2006-11-07 10:59:47 1 0.03 0.00 0.00 2006-11-07 10:59:49 1 0.02 0.03 0.00 2006-11-07 10:59:50 1 0.02 0.03 0.00 2006-11-07 10:59:51 2 0.02 0.03 0.00 2006-11-07 10:59:52 1 0.02 0.03 0.00 2006-11-07 10:59:53 1 0.02 0.03 0.00 2006-11-07 10:59:54 2 0.02 0.03 0.00 2006-11-07 11:00:40 2 0.07 0.06 0.01 2006-11-07 11:00:41 3 0.07 0.06 0.01 2006-11-07 11:00:43 1 0.07 0.06 0.01 2006-11-07 11:00:44 3 0.07 0.06 0.01 2006-11-07 11:00:45 2 0.07 0.06 0.01 2006-11-07 11:00:46 4 0.07 0.06 0.01 I've tried several variations on the following, but I have no idea how to do the range matching. gawk 'NR==FNR{b[$2]=$3;next}{print $0 OFS b[$4;}' hits_per_second load_avg Ultimately I'm looking to get all of this data put into a graph where I can compare the queries per second on the web server to the load average on the server (each of the 3 load average columns will have a peak and the qps will have a peak). I still have no idea how I'm going to pull that off. Any thoughts on this would be a great . |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
In comp.unix.shell rick <devrick88@gmail.com>:
> Hi All, > I'm looking to put something together that would compare and integrate > columns of data from two files. In one file I have load average that I > got from sar and prettied up. > from file: load_avg > date time 1min 5min 15min > 2006-11-07 10:00:01 0.02 0.02 0.00 > 2006-11-07 10:10:01 0.03 0.02 0.00 > 2006-11-07 10:20:01 0.01 0.02 0.00 > 2006-11-07 10:30:01 0.01 0.00 0.00 > 2006-11-07 10:40:01 0.00 0.00 0.00 [..] > The other file is queries per second from apache logs. > from file: hits_per_second > date time qps > 2006-11-07 10:59:36 2 > 2006-11-07 10:59:37 1 > 2006-11-07 10:59:38 1 > 2006-11-07 10:59:40 1 > 2006-11-07 10:59:41 1 > the load avg from 11:00) > 2006-11-07 10:59:47 1 0.03 0.00 0.00 > 2006-11-07 10:59:49 1 0.02 0.03 0.00 > 2006-11-07 10:59:50 1 0.02 0.03 0.00 man join Good luck btw Your example data doesn't match at all? -- Michael Heiming (X-PGP-Sig > GPG-Key ID: EDD27B94) mail: echo zvpunry@urvzvat.qr | perl -pe 'y/a-z/n-za-m/' #bofh excuse 114: electro-magnetic pulses from French above ground nuke testing. |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
2006-11-8, 14:13(-08), rick:
> Hi All, > > I'm looking to put something together that would compare and integrate > columns of data from two files. In one file I have load average that I > got from sar and prettied up. > from file: load_avg > date time 1min 5min 15min [...] > 2006-11-07 10:50:01 0.02 0.03 0.00 > 2006-11-07 11:00:01 0.07 0.06 0.01 [...] > > The other file is queries per second from apache logs. > from file: hits_per_second > date time qps > 2006-11-07 10:59:36 2 > 2006-11-07 10:59:37 1 [...] > I'd like to find a way (my attempts have been with awk) to get the load > averages added on after the qps column. Since my load avg is only done > every 10 minutes I want to put the load average for 10:50 tacked on to > any qps result in the 10:50:00-10:59:59 range. Example of what I'd > like to see. > (since 0.03 0.00 0.00 is the load avg from 10:50 & 0.07 0.06 0.01 is > the load avg from 11:00) > 2006-11-07 10:59:47 1 0.03 0.00 0.00 > 2006-11-07 10:59:49 1 0.02 0.03 0.00 [...] If your shell supports process substitution (zsh, bash, some kshs): join -t, <(sed 's/:./&,/' < hits_per_second) \ <(sed 's/\(:.\)[^ ]*/\1,/' < load_avg) | tr -d , -- Stéphane |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
rick wrote:
> Hi All, > > I'm looking to put something together that would compare and integrate > columns of data from two files. In one file I have load average that I > got from sar and prettied up. > from file: load_avg > date time 1min 5min 15min > 2006-11-07 10:00:01 0.02 0.02 0.00 > 2006-11-07 10:10:01 0.03 0.02 0.00 > 2006-11-07 10:20:01 0.01 0.02 0.00 > 2006-11-07 10:30:01 0.01 0.00 0.00 > 2006-11-07 10:40:01 0.00 0.00 0.00 > 2006-11-07 10:50:01 0.02 0.03 0.00 > 2006-11-07 11:00:01 0.07 0.06 0.01 > 2006-11-07 11:10:01 0.05 0.05 0.00 > 2006-11-07 11:20:01 0.01 0.04 0.00 > 2006-11-07 11:30:01 0.02 0.04 0.00 > 2006-11-07 11:40:01 0.24 0.06 0.02 > 2006-11-07 11:50:01 0.06 0.04 0.00 > > The other file is queries per second from apache logs. > from file: hits_per_second > date time qps > 2006-11-07 10:59:36 2 > 2006-11-07 10:59:37 1 > 2006-11-07 10:59:38 1 > 2006-11-07 10:59:40 1 > 2006-11-07 10:59:41 1 > 2006-11-07 10:59:43 1 > 2006-11-07 10:59:44 1 > 2006-11-07 10:59:45 1 > 2006-11-07 10:59:46 1 > 2006-11-07 10:59:47 1 > 2006-11-07 10:59:49 1 > 2006-11-07 10:59:50 1 > 2006-11-07 10:59:51 2 > 2006-11-07 10:59:52 1 > 2006-11-07 10:59:53 1 > 2006-11-07 10:59:54 2 > 2006-11-07 11:00:40 2 > 2006-11-07 11:00:41 3 > 2006-11-07 11:00:43 1 > 2006-11-07 11:00:44 3 > 2006-11-07 11:00:45 2 > 2006-11-07 11:00:46 4 > 2006-11-07 11:00:48 1 > 2006-11-07 11:00:49 2 > 2006-11-07 11:00:50 4 > > I'd like to find a way (my attempts have been with awk) to get the load > averages added on after the qps column. Since my load avg is only done > every 10 minutes I want to put the load average for 10:50 tacked on to > any qps result in the 10:50:00-10:59:59 range. Example of what I'd > like to see. > (since 0.03 0.00 0.00 is the load avg from 10:50 & 0.07 0.06 0.01 is > the load avg from 11:00) > 2006-11-07 10:59:47 1 0.03 0.00 0.00 I think you made a couple of mistakes in the 3 lines above. > 2006-11-07 10:59:49 1 0.02 0.03 0.00 > 2006-11-07 10:59:50 1 0.02 0.03 0.00 > 2006-11-07 10:59:51 2 0.02 0.03 0.00 > 2006-11-07 10:59:52 1 0.02 0.03 0.00 > 2006-11-07 10:59:53 1 0.02 0.03 0.00 > 2006-11-07 10:59:54 2 0.02 0.03 0.00 > 2006-11-07 11:00:40 2 0.07 0.06 0.01 > 2006-11-07 11:00:41 3 0.07 0.06 0.01 > 2006-11-07 11:00:43 1 0.07 0.06 0.01 > 2006-11-07 11:00:44 3 0.07 0.06 0.01 > 2006-11-07 11:00:45 2 0.07 0.06 0.01 > 2006-11-07 11:00:46 4 0.07 0.06 0.01 > > I've tried several variations on the following, but I have no idea how > to do the range matching. > gawk 'NR==FNR{b[$2]=$3;next}{print $0 OFS b[$4;}' hits_per_second > load_avg > > Ultimately I'm looking to get all of this data put into a graph where I > can compare the queries per second on the web server to the load > average on the server (each of the 3 load average columns will have a > peak and the qps will have a peak). I still have no idea how I'm going > to pull that off. > > Any thoughts on this would be a great . > Just strip the non-signficant parts of the time stamp and use that as a key to populate the averages from the first file, then to access them for the second, e.g.: awk '{t=$1$2; sub(/[0-9]:[^:]*$/,"",t)} NR==FNR{avg[t]=$3" "$4" "$5;next} {print $0,avg[t]}' load_avg hits_per_second Regards, Ed. |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
Ed Morton wrote:
> rick wrote: <snip> >> Ultimately I'm looking to get all of this data put into a graph where I >> can compare the queries per second on the web server to the load >> average on the server (each of the 3 load average columns will have a >> peak and the qps will have a peak). I still have no idea how I'm going >> to pull that off. >> Any thoughts on this would be a great . >> > > Just strip the non-signficant parts of the time stamp and use that as a > key to populate the averages from the first file, then to access them > for the second, e.g.: > > awk '{t=$1$2; sub(/[0-9]:[^:]*$/,"",t)} > NR==FNR{avg[t]=$3" "$4" "$5;next} > {print $0,avg[t]}' load_avg hits_per_second > > Regards, > > Ed. Oh, and for the graph, you could use "gnuplot" which has a home page and a newsgroup and many gnuplot applications use awk for data processing. Google... Ed. |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
>If your shell supports process substitution (zsh, bash, some
>kshs): > >join -t, <(sed 's/:./&,/' < hits_per_second) \ > <(sed 's/\(:.\)[^ ]*/\1,/' < load_avg) | tr -d , > >-- >Stéphane --------- >awk '{t=$1$2; sub(/[0-9]:[^:]*$/,"",t)} > NR==FNR{avg[t]=$3" "$4" "$5;next} > {print $0,avg[t]}' load_avg hits_per_second > >Regards, > > Ed. ---------- Both examples worked great guys, thanks! I'll check out gnuplot too, thanks for the tip. |
|
![]() |
| Outils de la discussion | |
|
|