Barry Margolin wrote:
> In article
> <a4ec4cd7-9bed-4a9f-92f9-77ddee436644@d21g2000prf.googlegroups.com>,
> Architect <x31337@gmail.com> wrote:
>
>
>>Hello everyone and happy new year,
>>
>>I have a file in the following format
>>
>>Thu Jul 18 03:44:03 2007
>>xxx
>>yyy
>>zzz
>>Sun Jun 1 01:00:13 2007
>>a
>>b
>>Mon Jun 2 02:04:00 2007
>>x = xxxx
>>z = zzzz
>>zx = xz
>>
>>Data between date's are random ,can be anything .I want it in this
>>format .
>>
>>Thu Jul 18 03:44:03 2007 xxx yyy zzz
>>Sun Jun 1 01:00:13 2007 a b
>>Mon Jun 2 02:04:00 2007 x = xxxx z = zzzz zx = xz
>>
>>Simply,for every date ,join every line till next date . My problem is
>>how to join till next date . Let me saw u what i did .
>>
>>1) i turned the result into 1 line using tr '\n' ' ' so now i have all
>>the file joined together
>>
>>Thu Jul 18 03:44:03 2007 xxx yyy zzz Sun Jun 1 01:00:13 2007 a b Mon
>>Jun 2 02:04:00 2007 x = xxxx ....
>>^
>>^ ^
>>and now i thought to put a new line before every date's 1st letter.I
>>did this in sed
>>
>>doit.sed:
>>/^[A-Z][a-z]\{3\} [A-Z][a-z]\{3\} [0-9]\{1,2\} [0-9]\{1,2\}:
>>[0-9]\{1,2\}:[0-9]\{1,2\} [0-9]\{4\}/i\
>>\n
>>
>>then i do ,cat INPUT_FILE |tr '\n' ' ' | sed -f doit.sed but i get the
>>same line ... i don't know if i am missing something here. Is there a
>>better way to accomplish this ?
>>
>>
>>Thanks in advance
>
>
> Day and month names are one uppercase letter followed by two lowercase
> letters. Your regexp looks for THREE lowercase letters, so it doesn't
> match them.
>
> Anyway, it would probably be easier to do this using awk:
>
> awk 'NR > 1 && /^([A-Z][a-z]{2} ){2} ([0-9]{2}
{2}[0-9]{2} [0-9]{4}/
> {printf("\n%s ", $0)} \
> {printf("%s " $0)} \
> END {printf("\n")}'
>
For the OP,just be aware that you need an awk that supports RE intervals
for that to work (e.g. a POSIX awk). If you're using GNU awk you'd add
"--re-interval" as an option to enable that functionality. I tweaked it
slightly too to fix one bug, remove a little redundant code and avoid
adding trailing white space:
gawk --re-interval '
NR > 1 && /^([A-Z][a-z]{2} ){2} ([0-9]{2}

{2}[0-9]{2} [0-9]{4}/ {s=ORS}
{printf "%s%s",s,$0; s=OFS}
END {print ""}'
Regards,
Ed.