|
|
|
|
||||||
| comp.unix.shell Using and programming the Unix shell. |
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hello,
I have a datafile that looks like this: 003 68441896104 8 CHEERLEADER BEAR 0510 00001599 00000000 003 68441896108 8 VARSITY BEAR 0510 00001599 00000000 003 68441896106 SAILOR BEAR WHITES 0510 00001599 00000000 003 336037204651 SENSI WMN 3.4 SPR 0592 00006000 00000000 003 76968229001 111404 0597 00001299 00000000 I want to be able to define a specific id ( in this case 003 ) and for all those records replace the spaces that seperate each value with a comma. Problem is the third value ( description ) has spaces in it. I know awk can get each file but I honestly dont know the possible number of spaces that could be in the description field. Can anyone point me in the right direction? I think this is pretty complex but I could be wrong. Thanks for your time, Carlton |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
2006-12-6, 11:38(-08), CarltonG:
> Hello, > I have a datafile that looks like this: > 003 68441896104 8 CHEERLEADER BEAR 0510 00001599 00000000 > 003 68441896108 8 VARSITY BEAR 0510 00001599 > 00000000 > 003 68441896106 SAILOR BEAR WHITES 0510 00001599 > 00000000 > 003 336037204651 SENSI WMN 3.4 SPR 0592 00006000 > 00000000 > 003 76968229001 111404 0597 > 00001299 00000000 > > I want to be able to define a specific id ( in this case 003 ) and for > all those records > replace the spaces that seperate each value with a comma. > Problem is the third value ( description ) has spaces in it. > > I know awk can get each file but I honestly dont know the possible > number of spaces that could be in the description field. > > Can anyone point me in the right direction? I think this is pretty > complex but I could be wrong. [...] If the number of fields before and after the 3rd field is always the same, I'd go for sed: b='[[:blank:]]\{1,\}' # blanks f='\([^[:blank:]]\{1,\}\)' # single word field sed "s/^$f$b$f$b\(.*[^[:blank:]]\)$b$f$b$f$b$f$/\1,\2,\3,\4,\5,\6/" -- Stéphane |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
On 2006-12-06, CarltonG wrote:
> Hello, > I have a datafile that looks like this: I doubt that it looks like this. Even with Google Groups, it is possible to prevent inappropriate line wrapping. > 003 68441896104 8 CHEERLEADER BEAR 0510 00001599 00000000 > 003 68441896108 8 VARSITY BEAR 0510 00001599 > 00000000 > 003 68441896106 SAILOR BEAR WHITES 0510 00001599 > 00000000 > 003 336037204651 SENSI WMN 3.4 SPR 0592 00006000 > 00000000 > 003 76968229001 111404 0597 > 00001299 00000000 Are the fields separated by spaces or tabs? > I want to be able to define a specific id ( in this case 003 ) and > for all those records replace the spaces that seperate each value > with a comma. > Problem is the third value ( description ) has spaces in it. How do you define where the description ends? Is it a tab? Is it multiple spaces? > I know awk can get each file but I honestly dont know the possible > number of spaces that could be in the description field. > > Can anyone point me in the right direction? I think this is pretty > complex but I could be wrong. If there are always three fields after the description: awk '{ printf "%s\t%s\t", $1, $2 n = 3 while ( n < (NF - 3) ) { printf "%s_", $n; ++n } printf "%s\t%s\t%s\t%s\n", $(NF-3), $(NF-2), $(NF-1), $NF }' ~/txt I've use tabs to separate the fields. If that's not what you want, replace all instances of \t with a space (or whatever). -- Chris F.A. Johnson, author <http://cfaj.freeshell.org/shell> Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress) ===== My code in this post, if any, assumes the POSIX locale ===== and is released under the GNU General Public Licence |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
There will always be three fields after the description EXCEPT I am
looking for the group of records that has only TWO after the description. I dont care about the other records It will be records that start with 052. They will only have two fields after the description. And they will be the only ones delimited by spaces. The other records have commas. The fields for 052 records are separated by spaces not tabs The fields for all records are length determinable by position. Meaning if I took out the commas I still would know when they start and end. 1-3 zone 5-15 code etc etc I will have to preview my post next time so it wont wrap. Every line starts with a three digit id always so 003 is the start of a new line. This is a dump from an Oracle DB so I would have to find the pl/sql code for tab ( chr(10) I think ) as I am using the PL/SQL file I/O API to write this file Thanks for your this gets me started Stephane CHAZELAS wrote: > 2006-12-6, 11:38(-08), CarltonG: > > Hello, > > I have a datafile that looks like this: > > 003 68441896104 8 CHEERLEADER BEAR 0510 00001599 00000000 > > 003 68441896108 8 VARSITY BEAR 0510 00001599 > > 00000000 > > 003 68441896106 SAILOR BEAR WHITES 0510 00001599 > > 00000000 > > 003 336037204651 SENSI WMN 3.4 SPR 0592 00006000 > > 00000000 > > 003 76968229001 111404 0597 > > 00001299 00000000 > > > > I want to be able to define a specific id ( in this case 003 ) and for > > all those records > > replace the spaces that seperate each value with a comma. > > Problem is the third value ( description ) has spaces in it. > > > > I know awk can get each file but I honestly dont know the possible > > number of spaces that could be in the description field. > > > > Can anyone point me in the right direction? I think this is pretty > > complex but I could be wrong. > [...] > > If the number of fields before and after the 3rd field is > always the same, I'd go for sed: > > b='[[:blank:]]\{1,\}' # blanks > f='\([^[:blank:]]\{1,\}\)' # single word field > > sed "s/^$f$b$f$b\(.*[^[:blank:]]\)$b$f$b$f$b$f$/\1,\2,\3,\4,\5,\6/" > > -- > Stéphane |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
CarltonG wrote:
> Hello, > I have a datafile that looks like this: > 003 68441896104 8 CHEERLEADER BEAR 0510 00001599 00000000 > 003 68441896108 8 VARSITY BEAR 0510 00001599 > 00000000 > 003 68441896106 SAILOR BEAR WHITES 0510 00001599 > 00000000 > 003 336037204651 SENSI WMN 3.4 SPR 0592 00006000 > 00000000 > 003 76968229001 111404 0597 > 00001299 00000000 > > I want to be able to define a specific id ( in this case 003 ) and for > all those records > replace the spaces that seperate each value with a comma. > Problem is the third value ( description ) has spaces in it. perl -pe's/^(003)\s+(\S+)\s+(.+?)\s+(\S+)\s+(\S+)\s+(\S+)$/$1,$2,$3,$4,$5,$6/' John -- Perl isn't a toolbox, but a small machine shop where you can special-order certain sorts of tools at low cost and in short order. -- Larry Wall |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
Thanks All, John yours worked. I used yours only because you wrote to
the more specific details. I will test all the suggestions. Thanks again. John W. Krahn wrote: > CarltonG wrote: > > Hello, > > I have a datafile that looks like this: > > 003 68441896104 8 CHEERLEADER BEAR 0510 00001599 00000000 > > 003 68441896108 8 VARSITY BEAR 0510 00001599 > > 00000000 > > 003 68441896106 SAILOR BEAR WHITES 0510 00001599 > > 00000000 > > 003 336037204651 SENSI WMN 3.4 SPR 0592 00006000 > > 00000000 > > 003 76968229001 111404 0597 > > 00001299 00000000 > > > > I want to be able to define a specific id ( in this case 003 ) and for > > all those records > > replace the spaces that seperate each value with a comma. > > Problem is the third value ( description ) has spaces in it. > > perl -pe's/^(003)\s+(\S+)\s+(.+?)\s+(\S+)\s+(\S+)\s+(\S+)$/$1,$2,$3,$4,$5,$6/' > > > > John > -- > Perl isn't a toolbox, but a small machine shop where you can special-order > certain sorts of tools at low cost and in short order. -- Larry Wall |
|
![]() |
| Outils de la discussion | |
|
|