PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Forums Hébergement > Forum Serveur - Sécurité et techniques > comp.unix.shell > delete using sed and line number file ....
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
comp.unix.shell Using and programming the Unix shell.

delete using sed and line number file ....

Réponse
 
LinkBack Outils de la discussion
Vieux 20/03/2008, 16h28   #1
LionelAndJen@gmail.com
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut delete using sed and line number file ....

I have file A with about 600 000 rows
File B contains all the line numbers I need to delete, one line per
file, 87 rows this time (could be 200 rows tomorrow)

how do I use sed to delete each of the line from File A to create file
C

the manual command is simple and works perfectly:

sed -e '5865d
7754d
12406d
..
..
..
488596d
490322d
492259d
493646d' FileA >> FileC

but I want to automate the process and sed gives me fits.
  Réponse avec citation
Vieux 20/03/2008, 16h46   #2
LionelAndJen@gmail.com
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: delete using sed and line number file ....

On Mar 20, 10:56am, pk <p...@pk.pk> wrote:
> pk wrote:
> > sed 's/.*/&d/g' FileB > delete.sed

>
> Or also
>
> sed 's/$/d/g' FileB > delete.sed
>
> --
> All the commands are tested with bash and GNU tools, so they may use
> nonstandard features. I try to mention when something is nonstandard (if
> I'm aware of that), but I may miss something. Corrections are welcome.


DAMN ... that was FAST.....

thank you so much, sed 's/$/d/g' FileB > delete.sed worked like a
champ.

  Réponse avec citation
Vieux 20/03/2008, 16h54   #3
pk
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: delete using sed and line number file ....

LionelAndJen@gmail.com wrote:

> I have file A with about 600 000 rows
> File B contains all the line numbers I need to delete, one line per
> file, 87 rows this time (could be 200 rows tomorrow)
>
> how do I use sed to delete each of the line from File A to create file
> C
>
> the manual command is simple and works perfectly:
>
> sed -e '5865d
> 7754d
> 12406d
> .
> .
> .
> 488596d
> 490322d
> 492259d
> 493646d' FileA >> FileC
>
> but I want to automate the process and sed gives me fits.


Assuming FileB has the format

123
145
233
2689
....

you can generate a sed command file starting from FileB (using sed, of
course!)

sed 's/.*/&d/g' FileB > delete.sed

and then

sed -f delete.sed FileA >> FileC

--
All the commands are tested with bash and GNU tools, so they may use
nonstandard features. I try to mention when something is nonstandard (if
I'm aware of that), but I may miss something. Corrections are welcome.
  Réponse avec citation
Vieux 20/03/2008, 16h56   #4
pk
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: delete using sed and line number file ....

pk wrote:

> sed 's/.*/&d/g' FileB > delete.sed


Or also

sed 's/$/d/g' FileB > delete.sed

--
All the commands are tested with bash and GNU tools, so they may use
nonstandard features. I try to mention when something is nonstandard (if
I'm aware of that), but I may miss something. Corrections are welcome.
  Réponse avec citation
Vieux 20/03/2008, 22h01   #5
Maxwell Lol
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: delete using sed and line number file ....

pk <pk@pk.pk> writes:

> pk wrote:
>
> > sed 's/.*/&d/g' FileB > delete.sed

>
> Or also
>
> sed 's/$/d/g' FileB > delete.sed


or

sed 's/$/d/' FileB > delete.sed

  Réponse avec citation
Vieux 21/03/2008, 03h56   #6
Ed Morton
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: delete using sed and line number file ....



On 3/20/2008 10:28 AM, LionelAndJen@gmail.com wrote:
> I have file A with about 600 000 rows
> File B contains all the line numbers I need to delete, one line per
> file, 87 rows this time (could be 200 rows tomorrow)
>
> how do I use sed to delete each of the line from File A to create file
> C
>
> the manual command is simple and works perfectly:
>
> sed -e '5865d
> 7754d
> 12406d
> .
> .
> .
> 488596d
> 490322d
> 492259d
> 493646d' FileA >> FileC
>
> but I want to automate the process and sed gives me fits.


That's not what a good job for sed, but it's trivial in awk:

awk 'NR==FNR{skip[$0];next}!(FNR in skip)' FileB FileA > FileC

Ed.

  Réponse avec citation
Vieux 21/03/2008, 09h40   #7
pk
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: delete using sed and line number file ....

Ed Morton wrote:

> awk 'NR==FNR{skip[$0];next}!(FNR in skip)' FileB FileA > FileC


This is one I've been wondering for a long time. If FileA and FileB are very
large, isn't the (FNR in skip) check inefficient? I mean, that seems to
imply a walk over the entire array to see whether the element exists each
time the condition is chacked. (I may be wrong of course, probably due to
my ignorance about the inner workings of awk). Wouldn't something like this

awk 'NR==FNR{skip[$0]++;next} skip[FNR]==0' FileB FileA > FileC

be more efficient?

Thanks

--
All the commands are tested with bash and GNU tools, so they may use
nonstandard features. I try to mention when something is nonstandard (if
I'm aware of that), but I may miss something. Corrections are welcome.
  Réponse avec citation
Vieux 21/03/2008, 12h24   #8
Ed Morton
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: delete using sed and line number file ....

On 3/21/2008 3:40 AM, pk wrote:
> Ed Morton wrote:
>
>
>>awk 'NR==FNR{skip[$0];next}!(FNR in skip)' FileB FileA > FileC

>
>
> This is one I've been wondering for a long time. If FileA and FileB are very
> large, isn't the (FNR in skip) check inefficient? I mean, that seems to
> imply a walk over the entire array to see whether the element exists each
> time the condition is chacked. (I may be wrong of course, probably due to
> my ignorance about the inner workings of awk). Wouldn't something like this
>
> awk 'NR==FNR{skip[$0]++;next} skip[FNR]==0' FileB FileA > FileC
>
> be more efficient?


Could be, though I expect the "in" operator is using hashing so it'd be close as
you're trading a hash lookup for an arithmetic increment plus an index plus a
comparison.

Here's the result of running both scripts twice deleting every odd-numbered line
in a 1-million line file using GNU awk 3.1.6:

$ wc -l FileB FileA
500000 FileB
1000000 FileA
1500000 total
$ time awk 'NR==FNR{skip[$0];next}!(FNR in skip)' FileB FileA > FileC

real 0m29.016s
user 0m28.546s
sys 0m0.328s
$ time awk 'NR==FNR{skip[$0]++;next} skip[FNR]==0' FileB FileA > FileC

real 0m29.558s
user 0m29.015s
sys 0m0.436s
$ time awk 'NR==FNR{skip[$0];next}!(FNR in skip)' FileB FileA > FileC

real 0m28.915s
user 0m28.484s
sys 0m0.483s
$ time awk 'NR==FNR{skip[$0]++;next} skip[FNR]==0' FileB FileA > FileC

real 0m29.502s
user 0m29.186s
sys 0m0.327s

Regards,

Ed.

  Réponse avec citation
Vieux 21/03/2008, 13h22   #9
pk
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: delete using sed and line number file ....

Ed Morton wrote:

> Here's the result of running both scripts twice deleting every
> odd-numbered line in a 1-million line file using GNU awk 3.1.6:


Yeah, I also consistently see the same results with GNU awk 3.1.5, so,
contrarily to what I thought, the (FNR in skip) seems to be slightly more
efficient.

Thanks!

--
All the commands are tested with bash and GNU tools, so they may use
nonstandard features. I try to mention when something is nonstandard (if
I'm aware of that), but I may miss something. Corrections are welcome.
  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 18h01.


Édité par : vBulletin® version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières ©2000-2008
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,14105 seconds with 17 queries