PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Forums Hébergement > Forum Serveur - Sécurité et techniques > comp.unix.shell > comparing two files
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
comp.unix.shell Using and programming the Unix shell.

comparing two files

Réponse
 
LinkBack Outils de la discussion
Vieux 30/12/2007, 01h55   #1
sonal10july@gmail.com
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut comparing two files

Guys,

I would like to describe the current scenario.
I got two type of files Primary and secondary. There is only one
primary file and around hundred secondary files.

Primary.txt Contains two columns i.e name and number of rows
================================================== =========
Currency_exchange|25000
Sales|21000
instruments|120000

================================================== =========


Secondary1.txt Contains two columns i.e name and number of rows

================================================== =========
Currency_exchange|21000
Sales|21000
instruments|120000

================================================== =========

Secondary2.txt Contains two columns i.e name and number of rows

================================================== =========
Currency_exchange|23100
Sales|21000
instruments|120000

================================================== =========

There are 100 more secondary files like
Secondary3.txt,Secondary4.txt.....Secondary100.txt .
First column( name) contains the same value among all files but second
column (number of rows) may contain different values.

Now, I want to compare each secondary file (i.e
Secondary1.txt,Secondary1.txt ....so on) with Primary.txt and copy
those rows in another file where number of rows are not matching.
In other words I want to figure out where the number of rows in
secondary files(i.e Secondary1.txt,Secondary1.txt ....so on) are not
matching with primary (primary.txt)

What is the best way to do this ? I will heartly thankful to all for
any assistance regarding this.

Thanks in advance

SS


  Réponse avec citation
Vieux 30/12/2007, 02h26   #2
Icarus Sparry
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: comparing two files

On Sat, 29 Dec 2007 17:55:07 -0800, sonal10july wrote:

> Guys,


> I would like to describe the current scenario.
> I got two type of files Primary and secondary. There is only one primary
> file and around hundred secondary files.
>
> Primary.txt Contains two columns i.e name and number of rows
> ================================================== =========
> Currency_exchange|25000
> Sales|21000
> instruments|120000
>
> ================================================== =========
>
>
> Secondary1.txt Contains two columns i.e name and number of rows
>
> ================================================== =========
> Currency_exchange|21000
> Sales|21000
> instruments|120000
>
> ================================================== =========
>
> Secondary2.txt Contains two columns i.e name and number of rows
>
> ================================================== =========
> Currency_exchange|23100
> Sales|21000
> instruments|120000
>
> ================================================== =========



Good so far, you have show us some typical input files.

> There are 100 more secondary files like
> Secondary3.txt,Secondary4.txt.....Secondary100.txt . First column( name)
> contains the same value among all files but second column (number of
> rows) may contain different values.


Useful information - again ful.

> Now, I want to compare each secondary file (i.e
> Secondary1.txt,Secondary1.txt ....so on) with Primary.txt and copy those
> rows in another file where number of rows are not matching. In other
> words I want to figure out where the number of rows in secondary
> files(i.e Secondary1.txt,Secondary1.txt ....so on) are not matching with
> primary (primary.txt)


At this point your request becomes less ful. You didn;t show us the
required output. For instance you say "copy those rows in another file",
do you want a single "another file", or one file for each secondary. Do
you want some information on which secondary the mismatched row came from?


awk -F'|' 'NR==FNR {v[$1]=$2;}
v[$1]!=$2 {print FILENAME,$0}' primary.txt Secondary*.txt > out

may do what you want.

  Réponse avec citation
Vieux 30/12/2007, 02h47   #3
Barry Margolin
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: comparing two files

In article
<c32538df-82de-4d2b-9ed9-5b67070d1d12@y5g2000hsf.googlegroups.com>,
sonal10july@gmail.com wrote:

> Guys,
>
> I would like to describe the current scenario.
> I got two type of files Primary and secondary. There is only one
> primary file and around hundred secondary files.
>
> Primary.txt Contains two columns i.e name and number of rows
> ================================================== =========
> Currency_exchange|25000
> Sales|21000
> instruments|120000
>
> ================================================== =========
>
>
> Secondary1.txt Contains two columns i.e name and number of rows
>
> ================================================== =========
> Currency_exchange|21000
> Sales|21000
> instruments|120000
>
> ================================================== =========
>
> Secondary2.txt Contains two columns i.e name and number of rows
>
> ================================================== =========
> Currency_exchange|23100
> Sales|21000
> instruments|120000
>
> ================================================== =========
>
> There are 100 more secondary files like
> Secondary3.txt,Secondary4.txt.....Secondary100.txt .
> First column( name) contains the same value among all files but second
> column (number of rows) may contain different values.
>
> Now, I want to compare each secondary file (i.e
> Secondary1.txt,Secondary1.txt ....so on) with Primary.txt and copy
> those rows in another file where number of rows are not matching.
> In other words I want to figure out where the number of rows in
> secondary files(i.e Secondary1.txt,Secondary1.txt ....so on) are not
> matching with primary (primary.txt)
>
> What is the best way to do this ? I will heartly thankful to all for
> any assistance regarding this.
>
> Thanks in advance
>
> SS


This seems like a good starting point:

for file in Secodary*.txt
do
diff Primary.txt "$file"
done

Depending on your specific needs, you may want to use options to diff
and/or pipe the output to something to grab the parts you want.

--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
  Réponse avec citation
Vieux 30/12/2007, 03h30   #4
sonal10july@gmail.com
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: comparing two files




Thanks for your quick reply .
There is a answer to your question :
"Do you want a single "another file", or one file for each secondary"

Yes. I want to create a single output file and also want to know which
secondary the mismatched row came from?


Following will be my output

Secondary1.txt|Currency_exchange|21000
Secondary2.txt|Currency_exchange|23100
  Réponse avec citation
Vieux 30/12/2007, 07h07   #5
Icarus Sparry
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: comparing two files

On Sat, 29 Dec 2007 19:30:49 -0800, sonal10july wrote:

> Thanks for your quick reply .
> There is a answer to your question :
> "Do you want a single "another file", or one file for each secondary"
>
> Yes. I want to create a single output file and also want to know which
> secondary the mismatched row came from?
>
>
> Following will be my output
>
> Secondary1.txt|Currency_exchange|21000
> Secondary2.txt|Currency_exchange|23100


Did you try the two lines I suggested? It will do what you ask for except
there will be a space after the filename, rather than a "|".

awk -F'|' 'NR==FNR {v[$1]=$2;}
v[$1]!=$2 {print FILENAME "|" $0}' primary.txt Secondary*.txt > out

is a fix for this problem. Make sure you are using a 'sh' family shell
(sh, ksh, bash, zsh) when you type this, rather than a csh family (csh,
tcsh) or something even more exotic (rc, scsh, es, .....).
  Réponse avec citation
Vieux 30/12/2007, 20h22   #6
sonal10july@gmail.com
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: comparing two files




I copied the above command in a file and ran the script but it's
showing all the records from all files. I'm not very much familier
with awk command .So, following are the steps I performed.

1. Copied the above command in a file called 'main_script.sh'

################################################## #######
$cat main_script.sh
#!/bin/ksh
awk -F'|' 'NR==FNR {v[$1]=$2;}
v[$1]!=$2 {print FILENAME "|" $0}' Primary.txt Secondary*

################################################## #######

2. Ran the script.

################################################## #######
$ sh main_script.sh

Primary.txt|Currency_exchange|25000
Primary.txt|Sales|21000
Primary.txt|instruments|120000
Secondary1.txt|Currency_exchange|25000
Secondary1.txt|Sales|20000
Secondary1.txt|instruments|120000
Secondary2.txt|Currency_exchange|25000
Secondary2.txt|Sales|20000
Secondary2.txt|instruments|110000
Secondary3.txt|Currency_exchange|25000
Secondary3.txt|Sales|6600
Secondary3.txt|instruments|9000
################################################## #######

Basically It printed the whole contents from all four files. Thanks in
advance for your .

I'm using korn shell.
$ echo $SHELL
/bin/ksh


Best Regards
SS
  Réponse avec citation
Vieux 30/12/2007, 23h07   #7
Icarus Sparry
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: comparing two files

On Sun, 30 Dec 2007 12:22:56 -0800, sonal10july wrote:

> I copied the above command in a file and ran the script but it's
> showing all the records from all files. I'm not very much familier with
> awk command .So, following are the steps I performed.
>
> 1. Copied the above command in a file called 'main_script.sh'
>
> ################################################## ####### $cat
> main_script.sh
> #!/bin/ksh
> awk -F'|' 'NR==FNR {v[$1]=$2;}
> v[$1]!=$2 {print FILENAME "|" $0}' Primary.txt Secondary*
>
> ################################################## #######
>
> 2. Ran the script.
>
> ################################################## ####### $ sh
> main_script.sh
>
> Primary.txt|Currency_exchange|25000
> Primary.txt|Sales|21000
> Primary.txt|instruments|120000
> Secondary1.txt|Currency_exchange|25000
> Secondary1.txt|Sales|20000
> Secondary1.txt|instruments|120000
> Secondary2.txt|Currency_exchange|25000
> Secondary2.txt|Sales|20000
> Secondary2.txt|instruments|110000
> Secondary3.txt|Currency_exchange|25000
> Secondary3.txt|Sales|6600
> Secondary3.txt|instruments|9000
> ################################################## #######
>
> Basically It printed the whole contents from all four files. Thanks in
> advance for your .
>
> I'm using korn shell.
> $ echo $SHELL
> /bin/ksh
>
>
> Best Regards
> SS


OK, something is very wrong.

The -F'|' sets the field delimiter to be a vertical bar, which is the
correct value for the data you have shown us.

The "NR==FNR" is an awk idiom, which is true for the first file, and
false for the second and later files. So "NR==FNR { v[$1]=$2}" says "save
in the array 'v' the value of the second field in the element indexed by
the first field".

The second line "v[$1]!=$2" says "If the value stored in the 'v' array
for the first field is not the same as the second field, then do the
action", and the action is "{print FILENAME "|" $0}" which is "print out
the filename, a vertical bar, and the line from the file".

The second line, by definition, must be true for the first file, as the
first line sets the elements of the 'v' array.

When I copy your files I get the following output

Secondary1.txt|Currency_exchange|21000
Secondary2.txt|Currency_exchange|23100

Can you send me your files by email (the email address of this post is
valid)?

You might try changing the program to
#!/bin/ksh
awk -F'|' 'NR==FNR {v[$1]=$2; print "Setting v[" $1 "] to <" $2 ">"}
v[$1]!=$2 {print FILENAME "|" $0 "(" $1 "," $2 ")"}' Primary.txt
Secondary*

as a debugging aid, and letting us see the output (either here or via
email).
Icarus
  Réponse avec citation
Vieux 31/12/2007, 02h01   #8
sonal10july@gmail.com
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: comparing two files

Here is the full details with the content of each file

lyca /home/sukumar/testing:cat Primary.txt
Currency_exchange|25000
Sales|21000
instruments|120000

lyca /home/sukumar/testing:cat Secondary1.txt
Currency_exchange|25000
Sales|20000
instruments|120000

lyca /home/sukumar/testing:cat Secondary2.txt
Currency_exchange|25000
Sales|21000
instruments|120000

lyca /home/sukumar/testing:cat Secondary3.txt
Currency_exchange|25000
Sales|6600
instruments|9000

lyca /home/sukumar/testing:cat main_script.sh
#!/bin/ksh
awk -F'|' 'NR==FNR {v[$1]=$2; print "Setting v[" $1 "] to <" $2 ">"}
v[$1]!=$2 {print FILENAME "|" $0 "(" $1 "," $2 ")"}' Primary.txt
Secondary*



lyca /home/sukumar/testing:ksh main_script.sh

Primary.txt|Currency_exchange|25000(Currency_excha nge,25000)
Primary.txt|Sales|21000(Sales,21000)
Primary.txt|instruments|120000(instruments,120000)
Secondary1.txt|Currency_exchange|25000(Currency_ex change,25000)
Secondary1.txt|Sales|20000(Sales,20000)
Secondary1.txt|instruments|120000(instruments,1200 00)
Secondary2.txt|Currency_exchange|25000(Currency_ex change,25000)
Secondary2.txt|Sales|21000(Sales,21000)
Secondary2.txt|instruments|120000(instruments,1200 00)
Secondary3.txt|Currency_exchange|25000(Currency_ex change,25000)
Secondary3.txt|Sales|6600(Sales,6600)
Secondary3.txt|instruments|9000(instruments,9000)


Can you please run the command for this input.
My output should be

################################################## ###########

Secondary1.txt|Sales|20000(Sales,20000)
Secondary3.txt|Sales|6600(Sales,6600)
Secondary3.txt|instruments|9000(instruments,9000)

################################################## ###########

Thsnks for your .

Best Regards
SS
  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 13h20.


Édité par : vBulletin® version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières ©2000-2008
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,22175 seconds with 16 queries