|
|
|
|
||||||
| comp.unix.shell Using and programming the Unix shell. |
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
I have a directory with >500 files. I want to divide the 500 odd files
into 4 groups, each group containing a random assortment of filenames. I've tried using awk with rand() and srand()until my head was ready to explode. I'm obviously getting the syntax wrong somewhere, but seeing as how I know very little about awk, it's not surprising. I've never professed to calling myself a coder or programmer and I never will. I just can't seem to wrap my head around scripting. -- Rob - - - - - - - - - - - - - - - - - - - - - - - - - - - - - http://www.aspir8or.com - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Linux means productivity and fun. NT means 'Not Today'. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
Rob S wrote:
> I have a directory with >500 files. I want to divide the 500 odd files > into 4 groups, each group containing a random assortment of filenames. > I've tried using awk with rand() and srand()until my head was ready to > explode. I'm obviously getting the syntax wrong somewhere, but seeing as > how I know very little about awk, it's not surprising. I've never > professed to calling myself a coder or programmer and I never will. I > just can't seem to wrap my head around scripting. > > If you don't need to have the same number of files in each set... { print > "fileset-" int(rand()*4)+1 } ....will create four files fileset-1 .. fileset-4 that will take random lines of the input. In a Unix environment you can call it ls your_dir | awk '{ print > "fileset-" int(rand()*4)+1 }' Use a BEGIN section of awk to set the random seed if that matters BEGIN {srand()} Janis |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
Rob S <Here@home> wrote:
> I have a directory with >500 files. I want to divide the 500 odd files > into 4 groups, each group containing a random assortment of filenames. > I've tried using awk with rand() and srand()until my head was ready to > explode. I'm obviously getting the syntax wrong somewhere, but seeing as > how I know very little about awk, it's not surprising. I've never > professed to calling myself a coder or programmer and I never will. I > just can't seem to wrap my head around scripting. If performance is not an issue, then this pipeline should do it: ls | while IFS= read -r FILE; do echo `head -7 /dev/urandom | wc -c` "$FILE" done \ | sort -n \ | cut -d' ' -f2- \ | awk '{ logfile = "/tmp/list." NR % 4 + 1 print > logfile }' You should execute it from within the directory that contains your 500+ files. It will create four files named list.1, list.2, list.3 and list.4 in the /tmp directory, each containing a random assortment of files. -- Kenan Kalajdzic |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
On Jul 29, 9:24 am, Rob S <Here@home> wrote:
> I have a directory with >500 files. I want to divide the 500 odd files > into 4 groups, each group containing a random assortment of filenames. > I've tried using awk with rand() and srand()until my head was ready to > explode. I'm obviously getting the syntax wrong somewhere, but seeing as > how I know very little about awk, it's not surprising. I've never > professed to calling myself a coder or programmer and I never will. I > just can't seem to wrap my head around scripting. > > > -- > > Rob > - - - - - - - - - - - - - - - - - - - - - - - - - - - - -http://www.aspir8or.com > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > Linux means productivity and fun. NT means 'Not Today'. > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - This will divide the 500+ files as evenly as possible amoung 4 groups. The filenames in each group will be sorted and printed to these files: __list_0 __list_1 __list_2 __list_3 #!ruby list = Dir[ "*" ].sort_by{ rand } chunk = ( list.size / 4.0 ).ceil 4.times{ |i| File.open( "__list_#{i}", "w" ){ |f| f.puts( list[ i * chunk, chunk ].sort ) } } |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
On Jul 29, 9:24 am, Rob S <Here@home> wrote:
> I have a directory with >500 files. I want to divide the 500 odd files > into 4 groups, each group containing a random assortment of filenames. > I've tried using awk with rand() and srand()until my head was ready to > explode. I'm obviously getting the syntax wrong somewhere, but seeing as > how I know very little about awk, it's not surprising. I've never > professed to calling myself a coder or programmer and I never will. I > just can't seem to wrap my head around scripting. > > > -- > > Rob > - - - - - - - - - - - - - - - - - - - - - - - - - - - - -http://www.aspir8or.com > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > Linux means productivity and fun. NT means 'Not Today'. > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - This will divide the 500+ files as evenly as possible amoung 4 groups. The filenames in each group will be sorted and printed to these files: __list_0 __list_1 __list_2 __list_3 #!ruby list = Dir[ "*" ].sort_by{ rand } chunk = ( list.size / 4.0 ).ceil 4.times{ |i| File.open( "__list_#{i}", "w" ){ |f| f.puts( list[ i * chunk, chunk ].sort ) } } |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
Thank you all. The awk script was where I was heading when I got stuck.
All 3 scripts work well, but the ruby script was ideal. A good assortment, equal number in each list, and lightning fast, all of 2 seconds. beautiful. I think it's about time I tried drumming ruby into my old grey matter. Cheers all. -- Rob - - - - - - - - - - - - - - - - - - - - - - - - - - - - - http://www.aspir8or.com - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Name one nice thing about Windows? It doesn't just crash, it displays a dialog box and lets you press 'OK' first. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
On Jul 29, 8:24 pm, Rob S <Here@home> wrote:
> Thank you all. The awk script was where I was heading when I got stuck. > All 3 scripts work well, but the ruby script was ideal. A good > assortment, equal number in each list, and lightning fast, all of 2 > seconds. beautiful. > > I think it's about time I tried drumming ruby into my old grey matter. > > Cheers all. > > -- > > Rob Ruby is very nice. You don't have to create any classes if you don't want to; you can use it much as you would use awk or Perl. I'm afraid that my code didn't do a perfect job of evenly distributing the files. This version is probably correct. #!ruby list = Dir[ "*" ].sort_by{ rand } chunk, remainder = list.size.divmod( 4 ) start = 0 4.times{ |i| amount = chunk if remainder > 0 amount += 1 remainder -= 1 end File.open( "__list_#{i}", "w" ){ |f| f.puts( list[ start, amount ].sort ) } start += amount } |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
On Mon, 30 Jul 2007 02:24:16 +1200, Rob S wrote:
> I have a directory with >500 files. I want to divide the 500 odd files > into 4 groups, each group containing a random assortment of filenames. > I've tried using awk with rand() and srand()until my head was ready to > explode. I'm obviously getting the syntax wrong somewhere, but seeing as > how I know very little about awk, it's not surprising. I've never > professed to calling myself a coder or programmer and I never will. I > just can't seem to wrap my head around scripting. You already have several nice solutions, though, just in case the main requirement was equality prior to 'randomness' this could be a simple idea of algorithm: '{v[$0]=NR%4} END{ for(f in v){print f v[f] } }' Sample here, in the form of a script preproc: $ seq 17 | awk '{v[$0]=NR%4} END{ for(f in v){print "mv " f " " f "_in_set_" v[f] } }' mv 17 17_in_set_1 mv 4 4_in_set_0 mv 5 5_in_set_1 mv 6 6_in_set_2 mv 7 7_in_set_3 mv 8 8_in_set_0 mv 9 9_in_set_1 mv 10 10_in_set_2 mv 11 11_in_set_3 mv 12 12_in_set_0 mv 13 13_in_set_1 mv 14 14_in_set_2 mv 1 1_in_set_1 mv 15 15_in_set_3 mv 2 2_in_set_2 mv 16 16_in_set_0 mv 3 3_in_set_3 |
|
![]() |
| Outils de la discussion | |
|
|