|
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hi,
I'm totally new to Ruby, so please forgive me if I'm asking a trivial question. :-) I've wrote this simple program to demonstrate my first major astonishment with Ruby: $ cat test-hash.rb hash={} key=[1] hash[key]=1 p hash key << 2 hash[key]=2 p hash puts $ ruby ./test-hash.rb {[1]=>1} {[1, 2]=>2, [1, 2]=>1} $ This suggests me that the keys are actually stored in the hash by reference, not by value as I would expect from my prior experience with other languages. I know I can use key.clone() to overcome this, but the question is: what is the reason for hash to store it's keys by reference? My Ruby and system versions are as follows: $ ruby --version ruby 1.8.6 (2007-06-07 patchlevel 36) [i486-linux] $ uname -a Linux zrbite 2.6.21-2-686 #1 SMP Wed Jul 11 03:53:02 UTC 2007 i686 GNU/ Linux Cheers, Alex |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
Hi,
I'd also be interested in renaming one or more hash keys. My use case: ActiveWarehouse-ETL (http://activewarehouse.rubyforge.org/) is an ETL tool handling rows of data as Hash, and it's very common to have to rename a field in that case. So far here's what I do: # rename a bunch of fields def rename_fields!(row,fields_to_rename,fields_new_nam es) throw "Array size mismatch" unless fields_new_names.size == fields_to_rename.size mapping = Hash[*fields_to_rename.zip(fields_new_names).flatten] mapping.each { |old_name,new_name| row[new_name] = row[old_name]; row.delete(old_name) } end Does anyone know a built-in way of achieving something similar ? best Thibaut |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
On Sep 17, 3:42 am, Yukihiro Matsumoto <m...@ruby-lang.org> wrote:
> Hi, > > In message "Re: Modifying a hash key" > on Sun, 16 Sep 2007 17:10:05 +0900, Alex Shulgin <alex.shul...@gmail.com> writes: > > |This suggests me that the keys are actually stored in the hash by > |reference, not by value as I would expect from my prior experience > |with other languages. > > Hash stores its values, as its name suggests, by hash values from > keys. So, if you modifies the key, and subsequently the hash value of > the key, it screws up. As a general rule, don't modify the keys, or > if you really need to modify the key for some unknown reason, call > rehash method on the hash. So the real answer to my question (what is the reason for hash to store it's keys by reference?) might be: trading rule of a thumb for speed and memory? If no one is ever going to modify the hash key, there is no reason to copy it... OK, I think I got it. :-) If someone wonder how could I come up with modifying the hash keys: this is my second program in Ruby and was coding a Ruby version of a speech generator example from the Pike & Kernigan's "Practical programming". Here is the code snippet: state={} prefix=[] while $stdin.eof? w=... # read a word state[prefix] << w ... prefix << w # <-- Bang! The hash key is changed... :-) end Cheers, Alex |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
2007/9/17, Alex Shulgin <alex.shulgin@gmail.com>:
> On Sep 17, 3:42 am, Yukihiro Matsumoto <m...@ruby-lang.org> wrote: > > Hi, > > > > In message "Re: Modifying a hash key" > > on Sun, 16 Sep 2007 17:10:05 +0900, Alex Shulgin <alex.shul...@gmail.com> writes: > > > > |This suggests me that the keys are actually stored in the hash by > > |reference, not by value as I would expect from my prior experience > > |with other languages. > > > > Hash stores its values, as its name suggests, by hash values from > > keys. So, if you modifies the key, and subsequently the hash value of > > the key, it screws up. As a general rule, don't modify the keys, or > > if you really need to modify the key for some unknown reason, call > > rehash method on the hash. > > So the real answer to my question (what is the reason for hash to > store it's keys by reference?) might be: trading rule of a thumb for > speed and memory? No, the reason is that you can only store by reference in Ruby (there are some internal optimizations for Fixnum and the like but the code still basically behaves the same). > If no one is ever going to modify the hash key, > there is no reason to copy it... There is also the issue that a key's hash value will likely change if you change a Hash key. So it is *never* a good idea to modify a Hash key. Note though that there is an exception: Strings are cloned if they are not frozen to avoid nasty effects. > OK, I think I got it. :-) > > If someone wonder how could I come up with modifying the hash keys: > this is my second program in Ruby and was coding a Ruby version of a > speech generator example from the Pike & Kernigan's "Practical > programming". Here is the code snippet: > > state={} > prefix=[] > while $stdin.eof? Are you serious about the line above? I'd rather have expected "until" there. I'd do this, note all the freezing in order to make errors with changing keys obvious. state = Hash.new {|h,k| h[k]=[]} prefix = [].freeze ARGF.each do |line| line.scan /\w+/ do |word| state[prefix] << word.freeze (prefix += [word]).freeze # or: prefix = (prefix.dup << word).freeze end end Btw, there's probably a more efficient way of storing this if you introduce a specialized class for prefix chaining. Probably like this: Prefix = Struct.new :word, :previous > w=... # read a word > state[prefix] << w > ... > prefix << w # <-- Bang! The hash key is changed... :-) You can #dup the prefix or use +: prefix += [w] Kind regards robert This will implicitly create a new Array. Kind regards robert |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
On Sep 17, 12:53 pm, "Robert Klemme" <shortcut...@googlemail.com>
wrote: > > > So the real answer to my question (what is the reason for hash to > > store it's keys by reference?) might be: trading rule of a thumb for > > speed and memory? > > No, the reason is that you can only store by reference in Ruby (there > are some internal optimizations for Fixnum and the like but the code > still basically behaves the same). Yes, but hash implementation could make a copy of the key when it inserts new elements. Of course, this will slow down the code in significant part of programs, so there is the trade-off I've talked about. :-) > > state={} > > prefix=[] > > while $stdin.eof? > > Are you serious about the line above? I'd rather have expected "until" there. Sorry, of course not. It was pulled off the top of my head, since no real code was at hand (I've posted that from work, and the Ruby code is something I keep at home (-: ). > I'd do this, note all the freezing in order to make errors with > changing keys obvious. > > state = Hash.new {|h,k| h[k]=[]} > prefix = [].freeze > > ARGF.each do |line| > line.scan /\w+/ do |word| > state[prefix] << word.freeze > (prefix += [word]).freeze > # or: prefix = (prefix.dup << word).freeze > end > end Uh-oh... this freeze stuff seems overly complicated to me. > Btw, there's probably a more efficient way of storing this if you > introduce a specialized class for prefix chaining. Probably like > this: > > Prefix = Struct.new :word, :previous > > > w=... # read a word > > state[prefix] << w > > ... > > prefix << w # <-- Bang! The hash key is changed... :-) > > You can #dup the prefix or use +: > prefix += [w] My real code is as follows: require 'scanf' NPREFIX = 2 $nwords = ARGV[0] ? ARGV[0].to_i() : 1000 # # acquire knowledge # state = {} prefix = [] while not $stdin.eof? do # w, = scanf("%s") words = $stdin.gets().scan(/[^\s]+/) words.each do |w| suf = state[prefix] if not suf suf = state[prefix.clone()] = [] end suf << w if prefix.length >= NPREFIX prefix.shift end prefix << w end end state[prefix] = [] # # generate pseudo-text # prefix = [] count = 0 while count < $nwords do suf = state[prefix] if suf.empty? break end w = suf[rand(suf.length)] print w + " " if prefix.length >= NPREFIX prefix.shift end prefix << w count += 1 end puts May be an eye of experienced programmer could catch some more odd places in my code? See, I'm just a Ruby newbie... Please do not waste more of your time than really necessary on this. :-) Cheers, Alex |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
On 17.09.2007 20:19, Alex Shulgin wrote:
> On Sep 17, 12:53 pm, "Robert Klemme" <shortcut...@googlemail.com> > wrote: >>> So the real answer to my question (what is the reason for hash to >>> store it's keys by reference?) might be: trading rule of a thumb for >>> speed and memory? >> No, the reason is that you can only store by reference in Ruby (there >> are some internal optimizations for Fixnum and the like but the code >> still basically behaves the same). > > Yes, but hash implementation could make a copy of the key when it > inserts new elements. Of course, this will slow down the code in > significant part of programs, so there is the trade-off I've talked > about. :-) That's the exact reason why this optimization was choosen for Strings only. >>> state={} >>> prefix=[] >>> while $stdin.eof? >> Are you serious about the line above? I'd rather have expected "until" there. > > Sorry, of course not. It was pulled off the top of my head, since no > real code was at hand (I've posted that from work, and the Ruby code > is something I keep at home (-: ). > >> I'd do this, note all the freezing in order to make errors with >> changing keys obvious. >> >> state = Hash.new {|h,k| h[k]=[]} >> prefix = [].freeze >> >> ARGF.each do |line| >> line.scan /\w+/ do |word| >> state[prefix] << word.freeze >> (prefix += [word]).freeze >> # or: prefix = (prefix.dup << word).freeze >> end >> end > > Uh-oh... this freeze stuff seems overly complicated to me. Well, it's not necessary - I just put it there in order to find bugs. >> Btw, there's probably a more efficient way of storing this if you >> introduce a specialized class for prefix chaining. Probably like >> this: >> >> Prefix = Struct.new :word, :previous >> >>> w=... # read a word >>> state[prefix] << w >>> ... >>> prefix << w # <-- Bang! The hash key is changed... :-) >> You can #dup the prefix or use +: >> prefix += [w] > > My real code is as follows: > > require 'scanf' > > NPREFIX = 2 > > $nwords = ARGV[0] ? ARGV[0].to_i() : 1000 > > # > # acquire knowledge > # > state = {} > prefix = [] > while not $stdin.eof? do > # w, = scanf("%s") > words = $stdin.gets().scan(/[^\s]+/) > words.each do |w| > suf = state[prefix] > if not suf > suf = state[prefix.clone()] = [] > end > suf << w > if prefix.length >= NPREFIX > prefix.shift > end > prefix << w > end > end > state[prefix] = [] > > # > # generate pseudo-text > # > prefix = [] > count = 0 > while count < $nwords do > suf = state[prefix] > if suf.empty? > break > end > w = suf[rand(suf.length)] > print w + " " > if prefix.length >= NPREFIX > prefix.shift > end > prefix << w > count += 1 > end > > puts You find my code "overly complicated"? Amazing... Cheers robert |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
On Sep 17, 10:41 pm, Robert Klemme <shortcut...@googlemail.com> wrote:
> On 17.09.2007 20:19, Alex Shulgin wrote: > > >> I'd do this, note all the freezing in order to make errors with > >> changing keys obvious. > > >> state = Hash.new {|h,k| h[k]=[]} > >> prefix = [].freeze > > >> ARGF.each do |line| > >> line.scan /\w+/ do |word| > >> state[prefix] << word.freeze > >> (prefix += [word]).freeze > >> # or: prefix = (prefix.dup << word).freeze > >> end > >> end > > > Uh-oh... this freeze stuff seems overly complicated to me. > > Well, it's not necessary - I just put it there in order to find bugs. > [snip] > > You find my code "overly complicated"? Amazing... Oh, sorry, I didn't want to hurt anyone... First of all your and mine code do different things, and most importantly that freeze stuff _really_ scared me. I thought it was some kind of garbage-collection voodoo. ;-) Now I see it may be safely removed after debugging the code. This way your code looks much better, thanks! Alex |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
On 18.09.2007 20:05, Alex Shulgin wrote:
> On Sep 17, 10:41 pm, Robert Klemme <shortcut...@googlemail.com> wrote: >> On 17.09.2007 20:19, Alex Shulgin wrote: >> >>>> I'd do this, note all the freezing in order to make errors with >>>> changing keys obvious. >>>> state = Hash.new {|h,k| h[k]=[]} >>>> prefix = [].freeze >>>> ARGF.each do |line| >>>> line.scan /\w+/ do |word| >>>> state[prefix] << word.freeze >>>> (prefix += [word]).freeze >>>> # or: prefix = (prefix.dup << word).freeze >>>> end >>>> end >>> Uh-oh... this freeze stuff seems overly complicated to me. >> Well, it's not necessary - I just put it there in order to find bugs. >> > [snip] >> You find my code "overly complicated"? Amazing... > > Oh, sorry, I didn't want to hurt anyone... Not hurt, just astonished. :-) > First of all your and mine code do different things Yes and no: your code does more but as far as I can see the gathering does basically the same in different ways. >, and most > importantly that freeze stuff _really_ scared me. I thought it was > some kind of garbage-collection voodoo. ;-) No, it just prevents changing an instance. #freeze has nothing to do with GC (unless you count not being able to overwrite a reference to an instance with a reference to nil). > Now I see it may be safely removed after debugging the code. This way > your code looks much better, thanks! :-) Cheers robert |
|
![]() |
| Outils de la discussion | |
|
|