|
|
|
#1 |
|
Messages: n/a
Hébergeur: |
[Note: parts of this message were removed to make it a legal post.]
I am still running out of memory with my ruby application. For some reason, there are objects that are not getting reclaimed long after their use is up. I understand the mark and sweep GC algorithm. Is there a method in Ruby to set an object's memory space to be collected? -- Joey Marino |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
Joey Marino wrote:
> I am still running out of memory with my ruby application. For some reason, > there are objects that are not getting reclaimed long after their use is up. > I understand the mark and sweep GC algorithm. Is there a method in Ruby to > set an object's memory space to be collected? Ruby's GC will eventually collect unreferenced objects without any intervention on your part. It may not collect them on the first sweep, but it will collect them. You can force GC to run earlier than normal by calling GC.start, but that doesn't necessarily mean that every unreferenced object will be collected on that sweep. In one of your earlier posts you said you were using a 3rd-party library? Have you determined that it's not responsible for the memory allocations? Have you determined that no references exist to objects that you think should have been cleaned up? -- RMagick: http://rmagick.rubyforge.org/ RMagick 2: http://rmagick.rubyforge.org/rmagick2.html |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
[Note: parts of this message were removed to make it a legal post.]
I found out it is not the 3rd party code. I am downloading images (~320,000 ct.) as strings, and saving each to a file. The file handlers are getting closed. Using the Dike gem, I think I determined the bulk of the leaked memory is in these string variables. But what I don't understand is that the variables containing the strings are local to a method of an object and should be unreferenced when that method is done, right? I've invoked GC at the end of the method and after each iteration that the method gets called. I've even set the strings to nil after they've been saved to a file and before gc is called. The app is using 2G of ram and 4G of swap before it runs of out memory and crashes about 1/3rd of the way through. I'm really starting to doubt Ruby's ability to do memory intensive work. Any ideas? On Sun, Mar 30, 2008 at 3:53 PM, Tim Hunter <TimHunter@nc.rr.com> wrote: > Joey Marino wrote: > > I am still running out of memory with my ruby application. For some > reason, > > there are objects that are not getting reclaimed long after their use is > up. > > I understand the mark and sweep GC algorithm. Is there a method in Ruby > to > > set an object's memory space to be collected? > > Ruby's GC will eventually collect unreferenced objects without any > intervention on your part. It may not collect them on the first sweep, > but it will collect them. You can force GC to run earlier than normal by > calling GC.start, but that doesn't necessarily mean that every > unreferenced object will be collected on that sweep. > > In one of your earlier posts you said you were using a 3rd-party > library? Have you determined that it's not responsible for the memory > allocations? > > Have you determined that no references exist to objects that you think > should have been cleaned up? > > -- > RMagick: http://rmagick.rubyforge.org/ > RMagick 2: http://rmagick.rubyforge.org/rmagick2.html > > -- Joey Marino |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
Joey Marino wrote:
> I found out it is not the 3rd party code. I am downloading images (~320,000 > ct.) as strings, and saving each to a file. Why? If the images come from the Internet and go to disk, why do you need to read them into Ruby working storage? Can't you just shell out to "wget" or "curl"? |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
[Note: parts of this message were removed to make it a legal post.]
Good question, unfortunately this is not a simple HTTP server. It's an industry standard client/server communication called RETS. Real Estate Transaction Standard. It requires a third party library in order to interface with the server. I wish it were that simple!! On Sun, Mar 30, 2008 at 8:16 PM, M. Edward (Ed) Borasky <znmeb@cesmail.net> wrote: > Joey Marino wrote: > > I found out it is not the 3rd party code. I am downloading images > (~320,000 > > ct.) as strings, and saving each to a file. > > Why? If the images come from the Internet and go to disk, why do you > need to read them into Ruby working storage? Can't you just shell out to > "wget" or "curl"? > > > -- Joey Marino |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
Joey Marino wrote:
> I found out it is not the 3rd party code. I am downloading images (~320,000 > ct.) as strings, and saving each to a file. The file handlers are getting > closed. Using the Dike gem, I think I determined the bulk of the leaked > memory is in these string variables. But what I don't understand is that the > variables containing the strings are local to a method of an object and > should be unreferenced when that method is done, right? I've invoked GC at > the end of the method and after each iteration that the method gets called. > I've even set the strings to nil after they've been saved to a file and > before gc is called. The app is using 2G of ram and 4G of swap before it > runs of out memory and crashes about 1/3rd of the way through. I'm really > starting to doubt Ruby's ability to do memory intensive work. Any ideas? Many people have used Ruby for long-running tasks that use a lot of memory. If Ruby was not collecting unused strings, don't you think somebody would have noticed it by now? Actually, I think you've ruled out Ruby being the problem. What else is running? Could it be using up the memory? -- RMagick: http://rmagick.rubyforge.org/ RMagick 2: http://rmagick.rubyforge.org/rmagick2.html |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
From: "Joey Marino" <joey.da3rd@gmail.com> > > I found out it is not the 3rd party code. I am downloading images (~320,000 > ct.) as strings, and saving each to a file. The file handlers are getting > closed. Using the Dike gem, I think I determined the bulk of the leaked > memory is in these string variables. But what I don't understand is that the > variables containing the strings are local to a method of an object and > should be unreferenced when that method is done, right? I've invoked GC at > the end of the method and after each iteration that the method gets called. > I've even set the strings to nil after they've been saved to a file and > before gc is called. The app is using 2G of ram and 4G of swap before it > runs of out memory and crashes about 1/3rd of the way through. I'm really > starting to doubt Ruby's ability to do memory intensive work. Any ideas? Do you have a small bit of code that reproduces the problem? How are you downloading the images? Net::HTTP ? Or...? If you can provide a small program that exhibits the memory leak I'm sure others here would be happy to try it on their systems as well. Regards, Bill |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
[Note: parts of this message were removed to make it a legal post.]
I shouldn't have criticized ruby like that, it actually is my second favorite language right behind PHP. If I learned RoR, it might be my most favorite. It really does a good job, is easy to write, and has a lot of features. Not to mention a great community. I am just really frustrated about this stupid app that I can't get working right. I am working on a workaround since it is only going to do a major download once then update daily. Thanks for all the though, it feels good that there is a place to turn to for . On Sun, Mar 30, 2008 at 10:31 PM, Bill Kelly <billk@cts.com> wrote: > > From: "Joey Marino" <joey.da3rd@gmail.com> > > > > I found out it is not the 3rd party code. I am downloading images > (~320,000 > > ct.) as strings, and saving each to a file. The file handlers are > getting > > closed. Using the Dike gem, I think I determined the bulk of the leaked > > memory is in these string variables. But what I don't understand is that > the > > variables containing the strings are local to a method of an object and > > should be unreferenced when that method is done, right? I've invoked GC > at > > the end of the method and after each iteration that the method gets > called. > > I've even set the strings to nil after they've been saved to a file and > > before gc is called. The app is using 2G of ram and 4G of swap before it > > runs of out memory and crashes about 1/3rd of the way through. I'm > really > > starting to doubt Ruby's ability to do memory intensive work. Any ideas? > > Do you have a small bit of code that reproduces the problem? > > How are you downloading the images? Net::HTTP ? Or...? > > If you can provide a small program that exhibits the memory > leak I'm sure others here would be happy to try it on their > systems as well. > > > Regards, > > Bill > > > > -- Joey Marino |
|
|
|
#9 |
|
Messages: n/a
Hébergeur: |
On Sun, Mar 30, 2008 at 5:07 PM, Joey Marino <joey.da3rd@gmail.com> wrote:
> I found out it is not the 3rd party code. I am downloading images (~320,000 > ct.) as strings, and saving each to a file. The file handlers are getting > closed. Using the Dike gem, I think I determined the bulk of the leaked > memory is in these string variables. But what I don't understand is that the > variables containing the strings are local to a method of an object and > should be unreferenced when that method is done, right? Usually, yes. However, there are ways a reference could be kept live beyond the end of the method. Obviously, if you pass the string out of the method to something else that keeps a reference to it, that would do it, but also if you pass a closure (e.g., a proc object) that refers to the methods application's binding (I think that's the right terminology) that gets stored somewhere else, that would keep the method's local variables "live" even after the method exits. Its hard to say if something like that might be happening without the code. |
|
|
|
#10 |
|
Messages: n/a
Hébergeur: |
On 31/03/2008, Tim Hunter <TimHunter@nc.rr.com> wrote:
> Joey Marino wrote: > > > I found out it is not the 3rd party code. I am downloading images > (~320,000 > > ct.) as strings, and saving each to a file. The file handlers are getting > > closed. Using the Dike gem, I think I determined the bulk of the leaked > > memory is in these string variables. But what I don't understand is that > the > > variables containing the strings are local to a method of an object and > > should be unreferenced when that method is done, right? I've invoked GC at > > the end of the method and after each iteration that the method gets > called. > > I've even set the strings to nil after they've been saved to a file and > > before gc is called. The app is using 2G of ram and 4G of swap before it > > runs of out memory and crashes about 1/3rd of the way through. I'm really > > starting to doubt Ruby's ability to do memory intensive work. Any ideas? > > > > Many people have used Ruby for long-running tasks that use a lot of memory. > If Ruby was not collecting unused strings, don't you think somebody would > have noticed it by now? I have noticed :-> Michal |
|
|
|
#11 |
|
Messages: n/a
Hébergeur: |
[Note: parts of this message were removed to make it a legal post.]
Ok, I was able to get it all into one class. The problem lies in this class: class Picture def initialize(db,rets,rets_class) @db = db @rets = rets @rets_class = rets_class @attempts = 0 end def getPic(key) begin get_object_request = GetObjectRequest.new(@rets_class, "Photo") get_object_request.add_all_objects(key) get_object_response = @rets.session.get_object(get_object_request) content_type_suffixes = { "image/jpeg" => "jpg"} makePicDir(key) get_object_response.each_object do |object_descriptor| object_key = object_descriptor.object_key obj_id = object_descriptor.object_id content_type = object_descriptor.content_type description = object_descriptor.description #print "#{object_key} object \##{object_id}" #print ", description: #{description}" if !description.empty? #puts suffix = content_type_suffixes[content_type] pic = object_descriptor.data_as_string savePic(key,obj_id.to_s,suffix,description,pic) end get_object_response = nil rescue => e puts "Error retrieving pictures for #{key}: " + e if @attempts <= 5 @attempts += 1 puts "retrying" retry else puts "failed" @attempts = 0 end end @attempts = 0 end def getThumb(key) begin get_object_request = GetObjectRequest.new(@rets_class, "Thumbnail") get_object_request.add_all_objects(key) get_object_response = @rets.session.get_object(get_object_request) content_type_suffixes = { "image/jpeg" => "jpg"} get_object_response.each_object do |object_descriptor| object_key = object_descriptor.object_key obj_id = object_descriptor.object_id content_type = object_descriptor.content_type description = object_descriptor.description #print "#{object_key} object \##{object_id}" #print ", description: #{description}" if !description.empty? #puts suffix = content_type_suffixes[content_type] pic = object_descriptor.data_as_string savePic(key,obj_id.to_s,suffix,description,pic,tru e) end get_object_response = nil rescue => e puts "Error retrieving thumbs for #{key}: " + e if @attempts <= 5 @attempts += 1 puts "retrying" retry else puts "failed" @attempts = 0 end end @attempts = 0 end def makePicDir(key) FileUtils.mkpath("#{$pic_dir}#{key}/thumb") end def savePic(key,id,suffix,desc,pic,thumb_bool=false) if thumb_bool file_name = $pic_dir + key + "/thumb/" + id + "." + suffix location = "/" + key + "/thumb/" + id + "." + suffix else file_name = $pic_dir + key + "/" + id + "." + suffix location = "/" + key + "/" + id + "." + suffix end self.savePicFile(file_name,pic) size = File.size(file_name) if thumb_bool self.insertThumbDB(key,id,location) else self.insertPicDB(key,id,desc,size,location) end end def savePicFile(file_name,pic) f = File.open(file_name, "wb") f << pic f.close end def insertPicDB(key,id,desc,size,location) description = @db.database.escape_string(desc) if @db.DBinsert("PICS","pkey,id,description,size,loca tion","#{key},#{id},'#{description}','#{size}','#{ location}'") # puts "#{key} #{id} pic added" print ":" end end def insertThumbDB(key,id,location) if @db.DBupdate("PICS","thumb = '#{location}'"," pkey = #{key} and id = #{id}") # puts "#{key} #{id} thumb added" print "." end end def deletePic(key) self.deletePicDir(key) self.deletePicDB(key) end def deletePicDir(key) if File.exists?("#{$pic_dir}#{key}") FileUtils.remove_dir("#{$pic_dir}#{key}") end end def deletePicDB(key) if @db.DBdelete("PICS","pkey = #{key}") print "-" # puts "#{key} pics deleted from db" end end end #end class On Wed, Apr 2, 2008 at 6:02 AM, Michal Suchanek <hramrach@centrum.cz> wrote: > On 31/03/2008, Tim Hunter <TimHunter@nc.rr.com> wrote: > > Joey Marino wrote: > > > > > I found out it is not the 3rd party code. I am downloading images > > (~320,000 > > > ct.) as strings, and saving each to a file. The file handlers are > getting > > > closed. Using the Dike gem, I think I determined the bulk of the > leaked > > > memory is in these string variables. But what I don't understand is > that > > the > > > variables containing the strings are local to a method of an object > and > > > should be unreferenced when that method is done, right? I've invoked > GC at > > > the end of the method and after each iteration that the method gets > > called. > > > I've even set the strings to nil after they've been saved to a file > and > > > before gc is called. The app is using 2G of ram and 4G of swap before > it > > > runs of out memory and crashes about 1/3rd of the way through. I'm > really > > > starting to doubt Ruby's ability to do memory intensive work. Any > ideas? > > > > > > > Many people have used Ruby for long-running tasks that use a lot of > memory. > > If Ruby was not collecting unused strings, don't you think somebody > would > > have noticed it by now? > > I have noticed :-> > > Michal > > -- Joey Marino |
|
|
|
#12 |
|
Messages: n/a
Hébergeur: |
Few suggestions, some not really related to the problem:
On Wed, Apr 2, 2008 at 6:52 PM, Joey Marino <joey.da3rd@gmail.com> wrote: > Ok, I was able to get it all into one class. The problem lies in this class: > class Picture > > def initialize(db,rets,rets_class) > @db = db > @rets = rets > @rets_class = rets_class > @attempts = 0 > end > > > def getPic(key) > begin > get_object_request = GetObjectRequest.new(@rets_class, "Photo") > get_object_request.add_all_objects(key) > get_object_response = @rets.session.get_object(get_object_request) > content_type_suffixes = { "image/jpeg" => "jpg"} Make content_type_suffixes a class constant, or member if you need to append to it. Now you are constructing and destructing the object on each method call. > makePicDir(key) > get_object_response.each_object do |object_descriptor| > object_key = object_descriptor.object_key > obj_id = object_descriptor.object_id > content_type = object_descriptor.content_type > description = object_descriptor.description > #print "#{object_key} object \##{object_id}" - #print ", description: #{description}" if !description.empty? + #print ", description: #{description}" unless description.empty? # a matter of taste/style > #puts > suffix = content_type_suffixes[content_type] > pic = object_descriptor.data_as_string > savePic(key,obj_id.to_s,suffix,description,pic) > end > get_object_response = nil > rescue => e > puts "Error retrieving pictures for #{key}: " + e > if @attempts <= 5 > @attempts += 1 > puts "retrying" > retry > else > puts "failed" > @attempts = 0 > end > end > @attempts = 0 > end > It seems that you could refactor the common code of these two methods in to a new one. The benefit would be shorter/more readable code and better responsibility split, the drawback slower execution. > def getThumb(key) > begin > get_object_request = GetObjectRequest.new(@rets_class, "Thumbnail") > get_object_request.add_all_objects(key) > get_object_response = @rets.session.get_object(get_object_request) > content_type_suffixes = { "image/jpeg" => "jpg"} > get_object_response.each_object do |object_descriptor| > object_key = object_descriptor.object_key > obj_id = object_descriptor.object_id > content_type = object_descriptor.content_type > description = object_descriptor.description > #print "#{object_key} object \##{object_id}" > #print ", description: #{description}" if !description.empty? > #puts > suffix = content_type_suffixes[content_type] > pic = object_descriptor.data_as_string > savePic(key,obj_id.to_s,suffix,description,pic,tru e) > end > get_object_response = nil > rescue => e > puts "Error retrieving thumbs for #{key}: " + e > if @attempts <= 5 > @attempts += 1 > puts "retrying" > retry > else > puts "failed" > @attempts = 0 > end > end > @attempts = 0 > end > > > def makePicDir(key) > FileUtils.mkpath("#{$pic_dir}#{key}/thumb") > end > > > > > def savePic(key,id,suffix,desc,pic,thumb_bool=false) > if thumb_bool > file_name = $pic_dir + key + "/thumb/" + id + "." + suffix > location = "/" + key + "/thumb/" + id + "." + suffix > else > file_name = $pic_dir + key + "/" + id + "." + suffix > location = "/" + key + "/" + id + "." + suffix > end > self.savePicFile(file_name,pic) self is not necessary, the same below > size = File.size(file_name) > if thumb_bool > self.insertThumbDB(key,id,location) > else > self.insertPicDB(key,id,desc,size,location) > end > end > > > > > def savePicFile(file_name,pic) - f = File.open(file_name, "wb") + File.open(file_name, "wb") do |f| > f << pic - f.close + end # automatic close on exceptions > end > > > > > def insertPicDB(key,id,desc,size,location) > description = @db.database.escape_string(desc) > if > @db.DBinsert("PICS","pkey,id,description,size,loca tion","#{key},#{id},'#{description}','#{size}','#{ location}'") > # puts "#{key} #{id} pic added" > print ":" > end > end > > > def insertThumbDB(key,id,location) > if @db.DBupdate("PICS","thumb = '#{location}'"," pkey = #{key} and id = #{id}") > # puts "#{key} #{id} thumb added" > print "." > end > end > > > > def deletePic(key) > self.deletePicDir(key) > self.deletePicDB(key) > end > > > def deletePicDir(key) - if File.exists?("#{$pic_dir}#{key}") - FileUtils.remove_dir("#{$pic_dir}#{key}") - end + pic_dir = $pic_dir + key # save one allocation + FileUtils.remove_dir pic_dir if File.directory? pic_dir > end > > > > def deletePicDB(key) > if @db.DBdelete("PICS","pkey = #{key}") > print "-" > # puts "#{key} pics deleted from db" > end > end > > end #end class I'd suspect the database code is keeping some cache. This code seems fine. I see you are using libRETS. Did you try to rule it out by replacing calls to libRETS (especially to data_as_string) with some stubs (create a random long string on the fly). If you are on unix, you can use IO.read("/dev/random", 100000). If you are on Windows, choose another long enough file to read. The documentation says that GetData() abandons ownership to the object it returns. It's possible that SWIG-generated wrapper doesn't handle this properly. Jano |
|
|
|
#13 |
|
Messages: n/a
Hébergeur: |
>> Many people have used Ruby for long-running tasks that use a lot of memory.
>> If Ruby was not collecting unused strings, don't you think somebody would >> have noticed it by now? > > I have noticed :-> > > Michal I have too, and it drives me crazy when my mongrel instances eat up 600MB of memory. I'd be willing to offer a bounty of $150 to anyone able to clear this up. It happens especially often when you run multiple threads, it seems. Probably a rails thing, but anyway. Enough ranting. Have a good one. -R -- Posted via http://www.ruby-forum.com/. |
|
|
|
#14 |
|
Messages: n/a
Hébergeur: |
On 4/5/08, Roger Pack <rogerpack2005@gmail.com> wrote:
> >> Many people have used Ruby for long-running tasks that use a lot of memory. > >> If Ruby was not collecting unused strings, don't you think somebody would > >> have noticed it by now? > > > > I have noticed :-> > > > > Michal > > I have too, and it drives me crazy when my mongrel instances eat up > 600MB of memory. > I'd be willing to offer a bounty of $150 to anyone able to clear this > up. It happens especially often when you run multiple threads, it > seems. Send me the 150 and i send you a copy of ramaze ![]() > Probably a rails thing, but anyway. > Enough ranting. > Have a good one. > -R > -- > Posted via http://www.ruby-forum.com/. > > |
|
|
|
#15 |
|
Messages: n/a
Hébergeur: |
What version of ruby are you running? I recently saw an issue using
1.8.6 with a low patch level. Upgrading to p114 might solve things for you. On Fri, Apr 4, 2008 at 7:26 PM, Roger Pack <rogerpack2005@gmail.com> wrote: > >> Many people have used Ruby for long-running tasks that use a lot of memory. > >> If Ruby was not collecting unused strings, don't you think somebody would > >> have noticed it by now? > > > > I have noticed :-> > > > > Michal > > I have too, and it drives me crazy when my mongrel instances eat up > 600MB of memory. > I'd be willing to offer a bounty of $150 to anyone able to clear this > up. It happens especially often when you run multiple threads, it > seems. > Probably a rails thing, but anyway. > Enough ranting. > Have a good one. > -R > -- > Posted via http://www.ruby-forum.com/. > > hat very |
|
![]() |
| Outils de la discussion | |
|
|