|
|
|
|
||||||
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Quoting Charles Oliver Nutter <charles.nutter / sun.com>:
> znmeb / cesmail.net wrote: >> Quoting Charles Oliver Nutter <charles.nutter / sun.com>: >> >>> Many people believed we'd never be faster than the C implementation, >>> and many still think we're slower. Now that I've set that record >>> straight, any questions? >> >> 1. How long will it be before Alioth has some *reasonable* numbers =20 >> for jRuby? As of yesterday, they still have you significantly =20 >> slower than MRI. So I need to take jRuby out of my slides for =20 >> RubyConf ... I> > The current published Alioth numbers are based on JRuby 1.0(ish), which > was generally 2-3x slower than MRI. I'm hoping the numbers will be > updated soon after the 1.1 releases...but it probably won't happen > until 1.1 final comes out in December. If someone else wants to re-run > them for us, it would make us very happy ![]() An "Update Programming Language" "Feature Request" will usually get our attention. Coincidentally, I did grab 1.1b1 so the benchmarks game has new measurements http://shootout.alioth.debian.org/gp...all&lang=jruby __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
> > Coincidentally, I did grab 1.1b1 so the benchmarks game has new > measurements > > http://shootout.alioth.debian.org/gp...all&lang=jruby Wow it would appear that jruby is indeed faster, and indeed uses a lot more memory (or maybe that's just startup overhead). thanks for agood program! -- Posted via http://www.ruby-forum.com/. |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
Roger Pack wrote:
>> Coincidentally, I did grab 1.1b1 so the benchmarks game has new >> measurements >> >> http://shootout.alioth.debian.org/gp...all&lang=jruby > > Wow it would appear that jruby is indeed faster, and indeed uses a lot > more memory (or maybe that's just startup overhead). thanks for a> good program! It's another well-known fact about running on the JVM that we have to suck it up and accept there's an initial memory chunk eaten up by every JVM process. If one excludes that initial cost, most measurements have us using less memory than C Ruby...so for very large apps we end up coming out ahead. But for small, short apps, the initial slow startup and high memory usage is going to be a battle we fight for a long time. - Charlie |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
> It's another well-known fact about running on the JVM that we have to
> suck it up and accept there's an initial memory chunk eaten up by every > JVM process. If one excludes that initial cost, most measurements have > us using less memory than C Ruby...so for very large apps we end up > coming out ahead. But for small, short apps, the initial slow startup > and high memory usage is going to be a battle we fight for a long time. > > - Charlie If you run multiple threads I assume there isn't an extra memory cost for that--is that right? -- Posted via http://www.ruby-forum.com/. |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
Roger Pack wrote:
> If you run multiple threads I assume there isn't an extra memory cost > for that--is that right? Every thread is going to need its own stack, but that'll be small compared to the startup overhead. I'm sure Charles will elaborate. I'm guessing that JRuby still doesn't support continuations...? I think that would require the "spaghetti stack" model, which would remove most of the per-thread initial stack overhead. Clifford Heath. |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
Clifford Heath wrote:
> Roger Pack wrote: >> If you run multiple threads I assume there isn't an extra memory cost >> for that--is that right? > > Every thread is going to need its own stack, but that'll be small > compared to the startup overhead. I'm sure Charles will elaborate. Our threads will be a lot more expensive than Ruby's, but a lot cheaper than a separate process in either world. > I'm guessing that JRuby still doesn't support continuations...? I > think that would require the "spaghetti stack" model, which would > remove most of the per-thread initial stack overhead. Our official stance is that JRuby won't support continuations until the JVM does. We could emulate them by forcing a stackless implementation, but it would be *drastically* slower than what we have now. - Charlie |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
Roger Pack wrote:
>> It's another well-known fact about running on the JVM that we have to >> suck it up and accept there's an initial memory chunk eaten up by every >> JVM process. If one excludes that initial cost, most measurements have >> us using less memory than C Ruby...so for very large apps we end up >> coming out ahead. But for small, short apps, the initial slow startup >> and high memory usage is going to be a battle we fight for a long time. >> >> - Charlie > > If you run multiple threads I assume there isn't an extra memory cost > for that--is that right? Yes, generally. They won't be as light as Ruby's green threads, but then Ruby's threads can't actually run in parallel anyway. - Charlie |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
Roger Pack wrote:
>> Coincidentally, I did grab 1.1b1 so the benchmarks game has new >> measurements >> >> http://shootout.alioth.debian.org/gp...all&lang=jruby > > Wow it would appear that jruby is indeed faster, and indeed uses a lot > more memory (or maybe that's just startup overhead). thanks for a> good program! I just committed an addition to JRuby that allows you to spin up a "server" JRuby instance (using "Nailgun") in the background and feed it commands. See the startup difference using this: normal: ~/NetBeansProjects/jruby $ time jruby -e "puts 'hello'" hello real 0m1.944s user 0m1.511s sys 0m0.138s nailgun: ~/NetBeansProjects/jruby $ time jruby-ng -e "puts 'hello'" hello real 0m0.103s user 0m0.006s sys 0m0.009s Here's a post from the JRuby list describing how to use this, for those of you that are interested. Also, this allows you to avoid the startup memory cost for every command you run since you can just issue commands to that running server and it will re-use memory. After running a bunch of commands on my system, that server process was still happily under 60M, and never went any higher. ... I've got Nailgun working with JRuby just great now. bin/jruby-ng-server bin/jruby-ng If you want to use the server, say if you're going to be running a lot of command-line tools, just spin it up in the background somewhere. jruby-ng-server > /dev/null 2> /dev/null & And then use the jruby-ng command instead, or alias it to "jruby" alias jruby=jruby-ng You'll need to make the ng client command on your platform, by running 'make' under bin/nailgun, but then everything should function correctly. jruby-ng -e "puts 'here'" The idea is that users will have a new option to try. For JRuby, where we have no global variables, no dependencies on static fields, and already depend on our ability to spin up many JRuby instances in a single JVM, this ends up working very well. It's building off features we already provide, and giving users the benefit of a fast, pre-initialized JVM without the startup hit. I think we're probably going to ship with this for JRuby 1.1 now. It's working really well. I've managed to resolve the CWD issue by defining my own "nailMain" next to our existing "main", and ENV vars are being passed along as well. The one big remaining complication I don't have an answer for just yet is OS signals; they get registered only in the server process, so signals from the client don't propagate through. It's fixable of course, by having the client register and the server just listen for client signal events, but that isn't supported in the current NG. So there's some work to do. All the NG stuff is in JRuby trunk right now. Give it a shot. I'm interested in hearing opinions on it. - Charlie |
|
|
|
#9 |
|
Messages: n/a
Hébergeur: |
> Wow it would appear that jruby is indeed faster, and indeed uses a lot > more memory (or maybe that's just startup overhead). thanks for a> good program! I wonder if jruby uses reference counting for its ruby objects (or if it even matters), and if not maybe someday it would I'm just in a proreference counting mood these days ![]() -Roge -- Posted via http://www.ruby-forum.com/. |
|
|
|
#10 |
|
Messages: n/a
Hébergeur: |
On 11/9/07, Roger Pack <rogerpack2005@gmail.com> wrote:
> > > Wow it would appear that jruby is indeed faster, and indeed uses a lot > > more memory (or maybe that's just startup overhead). thanks for a> > good program! > > I wonder if jruby uses reference counting for its ruby objects (or if it > even matters), and if not maybe someday it would I'm just in a pro> reference counting mood these days ![]() I very much doubt it. Roger, you REALLY need to read the literature on GC which has been accumulating for the past 50 years. Reference counting is pretty much an obsolete approach to GC. It was probably the first approach taken for lisp back in the 1950s. Other language implementations usually started with reference counting (e.g. the first Smalltalk). It's main advantage is that it's easy to understand. On the other hand it incurs a large overhead since counts need to be incremented/decremented on every assignment. It can't detect circular lists of dead objects. In early Smalltalk programs when reference counting was used, you needed to explicitly nil out references to break such chains. There's also the issue of the overhead for storing the reference count, and how many bits to allocate. Most reference counting implementations punt when the reference count overflows, they treat a 'full' count as an infinite count and no longer decrement it, leading to more uncollectable objects. Mark and sweep, such as is used in the Ruby 1.8 implementation quickly replaced reference counting as the simplest GC considered for real use. More modern GCs tend to use copying GCs which move live objects to new heap blocks leaving the dead ones behind. And most use generational scavenging which takes advantage of the observation that most objects either die quite young, or live a long time. This approach was pioneered by David Ungar in the Berkeley implementation of Smalltalk-80. And this is the kind of GC typically used in JVMs today. Which particular GC approach is best for Ruby is subject to some study. Many of the usages of ruby aren't quite like those of Java, or Smalltalk. I had dinner with a former colleague, who happens to be the lead developer of the IBM J9 java virtual machine, and he made the observation that Java, and Smalltalk before it have a long history of having their VMs tuned for long running processes. On the other hand many Ruby usages are get in and get out. These use cases mean that it's more valuable to have rapid startup than perfect GC in the sense that all dead objects are reclaimed quickly, not that any of the current GCs guarantee the latter. So the best GC for Ruby might not be the same as would be used for a JVM or Smalltalk VM, but I'm almost certain it would be a reference counter. -- Rick DeNatale My blog on Ruby http://talklikeaduck.denhaven2.com/ |
|
|
|
#11 |
|
Messages: n/a
Hébergeur: |
Rick DeNatale wrote:
> Reference counting is pretty much an obsolete approach to GC. It was > probably the first approach taken for lisp back in the 1950s. Other > language implementations usually started with reference counting (e.g. > the first Smalltalk). > > It's main advantage is that it's easy to understand. I don't think reference counting is any easier to understand than pure mark-and-sweep or pure stop-and-copy. The main advantage of reference counting in my opinion is that its restrictions force you to kick some features out of your language design if you want to use it. ![]() > Mark and sweep, such as is used in the Ruby 1.8 implementation quickly > replaced reference counting as the simplest GC considered for real > use. My recollection is that mark-and-sweep was the original, and that reference counting came later. > More modern GCs tend to use copying GCs which move live objects to new > heap blocks leaving the dead ones behind. And most use generational > scavenging which takes advantage of the observation that most objects > either die quite young, or live a long time. This approach was > pioneered by David Ungar in the Berkeley implementation of > Smalltalk-80. And this is the kind of GC typically used in JVMs > today. Bah ... I actually found a reference a couple of days ago on this (http://portal.acm.org/citation.cfm?id=91597). If you're not signed up for the ACM library it will cost you money to read it. But essentially "pure" mark-and-sweep was replaced by stop-and-copy, which compacts the heap. Then generational mark-and-sweep came along and "rehabilitated" mark-and-sweep. Note the publication date -- 1990. The abstract is free -- it reads: "Stop-and-copy garbage collection has been preferred to mark-and-sweep collection in the last decade because its collection time is proportional to the size of reachable data and not to the memory size. This paper compares the CPU overhead and the memory requirements of the two collection algorithms extended with generations, and finds that mark-and-sweep collection requires at most a small amount of additional CPU overhead (3-6%) but, requires an average of 20% (and up to 40%) less memory to achieve the same page fault rate. The comparison is based on results obtained using trace-driven simulation with large Common Lisp programs." > Which particular GC approach is best for Ruby is subject to some study. I think at least for Rails on Linux, someone (assuming funding) could collect and analyze plenty of data. I'd actually be surprised if someone *isn't* doing it, although I know *I'm* not. ![]() > Many of the usages of ruby aren't quite like those of Java, or > Smalltalk. I had dinner with a former colleague, who happens to be > the lead developer of the IBM J9 java virtual machine, and he made the > observation that Java, and Smalltalk before it have a long history of > having their VMs tuned for long running processes. On the other hand > many Ruby usages are get in and get out. These use cases mean that > it's more valuable to have rapid startup than perfect GC in the sense > that all dead objects are reclaimed quickly, not that any of the > current GCs guarantee the latter. Well ... OK. If you want to distinguish between long running (server) and rapid startup (client), that's fine. But look at the marketplace. We have servers, we have laptop clients, we have desktop clients, we have mobile clients, and we have bazillions of non-user-programmable computers like DVD players, iPods, in-vehicle navigation systems, etc. Now while the hard-core hackers like me wouldn't buy an iPod or a DVD player, preferring instead to add hard drive space to a real computer, Apple isn't exactly going broke making iPods and iPhones that are (for the moment, anyhow) closed to "outsiders". And I'm guessing that, while you *can* run Ruby on, say, an embedded ARM/Linux platform, most of the software in those gizmos is written in C and heavily optimized. I've got a couple of embedded toolkits, and I've actually built Ruby for them, but when you only have 32 MB of RAM, you don't want to collect garbage -- you don't even want to *generate* garbage! So I wouldn't personally spend much time thinking about garbage collection for rapid startup. If you want rapid startup, you're going to have as much binding as possible done at compile time -- you aren't even going to compile a Ruby script to an AST when you start a process up. > So the best GC for Ruby might not be the same as would be used for a > JVM or Smalltalk VM, but I'm almost certain it would be a reference > counter. Did you mean to say, "not be a reference counter"? |
|
|
|
#12 |
|
Messages: n/a
Hébergeur: |
M. Edward (Ed) Borasky wrote:
> Rick DeNatale wrote: >> It's main advantage is that it's easy to understand. > > I don't think reference counting is any easier to understand than pure > mark-and-sweep or pure stop-and-copy. The main advantage of reference > counting in my opinion is that its restrictions force you to kick some > features out of your language design if you want to use it. ![]() Those features being "finalizers"? IMHO its main advantage is being prompt, so you don't have to worry about resources hanging around after they're no longer needed. -- Posted via http://www.ruby-forum.com/. |
|
|
|
#13 |
|
Messages: n/a
Hébergeur: |
> Those features being "finalizers"? IMHO its main advantage is being
> prompt, so you don't have to worry about resources hanging around after > they're no longer needed. I agree--it seems that the promptness would allow it to take advantage of the cpu caches to still be fast. The disadvantage, as some people above have pointed out, is that you may lose compactness of the heap space. Also it requires extension's 'containers' objects (those that include references to other objects that might somehow create cycles) to provide a 'traverse' funcion which yields a list of accessible pointers so that you can traverse containers and stomp cycles every so often. Very similar to today's 'gc_mark' function that they already provide. Today's extensions would also have to be slightly rewritten to use the 'dec' and 'inc' functions for the reference count of contained objects (similar to their gc_mark function, again). So anyway I agree--promptness is good. I don't know too much on the subject, though, having never read a paper on it ![]() -Roger -- Posted via http://www.ruby-forum.com/. |
|
|
|
#14 |
|
Messages: n/a
Hébergeur: |
Roger Pack wrote:
> I agree--it seems that the promptness would allow it to take advantage > of the cpu caches to still be fast. > The disadvantage, as some people above have pointed out, is that you may > lose compactness of the heap space. I'm not sure on this one. Given that a compacting collector needs several times as much RAM available as in use to be efficient, and that a reference-counting collector probably gives no more fragmentation than malloc, it's hard to say which way locality would go. The traditional objection to reference counting is that you spend a lot of time adjusting reference counts. But with CPUs are so much faster than RAM nowadays, that may matter less. Anyways, for more than you ever wanted to know about GC, here's a slightly-dated but still excellent survey paper: ftp://ftp.cs.utexas.edu/pub/garbage/bigsurv.ps -- Posted via http://www.ruby-forum.com/. |
|
|
|
#15 |
|
Messages: n/a
Hébergeur: |
> I'm not sure on this one. Given that a compacting collector needs
> several times as much RAM available as in use to be efficient, and that > a reference-counting collector probably gives no more fragmentation than > malloc, it's hard to say which way locality would go. No joke sometimes I agree and malloc is just 'good enough' ![]() > The traditional objection to reference counting is that you spend a lot > of time adjusting reference counts. But with CPUs are so much faster > than RAM nowadays, that may matter less. Anyways, for more than you > ever wanted to know about GC, here's a slightly-dated but still > excellent survey paper: > > ftp://ftp.cs.utexas.edu/pub/garbage/bigsurv.ps Thank you. I've wondered about this, myself, as, to my limited knowledge, a generational GC would need to 'alias' everything that's allocated (so it could move them to different generations), which would involve a memory redirection. I could be wrong. If so then that's a drawback to it. Whereas for RC, like you said, the objects themselves are already in cache, so the cpu can inc them quickly, and, IMO in the lifetime of an object, how many times is it going to be inc'ed? Maybe a few times plus once per scope change where it is assigned? Seems not too often, as typically few objects are within a given scope, AFAIK--maybe class variables and local variables. I would imagine that the counts aren't changed all that much, and, if they are, at least it's not changing the counts on all objects in memory (like mark and sweep), and it spreads the GC over time instead of huge show stoppers. Just my latest $.02 spouting off steam. Have a good evening. -Roger -- Posted via http://www.ruby-forum.com/. |
|
|
|
#16 |
|
Messages: n/a
Hébergeur: |
On Nov 10, 9:45 pm, Roger Pack <rogerpack2...@gmail.com> wrote:
> > I'm not sure on this one. Given that a compacting collector needs > > several times as much RAM available as in use to be efficient, and that > > a reference-counting collector probably gives no more fragmentation than > > malloc, it's hard to say which way locality would go. > > No joke sometimes I agree and malloc is just 'good enough' ![]() Heap fragmentation is quite a big problem with malloc, you can see that just by the number of malloc and other memory allocation frameworks that have been written over the years. > > > The traditional objection to reference counting is that you spend a lot > > of time adjusting reference counts. But with CPUs are so much faster > > than RAM nowadays, that may matter less. Anyways, for more than you > > ever wanted to know about GC, here's a slightly-dated but still > > excellent survey paper: > > >ftp://ftp.cs.utexas.edu/pub/garbage/bigsurv.ps > > Thank you. I've wondered about this, myself, as, to my limited > knowledge, a generational GC would need to 'alias' everything that's > allocated (so it could move them to different generations), which would > involve a memory redirection. I could be wrong. You are. Generational GCs (I wrote one for Rubinius) do not need double the memory as I assume you're implying. They use what's called a write barrier (a small chunk of code) that runs whenever an object reference is stored in another object. This code is very small and simply updates a small table. That table is used by the GC to make sure that it runs properly and can update object references as objects move around. > If so then that's a > drawback to it. Whereas for RC, like you said, the objects themselves > are already in cache, so the cpu can inc them quickly, and, IMO in the > lifetime of an object, how many times is it going to be inc'ed? Maybe a > few times plus once per scope change where it is assigned? Seems not > too often, as typically few objects are within a given scope, > AFAIK--maybe class variables and local variables. I would imagine that > the counts aren't changed all that much, and, if they are, at least it's > not changing the counts on all objects in memory (like mark and sweep), > and it spreads the GC over time instead of huge show stoppers. I suggest you look at all the research done on reference counting algorithms versus sweep ones. Most if not all research shows that reference counting is slower and more prone to bugs than modern techniques. > > Just my latest $.02 spouting off steam. > Have a good evening. > -Roger > -- > Posted viahttp://www.ruby-forum.com/. |
|
|
|
#17 |
|
Messages: n/a
Hébergeur: |
On Nov 11, 2007 9:35 AM, evanwebb@gmail.com <evanwebb@gmail.com> wrote:
> On Nov 10, 9:45 pm, Roger Pack <rogerpack2...@gmail.com> wrote: > > > I'm not sure on this one. Given that a compacting collector needs > > > several times as much RAM available as in use to be efficient, and that > > > a reference-counting collector probably gives no more fragmentation than > > > malloc, it's hard to say which way locality would go. > > > > No joke sometimes I agree and malloc is just 'good enough' ![]() > > Heap fragmentation is quite a big problem with malloc, you can see > that just by > the number of malloc and other memory allocation frameworks that have > been > written over the years. > > > > > > The traditional objection to reference counting is that you spend a lot > > > of time adjusting reference counts. But with CPUs are so much faster > > > than RAM nowadays, that may matter less. Anyways, for more than you > > > ever wanted to know about GC, here's a slightly-dated but still > > > excellent survey paper: > > > > >ftp://ftp.cs.utexas.edu/pub/garbage/bigsurv.ps > > > > Thank you. I've wondered about this, myself, as, to my limited > > knowledge, a generational GC would need to 'alias' everything that's > > allocated (so it could move them to different generations), which would > > involve a memory redirection. I could be wrong. > > You are. Generational GCs (I wrote one for Rubinius) do not need > double > the memory as I assume you're implying. They use what's called a write > barrier > (a small chunk of code) that runs whenever an object reference is > stored > in another object. This code is very small and simply updates a small > table. > That table is used by the GC to make sure that it runs properly and > can > update object references as objects move around. On a completely unrelated note, I was wondering... how did you manage to keep compatibility with existing C extensions without requiring the developer to explicitly set write barriers, in some cases? (Sorry for being off-topic.) Laurent |
|
|
|
#18 |
|
Messages: n/a
Hébergeur: |
On Nov 11, 6:11 am, Laurent Sansonetti <laurent.sansone...@gmail.com>
wrote: > On Nov 11, 2007 9:35 AM, evanw...@gmail.com <evanw...@gmail.com> wrote: > > > > > On Nov 10, 9:45 pm, Roger Pack <rogerpack2...@gmail.com> wrote: > > > > I'm not sure on this one. Given that a compacting collector needs > > > > several times as much RAM available as in use to be efficient, and that > > > > a reference-counting collector probably gives no more fragmentation than > > > > malloc, it's hard to say which way locality would go. > > > > No joke sometimes I agree and malloc is just 'good enough' ![]() > > > Heap fragmentation is quite a big problem with malloc, you can see > > that just by > > the number of malloc and other memory allocation frameworks that have > > been > > written over the years. > > > > > The traditional objection to reference counting is that you spend a lot > > > > of time adjusting reference counts. But with CPUs are so much faster > > > > than RAM nowadays, that may matter less. Anyways, for more than you > > > > ever wanted to know about GC, here's a slightly-dated but still > > > > excellent survey paper: > > > > >ftp://ftp.cs.utexas.edu/pub/garbage/bigsurv.ps > > > > Thank you. I've wondered about this, myself, as, to my limited > > > knowledge, a generational GC would need to 'alias' everything that's > > > allocated (so it could move them to different generations), which would > > > involve a memory redirection. I could be wrong. > > > You are. Generational GCs (I wrote one for Rubinius) do not need > > double > > the memory as I assume you're implying. They use what's called a write > > barrier > > (a small chunk of code) that runs whenever an object reference is > > stored > > in another object. This code is very small and simply updates a small > > table. > > That table is used by the GC to make sure that it runs properly and > > can > > update object references as objects move around. > > On a completely unrelated note, I was wondering... how did you manage > to keep compatibility with existing C extensions without requiring the > developer to explicitly set write barriers, in some cases? > The key is that C extensions don't have direct access to object references. A C extension accesses all objects via a handle table. A handle is what a C extension sees as an object. This lets the GC mutate objects (which are also in the handle table) but keep the handles at constant addresses (so they can be stored on the C stack). The big problem with this approach is the RARRAY(), RSTRING(), etc macros, that access an object directly as C data structure. Thats the main reason for trying to move MRI away from using these macros and to using something that looks like a function call that we in rubinius can implement differently. - Evan > (Sorry for being off-topic.) > > Laurent |
|
|
|
#19 |
|
Messages: n/a
Hébergeur: |
evanwebb@gmail.com wrote:
> On Nov 11, 6:11 am, Laurent Sansonetti <laurent.sansone...@gmail.com> ... >> On a completely unrelated note, I was wondering... how did you manage >> to keep compatibility with existing C extensions without requiring the >> developer to explicitly set write barriers, in some cases? >> > > The key is that C extensions don't have direct access to object > references. A C extension accesses all objects via a handle table. A > handle is what a C extension sees as an object. This lets the GC > mutate objects (which are also in the handle table) but keep the > handles at constant addresses (so they can be stored on the C stack). > > The big problem with this approach is the RARRAY(), RSTRING(), etc > macros, that access an object directly as C data structure. Thats the > main reason for trying to move MRI away from using these macros and to > using something that looks like a function call that we in rubinius > can implement differently. The extension code must also lock the handle and unlock in rb_ensure? -- vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407 |
|
|
|
#20 |
|
Messages: n/a
Hébergeur: |
evanwebb@gmail.com wrote:
> The big problem with this approach is the RARRAY(), RSTRING(), etc > macros, that access an object directly as C data structure. Thats the > main reason for trying to move MRI away from using these macros and to > using something that looks like a function call that we in rubinius > can implement differently. This is also, incidentally, why JRuby doesn't support extensions yet. The same techniques in Rubinius would apply equally well to JRuby through a JNI-level Ruby API. But so long as extensions abuse their direct memory access privileges, neither Rubinius nor JRuby can run them. - Charlie |
|
|
|
#21 |
|
Messages: n/a
Hébergeur: |
> I suggest you look at all the research done on reference counting
> algorithms > versus sweep ones. Most if not all research shows that reference > counting is slower > and more prone to bugs than modern techniques. Just throwing this thought out for public feedback. I noticed that some other scripting languages use Reference counting. Ok just Python. Here are its reasons (for feedback). ==Begin quote Why doesn’t Python use a more traditional garbage collection scheme? For one thing, this is not a C standard feature and hence it’s not portable. (Yes, we know about the Boehm GC library. It has bits of assembler code for most common platforms, not for all of them, and although it is mostly transparent, it isn’t completely transparent; patches are required to get Python to work with it.) Traditional GC also becomes a problem when Python is embedded into other applications. While in a standalone Python it’s fine to replace the standard malloc() and free() with versions provided by the GC library, an application embedding Python may want to have its own substitute for malloc() and free(), and may not want Python’s. Right now, Python works with anything that implements malloc() and free() properly. Note that on systems using traditional GC, code that uses external resources without explicitly releasing them may run out of resources before the GC kicks in. Consider this example: class Resource: def __init__(self, name): self.handle = allocate_resource(name) def __del__(self): if self.handle: self.close() def close(self): release_resource(self.handle) self.handle = None ... for name in big_list: x = Resource(name) do something with x In current releases of CPython, each new assignment to x inside the loop will release the previously allocated resource. Using GC, this is not guaranteed. ==End Quote Oh except that in current Ruby it is still guaranteed to free (in its own klunky way). Any thoughts? -- Posted via http://www.ruby-forum.com/. |
|
|
|
#22 |
|
Messages: n/a
Hébergeur: |
Rick Denatale wrote:
> Roger, you REALLY need to read the literature on GC which has been > accumulating for the past 50 years. > > Reference counting is pretty much an obsolete approach to GC. It was > probably the first approach taken for lisp back in the 1950s. Other > language implementations usually started with reference counting (e.g. > the first Smalltalk). > > It's main advantage is that it's easy to understand. On the other hand > it incurs a large overhead since counts need to be > incremented/decremented on every assignment. It can't detect circular > lists of dead objects. In early Smalltalk programs when reference > counting was used, you needed to explicitly nil out references to > break such chains. There's also the issue of the overhead for storing > the reference count, and how many bits to allocate. Most reference > counting implementations punt when the reference count overflows, they > treat a 'full' count as an infinite count and no longer decrement it, > leading to more uncollectable objects. > > Mark and sweep, such as is used in the Ruby 1.8 implementation quickly > replaced reference counting as the simplest GC considered for real > use. This is somewhat confusing for me, because it seems that using a mark and sweep generational style GC slows down Ruby, as per Matz's observations: http://www.ruby-forum.com/topic/123747 On my own I tried integrating Boehm's GC as a drop in replacement and it seemed to cause a serious slowdown [possibly because I didn't integrate it right :P ] Also macruby reports that their runtime is slightly slower, I'll make a big assumption--because it uses a conservative GC so it can't allocate as cheaply [however their garbage collection is faster, and runs in a different thread so doesn't cause a program halt--so not all bad there] Performance-wise, Python uses reference counting and seems to have 'acceptable' performance. They avoid the circular problems by running a full mark and sweep every so often. I suppose I'm just naive, but it doesn't seem clear to me which is the 'best' GC style, from the above observations. There appears to be no clear answer. Cheers! -R > Which particular GC approach is best for Ruby is subject to some study. -- Posted via http://www.ruby-forum.com/. |
|
|
|
#23 |
|
Messages: n/a
Hébergeur: |
[Note: parts of this message were removed to make it a legal post.] On Sun, 2008-07-20 at 09:43 +0900, Roger Pack wrote: > This is somewhat confusing for me, because it seems that using a mark > and sweep generational style GC slows down Ruby, as per > > Matz's observations: http://www.ruby-forum.com/topic/123747 Badly-written garbage collection slows things down. This is no surprise. Badly written string handling or maths handling slows things down too. > On my own I tried integrating Boehm's GC as a drop in replacement and it > seemed to cause a serious slowdown [possibly because I didn't integrate > it right :P ] Whereas I snuck Boehm's GC into an employer's product (to find memory leaks, initially, but later just kept it in as a memory manager) and nobody noticed it until long after it was released. Nobody noticed because it actually improved performance in that persistent memory leaks vanished and as a result swapping was reduced (among other memory-related bottlenecks). Of course then the retards removed the GC because "garbage collection is slow" and the next release after that died a horrible death of a million paper cuts (read: memory leaks). > I suppose I'm just naive, but it doesn't seem clear to me which is the > 'best' |