|
|
|
#1 |
|
Messages: n/a
Hébergeur: |
I have an xml file and sometimes i call the find_first_recursive method;
when my xml file is small its working fine but when i have ~900 lines im waiting ~15 seconds to return me the wanted node and i want something faster; How can i obtain a better time? I would have tried libxml but i had some problems to install it under windows. -- Posted via http://www.ruby-forum.com/. |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
[Note: parts of this message were removed to make it a legal post.]
have you tried hpricot? On Sat, Mar 29, 2008 at 6:46 PM, Bu Mihai <mihai.bulhac@yahoo.com> wrote: > I have an xml file and sometimes i call the find_first_recursive method; > when my xml file is small its working fine but when i have ~900 lines im > waiting ~15 seconds to return me the wanted node and i want something > faster; How can i obtain a better time? > > I would have tried libxml but i had some problems to install it under > windows. > -- > Posted via http://www.ruby-forum.com/. > > |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
Mark Ryall wrote:
> have you tried hpricot? not yet; its faster? -- Posted via http://www.ruby-forum.com/. |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
Bu Mihai wrote:
>> have you tried hpricot? > > not yet; its faster? In general, yes. REXML stands for "Regular Expressions XML", and Regexps are very slow when you abuse them. Even a simple parser written in pure Ruby would have been faster. Hpricot trades a weaker parser for code optimized with C. It is also more forgiving; sometimes crappy HTML needs that. http://www.oreillynet.com/onlamp/blo...hpricot_1.html -- Phlip |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
On 29.03.2008 08:46, Bu Mihai wrote:
> I have an xml file and sometimes i call the find_first_recursive method; > when my xml file is small its working fine but when i have ~900 lines im > waiting ~15 seconds to return me the wanted node and i want something > faster; How can i obtain a better time? What's the find criteria you use? Maybe you can use XPath. 900 lines does not really sound large so I suspect there might be an algorithmic or design error. Kind regards robert |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
Robert Klemme wrote:
> On 29.03.2008 08:46, Bu Mihai wrote: >> I have an xml file and sometimes i call the find_first_recursive method; >> when my xml file is small its working fine but when i have ~900 lines im >> waiting ~15 seconds to return me the wanted node and i want something >> faster; How can i obtain a better time? > > What's the find criteria you use? Maybe you can use XPath. 900 lines > does not really sound large so I suspect there might be an algorithmic > or design error. > > Kind regards > > robert this is the criteria: node=rexml_element.find_first_recursive {|node| node.attributes["again"]=="yes"} -- Posted via http://www.ruby-forum.com/. |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
On 30.03.2008 20:23, Bu Mihai wrote:
> Robert Klemme wrote: >> On 29.03.2008 08:46, Bu Mihai wrote: >>> I have an xml file and sometimes i call the find_first_recursive method; >>> when my xml file is small its working fine but when i have ~900 lines im >>> waiting ~15 seconds to return me the wanted node and i want something >>> faster; How can i obtain a better time? >> What's the find criteria you use? Maybe you can use XPath. 900 lines >> does not really sound large so I suspect there might be an algorithmic >> or design error. > > this is the criteria: > > node=rexml_element.find_first_recursive {|node| > node.attributes["again"]=="yes"} That's easy doc.elements.each('//[@again="yes"]') do |node| # any node that has attribute again with value yes end And I am pretty sure that this is faster than your approach. What does your program do? With more context we can come up with further suggestions. Kind regards robert |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
Robert Klemme wrote:
> On 30.03.2008 20:23, Bu Mihai wrote: >> this is the criteria: >> >> node=rexml_element.find_first_recursive {|node| >> node.attributes["again"]=="yes"} > > That's easy > > doc.elements.each('//[@again="yes"]') do |node| > # any node that has attribute again with value yes > end > > And I am pretty sure that this is faster than your approach. What does > your program do? With more context we can come up with further > suggestions. > > Kind regards > > robert I'm not sure if that will works, i have a xml file with this structure(and it must be like this, the following example is a simple sample of the original): <root> <new_section> <pages> <page again="yes">page1</page> <page again="no">page2</page> <page againe=yes"">page3 <pages> <page again="no">page4<page> <page again="yes"> <pages>....and so on </pages> </page> </pages> </new_section> <new_section> </root> I have a recursive function to find all 'page' nodes with attribute 'again' 'yes but i need to start the searc from the beging of the file or from the current node and the display all subnodes with 'yes'; after the all nodes was founded then i need to search them again from the begining of the file; its something like this: def find(xml_file) node=xml_file.find_first_recursive {|node| node.attributes["again"]=="yes"} if not(node==nil) then puts node.text find(xml_file.elements[node]) else find(xml_file.elements["//"]) end end In this example the find function is an endless loop, somewhere i must put a return, but i need something like that and when my file is big (~900) i wait ~10 seconds for the command (but not always - only when i'm starting to search from the beging of the file): node=xml_file.find_first_recursive {|node| node.attributes["again"]=="yes"} Many thanks for your Robert. -- Posted via http://www.ruby-forum.com/. |
|
|
|
#9 |
|
Messages: n/a
Hébergeur: |
On 30.03.2008 22:21, Bu Mihai wrote:
> Robert Klemme wrote: >> On 30.03.2008 20:23, Bu Mihai wrote: >>> this is the criteria: >>> >>> node=rexml_element.find_first_recursive {|node| >>> node.attributes["again"]=="yes"} >> That's easy >> >> doc.elements.each('//[@again="yes"]') do |node| >> # any node that has attribute again with value yes >> end >> >> And I am pretty sure that this is faster than your approach. What does >> your program do? With more context we can come up with further >> suggestions. > > I'm not sure if that will works, i have a xml file with this > structure(and it must be like this, the following example is a simple > sample of the original): > <root> > <new_section> > <pages> > <page again="yes">page1</page> > <page again="no">page2</page> > <page againe=yes"">page3 > <pages> > <page again="no">page4<page> > <page again="yes"> > <pages>....and so on > </pages> > </page> > > </pages> > </new_section> > <new_section> > </root> > > I have a recursive function to find all 'page' nodes with attribute > 'again' 'yes but i need to start the searc from the beging of the file > or from the current node and the display all subnodes with 'yes'; You can use the XPath from the root and I believe also from a particulara node. > after > the all nodes was founded then i need to search them again from the > begining of the file; When I asked what your program does, I really meant: Can you explain in non technical words what this program is supposed to do? Since you seem to traverse over the same nodes over and over again I have the strong feeling that there is a better alternative - but for that we need to know the purpose of the program. > Many thanks for your Robert. You're welcome. Kind regards robert |
|
|
|
#10 |
|
Messages: n/a
Hébergeur: |
Im trying to build a map and to memorize all routes. I have a root node
wich will generate some roads and each road will generate another roads and i have to go on all roads until there is no road unchecked. If im on a road and that road generates new roads then to go an all generated road i must begin my route from the begining not from the road who generates his child roads. <root> <roads> <road testit="yes" again="yes" duplicate_road="no" >road1</road> <roads> <road testit="no" again="yes" duplicate_road="no">road3</road> </roads> <road testit="no" again="yes" duplicate_road="no">road2</road> <roads> <road testit="no" again="no" duplicate_road="yes">road3</road> </roads> </roads> </root> I have the root node who generate two roads: road1 and road2 and i must verify this roads and check if each road will generate new roads; if yes then i must set "again=yes" because that road has "childs" who must be checked. So for example road1 generate road3 but to get to road3 i must go to root->road1->road3 and so on... (if road3 generates 3 another roads to go on one road i must have root->road1->road3->road3_1 or road3_2 or road3_3) Also i must have a attribute duplicate_road; for example if road2 generates also road3 then i will compare all checked roads till that moment and if it is found then that means it is a duplicate road so i mustnt check if again (again=no) And so i can generate in xml a map with roads (for the moment i dont care which path is shorter only to find a path from the root to the road_x based on the xml map). Tnx. -- Posted via http://www.ruby-forum.com/. |
|
|
|
#11 |
|
Messages: n/a
Hébergeur: |
2008/3/31, Bu Mihai <mihai.bulhac@yahoo.com>:
> Im trying to build a map and to memorize all routes. I have a root node > wich will generate some roads and each road will generate another roads > and i have to go on all roads until there is no road unchecked. > If im on a road and that road generates new roads then to go an all > generated road i must begin my route from the begining not from the road > who generates his child roads. Ok, a pretty straightforward graph problem. It is a bad idea to do that on the raw XML data. You should create a representation of the road data that suits your algorithm better. Then read the whole XML only once, create that representation and implement your algorithm on your internal representation. Doing it on the XML is certainly the worst option. Kind regards robert -- use.inject do |as, often| as.you_can - without end |
|
|
|
#12 |
|
Messages: n/a
Hébergeur: |
|
|
|
|
#13 |
|
Messages: n/a
Hébergeur: |
2008/3/31, Bu Mihai <mihai.bulhac@yahoo.com>:
> and what do you recomand? ? I gave my recommendations already. You sure do not expect me to code that up for you, do you? Kind regards robert -- use.inject do |as, often| as.you_can - without end |
|
|
|
#14 |
|
Messages: n/a
Hébergeur: |
Robert Klemme wrote:
> 2008/3/31, Bu Mihai <mihai.bulhac@yahoo.com>: >> and what do you recomand? > > ? I gave my recommendations already. You sure do not expect me to > code that up for you, do you? > > Kind regards > > robert No of course not, i meant what algorithm would u recomand and in what would be better to implement it (any ruby gem?)... Thanks, i'll do a search to find out. -- Posted via http://www.ruby-forum.com/. |
|
|
|
#15 |
|
Messages: n/a
Hébergeur: |
I believe the core problem is that XML itself is pretty sub-optimal for
almost everything ![]() Is anyone updating the REXML website by the way? I believe it would be interesting to see exactly these kind of speed issues handled on the website because if i am not mistaken, these questions and problems continually pop-up with *XML -- Posted via http://www.ruby-forum.com/. |
|
|
|
#16 |
|
Messages: n/a
Hébergeur: |
2008/3/31, Bu Mihai <mihai.bulhac@yahoo.com>:
> Robert Klemme wrote: > > 2008/3/31, Bu Mihai <mihai.bulhac@yahoo.com>: > >> and what do you recomand? > > > > ? I gave my recommendations already. You sure do not expect me to > > code that up for you, do you? > > No of course not, i meant what algorithm would u recomand and in what > would be better to implement it (any ruby gem?)... Ah, ok misunderstood you. Backtracking comes to mind. Before you change the algorithm you could start by creating few classes based on the info you have in the XML file and use those. I would have to think longer about this to come up with more profound suggestions. Cheers robert -- use.inject do |as, often| as.you_can - without end |
|
|
|
#17 |
|
Messages: n/a
Hébergeur: |
2008/3/31, Marc Heiler <shevegen@linuxmail.org>:
> I believe the core problem is that XML itself is pretty sub-optimal for > almost everything ![]() As always, there are problems where this tool (XML) is suited good, less good and not at all. > Is anyone updating the REXML website by the way? I believe it would be > interesting to see exactly these kind of speed issues handled on the > website because if i am not mistaken, these questions and problems > continually pop-up with *XML Not sure whether I agree: IMHO in this case the problem is a misapplication of XML. XML is good for persisting structured data but not as an in memory model for calculations. Kind regards robert -- use.inject do |as, often| as.you_can - without end |
|
|
|
#18 |
|
Messages: n/a
Hébergeur: |
Robert Klemme wrote:
> 2008/3/31, Marc Heiler <shevegen@linuxmail.org>: >> I believe the core problem is that XML itself is pretty sub-optimal for >> almost everything ![]() > > As always, there are problems where this tool (XML) is suited good, > less good and not at all. > >> Is anyone updating the REXML website by the way? I believe it would be >> interesting to see exactly these kind of speed issues handled on the >> website because if i am not mistaken, these questions and problems >> continually pop-up with *XML > > Not sure whether I agree: IMHO in this case the problem is a > misapplication of XML. XML is good for persisting structured data but > not as an in memory model for calculations. > > Kind regards > > robert What is IMHO?; i'm a newbie in ruby and at first i was searching something similary with C++ tree structure (beacause that i would use if is was C) but i want in ruby and i was searching some gem to me because i have to learn more about ruby language to build my own ruby class to work with. I've tried xml beacuse it was the best (!?) i found in ruby for implementing a tree structure (not only a binary tree), but it is slowly when i want to read a big structure. -- Posted via http://www.ruby-forum.com/. |
|
|
|
#19 |
|
Messages: n/a
Hébergeur: |
2008/3/31, Bu Mihai <mihai.bulhac@yahoo.com>:
> Robert Klemme wrote: > > 2008/3/31, Marc Heiler <shevegen@linuxmail.org>: > >> I believe the core problem is that XML itself is pretty sub-optimal for > >> almost everything ![]() > > > > As always, there are problems where this tool (XML) is suited good, > > less good and not at all. > > > >> Is anyone updating the REXML website by the way? I believe it would be > >> interesting to see exactly these kind of speed issues handled on the > >> website because if i am not mistaken, these questions and problems > >> continually pop-up with *XML > > > > Not sure whether I agree: IMHO in this case the problem is a > > misapplication of XML. XML is good for persisting structured data but > > not as an in memory model for calculations. > > What is IMHO? http://www.google.com/search?q=imho http://dictionary.reference.com/search?q=imho >; i'm a newbie in ruby and at first i was searching > something similary with C++ tree structure (beacause that i would use if > is was C) but i want in ruby and i was searching some gem to me > because i have to learn more about ruby language to build my own ruby > class to work with. I believe it works better the other way round: understand a concept (such as "tree", which is not too difficult) and implement it in Ruby. http://raa.ruby-lang.org/search.rhtml?search=tree Apart from that, it's easy to roll your own: TreeNode = Struct.new :data, :parent, :children do def initialize(data = nil, parent = nil) self.data = data self.parent = parent self.children = [] end end > I've tried xml beacuse it was the best (!?) i found in ruby for > implementing a tree structure (not only a binary tree), but it is slowly > when i want to read a big structure. 900 lines XML is far from a "big structure". And XML is format for /persistently/ storing structured data - mostly in files. An XML DOM is nothing you want to do complex non XML operations on. because of the overhead. Kind regards robert -- use.inject do |as, often| as.you_can - without end |
|
|
|
#20 |
|
Messages: n/a
Hébergeur: |
Thanks a lot for ing me, ive also find this
http://rubytree.rubyforge.org/ which i think it is what i want from the begining; i can build my map with rubytree and the save it in a xml file. I will check your links too. Thanks again Robert. -- Posted via http://www.ruby-forum.com/. |
|
![]() |
| Outils de la discussion | |
|
|