PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Autres forums > Forum Programmation & Conception > comp.lang.ruby > [Q] difference between StringScanner#scan and Regexp#match
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
[Q] difference between StringScanner#scan and Regexp#match

Réponse
 
LinkBack Outils de la discussion
Vieux 24/02/2008, 04h17   #1
makoto kuwata
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut [Q] difference between StringScanner#scan and Regexp#match

Hi,

I'm planning to implement StringScanner in pure Ruby.
But I found that it is hard to implement StringScanner#scan()
in pure Ruby, because of the difference between Regexp#match()
and StringScanner#scan().

StringScanner#scan() matches only when pattern matches at the
beginning (or at the current position) of input string.

require 'strscan'
input = 'foo 123'
scanner = StringScanner.new(input)
p scanner.scan(/\d+/) #=> nil

But Regexp#match() matches whenever input string contains pattern.

input = 'foo 123'
m = /\d+/.match(input)
p m[0] if m #=> "123"


Is it possible to restrict Regexp#match() to match only when
pattern starts at the beginning of input string?
My idea is to convert /regexp/ into /\A(?:regexp)/ every time,
but it is a litte ugly.
Is there any good idea to emulate StringScanner#scan in pure Ruby?

--
regards,
makoto kuwata

  Réponse avec citation
Vieux 24/02/2008, 05h33   #2
Michael Fellinger
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: [Q] difference between StringScanner#scan and Regexp#match

On Sun, Feb 24, 2008 at 1:20 PM, makoto kuwata <kwa@kuwata-lab.com> wrote:
> Hi,
>
> I'm planning to implement StringScanner in pure Ruby.
> But I found that it is hard to implement StringScanner#scan()
> in pure Ruby, because of the difference between Regexp#match()
> and StringScanner#scan().
>
> StringScanner#scan() matches only when pattern matches at the
> beginning (or at the current position) of input string.
>
> require 'strscan'
> input = 'foo 123'
> scanner = StringScanner.new(input)
> p scanner.scan(/\d+/) #=> nil
>
> But Regexp#match() matches whenever input string contains pattern.
>
> input = 'foo 123'
> m = /\d+/.match(input)
> p m[0] if m #=> "123"
>
>
> Is it possible to restrict Regexp#match() to match only when
> pattern starts at the beginning of input string?
> My idea is to convert /regexp/ into /\A(?:regexp)/ every time,
> but it is a litte ugly.
> Is there any good idea to emulate StringScanner#scan in pure Ruby?


input = 'foo 123'
if (input =~ /\d+/) == 0
p $& # doesn't happen
end

  Réponse avec citation
Vieux 24/02/2008, 07h43   #3
makoto kuwata
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: difference between StringScanner#scan and Regexp#match

Michael Fellinger <m.fellin...@gmail.com> wrote:
> input = 'foo 123'
> if (input =~ /\d+/) == 0
> p $& # doesn't happen
> end


thank you Michael, but it is slow and not efficient, especially input
string is long, I think.

--
regards,
makoto kuwata
  Réponse avec citation
Vieux 24/02/2008, 09h21   #4
Michael Fellinger
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: difference between StringScanner#scan and Regexp#match

On Sun, Feb 24, 2008 at 4:44 PM, makoto kuwata <kwa@kuwata-lab.com> wrote:
> Michael Fellinger <m.fellin...@gmail.com> wrote:
> > input = 'foo 123'
> > if (input =~ /\d+/) == 0
> > p $& # doesn't happen
> > end

>
> thank you Michael, but it is slow and not efficient, especially input
> string is long, I think.


You are right of course, but i don't know any other way, =~ is about
as fast as you can get already without modifying the regular
expression.

>
> --
> regards,
> makoto kuwata
>
>


  Réponse avec citation
Vieux 25/02/2008, 22h12   #5
Caleb Clausen
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: [Q] difference between StringScanner#scan and Regexp#match

On 2/23/08, makoto kuwata <kwa@kuwata-lab.com> wrote:
> Hi,
>
> I'm planning to implement StringScanner in pure Ruby.
> But I found that it is hard to implement StringScanner#scan()
> in pure Ruby, because of the difference between Regexp#match()
> and StringScanner#scan().
>
> StringScanner#scan() matches only when pattern matches at the
> beginning (or at the current position) of input string.
>
> require 'strscan'
> input = 'foo 123'
> scanner = StringScanner.new(input)
> p scanner.scan(/\d+/) #=> nil
>
> But Regexp#match() matches whenever input string contains pattern.
>
> input = 'foo 123'
> m = /\d+/.match(input)
> p m[0] if m #=> "123"
>
>
> Is it possible to restrict Regexp#match() to match only when
> pattern starts at the beginning of input string?
> My idea is to convert /regexp/ into /\A(?:regexp)/ every time,
> but it is a litte ugly.
> Is there any good idea to emulate StringScanner#scan in pure Ruby?


I've done this kind of thing before, and rewriting the regex was the
best I could come up with. If Michael's suggestion is too slow for
you, then regex rewriting is the only game in town. (Actually, if you
can determine that your regexes always require some leading substring,
you might be able to optimize Michael's way a bit more...)

If speed is an issue, why not just use the existing StringScanner?
Creating regex's at runtime can cost you quite a bit in performance as
well... some caching can here, if the same regexes are likely to
be encountered again.

You might want to take a look at String#index (and it's 2nd parameter)
rather than String#match or Regexp#match, as it allows you to start
matching wherever you want in the string, rather than just the
beginning. That doesn't with your immediate question, but maybe
it'll give you some ideas of different ways to approach it.

Finally, a moment of self-promotion. My library 'sequence' implements
basically what you want (using regex rewriting, which in the full
elaboration gets rather involved). Maybe it could save you some
effort...

  Réponse avec citation
Vieux 26/02/2008, 04h00   #6
Michael Fellinger
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: [Q] difference between StringScanner#scan and Regexp#match

On Tue, Feb 26, 2008 at 7:12 AM, Caleb Clausen <vikkous@gmail.com> wrote:
>
> On 2/23/08, makoto kuwata <kwa@kuwata-lab.com> wrote:
> > Hi,
> >
> > I'm planning to implement StringScanner in pure Ruby.
> > But I found that it is hard to implement StringScanner#scan()
> > in pure Ruby, because of the difference between Regexp#match()
> > and StringScanner#scan().
> >
> > StringScanner#scan() matches only when pattern matches at the
> > beginning (or at the current position) of input string.
> >
> > require 'strscan'
> > input = 'foo 123'
> > scanner = StringScanner.new(input)
> > p scanner.scan(/\d+/) #=> nil
> >
> > But Regexp#match() matches whenever input string contains pattern.
> >
> > input = 'foo 123'
> > m = /\d+/.match(input)
> > p m[0] if m #=> "123"
> >
> >
> > Is it possible to restrict Regexp#match() to match only when
> > pattern starts at the beginning of input string?
> > My idea is to convert /regexp/ into /\A(?:regexp)/ every time,
> > but it is a litte ugly.
> > Is there any good idea to emulate StringScanner#scan in pure Ruby?

>
> I've done this kind of thing before, and rewriting the regex was the
> best I could come up with. If Michael's suggestion is too slow for
> you, then regex rewriting is the only game in town. (Actually, if you
> can determine that your regexes always require some leading substring,
> you might be able to optimize Michael's way a bit more...)
>
> If speed is an issue, why not just use the existing StringScanner?
> Creating regex's at runtime can cost you quite a bit in performance as
> well... some caching can here, if the same regexes are likely to
> be encountered again.


I, for one, started to work on a StringScanner replacement as well
just for fun. But it could be useful for rubinius to have a pure Ruby
implementation that can be augmented with C in some core areas by a
simple require.

> You might want to take a look at String#index (and it's 2nd parameter)
> rather than String#match or Regexp#match, as it allows you to start
> matching wherever you want in the string, rather than just the
> beginning. That doesn't with your immediate question, but maybe
> it'll give you some ideas of different ways to approach it.
>
> Finally, a moment of self-promotion. My library 'sequence' implements
> basically what you want (using regex rewriting, which in the full
> elaboration gets rather involved). Maybe it could save you some
> effort...


Thanks for the hint: http://sequence.rubyforge.org for anyone who is
too lazy to type it

^ manveru

  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 19h20.


Édité par : vBulletin® version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières ©2000-2008
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,16456 seconds with 14 queries