|
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hey all! To say I suck at regex is an understatement so really need any I can get on this, I have a page of text with different html tags in them, but each "block" of text has a <p> or a < class="something"> tag... anybody have any regex that will catch each of these paragraphs and put then into an array example: array[0]="<p> first block </p>"; array[1]="<p class="blah"> block X</p>"; Thanks! R __________________________________________________ __________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i...Dypao8Wcj9tAcJ |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
On Mon, May 5, 2008 at 9:59 PM, Ryan S <genphp@yahoo.com> wrote:
> To say I suck at regex is an understatement so really need any I can get on this, I have a page of text with different html tags in them, but each "block" of text has a <p> or a < class="something"> tag... anybody have any regex that will catch each of these paragraphs and put then into an array If you're using php5 you can use DOM's getElementsByTagName. If you still think you need to do some sort of regex it is possible but it will be buggy at best. |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
Ryan S wrote:
> Hey all! > > To say I suck at regex is an understatement so really need any I can get on this, I have a page of text with different html tags in them, but each "block" of text has a <p> or a < class="something"> tag... anybody have any regex that will catch each of these paragraphs and put then into an array > example: > array[0]="<p> first block </p>"; > array[1]="<p class="blah"> block X</p>"; > > Thanks! > R > Hi, Maybe the example is overkill, but I give you a quick setup that can save you some time finding HTML tags with a certain attribute. <?php $html = <<<END_OF_HTML <b>hello</b> <b class="blah">hello</b> <p>hello</p> <p class="blah">hello</p> <a>hello</a> <a href="url">hello</a> END_OF_HTML; $tags = array(); $tags[] = 'p'; $tags[] = 'a'; $tags = implode('|', $tags); $pattern = '/<('.$tags.')[^>]*>/i'; echo $pattern."\n"; preg_match_all($pattern, $html, $matches); var_dump($matches); ?> I'm not an expression guru either, but I think it works OK. I had to find 'link', 'img', 'a' and other tags in HTML and used a more complex expression for it which worked like a charm. It's just an example. For you, you have to leave away the 'a' tag in the $tags array, to get what you want. Hope it s! -- Aschwin Wesselius /'What you would like to be done to you, do that to the other....'/ |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
Aschwin Wesselius wrote:
> Ryan S wrote: >> Hey all! >> >> To say I suck at regex is an understatement so really need any I >> can get on this, I have a page of text with different html tags in >> them, but each "block" of text has a <p> or a < class="something"> >> tag... anybody have any regex that will catch each of these >> paragraphs and put then into an array >> example: >> array[0]="<p> first block </p>"; >> array[1]="<p class="blah"> block X</p>"; >> >> Thanks! >> R >> > Hi, > > Maybe the example is overkill, but I give you a quick setup that can > save you some time finding HTML tags with a certain attribute. Hi, I'm sorry. I didn't read your request properly. Below you'll have a correct solution: <?php $html = <<<END_OF_HTML <b>hello</b> <b class="blah">hello</b> <p>hello</p> <p class="blah">hello</p> <a>hello</a> <a href="url">this</a> <a>hello</a> <a href="regex yo">hello</a> <a>hello</a> <a id="2" href="regex yo">hello</a> <p>that</p> <p class="blah" title="whatever">hello</p> END_OF_HTML; $tags = array(); $tags[] = 'p'; $tags[] = 'a'; $attr = array(); $attr[] = 'class'; $attr[] = 'href'; $vals = array(); $vals[] = 'blah'; $vals[] = 'url'; $vals[] = 'yo'; $text = array(); $text[] = 'hello'; $text[] = 'this'; $text[] = 'that'; $tags = implode('|', $tags); $attr = implode('|', $attr); $vals = implode('|', $vals); $text = implode('|', $text); $pattern = '/<('.$tags.')[^>]*('.$attr.')[^>]*('.$vals.')[^>]*>('.$text.')[^<\/]*<\/\1>/i'; echo $pattern."\n"; echo "--------------------\n"; preg_match_all($pattern, $html, $matches); var_dump($matches); ?> |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
Aschwin Wesselius wrote:
> Aschwin Wesselius wrote: >> Ryan S wrote: >>> Hey all! >>> >>> To say I suck at regex is an understatement so really need any >>> I can get on this, I have a page of text with different html tags in >>> them, but each "block" of text has a <p> or a < class="something"> >>> tag... anybody have any regex that will catch each of these >>> paragraphs and put then into an array >>> example: >>> array[0]="<p> first block </p>"; >>> array[1]="<p class="blah"> block X</p>"; >>> >>> Thanks! >>> R >>> >> Hi, >> >> Maybe the example is overkill, but I give you a quick setup that can >> save you some time finding HTML tags with a certain attribute. > > Hi, > > I'm sorry. I didn't read your request properly. Below you'll have a > correct solution: Hi, It is obvious I haven't had my caffeine yet. This is my last try to get the pattern straight: <?php $html = <<<END_OF_HTML <b>hello</b> <b class="blah">hello</b> <p>those</p> <p class="blah">hello</p> <a>hello</a> <a href="url">this</a> <a>rose</a> <a href="regex yo">hello</a> <a>nose</a> <a id="2" href="regex yo">hello</a> <p>that</p> <p class="blah" title="whatever">hello</p> END_OF_HTML; $tags = array(); $tags[] = 'p'; $tags[] = 'a'; $attr = array(); $attr[] = 'class'; $attr[] = 'href'; $vals = array(); $vals[] = 'blah'; $vals[] = 'url'; $vals[] = 'yo'; $text = array(); $text[] = 'hello'; $text[] = 'this'; $text[] = 'that'; $tags = implode('|', $tags); $attr = implode('|', $attr); $vals = implode('|', $vals); $text = implode('|', $text); $pattern = '/<('.$tags.')[^>]*('.$attr.')?[^>]*('.$vals.')?[^>]*>('.$text.')[^<\/]*<\/\1>/i'; echo $pattern."\n"; echo "--------------------\n"; preg_match_all($pattern, $html, $matches); var_dump($matches); ?> -- Aschwin Wesselius /'What you would like to be done to you, do that to the other....'/ |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
Or you could simplify and just do this:
$html = <<<END_OF_HTML <b>hello</b> <b class="blah">hello</b> <p>those</p> <p class="blah">hello</p> <a>hello</a> <a href="url">this</a> <a>rose</a> <a href="regex yo">hello</a> <a>nose</a> <a id="2" href="regex yo">hello</a> <p>that</p> <p class="blah" title="whatever">hello</p> END_OF_HTML; // This will give you any tag preg_match_all("/<[\s\S]*?>*?<\/[\s\S]*?>/", $html, $matches); print_r($matches); // This will give you any p tag preg_match_all("/<p[\s\S]*?>*?<\/p[\s\S]*?>/", $html, $matches); print_r($matches); |
|
![]() |
| Outils de la discussion | |
|
|