PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Autres forums > Forum Programmation & Conception > php.general > html to xml?
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
html to xml?

Réponse
 
LinkBack Outils de la discussion
Vieux 12/09/2007, 09h39   #1
Slith
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut html to xml?

i need to parse an html page for tabular data which i can then import
into mysql so i thought converting the html to xml might be a feasible
thing to do, however, other than using tidy from the command line i
can't find a way to do this from php.

does anyone know of any class (or other) that would allow me to do this?
or maybe i just need a different approach.
  Réponse avec citation
Vieux 12/09/2007, 10h44   #2
Per Jessen
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: [PHP] html to xml?

Slith wrote:

> i need to parse an html page for tabular data which i can then import
> into mysql so i thought converting the html to xml might be a feasible
> thing to do, however, other than using tidy from the command line i
> can't find a way to do this from php.
>
> does anyone know of any class (or other) that would allow me to do
> this? or maybe i just need a different approach.


Is this a one-off or will you be doing this often?

For a one-off I would just use sed/grep/awk/cut/tr etc. - HTML pages are
rarely syntactically correct, so trying to parse them or even turn them
into XML is tiresome at best.


/Per Jessen, Zürich
  Réponse avec citation
Vieux 12/09/2007, 11h00   #3
Edward Kay
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut RE: [PHP] html to xml?


> Slith wrote:
>
> > i need to parse an html page for tabular data which i can then import
> > into mysql so i thought converting the html to xml might be a feasible
> > thing to do, however, other than using tidy from the command line i
> > can't find a way to do this from php.
> >
> > does anyone know of any class (or other) that would allow me to do
> > this? or maybe i just need a different approach.

>
> Is this a one-off or will you be doing this often?
>
> For a one-off I would just use sed/grep/awk/cut/tr etc. - HTML pages are
> rarely syntactically correct, so trying to parse them or even turn them
> into XML is tiresome at best.


For one-off, I'd simply copy/paste the data from the browser into Excel/OO Calc and save as CSV.

Edward
  Réponse avec citation
Vieux 12/09/2007, 19h52   #4
mike
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: [PHP] html to xml?

On 9/12/07, Slith <slithone@gmail.com> wrote:
> i need to parse an html page for tabular data which i can then import
> into mysql so i thought converting the html to xml might be a feasible
> thing to do, however, other than using tidy from the command line i
> can't find a way to do this from php.
>
> does anyone know of any class (or other) that would allow me to do this?
> or maybe i just need a different approach.


use tidy.

i do it all the time. note that it does its best, but it will get to
xhtml at least; which is an xml compliant document. there's a pecl
module for it, or you can just install the command line and run
system() type calls too. http://tidy.sf.net

(that would be if this is a regular thing, if it's a one time thing
then yeah... do what Edward suggested and just manually do it once.
sometimes you can't script things or it's too much effort... i've done
way too many migrations and you usually always have some manual work
)
  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 16h16.


Édité par : vBulletin® version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières ©2000-2008
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,11342 seconds with 12 queries