PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Forums Hébergement > Diriger une société d'hébergement > alt.internet.seo > Froogle Website "scraper"
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
alt.internet.seo Internet search engines and related topics.

Froogle Website "scraper"

Réponse
 
LinkBack Outils de la discussion
Vieux 11/09/2006, 20h24   #1
Pet @ www.gymratz.co.uk ;¬)
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Froogle Website "scraper"

I was wondering if anyone has come across, or written a script that
would spider a web-site to extract the majority of information from
..html pages in order to create a feed for Froogle?

or even if such a thing would be possible.

TIA

Pete
--
http://gymratz.co.uk - Best Gym Equipment & Bodybuilding Supplements UK.
http://trade-price-supplements.co.uk - TRADE PRICED SUPPLEMENTS for ALL!
http://fitness-equipment-uk.com - UK's No.1 Fitness Equipment Suppliers.
http://Water-Rower.co.uk - Worlds best prices on the Worlds best Rower.
  Réponse avec citation
Vieux 11/09/2006, 20h31   #2
John Bokma
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Froogle Website "scraper"

"Pet @ www.gymratz.co.uk ;¬)" <PeTe33@gymratz.co.uk> wrote:

> I was wondering if anyone has come across, or written a script that
> would spider a web-site to extract the majority of information from
> .html pages ...


yes, however, the small script I wrote is outdated:
http://www.google.com/search?q=froogle%20script

> ... in order to create a feed for Froogle?
>
> or even if such a thing would be possible.


yes, I can be hired to do such things :-)

--
John Need with SEO? Get started with a SEO report of your site:

--> http://johnbokma.com/websitedesign/seo-expert-.html
  Réponse avec citation
Vieux 11/09/2006, 21h29   #3
Pet @ www.gymratz.co.uk ;¬)
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Froogle Website "scraper"

John Bokma wrote:

>> I was wondering if anyone has come across, or written a script that
>> would spider a web-site to extract the majority of information from
>> .html pages ...


> yes, I can be hired to do such things :-)


Can you give me an idea of price John?
This e-mail is a working one if you want to reply to it privately.

Cheers
Pete

--
http://gymratz.co.uk - Best Gym Equipment & Bodybuilding Supplements UK.
http://trade-price-supplements.co.uk - TRADE PRICED SUPPLEMENTS for ALL!
http://fitness-equipment-uk.com - UK's No.1 Fitness Equipment Suppliers.
http://Water-Rower.co.uk - Worlds best prices on the Worlds best Rower.
  Réponse avec citation
Vieux 11/09/2006, 23h28   #4
John Bokma
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Froogle Website "scraper"

"Pet @ www.gymratz.co.uk ;¬)" <PeTe33@gymratz.co.uk> wrote:

> John Bokma wrote:
>
>>> I was wondering if anyone has come across, or written a script that
>>> would spider a web-site to extract the majority of information from
>>> .html pages ...

>
>> yes, I can be hired to do such things :-)

>
> Can you give me an idea of price John?
> This e-mail is a working one if you want to reply to it privately.


Hi Pete,

You got mail :-)

--
John Need with SEO? Get started with a SEO report of your site:

--> http://johnbokma.com/websitedesign/seo-expert-.html
  Réponse avec citation
Vieux 12/09/2006, 04h02   #5
John Bokma
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Froogle Website "scraper"

John Bokma <john@castleamber.com> wrote:

> "Pet @ www.gymratz.co.uk ;¬)" <PeTe33@gymratz.co.uk> wrote:
>
>> John Bokma wrote:
>>
>>>> I was wondering if anyone has come across, or written a script that
>>>> would spider a web-site to extract the majority of information from
>>>> .html pages ...

>>
>>> yes, I can be hired to do such things :-)

>>
>> Can you give me an idea of price John?
>> This e-mail is a working one if you want to reply to it privately.

>
> Hi Pete,
>
> You got mail :-)


A shameless plug: most not-too-complex (but also not simple) scraping
scripts like given above I can program in between 4 and 8 hrs, depending
on the complexity of course.

--
John Need with SEO? Get started with a SEO report of your site:

--> http://johnbokma.com/websitedesign/seo-expert-.html
  Réponse avec citation
Vieux 12/09/2006, 14h32   #6
John A.
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Froogle Website "scraper"

On Mon, 11 Sep 2006 19:24:39 GMT, "Pet @ www.gymratz.co.uk ;¬)"
<PeTe33@gymratz.co.uk> wrote:

>I was wondering if anyone has come across, or written a script that
>would spider a web-site to extract the majority of information from
>.html pages in order to create a feed for Froogle?
>
>or even if such a thing would be possible.


Presumably there would be some sort of database or table for the items
offered for sale, from which those pages are generated. That would be
a better source for generating the feed.
  Réponse avec citation
Vieux 12/09/2006, 14h50   #7
Pet @ www.gymratz.co.uk ;¬)
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Froogle Website "scraper"

John A. wrote:

> Presumably there would be some sort of database or table for the items
> offered for sale, from which those pages are generated. That would be
> a better source for generating the feed.


I can export the various fields, unfortunately all the HTML tags on
product descriptions etc are exported as well.
The problem really lies with lack of URL of each specific product, this
can change as items are added or deleted because the site creation
software then re-creates all the pages.

I suppose I could create a new site purely for Froogle was was excluding
tags and stuff and only put the more popular stuff on, but that seems
like a bit excessive.

:¬(


--
http://gymratz.co.uk - Best Gym Equipment & Bodybuilding Supplements UK.
http://trade-price-supplements.co.uk - TRADE PRICED SUPPLEMENTS for ALL!
http://fitness-equipment-uk.com - UK's No.1 Fitness Equipment Suppliers.
http://Water-Rower.co.uk - Worlds best prices on the Worlds best Rower.
  Réponse avec citation
Vieux 12/09/2006, 19h11   #8
John A.
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Froogle Website "scraper"

On Tue, 12 Sep 2006 13:50:35 GMT, "Pet @ www.gymratz.co.uk ;¬)"
<PeTe33@gymratz.co.uk> wrote:

>John A. wrote:
>
>> Presumably there would be some sort of database or table for the items
>> offered for sale, from which those pages are generated. That would be
>> a better source for generating the feed.

>
>I can export the various fields, unfortunately all the HTML tags on
>product descriptions etc are exported as well.
>The problem really lies with lack of URL of each specific product, this
>can change as items are added or deleted because the site creation
>software then re-creates all the pages.


Sounds like your first step should be to put together some sort of
permalink scheme. Feed URLs will be a snap then, and inbound links
more reliable.

Couldn't really tell you how, myself. We give each item its own page.

>I suppose I could create a new site purely for Froogle was was excluding
> tags and stuff and only put the more popular stuff on, but that seems
>like a bit excessive.


You probably don't need to go quite that far.
  Réponse avec citation
Vieux 12/09/2006, 19h43   #9
John Bokma
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Froogle Website "scraper"

"Pet @ www.gymratz.co.uk ;¬)" <PeTe33@gymratz.co.uk> wrote:

> John A. wrote:
>
>> Presumably there would be some sort of database or table for the items
>> offered for sale, from which those pages are generated. That would be
>> a better source for generating the feed.

>
> I can export the various fields, unfortunately all the HTML tags on
> product descriptions etc are exported as well.


That can be cleaned up easily. I probably misunderstood your first
request, I thought you wanted to have a Froogle -> RSS feed for one reason
or another. Not that you were looking for a way to add an RSS feed to your
own site containing the products on your own site.

> The problem really lies with lack of URL of each specific product, this
> can change as items are added or deleted because the site creation
> software then re-creates all the pages.


Sounds like your site creation software is severely broke.

--
John Need with SEO? Get started with a SEO report of your site:

--> http://johnbokma.com/websitedesign/seo-expert-.html
  Réponse avec citation
Vieux 13/09/2006, 19h31   #10
Pet @ www.gymratz.co.uk ;¬)
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Froogle Website "scraper"

John Bokma wrote:


> Sounds like your site creation software is severely broke.


It has served it's purpose for many years, but is indeed in need of a
more up to date solution, I just haven't found the right one yet.

:¬)

--
http://gymratz.co.uk - Best Gym Equipment & Bodybuilding Supplements UK.
http://trade-price-supplements.co.uk - TRADE PRICED SUPPLEMENTS for ALL!
http://fitness-equipment-uk.com - UK's No.1 Fitness Equipment Suppliers.
http://Water-Rower.co.uk - Worlds best prices on the Worlds best Rower.
  Réponse avec citation
Vieux 13/09/2006, 21h38   #11
John Bokma
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Froogle Website "scraper"

"Pet @ www.gymratz.co.uk ;¬)" <PeTe33@gymratz.co.uk> wrote:

> John Bokma wrote:
>
>
>> Sounds like your site creation software is severely broke.

>
> It has served it's purpose for many years, but is indeed in need of a
> more up to date solution, I just haven't found the right one yet.


If it generates static files from a database / textfile(s), I also can be
hired to do stuff like that ;-)

Examples are: http://johnbokma.com/ (generated from XML files)
http://castleamber.com/ (generated from a single txt file)

Depending on complexity 4-8 hrs would be a (very) rough estimate. My SEO
experience will be included, of course :-D.

--
John Need with SEO? Get started with a SEO report of your site:

--> http://johnbokma.com/websitedesign/seo-expert-.html
  Réponse avec citation
Vieux 13/09/2006, 23h06   #12
Pet @ www.gymratz.co.uk ;¬)
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Froogle Website "scraper"

John Bokma wrote:

> If it generates static files from a database / textfile(s), I also can be
> hired to do stuff like that ;-)


Strangely I wouldn't have thought otherwise.
:¬)

> Examples are: http://johnbokma.com/ (generated from XML files)
> http://castleamber.com/ (generated from a single txt file)


When It starts generating it's pages, the first thing it says is "saving
XML data (or files or something) it also has an option to export XML
data whether that is of any use to me I have no idea.

:¬)

> Depending on complexity 4-8 hrs would be a (very) rough estimate. My SEO
> experience will be included, of course :-D.


4 to 8 hours for which part? I'm lost already.

Pete

--
http://gymratz.co.uk - Best Gym Equipment & Bodybuilding Supplements UK.
http://trade-price-supplements.co.uk - TRADE PRICED SUPPLEMENTS for ALL!
http://fitness-equipment-uk.com - UK's No.1 Fitness Equipment Suppliers.
http://Water-Rower.co.uk - Worlds best prices on the Worlds best Rower.
  Réponse avec citation
Vieux 14/09/2006, 00h18   #13
John Bokma
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Froogle Website "scraper"

"Pet @ www.gymratz.co.uk ;¬)" <PeTe33@gymratz.co.uk> wrote:

> John Bokma wrote:
>
>> If it generates static files from a database / textfile(s), I also
>> can be hired to do stuff like that ;-)

>
> Strangely I wouldn't have thought otherwise.
> :¬)
>
>> Examples are: http://johnbokma.com/ (generated from XML files)
>> http://castleamber.com/ (generated from a single txt
>> file)

>
> When It starts generating it's pages, the first thing it says is
> "saving XML data (or files or something) it also has an option to
> export XML data whether that is of any use to me I have no idea.
>
> :¬)
>
>> Depending on complexity 4-8 hrs would be a (very) rough estimate. My
>> SEO experience will be included, of course :-D.

>
> 4 to 8 hours for which part? I'm lost already.


Generating a site from database/textfile(s) statically.

--
John Need with SEO? Get started with a SEO report of your site:

--> http://johnbokma.com/websitedesign/seo-expert-.html
  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 09h52.


Édité par : vBulletin® version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières ©2000-2008
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,21937 seconds with 21 queries