|
|
|
|
||||||
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hello.
I am looking for a way to read html file and create a short summary (like that shows in google results for example) which ought to be the first few lines of welcome text or so. Does any got any idea on how to do this? (I searched allot, but all I found was simply extracting meta tags). Thanks |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
Well, the tricky part is that you'll need to decide what text to grab
and show from the file - which is why there's a meta description tag for the purpose. I believe google grabs the text surrounding a search term and displays that if there's no meta description tag to use - so if you're actually searching for a term you could do something like that. --- www.NEXCESS.NET - Shared/Reseller Hosting www.EliteRax.com - Dedicated Servers, Server Clusters www.MaxVPS.com - Virtual Private Servers - Great prices, Great service - check us out! On Jan 17, 3:48 pm, solk <rikibl...@gmail.com> wrote: > Hello. > > I am looking for a way to read html file and create > a short summary (like that shows in google results for example) > which ought to be the first few lines of welcome text or so. > > Does any got any idea on how to do this? (I searched allot, > but all I found was simply extracting meta tags). > > Thanks |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
Hello,
solk wrote: > Hello. > > I am looking for a way to read html file and create > a short summary (like that shows in google results for example) > which ought to be the first few lines of welcome text or so. > > Does any got any idea on how to do this? (I searched allot, > but all I found was simply extracting meta tags). > > Thanks I can recommend Snoopy (http://snoopy.sourceforge.net/). It is able to retrieve an entire web page, follow links and so on. The result will be the HTML source output you can see if you do a view source in your web browser. From there you can strip HTML tags, use substr() to jump to certain sections in the source (eg: jump to right after the body tag, remove all HTML tags and save the text output). - Jensen |
|
![]() |
| Outils de la discussion | |
|
|