PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Forums Hébergement > Diriger une société d'hébergement > alt.internet.seo > How could this page be indexed ?
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
alt.internet.seo Internet search engines and related topics.

How could this page be indexed ?

Réponse
 
LinkBack Outils de la discussion
Vieux 06/02/2007, 00h27   #1
Alexandre Oberlin
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut How could this page be indexed ?

Hi all,

I recently added a subdir with PHP files to my site, while no link AFAIK
could possibly be pointing at it. So I didn't bother to disallow it in
robots.txt at first.
Yet google, yahoo and msn retrieved those pages...
This even brought me some some visitors.
I tried the search done by a visitor picked from my logs but did not get
to my site.
Also I had "Options -Indexes" in the .htaccess file.

Has someone a clue ?

Regards,

AO
  Réponse avec citation
Vieux 06/02/2007, 01h24   #2
Nikita the Spider
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: How could this page be indexed ?

In article <eq8i19$mm4$1@fata.cs.interbusiness.it>,
Alexandre Oberlin <alex@nospam.com.invalid> wrote:

> Hi all,
>
> I recently added a subdir with PHP files to my site, while no link AFAIK
> could possibly be pointing at it. So I didn't bother to disallow it in
> robots.txt at first.
> Yet google, yahoo and msn retrieved those pages...
> This even brought me some some visitors.
> I tried the search done by a visitor picked from my logs but did not get
> to my site.
> Also I had "Options -Indexes" in the .htaccess file.


Did your link make it into anyone's email? Google, Yahoo and MSN all
have mail services and I would not be surprised if they index links that
their mail services see in an effort to be ahead of the competition.

--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more
  Réponse avec citation
Vieux 06/02/2007, 01h28   #3
www.1-script.com
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: How could this page be indexed ?


Alexandre Oberlin wrote:




> Hi all,


> I recently added a subdir with PHP files to my site, while no link
> AFAIK
> could possibly be pointing at it. So I didn't bother to disallow it in
> robots.txt at first.
> Yet google, yahoo and msn retrieved those pages...
> This even brought me some some visitors.
> I tried the search done by a visitor picked from my logs but did not
> get
> to my site.
> Also I had "Options -Indexes" in the .htaccess file.


> Has someone a clue ?


> Regards,


This can happen in great many different ways.

First of all, there is no possible way to know for sure that a page does
not have links to it. Certainly, Google would be the last place to look
because their algo is based on links and so their "link:" command is
purposely designed to lie and not show most significant links or any links
at all.

Also, to give you a clue: don't you run a Google Toolbar in your browser?
Didn't they warn you about possible privacy issues? So, basically, they
know every page you go to if you have PR checker enabled.

And there is always a possibility of someone deliberately linking to a
"bad" page at your site. Competitor would do that in a heart beat.

--
Cheers,
Dmitri
See Site Sig Below



--
+------------------------------------------------+
| Follow alt.internet.search-engines threads |
| with your Firefox Live Bookmarks! Set it up at |
| http://www.1-script.com/forums/ |
+------------------------------------------------+

  Réponse avec citation
Vieux 06/02/2007, 01h58   #4
Alexandre Oberlin
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: How could this page be indexed ?

Nikita the Spider wrote:

> Did your link make it into anyone's email? Google, Yahoo and MSN all
> have mail services and I would not be surprised if they index links that
> their mail services see in an effort to be ahead of the competition.
>

Yes, I sent one email containing this URL using Google SMTP, but... it
was 2 days *after* the first indexation by Google and Yahoo.
I would hardly believe this anyway !




  Réponse avec citation
Vieux 06/02/2007, 02h09   #5
Alexandre Oberlin
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: How could this page be indexed ?

1100000 wrote:
> If you build a "secret" page or directory and then navigate to another site,
> you will send the referrer string. If the external site has some kind of
> tracking program that allows search engines to spider the stats, it will
> send search engines your way.

I'm not quite sure but I might have tested 2 URLs from the page.
A wikipedia article and a book/cd editor.
So this could be a clue.

> If you use the Firefox Web Developer Toolbar, you can set it to not send
> referrers. More features here:
> http://tips.webdesign10.com/web-developer-toolbar.htm

Thank you for the tip.

> If you don't want to expose the location of your secret page in robots.txt,
> you can use the robots meta tag, but not every robot will obey that either.

Yes, it is definitely a better solution in this case, provided I don't
forget
to remove it when it comes to advertising !

> Best to put it in a password protected directory.

Too much of a hassle to do that with my ISP.


--
Au pays des borgnes les aveugles sont rois.

Alexandre Oberlin
http://www.migo.info/
  Réponse avec citation
Vieux 06/02/2007, 02h22   #6
Alexandre Oberlin
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: How could this page be indexed ?

www.1-script.com wrote:
> This can happen in great many different ways.
>
> First of all, there is no possible way to know for sure that a page does
> not have links to it. Certainly, Google would be the last place to look
> because their algo is based on links and so their "link:" command is
> purposely designed to lie and not show most significant links or any links
> at all.

I know. Neither msn did give anything.

> Also, to give you a clue: don't you run a Google Toolbar in your browser?

No
> Didn't they warn you about possible privacy issues? So, basically, they
> know every page you go to if you have PR checker enabled.
>
> And there is always a possibility of someone deliberately linking to a
> "bad" page at your site. Competitor would do that in a heart beat.

They would have to know the name of the directory.
So most probably no one could do that before my email, which I sent 2 days
after the first indexation.
So the only remaining possibility would be the referrer ?




  Réponse avec citation
Vieux 06/02/2007, 06h27   #7
1100000
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: How could this page be indexed ?

"Alexandre Oberlin" <alex@nospam.com.invalid> wrote:
> Hi all,
>
> I recently added a subdir with PHP files to my site, while no link AFAIK
> could possibly be pointing at it. So I didn't bother to disallow it in
> robots.txt at first.
> Yet google, yahoo and msn retrieved those pages...
> This even brought me some some visitors.
> I tried the search done by a visitor picked from my logs but did not get
> to my site.
> Also I had "Options -Indexes" in the .htaccess file.
>
> Has someone a clue ?


If you build a "secret" page or directory and then navigate to another site,
you will send the referrer string. If the external site has some kind of
tracking program that allows search engines to spider the stats, it will
send search engines your way.

If you use the Firefox Web Developer Toolbar, you can set it to not send
referrers. More features here:
http://tips.webdesign10.com/web-developer-toolbar.htm

If you don't want to expose the location of your secret page in robots.txt,
you can use the robots meta tag, but not every robot will obey that either.

Best to put it in a password protected directory.


  Réponse avec citation
Vieux 06/02/2007, 07h38   #8
1100000
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: How could this page be indexed ?

"Alexandre Oberlin" <alex@nospam.com.invalid> wrote:
> 1100000 wrote:
>> If you build a "secret" page or directory and then navigate to another
>> site, you will send the referrer string. If the external site has some
>> kind of tracking program that allows search engines to spider the stats,
>> it will send search engines your way.

> I'm not quite sure but I might have tested 2 URLs from the page.
> A wikipedia article and a book/cd editor.
> So this could be a clue.


You can read more about the technique here:
http://en.wikipedia.org/wiki/Referer_spam

Black-hat SEOs use it, but it also can cause spiders to find "hidden" pages
on your own site.


>> If you don't want to expose the location of your secret page in
>> robots.txt, you can use the robots meta tag, but not every robot will
>> obey that either.

> Yes, it is definitely a better solution in this case, provided I don't
> forget
> to remove it when it comes to advertising !


Have you tried a project management tool called ActiveCollab? It's free...
You can make to-do lists so that you won't forget things like that. It's
modeled on Basecamp, which is not bad, but it's not free and you can't host
it on your own servers.

Highly recommended:
http://www.activecollab.com/

>> Best to put it in a password protected directory.

> Too much of a hassle to do that with my ISP.


No .htaccess support?


  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 06h39.


Édité par : vBulletin® version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,16072 seconds with 16 queries