Afficher un message
Vieux 02/09/2007, 21h52   #18
Ed Jay
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Outgoing links and Google ranking

Jerry Stuckle scribed:

>Ed Jay wrote:
>> Jerry Stuckle scribed:
>>
>>> Ed Jay wrote:
>>>> Jerry Stuckle scribed:
>>>>
>>>>> Ed Jay wrote:
>>>>>> Jerry Stuckle scribed:
>>>>>>
>>>>>>> Ed Jay wrote:
>>>>>>>> Brian Cryer scribed:
>>>>>>>>
>>>>>>>>> "Jerry Stuckle" <jstucklex@attglobal.net> wrote in message
>>>>>>>>> news:R72dnTtJFO7zckTbnZ2dnUVZ_r3inZ2d@comcast.com. ..
>>>>>>>>>> Brian Cryer wrote:
>>>>>>>>> <snip>
>>>>>>>>>>> For what its worth, for link exchanges I don't exchange links with sites
>>>>>>>>>>> which use perl, nofollow or scripts for their links.
>>>>>>>>>> There is no such thing as a "perl link" on a web page. Perl may generate
>>>>>>>>>> the link - but it's straight html code, and no one can tell from the
>>>>>>>>>> client side whether the link was generated statically, with Perl, PHP, ASP
>>>>>>>>>> or one of the 1,000,000 parrots pecking on keyboards.
>>>>>>>>> Quite true.
>>>>>>>>>
>>>>>>>>> What I meant, and what I think the OP was referring to is that pages that
>>>>>>>>> are generated using perl typically seem to have a zero PR. Whether a 0 PR
>>>>>>>>> means that Google isn't following the link I simply don't know. For example
>>>>>>>>> while example.com (if generated using perl) might have a PR of say 5,
>>>>>>>>> example.com/foo.pl?i=3 typically has a PR of 0. (This may not be restricted
>>>>>>>>> to perl.) More than happy to be shown that I'm wrong on this - my feeling is
>>>>>>>>> that I should be wrong about it.
>>>>>>>>>
>>>>>>>>> I suppose in the context of the OP thread, a link generated using a perl
>>>>>>>>> script if it were simply generating html wouldn't in any way be
>>>>>>>>> distinguishable from a normal link. So, in the context of the thread you are
>>>>>>>>> 100% correct. Good point.
>>>>>>>> My specific issue pertains to a single page that contains all of my external
>>>>>>>> links, and having that page generated using Perl (or any other SS solution).
>>>>>>>> My observations indicate that none of the SE spiders follow links to Perl or
>>>>>>>> other SS scripts. If one is penalized for the number of outgoing links, then
>>>>>>>> the SE spider would never see those links.
>>>>>>> Another though, Ed - have you tried validating the page? It could be
>>>>>>> the html is screwed up just enough to upset the se spider. Or are you
>>>>>>> sure this page is being spidered at all?
>>>>>> Two things, Jerry. I've written my robots.txt file to disallow /cgi-bin. My
>>>>>> pages validate 100% and it gets spidered.
>>>>> Hmmm, that's quite interesting then, Ed. Are the external scripts in
>>>>> /cgi-bin directories? (I doubt it, but had to ask rather than assume :-) ).
>>>> Yes, they are. Aren't your Perl scripts?
>>> Some are in cgi, some in cgi-bin - it depends on how the server was
>>> originally set up.
>>>
>>>>> Also, a maybe too-obvious question - how do you know the spider doesn't
>>>>> follow the external link?
>>>> I see the HTML page hits in my stats, but no page hits on the pages
>>>> generated by the Perl scripts.
>>> So you said you're disallowing /cgi-bin in your robots.txt file, but now
>>> the spiders aren't hitting those in the /cgi-bin directory?
>>>
>>> I must be missing something here, because I know that isn't the case.

>>
>> If I'm disallowing the spiders to roam cgi-bin, how can they spider the
>> scripts contained within it? I don't understand your confusion.

>
>I'm confused because earlier you said:
>
>"I've written my robots.txt file to disallow /cgi-bin."
>
>So files those directories would not be spidered, but pages which link
>to those (but are not in /cgi-bin) will be spidered.
>
>Is this what you're seeing?
>
>Sorry, but it looks like I'm missing something obvious.


I shouldn't have confused the issue with my robots.txt file, as I didn't add
it until a couple of days ago. It's 98º here and I think it's affected my
thinking (more than usual). So, forget the robots.txt disallowance.

With access to /cgi-bin allowed, I'm seeing the hit stats showing the static
HTML pages getting hits from the spiders, but none of the .pl, .cgi or HTML
pages generated by these scripts are shown as getting hits.

There's another possibility...each of the scripts checks for a login
before it allows access to the script. IOW, the spiders can't access the
scripts, because they don't set a , so the HTML document is never
generated for the spider to roam.

I won't be answering again for a few hours. We're off to get some relief
from the heat. Thanks for your input and queries. You make me think.
--
Ed Jay (remove 'M' to respond by email)
  Réponse avec citation
 
Page generated in 0,07369 seconds with 9 queries