Afficher un message
Vieux 02/09/2007, 21h59   #19
Jerry Stuckle
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Outgoing links and Google ranking

Ed Jay wrote:
> Jerry Stuckle scribed:
>
>> Ed Jay wrote:
>>> Jerry Stuckle scribed:
>>>
>>>> Ed Jay wrote:
>>>>> Jerry Stuckle scribed:
>>>>>
>>>>>> Ed Jay wrote:
>>>>>>> Jerry Stuckle scribed:
>>>>>>>
>>>>>>>> Ed Jay wrote:
>>>>>>>>> Brian Cryer scribed:
>>>>>>>>>
>>>>>>>>>> "Jerry Stuckle" <jstucklex@attglobal.net> wrote in message
>>>>>>>>>> news:R72dnTtJFO7zckTbnZ2dnUVZ_r3inZ2d@comcast.com. ..
>>>>>>>>>>> Brian Cryer wrote:
>>>>>>>>>> <snip>
>>>>>>>>>>>> For what its worth, for link exchanges I don't exchange links with sites
>>>>>>>>>>>> which use perl, nofollow or scripts for their links.
>>>>>>>>>>> There is no such thing as a "perl link" on a web page. Perl may generate
>>>>>>>>>>> the link - but it's straight html code, and no one can tell from the
>>>>>>>>>>> client side whether the link was generated statically, with Perl, PHP, ASP
>>>>>>>>>>> or one of the 1,000,000 parrots pecking on keyboards.
>>>>>>>>>> Quite true.
>>>>>>>>>>
>>>>>>>>>> What I meant, and what I think the OP was referring to is that pages that
>>>>>>>>>> are generated using perl typically seem to have a zero PR. Whether a 0 PR
>>>>>>>>>> means that Google isn't following the link I simply don't know. For example
>>>>>>>>>> while example.com (if generated using perl) might have a PR of say 5,
>>>>>>>>>> example.com/foo.pl?i=3 typically has a PR of 0. (This may not be restricted
>>>>>>>>>> to perl.) More than happy to be shown that I'm wrong on this - my feeling is
>>>>>>>>>> that I should be wrong about it.
>>>>>>>>>>
>>>>>>>>>> I suppose in the context of the OP thread, a link generated using a perl
>>>>>>>>>> script if it were simply generating html wouldn't in any way be
>>>>>>>>>> distinguishable from a normal link. So, in the context of the thread you are
>>>>>>>>>> 100% correct. Good point.
>>>>>>>>> My specific issue pertains to a single page that contains all of my external
>>>>>>>>> links, and having that page generated using Perl (or any other SS solution).
>>>>>>>>> My observations indicate that none of the SE spiders follow links to Perl or
>>>>>>>>> other SS scripts. If one is penalized for the number of outgoing links, then
>>>>>>>>> the SE spider would never see those links.
>>>>>>>> Another though, Ed - have you tried validating the page? It could be
>>>>>>>> the html is screwed up just enough to upset the se spider. Or are you
>>>>>>>> sure this page is being spidered at all?
>>>>>>> Two things, Jerry. I've written my robots.txt file to disallow /cgi-bin. My
>>>>>>> pages validate 100% and it gets spidered.
>>>>>> Hmmm, that's quite interesting then, Ed. Are the external scripts in
>>>>>> /cgi-bin directories? (I doubt it, but had to ask rather than assume :-) ).
>>>>> Yes, they are. Aren't your Perl scripts?
>>>> Some are in cgi, some in cgi-bin - it depends on how the server was
>>>> originally set up.
>>>>
>>>>>> Also, a maybe too-obvious question - how do you know the spider doesn't
>>>>>> follow the external link?
>>>>> I see the HTML page hits in my stats, but no page hits on the pages
>>>>> generated by the Perl scripts.
>>>> So you said you're disallowing /cgi-bin in your robots.txt file, but now
>>>> the spiders aren't hitting those in the /cgi-bin directory?
>>>>
>>>> I must be missing something here, because I know that isn't the case.
>>> If I'm disallowing the spiders to roam cgi-bin, how can they spider the
>>> scripts contained within it? I don't understand your confusion.

>> I'm confused because earlier you said:
>>
>> "I've written my robots.txt file to disallow /cgi-bin."
>>
>> So files those directories would not be spidered, but pages which link
>> to those (but are not in /cgi-bin) will be spidered.
>>
>> Is this what you're seeing?
>>
>> Sorry, but it looks like I'm missing something obvious.

>
> I shouldn't have confused the issue with my robots.txt file, as I didn't add
> it until a couple of days ago. It's 98º here and I think it's affected my
> thinking (more than usual). So, forget the robots.txt disallowance.
>
> With access to /cgi-bin allowed, I'm seeing the hit stats showing the static
> HTML pages getting hits from the spiders, but none of the .pl, .cgi or HTML
> pages generated by these scripts are shown as getting hits.
>
> There's another possibility...each of the scripts checks for a login
> before it allows access to the script. IOW, the spiders can't access the
> scripts, because they don't set a , so the HTML document is never
> generated for the spider to roam.


Ed, that is very true. Spiders don't track , so the won't be
able to see the pages. However, you should still see a request for the
page in your server log access log. But if you're doing any tracking in
your code that depends on the , it won't be executed.

>
> I won't be answering again for a few hours. We're off to get some relief
> from the heat. Thanks for your input and queries. You make me think.


OK, have a good one. I'm just sitting in a cook house watching
baseball. :-)


--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
  Réponse avec citation
 
Page generated in 0,07959 seconds with 9 queries