Go Back   United Bimmer Community - BMW Forum > UnitedBimmer- Off Topic > United Off Topic  **FOR MEMBERS ONLY** > Geek Chat
FAQ Members List Calendar Advertise With Us Mark Forums Read

Geek Chat Ask computer questions or just engage in general geek talk on here

Reply
 
Thread Tools
Old 03-08-2006, 11:28 PM   #1
komodo
 
komodo's Avatar

Name: komodo
Title: Administrator
Status: Offline
Join Date: Apr 2005
Location: Athens, GA
Rate My Car: 68 / 340
Your Ride: 1995 M3
Yahoo Slurp Crawler.... not playing by the rules?

Whoa... I just caught Slurp in somewhere it shouldn't be...


Now, United Bimmer's robots.txt file has this in it first of all:
[code]Disallow: /forums/printhread.php[/code]

And second of all, the ONLY link to that page is from the actual thread itself, and that link has a rel="nofollow" tag in it, so even if I didn't have the robots.txt entry, it still should follow it.

Why/how is it crawling there?
__________________

  Reply With Quote
Old 03-08-2006, 11:47 PM   #2
witeshark
 
witeshark's Avatar

Name: witeshark
Title: Suspended License
Status: Offline
Join Date: Apr 2005
Location: Miami FL
Rate My Car: 84 / 340
Your Ride: 89 325i 5 speed
Christ - a link jump loop!
  Reply With Quote
Old 03-08-2006, 11:53 PM   #3
komodo
 
komodo's Avatar

Name: komodo
Title: Administrator
Status: Offline
Join Date: Apr 2005
Location: Athens, GA
Rate My Car: 68 / 340
Your Ride: 1995 M3
Link jump loop?

Edit: Ah, but after an iteration or two, the spider should figure it out and default out. Not good, but not the end of the world either.
__________________

  Reply With Quote
Old 03-08-2006, 11:56 PM   #4
witeshark
 
witeshark's Avatar

Name: witeshark
Title: Suspended License
Status: Offline
Join Date: Apr 2005
Location: Miami FL
Rate My Car: 84 / 340
Your Ride: 89 325i 5 speed
Okay
  Reply With Quote
Old 03-09-2006, 12:46 AM   #5
komodo
 
komodo's Avatar

Name: komodo
Title: Administrator
Status: Offline
Join Date: Apr 2005
Location: Athens, GA
Rate My Car: 68 / 340
Your Ride: 1995 M3
Wow, stupid mistake.

Notice how in the robots.txt file I restricted "printhread.php"? Wouldn't it make more sense if I restricted "printthread.php"? haha
__________________

  Reply With Quote
Old 03-09-2006, 09:23 AM   #6
jms
 
jms's Avatar

Name: jms
Title: ______
Status: Offline
Join Date: Jul 2005
Location: Pittsburgh,PA
Rate My Car: 105 / 340
Your Ride: 99 328I Convertible
yep, that would do it. although some crawlers do ignore the robots.txt file, though 99% do, every once in a while there's one that does. you might want to add some referrer check to it to prevent a loop.
  Reply With Quote
Old 03-09-2006, 10:48 AM   #7
komodo
 
komodo's Avatar

Name: komodo
Title: Administrator
Status: Offline
Join Date: Apr 2005
Location: Athens, GA
Rate My Car: 68 / 340
Your Ride: 1995 M3
Yeah, I watch all the bots that crawl us very closely (mostly to ensure performance and efficiency), as right now there's only 12, so it's easy to keep tabs on them.

If I ever see one violating robots.txt, I'd probably ban it... but we get 65% of all our search engine traffic from Yahoo, so I'd rather keep it happy.
__________________

  Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
AOL, Yahoo to Charge for Email komodo United Off Topic  **FOR MEMBERS ONLY** 7 02-10-2006 12:29 AM


All times are GMT -5. The time now is 09:42 PM.

A vBSkinworks Design

 
Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Copyright © 2005-2013 UnitedBimmer.com