[Chugalug] would you trust yandex spidering your site?

Dave Brockman dave at brockmans.com
Wed Nov 14 22:11:32 UTC 2012


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/14/2012 4:33 PM, Rod-Lists wrote:
> they hitting mine.

If you have a website open to the public, you kinda have to trust
various entities to spider it, because they will whether you do or
not.  Yes, I know you can create that cute little text file and
instruct the spider what to and not to index.  They do not have to
honor or even read that file.  And when I use curl or wget to mirror a
site, they don't give a damn about robots.txt either....

There are places on the interwebz that will give you fairly updated
lists of IP addresses and/or blocks you don't want visiting your
network, or don't want your network to visit, if you want to play
eternal whack-a-mole.  Chugalug is not one of those places, btw....

Regards,

dtb


- -- 
"Some things in life can never be fully appreciated nor
understood unless experienced firsthand. Some things in
networking can never be fully understood by someone who neither
builds commercial networking equipment nor runs an operational
network."  RFC 1925
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iEYEARECAAYFAlCkFxQACgkQABP1RO+tr2SB1QCeNIcpKYUbLLLQmlE0bNG2pqPi
yuEAoKJ3EzcYk4Pnov9NljTa/GEl/5bp
=Y0ef
-----END PGP SIGNATURE-----


More information about the Chugalug mailing list