A very interesting lawsuit was handed a very interesting judgment the other day in Colorado, in the case of a woman, Suzanne Shell, who filed suit against an internet search engine spider which “crawled” her site, indexing, as these spiders do, its contents. As discussed over at Information Week, the suit alleges everything from breaking and entering, theft, racketeering, and breach of contract.
It all got thrown out of court, except the breach of contract part.
Huh? Well, she has a warning on her site, profanejustice.org that entering it and clicking on links etc. constitutes acceptance of her terms of service, which include not indexing it or downloading the contents, etc. You get the idea.
She sounds like something of a s**t diturber, anyway, refusing at one point to surrender a .38 in her carry-on, etc. Don’t get me wrong, I am gaining growing respect for the disturbers out there in this strange world. But, you know, pick your battles.
But to the issue: *do* internet search sites have the right, no matter what, to index you and send readers your way? Or index you and use the information for something else? Is it a bad thing to respect someone’s declared intent for you to not do that?
I think the whole argument about whether computer programs or agents or spiders or whatever are sentient is stupid. They are not, but someone hit that return key somewhere, and they are the ones responsible.
There is an informal agreement that robots should obey the restrictions in a robots.txt file on a site, but it’s no more than that, an informal agreement. So that’s not a good argument against the suit.
What happens if she wins this one all the way? Then, any time a site wanted to avoid being indexed, they could simply declare this on the page. The vagaries of our language being what they are, it would be hard to program a robot to be sensitive to any such disclaimers anywhere on a page. But, supposing that can be overcome, what uses might this be put to?
I suppose those might include online stores that don’t want their prices advertised elsewhere, because they are so high! It also might make it easier to protect copyrighted material. It certainly would put something of a damper, in the end, on the open and free nature of the web. But perhaps our diligent readers can think of other evil to do with such a new restriction.
Since the robots.txt format has existed (relatively) for ages, if she didn’t have one then she shouldn’t have a leg to stand on here. If she did, and the program indexed it anyway, then she should win. Not to mention that, literally speaking, nobody clicked any links in her site, since that describes a more physical action than parsing and downloading.
I suppose it comes down to how tech savvy of a judge gets assigned the case…and whether she refuses to surrender that .38 when she goes to court.
Maybe a new standard needs to be incorporated, like an HTML/XML header that will tell bots to stop. it would be ridiculous to place the onus upon the spider coders to search for disclaimers, which would also be a huge waste of time for the program.
Such a standard already exists, but isn’t as well known, so isn’t as well followed:
http://www.robotstxt.org/wc/meta-user.html
You mean like the aforementioned robots.txt file?
I’m not sure how that’s going to hold up in court though. It would be like a criminal trying to blame the homeowner that he didn’t have a sign saying “don’t rob me.” There’s no governing body for the intenets that says that if you put up a robots.txt file then sites have to obey it, otherwise you can be spidered within reason (ie no abusing pulling down large files, etc).
I’m not a lawyer, or a judge - but I am a tech-guy/software developer/etc.
I think a defence could reasonably argue that those with sufficient skill in the industry would be aware of this de-facto standard.
I keep trying to draw analogies, but nothing quite works right - the closest I can come is this:
When you’re writing a legal document, some words have a different meaning to what your typical ‘man in the street’ might assume. So, whilst someone without any qualification or skills in writing legal documents might be able to write one - the interpretation of such a document might be different from the original intention.
For that matter - someone who is prepared to put a few minutes into doing a web search, would find out that there was a standard method for preventing web indexing from taking place.
I beg to differ. All the major search engines (which is to say, anyone worth suing) support it.
Google even introduced a further refinement of
<meta name='robots' content='nofollow' />: therel='nofollow'attribute, which can be applied to individual links on a page (as opposed to the<meta>element, which applies to all links on the page).Attempts to make money by suing Search Engine companies (say, for copyright violation by maintaining an illegal “copy” of your site in their index) have a long pedigree.
That’s why all the big search engines have a page where you can submit a request to have your site removed from their index. Forestalls a lot of frivolous lawsuits…
From my observation the situation would seem to be that these web crawlers are not unlike a librarian making microfilms. The Times, the Post, and every other newspaper prints with the intention of receiving some money for their paper. In libraries all around the world these papers end up on microfilm, so wouldn’t it be a similar standard or guidelines. The main difference I see is that what gets put into a publicly viewable website is just that, its the realm of public domain.
Also with this website, you can go to the various pages without tripping the copyright message agreement as long as you jump right to particular section. What happens when one sends a link to a particular page that doesn’t trigger the message on the copyright? Would one be bound even if they did not scroll to the bottom and see said copyright? Considering the site uses in part information gleaned from government supplied statistics, they seem to be missing how open information can have its benefits. I can understand a desire to control one’s web content, but if this is something that has not come up in years prior, why would this person think they are different?
Also, remember the Internet != “US of A” - what happens when a non-English speaking person loads her page? Would he be bound to the “agreement”, too?
er…one cannot download her site? I guess one cannot even view the web page. I guess I am just stupid, does not ones computer have to download the page to display it?
If one goes to the site, you get a dialog box asking you to click OK or cancel if you agree to terms and conditions. If you click cancel, you get another dialog box which tells you to close your browser, but the front page still loads!
You aren’t given the opportunity to read the “terms of use and purchases” before clicking OK to “agree.” I’d think that this by itself makes the “contract” unenforcable.
More deeply, if you happen to know the URL of a page that’s not the top-level page, and you enter that directly, the page will load without any warning or disclaimer or dialog box at all.
I think it’s like putting a sign in front of a tree that says “don’t photograph this tree.” Whether that sign has any significance depends on whether the potential photographer is standing on the sign-installer’s property.
If you want to ensure that nobody will take a photograph of a tree, it’s your responsibility to make sure it can’t be seen off your property. On the web, there are all sorts of ways to manage access, with session management and cookies and so forth, in addition to robots.txt that was mentioned above. This site doesn’t try to do any of these.
My picture of the internet is that it’s somewhat like a public space. I draw my idea of the rights of browsers from those of photographers in public spaces. It’s not a free-for-all, but you can’t really put any limits on normal http messages that servers and browsers send to one another.
I am greatly amussed by her site. Somehow you can turn off copy and paste and print. If you view the source you get this
”
”
(spaces added so the html parser on this thing didn’t eat the lines)
followed by a few hundred carriage returns….and then all the source code.
she keeps claiming they sued her first, but never says why…
There is also the question of what it takes to enter in to a contrat. Can a store put in small print by their door “if you enter this store you must spend a minimum of $300 before you may leave” (with right legal words), and then start sueing all the low spenders for breach of contract?
She should also go after some ISPs that cache web pages and anyone who uses a web browser that caches
Further, it is only the first page that has that strange qustion pop up. If you have a direct link in to the inside you don’t get that. And you can hit no and still view the site….
This whole thing is a publicity stunt. I kind of feel bad giving her more traffic to look at her site.
and her greatest sin is what ever she did to turn off copy and paste break firefox mouse gestures.
grr, the html showed up in the preview
\
\
\
hopefully that will work better
ok, one more try then i give up
DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.0 Transitional//EN”
hppage status=”protected”
Copyright 1997 - 2007 Suzanne Shell and licensors. All Rights Reserved.
Source code not available
Amusing.
I went to the site, and tried to load the “robots.txt” file. 404 (file not found). (To see what a robots.txt file looks like, just access one…
I had no trouble viewing the page source — no mention of robots.
She seems to have no legs to stand on, having not used any of the standard methods to declare her site off-limits to spiders.
Oh, and in response to BlogReader, who thinks that requiring a website owner to use a standard method of saying no to spiders is like trying to blame the homeowner that he didn’t have a sign saying “don’t rob me.†— that’s ridiculous. She controls a website that sends webpages to anyone on the internet that makes properly formatted requests. A better (but still imperfect) analogy might be a someone who is handing out candy to trick-or-treaters, but places a small note next to the door saying that no one is allowed to give the candy to someone else. Neither Google nor anyone else is “breaking in” to her site: it’s an open webserver on the internet.
So on a ‘parallel’ (parallel world?)
Should YouTube (Google) pay Videocom (or whoever) $1billion
or should Google charge Videocom for advertising their products online.
Should teenagers pay for wearing ‘Nike’ on their feet,
or should kids charge Nike for wearing their logo (like Jordan) - anyone out there making Nikes without the logo could be onto a winner for those who like the product but object to doing free advertising for a company.
You know like having windows Vista without the win logo
or an apple mac without the logo - I’m surprised there isn’t a worm coming out of that apple, yet
I find myself thinking that someone this stupid and belligerent should just stay the hell off the web.
..or at least limit themselves to posting comments on FreeRepublic.com, where they’ll have plenty of company.
It strikes me–a lawyer–that a contract needs to be agreed to by both sides. A unilateral declaration does not a contract make. That claim is pretty much off the wall, and I’m surprised it wasn’t dismissed.
It was clear that there was no “breaking and entering,” since one cannot break into that which is not closed. Theft? The contents were apparently open to the public. Racketeering? That’s off the wall.
If she wanted to restrict usage of the contents of her site, she could have done so by limiting access to the site. Apparently she didn’t, and she’s reaping her failure sow.
I she wanted to keep “unauthorized” people off her site, all she had to do is password-protect it, among other possiblilities. She’s clearly a publicity hound or worse, and this judge is clearly clueless. Maybe we need a separate court system for IP/Net-related claims. With a provision to award substantial damages to defendants in a ridiculous lawsuit.
Arachnophobia | Cosmic Variance
A very interesting lawsuit was handed a very interesting judgment the other day in Colorado, in the case of a woman, Suzanne Shell, who filed suit against an internet search engine spider which “crawled†her site, indexing, as these spiders do, its…