RSS and Copyright
Posted January 16th, 2005 @ 12:14pm by Erik J. Barzeski
Robert Scoble has broached the topic of Copyrighting RSS in his usual bumbleheaded way. Essentially, a man named Martin Schwimmer does not like having his posts republished via Bloglines.
Martin summarizes his point succinctly:
This website is published under a Creative Commons license that allows for non-commercial use, provided there is attribution. Commercial use and derivative works are prohibited… in my view, Bloglines' reproduction of my site is a commercial derivative work.(Read for more.)
Scoble's usual bumbleheaded response? Here:
The real trick here is: if you don't want your full posts reprinted somewhere else, don't put them into RSS. That's one reason most commercial sites don't include full content in their feeds.
Derrr… no.
Imagine if Bloglines eventually does begin selling advertisements. Bloglines will immediately begin breaking copyright (and Creative Commons) law and license. This particular blog is licensed under an "Attribution-NonCommercial-ShareAlike" policy. Bloglines clearly advertises their copyright as " Copyright © 2003-2004 Trustic, Inc. All rights reserved" and does so even on pages displaying my content (example).
Uhhhhh… no.
I have not written to Bloglines to have them remove my feeds (it'd probably render the link above 404), but it's clear that they're violating the law. If they begin putting ads or charging for their service, they'll be violating even more laws.
Scoble asks "what's different about Bloglines than, say, NewsGator [or PulpFiction]?" Well, here's one: PulpFiction sure as hell doesn't put its own copyright notice on information generated elsewhere. Also, PulpFiction doesn't "republish" content - it merely consumes it. Pretty clear to me.
Update: A site named "MacBlogs.org" seems to be in even further violation of the law. It not only republishes (images too), but it does so without any copyright whatsoever, meaning it falls under generic copyright. As of right now, I don't have an issue with this: they're not advertising and I don't advertise on my blog. I can imagine that others may have a problem with this, though, and these sites' blatant disregard for copyright or CC licenses is only going to land them in hot water sooner or later.
Posted 16 Jan 2005 at 1:13pm #
I disagree with you guys here, I think. BlogLines should change their notice to say something like 'Page generated by BlogLines, content by NSLog();' or something.
What if PulpFiction started showing advertising next to blog items? Wouldn't that be just as bad as if BlogLines did it?
But then, PulpFiction is $25. So one could argue that you're already making a profit off of someone's content. Except, why would one publish a feed if not to have it read in aggregators?
If the only difference here is that one is a web based application and the other is a desktop application then I really think there is no difference at all.
Posted 16 Jan 2005 at 1:32pm #
No, they really couldn't.
There's a big difference between PF and BL. Bloglines (and MacBlogs) stamp everything with their own copyright and stand to profit from the re-publication of my writing. Quoting in full is not "fair use."
PulpFiction (et al) don't republish, they consume. They don't "quote." And they sure as hell don't put their own copyright or license on my content, which is already governed by a license/copyright of my choosing.
Posted 16 Jan 2005 at 1:39pm #
Aren't these sites just providing users with the Application (reading RSS feeds), just within a different display medium (X/HTML + CSS as opposed to a Cocoa Application)? I bet you'll find a "Copyright" somewhere in PulpFiction.
It's worth noting that my LJ friends page pulls in others content including RSS feeds, combines them, orders them, and then (by way of a shared template) slaps Copyright 2002-2004 $ME at the bottom. Now, of course, just by slapping that on their doesn't make it true: I have no right (and of course no desire) to claim Copyright over others work.
Bloglines is just an aggregator. Happens to be a pulic one, but that's not really the point.
Posted 16 Jan 2005 at 2:13pm #
Patrick, go ahead and find the copyright in PulpFiction. Guess where it is? In the credits. It's presented there with other copyright information for some libraries, etc. Clearly we have a right to do so, and clearly we're not labelling someone else's content as being copyrighted to us.
Seemingly, you've managed to completely miss the point. Let's try again: Bloglines (et al) consume and re-publish. Desktop aggregators consume (without the publishing side). In re-publishing, they brand my content as being copyrighted, they violate my chosen CC license, and they stand to profit from my content by quoting in full. Bloglines is not just an aggregator.
It's really pretty simple.
Posted 16 Jan 2005 at 2:20pm #
Of course, the really simple answer is that BL or whomever needs to just put a little box on the page with the entries that contains the contents of the feed's copyright tag. That would solve everything, really.
And perhaps either modify robots.txt to ban Bloglines (and have it respect that) or have Yet Another RSS Tag that says it should not be republished by a web-aggregator (and have it respect that).
It's a conundrum, certainly. But in this day and age of ignoring copyrights in the digital medium, it was only a matter of time before it hit the small publishers as well.
Posted 16 Jan 2005 at 2:40pm #
Hey Erik,
I subscribe to your feed in bloglines. Leaving aside the 'slap a copyright on it' issue, would it be fair to say that you have no problem with the 'consumption' side of bloglines that I use? But rather the fact that they allow non-logged in users to view your content in full, including the search crawlers?
Posted 16 Jan 2005 at 2:53pm #
Michael brings up another point that I didn't have room to mention above: the trawling of bloglines by Google, Yahoo, etc. In no way should a search that turns up my content provide a bloglines URL. That should never happen. Hiding the actual content behind a login seems like an awfully simple way to solve that problem, yes.
I personally don't have (much of) a problem with Bloglines at this time. If they start selling ads I'll have a big problem with them. It's my understanding that they're breaking the law currently as well.
As the author of a "consumption" application, I have no problems with consumption at all. I do have a problem with "re-publication," particularly when re-publication is paired with broken laws, violations of copyright, and profit generated from the protected works of others.
Scoble says "if you don't want your full posts reprinted somewhere else, don't put them into RSS." That's like saying "If you don't want people to scan and OCR a book you've written, then republish under their name, don't write a book."
Get frickin' real, Robert! Perhaps working for Microsoft has dimmed this always-dim man's concept of "right and wrong" or "the law."
Incidentally, I feel that it's worth noting my RSS and Atom feeds doesn't include my Creative Commons license, meaning they fall under the far more general and restrictive Copyright laws.
Posted 16 Jan 2005 at 2:59pm #
Displaying the original copyright doesn't fulfill the requirements listed by the CC license. This is akin to photocopying a book in the library and selling it on the front steps.
Also, it is not the responsibility of the original publisher to provide a mechanism to prevent unauthorized use.
PulpFiction is an application that acts in the role of the client, much like a browser or email client. It doesn't republish anything; it fetches and displays data. In contrast, bloglines is a website which takes blog entries in the same way as PulpFiction, but publishes them on their own site.
Bloglines is required under the CC to give credit to the original authors, not use it for commercial purposes, and ensure that any additions or modifications are available under the same license.
From this I would gather that republishing it is allowed, provided it is not done for profit, that credit for the work is provided where credit is due, and it is made clear that you can republish any work added to the original under the CC.
Bloglines would therefore be required, IMHO, to display the original copyright, not use it commercially, and make their aggregator service available under the CC at least for all blogs under that license.
Posted 16 Jan 2005 at 3:11pm #
Apparently dim bulbs like to gather around other dim bulbs. Scoble has links to tens of other blogs who have commented on this, and many of the posts:
are unnecessarily condescending towards the man (Martin) who started it all simply for asking that his feed be removed from Bloglinesare completely in the dark about this thing called "the law."simply don't get that re-publication is not the same as consumption.(…and on and on and on)
I don't get it. There are a few lit bulbs in the list, but the majority reflect Scoble's complete ignorance and stupidity.
Posted 16 Jan 2005 at 3:56pm #
Republication is the exact same thing as consumption, it's just being done on the server side instead of on your own computer. From a user standpoint, it's all in how you view your application. If a server-side aggregator like bloglines used a XMLHTTPRequest to grab a RSS feed dynamically and XSLT to style the apperance at runtime, the page would look exactly the same yet it'd be technically a lot different.
Bloglines skews usage stats, but if they fix their copyright to refer only to the parts of the page they've created, I don't see much of an issue. I think there's some conflation of media licensing with the technology used to retrieve content going on.
Posted 16 Jan 2005 at 4:22pm #
Mike, republication is not the exact same thing as consumption. If you have a web site that has a bunch of content on it and I copy all of the content for a book and sell it at borders, you're not going to be happy with me. I'm willing to bet I'd be getting hit with a cease and desist plus a lawsuit. Just because the republishing is happening on the web in digital form does not make it ok.
Bloglines is also not just consuming the data. With the bloglines api they are redistributing the data to other 3rd parties. Any way you look at it, they are redistributing the data, not acting as a simple view of the data like PF.
Posted 16 Jan 2005 at 9:15pm #
Having read Scoble's post and this discussion here, there are some things that I don't quite understand here, Erik:
In what way is searching for something on your site with Google different from searching for something on your site with Bloglines? Moreover, in a comment above you say: "...and stand to profit from the re-publication of my writing", but neither do I see a business model behind their service nor do I see a business model behind your personal blog (apart from a little promotion of your own software now and then). So it's not as if Bloglines were taking away any source of income from you or other people with personal diaries (which is what a blog stands for IMHO because I feel that calling them "journalism" like Scoble and others do would be a little overrated). What am I missing here?
Posted 16 Jan 2005 at 9:37pm #
Google doesn't re-publish my feed content. They show, what, 10-12 words or so? And no images. Also, Google respects my robots.txt file - something Bloglines (et al) do not. Furthermore, Google scans my Web page, not my feed. A few other reasons exist, Ralph. They're quite different.
What are you missing? The fact that many bloggers do make money from their blogs and would utterly despise someone else making money from their work. Read the "book" example I gave above. Despite your definition of blogs as "personal diaries," there are several out there that do exist to make money.
Posted 16 Jan 2005 at 9:58pm #
[Ed: comment deleted. In court, the judge would have said "asked and answered, move on counselor."]
Posted 17 Jan 2005 at 2:04am #
It isn't an issue about profit. You can choose not to make a profit on your copyrighted work and still have valid reasons not to permit unauthorized publication. Just because you get it for free doesn't mean you have any rights to it.
Posted 17 Jan 2005 at 7:42am #
Eric: "...but it's clear that they're violating the law."
Microsoft takes in your HTML, reconfigures it, strips parts, modifies the display, and sends it on to be viewed by hundreds of thousands of paying MSNTV users. Just as Earthlink's proxy compresses the JPG of your company logo, modifying its appearance while allowing dial-up users to download it faster. Corporate firewalls examine your content, strip anything they don't like, and cache the results, doling them out to hundreds (or in some cases, thousands) of users.
The Web's entire infrastructure violates any self-servingly rigid interpretation of copyright, every second of every day. The caches, proxies, and firewalls of the world take in content, archive it, alter it, and make it available to others. That's the way this whole thing was designed, that's why it works. Anyone who doesn't like it is free to opt out of the Web. (Or begin panting breathlessly about suing Cisco, Microsoft, most of the Fortune 500, and so on.)
Ditto for syndication and RSS. The technology is designed to enable services like Bloglines to do precisely what services like Bloglines do. If you don't like that, RSS is not for you.
"Google doesn't re-publish my feed content."
Ummm... think again. Do a Google search for your site and click the "cached" link... Google does quite a bit more than simply republish your some text. It republishes your layout, too.
"Also, Google respects my robots.txt file - something Bloglines (et al) do not."
A marginally useful point. I say "marginally" because robots.txt was not designed to restrict single page accesses by applications under human control... its purpose was to limit the automated link crawling of spiders. Bloglines isn't obligated to act as if they're Google, or even YahooFeedSeeker.
But it probably wouldn't hurt if aggregators (including desktop apps) supported robots.txt. It would give site owners another tool in controlling the RSS Bandwidth Monster... they could just disallow the most popular clients during peak traffic periods.
"In no way should a search that turns up my content provide a bloglines URL."
Another arguably valid point, and one easily addressed by "noindex,follow" meta tags. The content isn't indexed by Google, but the author gets the full benefit of the aggregator's Google juice. Unfortunately, Bloglines doesn't *appear* to be including such meta tags at this time... their bad.
Posted 17 Jan 2005 at 9:24am #
To pick up on one small point, Does PulpFiction support robots.txt?
Bloglines has no need to, according to robotstxt.org who define a robot as A robot is a program that automatically traverses the Web's hypertext structure by retrieving a document, and recursively retrieving all documents that are referenced.. There is no automatic traversal involved.
Rod.
Posted 17 Jan 2005 at 10:29am #
Rod, Roger (who still can't figure out how to spell my name), two words for each of you.
1. CONSUMPTION
2. RE-PUBLICATION
They're not the same.
PulpFiction doesn't respect robots.txt because PulpFiction doesn't re-publish. It merely consumes. Bloglines re-publishes (takes credit for, makes money from, etc.) and does so without the creator's permission.
Roger insists that I "think again" that Google doesn't re-publish my feed content. I'd like Roger to find the spot on Google where my feed content is re-published. Not my Web content - my feed content.
I'd also like Roger to note that if I wanted to block Google, I could. The power still rests in my hands to control how my content is handled. I still have the ability and power as creator of copyrighted material to say what Google can and cannot do. No such power exists with Bloglines - they automatically assume they're free to rape creative content for their own benefit, regardless of license or copyright.
Consumption. Re-publication. Big difference, guys.
Posted 17 Jan 2005 at 11:44am #
If you buy the book "Cult of Mac" chances are you have seen some of the stories somewhere before, perhaps surfing or reading a newsfeed aggregator.
In this situation, the content has been repackaged for profit.
If BlogLines starts selling ads, someone buy an ad on BlogLines, selling their newsfeed reader, educating those reading BlogLines they can be their own editor. Or buy an ad touting Creative Commons.
Engadget pumps ads into their feeds, making the content LESS usable, for keyword searching for example.
Erik is right, Bloglines is wrong..
I don't even want to use a service like Blogger to host a possible blog I might fancy, because it seems so tied in to a possibly for profit service (Google? Yahoo?) that I am afraid it will be caged off somehow.
Posted 17 Jan 2005 at 3:05pm #
Erik - read the link in my post. There is nothing in the robots.txt spec that specifies *anything* about how the data should be used.
If I wrote a spider that downloads every page on this site, ROT13s it, then pipes it to /dev/null, you'd (rightfully) be pissed off if I ignored robots.txt. I'd be wasting your bandwidth. That's what robots.txt is there to protect.
Robots.txt is not designed for the case where a user-agent (whether fat client on a PC, or a service on a remote server) requests one file on a scheduled basis.
Posted 17 Jan 2005 at 3:26pm #
Umm, Rod, your point being… what? Cuz my point was that robots.txt allows me to stop Google from indexing my site if I wanted to. No such option exists with Bloglines. Additionally, search engines are inherently different than sites like Bloglines. So too are Web pages vs. XML feeds.
But what kind of argument can I expect from someone who emails me to say:
Mmmmm… I'll pass.
Posted 19 Jan 2005 at 12:05pm #
[Ed: author makes argument "if you don't want people to scan your books and resell them online, don't write books." Comment removed. This (stupid) point has been offered before.]
Posted 23 Jan 2005 at 8:32pm #
RSS copyright controversy still raging
On my way home I turned on the Tablet PC at Oakland International Airport. WiFi here costs $7 on Wayport. I see that the RSS copyright story is still raging. Here's a few I just saw come through my aggregator....
Posted 25 Sep 2006 at 10:43am #
Over the weekend, it came to my attention that a site (I won't link to them, or even mention their name) was re-publishing content from The Sand Trap's RSS feed, sans author and copyright information. The summary text, often with...