Follow-up to RSS and Copyright

Mikel.org has a post
that is right on — where Top10 Sources or anyone else makes a mistake
in republishing an RSS feed that is subject to a (cc) license, it
should fix that mistake fast.  (It’s possible, of course, for
someone to give license to do something beyond what the (cc) license
says, so there may be other facts in play here; but the core point
remains.)  The human-and-technical system may be fallible, and/but
things that slip through the cracks of the policy should be corrected promptly.  Michael also notes that Top10 Sources
itself should have an outbound (cc) license, especially to Share-Alike,
where Top10 Sources has the right to do so.  Again, that’s right,
and should (will!) be fixed. 

(Update: with thanks to Michael for
pointing it out, and to the Top10 Sources team for quick turnaround,
the changes have been made to the site.
)

RSS and Copyright, circa 2006

There’s been a flurry of posts and comments on a topic I’ve long been
watching, which is the status of copyright and syndication
technologies.  It’s arisen this time in the context of a project
that I’m involved with outside of my Harvard work, called Top10 Sources
(see my disclosures
for more; it’s important to note that what you read below is
potentially colored by my obvious interests here, though I believe I’ve
been 100% consistent on the merits of this argument since I became
involved in the discussion).  

The issue, raised by a few respected members of the blogosphere, Mike Rundle, Om Malik and Adam Green among others, is whether Top10 Sources
is doing something that violates copyright or, separately, is doing
something that is outside of “the bounds of accepted aggregator
behavior” (perhaps related to the furor over splogs).  My view is
that the site is doing neither.  I believe also that this issue is
a very important one to vet fully, as a community, because this debate is
going to recur and recur until we sort it out.

What Top10 Sources does is to introduce readers who
ordinarily don’t spend all their time reading blogs into the
medium.  The idea is to offer a directory of reading lists,
available as web pages and as OPML files, as well as a quick synopsis
of what each of the chosen sites is saying.  Top10 Sources is
meant to be helpful to the RSS-offering community by directing readers
to great content, to get people to subscribe to your feeds, to get
people clicking through to blogs.  The Top10 Sources editorial
group also ends up learning about communities built around ideas. 
(Soon, Top10 Sources will enable anyone to create their own, competing
Top10 lists and upload them to the site, which will add another
dimension to the analysis below.)

On the copyright matter: what Top10 Sources does is instructive to
whether it’s lawful. 
First, an editor, as part of an editorial team chooses a topic, spends
a LOT of time in the community of people writing about this topic,
consults some technical metrics for the sources, and chooses 10 online
sources (defined simply as offering a feed syndicated using some flavor
of RSS) that cover that topic.  The editors periodically repeat
this process, taking one source off the list when a voice fades or
stops covering the same topic, and adding a new voice as it emerges as
important and topical.  The point is to create a human-edited
Reading List by topic, and to contribute those sources into a
human-created, limited search engine.  

As the editor compiles the site, the editor sends out an e-mail to the
person who appears to be responsible for the site, or, sometimes, posts
a comment to say that the site has been chosen.  The site renders
a list of those sites offering the feeds as directlinks to the
page.  The site also subscribes to those feeds and renders them
all together on a single page.  It is this latter activity that I
take to be the concern.  

The issue raised here is whether it is a copyright violation to render
these syndicated feeds in this way.  As a matter of copyright law,
I contend that it is not.  The strong form of the pro-copyright
argument runs like this: the creator of the RSS feed retains,
automatically, all copyrights in the content in the feed and retains
all rights in its republication, use as a derivative work, and so
forth.  Given that those rights have been retained fully by the
creator of the site, the argument goes, it is unlawful for someone —
presumably in a commercial context — to republish that copyrighted
context without license to do so.  This is the Web 2.0 variant of
the argument that is litigated frequently in the context of web-based
content, with plaintiffs like the RIAA and the MPAA (in the p2p
context), the publishers (like McGraw-Hill, or Perfect 10) who are
suing Google, and the like.  

Though I don’t believe this to be the end of the story, to be fully
responsive to this argument, Top10 Sources offers to remove any feed
chosen as a top source immediately.  So far, out of the 1,500+
sources chosen to be included in one of the site’s recommended
“Reading Lists,” only two sites have asked to be removed, both owned by
the same copyright holder.  (Out of deference to them, I won’t
list them here, though the publisher is well-known for his stance on
this topic, which I respect.)  So, as an open invitation, for
anyone included in one of those lists who wishes to be taken out, just
write to terms of service, in the footer of every page, which includes a section on Copyright.  

Why this is not the end of the story is that there are several other
factors to consider.  One is a defense of fair use, which is a
four-factor test that excuses some activity that would otherwise be
unlawful.  Another is the concept of implied license: why, after
all, would someone in fact offer an RSS feed if they did not want to be
included in aggregators?  As an empirical matter, the fact that
far fewer than 1% of those that Top10 Sources has included in
aggregators in fact have complained about inclusion suggests a norm
around what people are expecting when they decide to syndicate their
content.  As a broader sample size, consider all of the
aggregators, whether public or private in the market, which now number
in the hundreds, and the fact that we have not yet had a train wreck
around the presentation of content in these web pages.  Another is
the fact that many people have written in, asking to be added to the
aggregators.

This is so because, fundamentally, RSS is ads.  As Dave Winer has written, “RSS itself is an advertising medium, if you use it correctly.”    Or, put another way
by Mitch Ratcliffe, “RSS is not content, it’s a
channel.”    The point of many public aggregators is a
place to run these ads, or a TV Guide to these new channels.  Some
people also embed ads in their feeds, presumably so that these ads will
run other places and be seen or clicked through.  Another way to put it: “People come back to places that send them away.”  (Recall what happened to the AOL walled-garden model.)

If a publisher of RSS feeds thinks of it differently, that publisher
has options.  First, the publisher can and should put a license in
the feed that says what they want people to do or not to do with their
feeds.  Creative Commons licenses, as I’ve argued on this blog,
are the way to
go — to embed them into the RSS feeds when they go out, with clear
instructions for your intent.  If you want people to run your feed
in private aggregators,
but not in public aggregators that are for-profit, to re-offer your
content just as you’ve offered it, and to attibute authorship to you,
why not add to your feed a BY-NC-SA
license?   Second,
the publisher of the source, as some have done, can make clear on their
blogs or by
writing to those who aggregate or allow others to aggregate their
content not to do so, pursuant, for instance, to the DMCA 512
procedures.  If an aggregator does not abide your wishes, then the
publisher can seek to assert a copyright complaint via the courts or
otherwise.  But to switch the presumption, somehow, back to a
strong form of the copyright argument would do far more harm than good.

I’ve been worried about this issue since early in 2003.   
Is history repeating itself?  Is the blogosphere arguing itself
right into a trainwreck of the sort that has played out over music and
movies?  Consider the world that A (prominent) VC envisions, here  and here,
wherein content is micro-chunked and syndicated.  This world
cannot emerge if every plausible copyright claim is asserted and
litigated.  Is it a “permission culture,” as Lawrence Lessig has
argued, that we want to head for, where every use of syndicated content
must be pre-approved?

OK, so maybe you don’t like the micro-chunked and syndicated version of
the future.  Even without that version of the future, the rights
in syndicated content should be clarified.  There’s no doubt that
common practice is to share the content that you are syndicating for a
wide variety of uses.  That’s the default that has emerged. 
Simple, clear, online licenses should demark those feeds that are not
meant to be consumed broadly in such a fashion, before the train-wreck
hits.

Back to Top10 Sources, I expect to take up this issue again with the
management team.  I don’t think there’s anything being
done wrong from the perspective of the law.  But we should take up
for discussion some of the
ethical issues that Mike Rundle and Om Malik raise and suggestions that Adam Green
makes about how much of a given feed that the
site republishes — maybe a truncated version of the feeds is the right
thing to render.  The point is not to “steal” someone’s content,
but rather to direct readers to that person’s content after giving a
snippet of it.  Perhaps the right answer is to limit how much of a
feed’s offering is republished in the aggregator.

The broader issue of RSS and copyright remains.  The community is
speaking, to large extent, by creating a norm around syndication and
aggregation which is very important.  It would be a great shame if
the terrific changes being wrought by online publication, syndication,
and aggregation were to be brought down by an aggressive (and in my
view, wrong) reading of the world’s copyright laws.  As my friend
and colleague Jonathan Zittrain might say, the Internet and its
communities have a terrific way of “self-healing.”  This topic is
a great one for the Internet community to solve on its own before it
becomes a (self-)destructive fight.

(Addendum: Adam Green responds, with helpful annotation.  For the record, Adam, you were wildly overqualified for that Extension School class.  It was no doubt the right decision to have dropped.)

Two follow-ups to Berlind Tuesday at Berkman

For those who missed it, David Berlind submitted to an interview — how’s that for being a good sport when the tables are turned! — for the Berkman homepage blog.  Here’s a [snip]:

Question: You mentioned in today’s luncheon series that you had an idea for a transparent workflow for journalists, but that the software sucked. What would that look like?

David Berlind: Just make it easier to encode raw material and transmit it to people who might want to take a look.  The technologies exist.  They’re just not glued together in a way that takes the friction out. For example, a typical blogging system has all the RSS you’d ever need. But, if one source of your material as journalist is e-mail, just try moving your e-mails into the blogging system so “watchdogs” can get at that source material via RSS.  It’s doable.  But it’s so burdensome that you give up trying (especially when you think about how journalists have to work harder faster, etc.. going back to what we have to do to survive in the first question).  The last thing we need is something else that takes our precious time.  With the press of two or three buttons, you could record a phone interview and publish it into an RSS feed.  But someone has to design the software to make it that simple.

David also posted after-thoughts on one of his several blogs over at ZDNet.  Adam Green had this to say after lunch. 

A great Tuesday guest; thanks, David.  (This coming week: Dan Gillmor.)

David Berlind at the Berkman Center

We’ve got David Berlind, executive editor of ZDNet
here as part of our Tuesday luncheon series at the Berkman Center
today.  We’re hoping to rope him into JZ’s cyberlaw class (to talk
ODF, SCO, and the like) and also for the fellows’ meeting this
afternoon, too.  His 75-minute tour-de-force on a range of issues
from the software industry’s past and predictions about the future down
to specifics of certain XML formats, DRM, and open standards.  My
guess is the podcast of the lunch, though long, will be a good one.  (Audio webcast is here now, if you are tuning in at midday, Boston time, on Tuesday, January 10, 2006.)

Nart Villeneuve on Filtering in First Monday

Our truly wonderful colleague in the ONI, Nart Villeneuve,
director of technology at the Citizen Lab at the University of Toronto
(and, truth be told, a key element of the brains behind all filtering
research), has a timely new article in First Monday on filtering. 

His abstract: “Increasingly, states are adopting practices aimed at
regulating and controlling the Internet as it passes through their
borders. Seeking to assert information sovereignty over their
cyber–territory, governments are implementing Internet content
filtering technology at the national level. The implementation of
national filtering is most often conducted in secrecy and lacks
openness, transparency, and accountability. Policy–makers are seemingly
unaware of significant unintended consequences, such as the blocking of
content that was never intended to be blocked. Once a national
filtering system is in place, governments may be tempted to use it as a
tool of political censorship or as a technological “quick fix” to
problems that stem from larger social and political issues. As
non–transparent filtering practices meld into forms of censorship the
effect on democratic practices and the open character of the Internet
are discernible. States are increasingly using Internet filtering to
control the environment of political speech in fundamental opposition
to civil liberties, freedom of speech, and free expression. The
consequences of political filtering directly impact democratic
practices and can be considered a violation of human rights.”

A relevant finding for the swirling debate over China and the role of
US corporations: “Countries such as Iran, Saudi Arabia, United Arab
Emirates (UAE), Tunisia, Yemen and Sudan all use commercial filtering
products developed by U.S. corporations.”

RMacK on censorship in China is most-linked-to

Blogpulse says that Rebecca MacKinnon’s post about the Chinese blog censorship issue
is the most linked-to in the blogosphere.  With good reason. 
I was ashamed not to be one of those links earlier!  On a related
topic, here’s an article in Ethical Corporation.

Big props to Scoble for his take on it. 

Read also what MSN Spaces Product manager Michael Connolly has to say
about it: “In China, there is a unique issue for our entire industry:
there are certain aspects of speech in China that are regulated by the
government.  We’ve made a choice to run a service in China, and to
do that, we need to adhere to local regulations and laws.  This is
not unique to MSN Spaces; this is something that every company has to
do if they operate in China.  So, if a Chinese blog on MSN Spaces
is reported to us by the community, or the Chinese government, as
offensive, we have to ask ourselves: is this blog adhering to our code
of Conduct?  In many cases, the answer is ‘yes, this site is
fine’.  But, in some cases, the answer is ‘no’.  And when an
offense is found that actually breaks a national law, we have no choice
but to take down the site.” 

An extraordinarily hard ethical problem, worthy of close study.

Getting OPML

So, true confessions: it took me a while, even with Dave and other true believers as fellows at the Berkman Center and us hosting the RSS 2.0 spec, to “get” RSS and why it was going to be (and now, already is) so powerful.

I’ve been going through the same slightly slow process to “get” OPML.  The promise seems obvious enough, but I haven’t yet had an epiphany around it.  Dave’s back in Cambridge this week, and I’ve had the pleasure of hanging out with him a fair amount (including lunch with 6 Harvard College students earlier today, the usual fun romp through a series of topics), and I figured maybe I’d work on getting OPML this time. 

Five minutes ago, I had my a-ha moment on OPML.  The best expression I’ve ever seen of its power is the rendering of the OPML file of all the Top10 Sources reading lists that we’ve been compiling (on a non-Harvard project).  Dave’s got it linked now from Scripting News here.  It’s such a dynamic and striking way to organize knowledge.  There’s a critical mass issue to work out, but even with the 150 or so Top10 sites, plus the 1500 or so sources, and down to the most recent posts from those sources, it’s an amazing way to organize citizen-generated (or MSM-generated) information. 

Wow.  I have more to learn, but this was a good day for getting into this technology.  Now imagine if we got all our course syllabi, course PowerPoints, lists of sources reached by journalists, etc. into this format…  Imagine connecting it all up, either in an open-to-the-world kind of way or even a within-the-corporate-firewall kind of way…

(Disclosure: Top10 Sources and various experiments underway with RSS Labs, part of Newsilike Media Group, in which I hold an equity interest, are working on OPML-related developments.)

Nesson and Zittrain teach Internet Law at HLS

(Oh, yeah, and Evidence.)  Now that’s a class.

Charlie says: “Come January 3, 2006, Zittrain and i open our internet
school. We start with the idea that our three week winter semester is
like camp and we are camp directors. Welcome to cyber school. We would
like our students to engage with us in an intense and absorbing
learning and teaching experience during the three weeks that is ours.

“I start each day with light calesthenics, breathing and stretching and
such, at 8:30 a.m., all welcome to come, Evidence class begins sharp at
nine and runs til noon. Z teaches Internet Law starting at 2 p.m. I
hope he will allow me to attend his class, blogging all the way.

“Cyberspace is a rhetorical place, virtual, made of message. Internet is
the wiring of cyberspace. Internet Law has so far been conceived as the
law that affects the routing of message in the space. But focus first
on the space itself and the concept of message. What new capacities
does this meta space bring? How will we learn and teach and create in
it? What new forms of legal and social organization with this new
environment permit to evolve and thrive? What will grow, what will die.”

Getting your facts straight: the UMass Patriot Act hoax

There’s a fascinating story to unravel related to the made-up account of
a UMass student who claimed to have gotten a visit from the Department
of Homeland Security as a result of having borrowed Mao’s Little Red
Book via interlibrary loan.  After a fair amount of ink on the
matter, the student admitted recently that it was a hoax.  There
are no doubt some enduring lessons to be learned for many people
involved.  The record ought to be set straight.  Hopefully,
this false narrative did not lead anyone to make a decision they
otherwise would not have made.

So, fair enough: it’s embarrassing for anyone who used what turned out
to be a phony example as an instance of the over-reach of US law
enforcement via the Patriot Act.  Those of us who think there
ought to be a carefully struck balance between civil liberties and the
empowerment of law enforcement officials to learn things about us ought
to be accountable when we make mistakes.  It’s imperative, where
life and death is literally at stake every day, that we only make
decisions based on facts.

But for those who would gloat in the aftermath of this unfortunate
event, I’d urge just the same accountability and restraint. 
There’s a rant at redstate.com on this topic in which I am implicated. 

Part of what’s said in this rant is true: in an article
in the Harvard Crimson, I said I was skeptical that the DHS would visit
a student just based on the interlibrary loan information alone, and
that, if the student had indeed received such a visit, that there must
have been other factors involved. 

The author of the blog-post
cites my and another person’s skepticism of the original reports, but
in response goes from there to say:

“Anything jump out at you there?  How about two Harvard law and
history professors 1) think this was ‘extremely unlikely’ and 2) have
never heard of any related incidents?  Now of course Professors
Palfrey and Gordon have no doubts, either, about the story; Palfrey
assumes there was some other reason for surveillance…”

It’s pretty remarkable that someone would write a long post about how
civil libertarians jump to conclusions based on partial evidence, and
the goes on to jump to the conclusion that I and someone else have “no
doubts … about the story.”  After stating on the record that it
was “extremely unlikely” that DHS would have done such a thing, I’d
think that my “doubts” about the story were perfectly clear.  As
one might imagine, the reporter went on to ask, “How do you think such
a thing could have happened?”  I stated that, if such a visit had
taken place (I had no knowledge one way or the other on this matter),
the officials must have had other information that would lead them to
undertake such an investigation.

There’s no room in the debate over the extension of the Patriot Act
provisions, and the crucially important balance between privacy and
investigative powers, for hyperbole.  The only way to come to a
reasoned decision is to look closely at hard facts.  When someone
gets the facts wrong, we should correct the record and reconsider
whether we made the wrong decision based on the facts.  But those
who swing the pendulum back so far as to distort the narrative in yet
another direction are just fanning the flames and doing us all a
disservice.

Meeting Dave

It was several years ago this time, over the winter holiday break, that I first met Dave Winer.  He came well-introduced
and with an extraordinary record of accomplishment behind (and, as it
turned out, ahead) of him.  We met in the unheated Berkman
conference room in a dark building, with all the sane academics
burrowed away somewhere warm and outside of abandoned Cambridge. 
Dave’s fellowship gave rise to some of the most substantial projects
the Berkman Center has ever undertaken, including an ever-growing focus
on the rise of citizens’ media (blogging, podcasting, RSS, OPML, and
the like) and the impact of Internet on democracy, which led me to Dave
in the first place.  Fittingly, on this Monday holiday, I’m seeing
Dave again here in Cambridge.  Always a watershed.  Or, more
likely, a blizzard (like last year at WebCred, when I last had lunch
with him here).