CCL: Science code manifesto



 Sent to CCL by: Andrew Dalke [dalke^^^dalkescientific.com]
 On Oct 18, 2011, at 4:17 PM, Adrià Cereto Massagué
 adrian.cereto,+,gmail.com wrote:
 > I don't think the manifesto is at odds with FSF. GPL'd software can be sold
 at any price, but its source code must be available for those who own the
 software at no further cost. And someone who has bought some GPL software is
 allowed to redistribute it for free, so researchers using it for a paper would
 be able to provide the software to reviewers and readers of the paper at no
 cost.
 Abstract: How much can the paper authors ask for access to the
 source code? How much can the curators charge? What should the
 curator do if the curated software contains a license violation?
 If I write a paper which depends on software for its analysis,
 and others should have access to the software as part of effective
 peer review, then how much can I charge others to get access to
 the software? US $1 billion?
 The FSF says I can charge as much as you want, and that freedom
 is one of the core freedoms of free software.
 The philosophy that others need access to my source code to
 provide good peer review has the implicit assumption that I
 will provide the software at a non-prohibitive cost.
 There is clearly a tension between these two viewpoints. This
 manifesto says nothing of what that cost might be, nor even
 that it might be an issue.
 What should be the cost to get access to source code from the
 author, or from the curator? Does the curator get no-cost access
 to it as a condition of publication? Doesn't any limit on cost
 curtail what the FSF says is my freedom to charge as much as
 I want?
 Remember, the FSF encourages software freedom. I argue that
 scientific communication has overlapping but different goals.
 Science communications needs to have a low cost so that many
 people can get access to it. The FSF is only concerned about
 what happens *after* someone gets access to software.
 This is of course similar to (most) scientific papers. There
 the author gives the curator the right to redistribute the paper
 without paying royalties, and the curator can charge effectively
 any price for it. Most paper publishers want to maximize revenue,
 and therefore set high but not prohibitive prices. The software
 author may have other concerns.
 Interestingly, the software curator takes on a more difficult challenge
 than a paper curator. The authors of a paper (with a few exceptions
 usually well-covered by fair use exceptions) are the only copyright
 holders of a paper. More often though, the accompanying software has
 many more copyright holders. That can lead to problems.
 Consider the CDK chemistry toolkit. The package contains many
 copyright holders, including those from third-party libraries
 which it incorporates. A few years ago the CDK was in minor
 violation of the LGPL requirement of some of those libraries.
 (It omitted the credit required by those licenses.) This was
 quickly fixed once pointed out. I can easily imagine cases
 where it can't be easily fixed.
 The curator takes on the risk that someone else, who is a
 copyright holder to the software in question but not a paper
 author, may challenge the right of the curator to distribute
 the software. How does the curator resolve the violation,
 especially if the original author doesn't want to be involved?
 Does the curator remove the software in question?
 If so, and if you insist that the software must be available in
 order to do correct peer review, then should the corresponding paper
 also be withdrawn?
 As I said before, these are solvable. I bring them up because
 I encourage people to distribute their source code along with
 the paper, and to be aware that it's not a simple, clear issue.
 				Andrew
 				dalke[*]dalkescientific.com