CCL:G: AW: Science code manifesto



 Sent to CCL by: "Georg Lefkidis" [lefkidis=physik.uni-kl.de]
 Hello everyone,
 I don't know if this was brought to the list's attention already, but there
 is an additional component as well (much as most of us would not like to
 admit it). Writing a source code often is the most time consuming part of
 one's results, even if the mathematical analysis per se might be done
 relatively quickly. This means, that once a code is there, the author(s)
 would like to use it over and over again (perhaps for different but similar
 systems) and of course *publish* more. The algorithmic implementation of a
 mathematical formula is part of the process. So I believe that most
 scientists-programmers would not feel very comfortable with sharing the
 codes with *anonymous* referees, which at the end might even reject the
 paper, and see that work appear elsewhere for other systems. Let's not
 forget it is not only the systems, the analysis and the results, but also
 the programming itself that is worth a publication. Everyone who ever
 programmed a data-mining algorithm for Gaussian or Gamess output knows that
 only too well.
 Perhaps this is not of concern to great professors with huge groups and
 meanwhile bug-free codes that have been around for decades but to a common
 mortal (like myself) it is (since I've seen that happen, although luckily
 not to me yet).
 Another issue is the quality of different third-party programs used. I read
 a couple of posts below, that it is (or might be) that the group's
 reputation is decisive. In fact there is more to that: being able to
 evaluate the quality of the results (by comparing to experiment or assessing
 the quality of derived properties, selection rules, symmetries etc.) plays
 also a very big role. For me a paper interpreting the importance of scaling
 a parameter done with not the best code is at least equally important as the
 best uncommented results produces out of the code only. So I see a potential
 danger there if the code itself becomes more and more important. Besides,
 good results will always get reproduced by other groups even with other
 codes or methods (for instance theory vs. experiment etc.).
 I am not saying I am for or against those two arguments, I just want to
 mention them as possible issues which need thinking.
 Best regards
 Georg
 -----Ursprüngliche Nachricht-----
 Von: owner-chemistry+lefkidis==physik.uni-kl.de|,|ccl.net
 [mailto:owner-chemistry+lefkidis==physik.uni-kl.de|,|ccl.net] Im Auftrag von
 Andrew Dalke dalke a dalkescientific.com
 Gesendet: Mittwoch, 19. Oktober 2011 11:39
 An: Lefkidis, Georg
 Betreff: CCL: Science code manifesto
 Sent to CCL by: Andrew Dalke [dalke^^^dalkescientific.com] On Oct 18, 2011,
 at 4:17 PM, Adrià Cereto Massagué adrian.cereto,+,gmail.com wrote:
 > I don't think the manifesto is at odds with FSF. GPL'd software can be
 sold at any price, but its source code must be available for those who own
 the software at no further cost. And someone who has bought some GPL
 software is allowed to redistribute it for free, so researchers using it for
 a paper would be able to provide the software to reviewers and readers of
 the paper at no cost.
 Abstract: How much can the paper authors ask for access to the source code?
 How much can the curators charge? What should the curator do if the curated
 software contains a license violation?
 If I write a paper which depends on software for its analysis, and others
 should have access to the software as part of effective peer review, then
 how much can I charge others to get access to the software? US $1 billion?
 The FSF says I can charge as much as you want, and that freedom is one of
 the core freedoms of free software.
 The philosophy that others need access to my source code to provide good
 peer review has the implicit assumption that I will provide the software at
 a non-prohibitive cost.
 There is clearly a tension between these two viewpoints. This manifesto says
 nothing of what that cost might be, nor even that it might be an issue.
 What should be the cost to get access to source code from the author, or
 > from the curator? Does the curator get no-cost access to it as a condition
 of publication? Doesn't any limit on cost curtail what the FSF says is my
 freedom to charge as much as I want?
 Remember, the FSF encourages software freedom. I argue that scientific
 communication has overlapping but different goals.
 Science communications needs to have a low cost so that many people can get
 access to it. The FSF is only concerned about what happens *after* someone
 gets access to software.
 This is of course similar to (most) scientific papers. There the author
 gives the curator the right to redistribute the paper without paying
 royalties, and the curator can charge effectively any price for it. Most
 paper publishers want to maximize revenue, and therefore set high but not
 prohibitive prices. The software author may have other concerns.
 Interestingly, the software curator takes on a more difficult challenge than
 a paper curator. The authors of a paper (with a few exceptions usually
 well-covered by fair use exceptions) are the only copyright holders of a
 paper. More often though, the accompanying software has many more copyright
 holders. That can lead to problems.
 Consider the CDK chemistry toolkit. The package contains many copyright
 holders, including those from third-party libraries which it incorporates. A
 few years ago the CDK was in minor violation of the LGPL requirement of some
 of those libraries.
 (It omitted the credit required by those licenses.) This was quickly fixed
 once pointed out. I can easily imagine cases where it can't be easily fixed.
 The curator takes on the risk that someone else, who is a copyright holder
 to the software in question but not a paper author, may challenge the right
 of the curator to distribute the software. How does the curator resolve the
 violation, especially if the original author doesn't want to be involved?
 Does the curator remove the software in question?
 If so, and if you insist that the software must be available in order to do
 correct peer review, then should the corresponding paper also be withdrawn?
 As I said before, these are solvable. I bring them up because I encourage
 people to distribute their source code along with the paper, and to be aware
 that it's not a simple, clear issue.
 				Andrew
 				dalke:+:dalkescientific.comhttp://www.ccl.net/cgi-bin/ccl/send_ccl_messagehttp-:-//www.ccl.net/chemistry/sub_unsub.shtmlhttp-:-//www.ccl.net/spammers.txt