From owner-chemistry #at# ccl.net Thu Oct 20 09:26:00 2011 From: "Georg Lefkidis lefkidis * physik.uni-kl.de" To: CCL Subject: CCL:G: AW: Science code manifesto Message-Id: <-45711-111020052617-956-pYhTgfGoItD5b5LZiOwvhQ/a\server.ccl.net> X-Original-From: "Georg Lefkidis" Content-Language: de Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="iso-8859-1" Date: Thu, 20 Oct 2011 11:25:57 +0200 MIME-Version: 1.0 Sent to CCL by: "Georg Lefkidis" [lefkidis=physik.uni-kl.de] Hello everyone, I don't know if this was brought to the list's attention already, but there is an additional component as well (much as most of us would not like to admit it). Writing a source code often is the most time consuming part of one's results, even if the mathematical analysis per se might be done relatively quickly. This means, that once a code is there, the author(s) would like to use it over and over again (perhaps for different but similar systems) and of course *publish* more. The algorithmic implementation of a mathematical formula is part of the process. So I believe that most scientists-programmers would not feel very comfortable with sharing the codes with *anonymous* referees, which at the end might even reject the paper, and see that work appear elsewhere for other systems. Let's not forget it is not only the systems, the analysis and the results, but also the programming itself that is worth a publication. Everyone who ever programmed a data-mining algorithm for Gaussian or Gamess output knows that only too well. Perhaps this is not of concern to great professors with huge groups and meanwhile bug-free codes that have been around for decades but to a common mortal (like myself) it is (since I've seen that happen, although luckily not to me yet). Another issue is the quality of different third-party programs used. I read a couple of posts below, that it is (or might be) that the group's reputation is decisive. In fact there is more to that: being able to evaluate the quality of the results (by comparing to experiment or assessing the quality of derived properties, selection rules, symmetries etc.) plays also a very big role. For me a paper interpreting the importance of scaling a parameter done with not the best code is at least equally important as the best uncommented results produces out of the code only. So I see a potential danger there if the code itself becomes more and more important. Besides, good results will always get reproduced by other groups even with other codes or methods (for instance theory vs. experiment etc.). I am not saying I am for or against those two arguments, I just want to mention them as possible issues which need thinking. Best regards Georg -----Ursprüngliche Nachricht----- Von: owner-chemistry+lefkidis==physik.uni-kl.de|,|ccl.net [mailto:owner-chemistry+lefkidis==physik.uni-kl.de|,|ccl.net] Im Auftrag von Andrew Dalke dalke a dalkescientific.com Gesendet: Mittwoch, 19. Oktober 2011 11:39 An: Lefkidis, Georg Betreff: CCL: Science code manifesto Sent to CCL by: Andrew Dalke [dalke^^^dalkescientific.com] On Oct 18, 2011, at 4:17 PM, Adrià Cereto Massagué adrian.cereto,+,gmail.com wrote: > I don't think the manifesto is at odds with FSF. GPL'd software can be sold at any price, but its source code must be available for those who own the software at no further cost. And someone who has bought some GPL software is allowed to redistribute it for free, so researchers using it for a paper would be able to provide the software to reviewers and readers of the paper at no cost. Abstract: How much can the paper authors ask for access to the source code? How much can the curators charge? What should the curator do if the curated software contains a license violation? If I write a paper which depends on software for its analysis, and others should have access to the software as part of effective peer review, then how much can I charge others to get access to the software? US $1 billion? The FSF says I can charge as much as you want, and that freedom is one of the core freedoms of free software. The philosophy that others need access to my source code to provide good peer review has the implicit assumption that I will provide the software at a non-prohibitive cost. There is clearly a tension between these two viewpoints. This manifesto says nothing of what that cost might be, nor even that it might be an issue. What should be the cost to get access to source code from the author, or > from the curator? Does the curator get no-cost access to it as a condition of publication? Doesn't any limit on cost curtail what the FSF says is my freedom to charge as much as I want? Remember, the FSF encourages software freedom. I argue that scientific communication has overlapping but different goals. Science communications needs to have a low cost so that many people can get access to it. The FSF is only concerned about what happens *after* someone gets access to software. This is of course similar to (most) scientific papers. There the author gives the curator the right to redistribute the paper without paying royalties, and the curator can charge effectively any price for it. Most paper publishers want to maximize revenue, and therefore set high but not prohibitive prices. The software author may have other concerns. Interestingly, the software curator takes on a more difficult challenge than a paper curator. The authors of a paper (with a few exceptions usually well-covered by fair use exceptions) are the only copyright holders of a paper. More often though, the accompanying software has many more copyright holders. That can lead to problems. Consider the CDK chemistry toolkit. The package contains many copyright holders, including those from third-party libraries which it incorporates. A few years ago the CDK was in minor violation of the LGPL requirement of some of those libraries. (It omitted the credit required by those licenses.) This was quickly fixed once pointed out. I can easily imagine cases where it can't be easily fixed. The curator takes on the risk that someone else, who is a copyright holder to the software in question but not a paper author, may challenge the right of the curator to distribute the software. How does the curator resolve the violation, especially if the original author doesn't want to be involved? Does the curator remove the software in question? If so, and if you insist that the software must be available in order to do correct peer review, then should the corresponding paper also be withdrawn? As I said before, these are solvable. I bring them up because I encourage people to distribute their source code along with the paper, and to be aware that it's not a simple, clear issue. Andrew dalke:+:dalkescientific.comhttp://www.ccl.net/cgi-bin/ccl/send_ccl_messagehttp://www.ccl.net/chemistry/sub_unsub.shtmlhttp://www.ccl.net/spammers.txt