|
Dr. Alejandro Pisanty, Secretary of the Advisory Council on Computing, UNAM,
Universidad Nacional Autonoma de Mexico (UNAM)
Ciudad Universitaria, 04510 Mexico City DF MEXICO
apisan@servidor.unam.mx
========================================
Survey of Computational Chemistry List (summer 1994).
(See also: http://www.elsevier.nl:80/section/chemical/trac/emltrac.htm)
How the survey was conducted.
The survey on the use and impact of the Computational Chemistry List was
conducted through the List itself. A special "Userid" was established at UNAM
for this sole purpose; it was exempted from the quota system for file space and
permanence. The questions asked are grouped, in order to characterize the users
in several ways, and find out both how they use CCL and how it has affected
their work. Some questions are also directed to rating the list, especially in
comparison to other services, and to find out what some possible improvements
might be. A part of the survey addresses the users' expectations for the uses
of computer networks, the evolution of publication, etc.
Some questions are at least partially redundant. Others may seem so on
first reading, but inspection of the replies shows the soundness of their
inclusion; a specific instance is the question on the computer resources used
for reading the list, followed by one on the computers used for work. For many
users, the computer used for reading the list is more modest than those usually
employed for work.
A pair of questions also attempt to round out the survey. These ask for
possible questions that were omitted in the survey, and for general thoughts on
the list. The first one was particularly important during the pilot, and
valuable suggestions were taken from it. The second one provides useful
insights, complementary to the other questions which contribute an evaluation
of the list. Both will be discussed later on in more detail.
Our general impression, especially after analyzing the responses to the
survey, is that we sent out a verbose but otherwise useful survey. Even though
some users commented at this point that the survey was too long, and in fact
many did not answer all of the questions, we still have close to 15% of the
estimated list size responded to the survey, in most instances with informative
thoughts, which reflect a long time dedicated to answering.
A pilot test was conducted with an early version of the survey. This was
broadcast throughout the list; the first 50 responses were studied, in order to
glean from the respondent comments on what to adjust before putting out the
final form. Also, the prior experience of user surveys at Ohio Supercomputing
Center was consulted.
We tried to cover the main issues observed in several years of use (APB)
and administration (JKL) of the CCL. These concern mostly if and how the CCL
has affected the way research is conducted, with further consideration given to
its impact on teaching. Sections were added to characterize the users, both in
the need to know them better and to provide some statistical checks on how
representative a sample of the CCL the respondents provide.
Two forms of posting the survey were used. One was a direct broadcast to
the CCL of the survey questions in ASCII format. Care was taken to avoid
overflowing line lengths of some types of computers where the survey could be
answered. The format was such that it allowed the users to type in their
answers right after the questions and to send back a filled-in form by
electronic mail with minimal trouble. This was done even at the price of
making the analysis of the responses a labor-consuming task. We believe that we
kept to the best CCL tradition in this way, except slightly in that the survey
is a large document. All mail was directed to one of the authors (APB) in
Mexico City. Replies will only be known to JKL once this writing is finished
(although the analysis of the responses will be continued), and then only in
anonymous form. For a large part of the responses, the only identification
available from the start is the "userid" at their computers at the time and
from which they responded to the survey.
The second form of posting the survey was through a World-Wide-Web (WWW)
page which was designed in the Ohio Supercomputing Center. It made the task of
analyzing the responses much easier, for those who came in through this route,
since it only provided one of the authors (APB) with a one-line summary of each
question, and the corresponding reply in full text.
Replies will only be known to JKL once this writing is finished (although
the analysis of the responses will be continued), and then only in anonymous
form. For a large part of the responses, the only identification available from
the start is the "userid" at their computers at the time and from which they
responded to the survey.
The survey was posted to the list in June, 1994, close to the beginning of
the Summer vacation of many academic institutions in the United States and in
Europe. Three further prompts for answers were posted in a period of five
weeks. The whole survey questionnaire was reposted once. The analysis made here
was performed on those responses received from the date of posting until the
end of August 1994.
The list has over 2,000 subscribers. The actual number of readers the list
is hard to assess, as many of these subscribers are in turn listservers or
other kinds of exploders. Close to 300 people answered the survey (289, with
some providing response to only a few of the questions). This size allows for
statistical significance of many of the responses; to further assess the
reliability of the survey, some measures of match between the respondents and
the total population of subscribers were made.
Country where the list is read by respondent.
The users of the list are located in at least 43 countries (as of March
1994). The geographical distribution of respondents closely matches that of
the users. This is the most reliable indicator we have that the sample is
representative. We have no way to know if the age, position, occupation, etc.
structure of the respondents also matches the total population as closely.
However, some assessments will be made and, if possible, qualified in specific
points.
The country of most respondents and subscribers is by far the US, which
roughly encompasses half the list. It is followed by Germany, the UK and
Canada, each comprising roughly 10%. The rest is widely scattered, mostly in
Europe. As we set out the question of country where the respondent reads the
list, we expected to see significant traces that people working in
transnational firms would read the list from abroad. There is only scant
evidence of this in the survey (fewer than 5 out of the 300).
Type of organization the respondent belongs to.
Most of the respondents work in academic organizations (197). Of these,
126 did not give further specifications; out of their electronic addresses one
can observe that most are universities. Further, 63 of these 197 respondents
specified their belonging to universities, and the rest mentioned
supercomputing centers (2), four-year colleges (2), research institutes, NCSA,
secondary school and medical school (one each).
The next-strongest group belongs to for-profit organizations (55). The
vast majority (40) went unspecified. The rest are aerospace, R&D,
pharmaceutical, agrochemical research, biotechnology, modeling software
developer, systems consultant, and publisher, as well as industrial research
laboratory and unspecified industry.
This group is followed by those in government (18), of which 13
unspecified, 4 research, and 1 military lab. One subscriber wrote from a
military institution in the US stating that he had made several attempts to
submit his responses to the survey and found himself unable to do so because
the security arrangements (a "firewall") of his organization's system.
The group reporting itself as belonging to non-profit organizations (9)
also gave no further specification, except for academic, institute, and
chemical information provider.
Six respondents are self-employed consultants.
It should be noted that some responses point to more than one type of
organization and have been counted for all types stated explicitly.
Time since respondents first subscribed.
The period of time during which people have subscribed to the list up to
the date of the survey was distributed as follows: 75 people have belonged to
the list for a few months, or less than a year, while 173 have belonged to the
list for more than one year. Out of these, 64 have subscribed for more than
two years and in fact a subgroup has been "on" for three years or more. It
must be pointed out that not all responses are thoroughly specific on this
point, many because of only a vague recollection of when they first subscribed,
and a careful sorting of the responses was made. One should be able to check
these data against the subscriber list, but the records in it have been taken
in such a way to make this task only very roughly approximate at best.
Type of position held by respondent.
The positions people hold show a great variety. The largest group is made
up of 77 graduate students (also calling themselves PhD students, doctoral
students, postgraduate students, etc.).
The second largest group was made up of 71 Research Scientists and similar
positions (like research professor, principal researcher). Further, at this
level, 24 respondents characterize themselves as professors, full professors,
lecturers, etc.; altogether, these 95 people may be characterized as senior
academic staff with either full research employment or major research
responsibility.
They are followed by 42 "postdocs". After these, there are 24 assistant
professors, researchers and lecturers, and 17 associate professors and
researchers, i.e., 41 junior academic staff active in research (83 counting the
postdoctorals). A large but varied group is made up of 21 computer scientists,
systems engineers and managers, etc., the group we would characterize as
computer professionals. 7 are in management, marketing, and sales, 4 are
undergraduate students, and up to 5 belong in other groups. Here, a larger
number than in the previous paragraphs described themselves within more than
one category.
Field of activity of respondent.
The question regarding fields of activity produced an almost bewildering
variety of selfdescriptions, as well as a large number of respondents under
more than one characterization. Those appearing most frequently are quantum
chemistry (82), computational chemistry (52), molecular modeling (26), physical
chemistry (including several sub-fields) (19), drug design, QSAR, and pesticide
formulation (taken together) (16), organic chemistry (15), computation,
computational science, and software (taken together) (14), statistical
mechanics and molecular dynamics (grouped together) (14).
A regrouping of the above with those less-represented shows 23 respondents
declaring a field of activity in the biosciences. 4 respondents state that
they work in fields related to materials. A large number of other designations
received less than four responses.
Type of work: experiment, theory, computation, software development.
Recipients of the questionnaire were asked to describe their work as
theoretical, computational or experimental, and whether they develop software,
in a single question. This formulation led to a slightly infelicitous result
as will be seen below.
142 respondents described their work as computational, 56 as theoretical
and computational, 32 as theoretical, 20 as experimental and computational, 18
as experimental, 6 as belonging to all three categories, and 5 as theoretical
and experimental. Twelve respondents gave no usable response.
One answer pointed to a pitfall in this question: "computational,
although I do not understand your distinction between theoretical and
computational". Indeed, there may be some ambiguity in what people understand
under theoretical and under computational before they respond. Inspection of
related questions shows indeed that all those calling their work computational
mean that computing is their main task and their main approach in research.
"Theoretical" in the context of present-day chemistry leans heavily to
"computational", though it implies more modeling, more of an approach to
model-building and to analysis than what "computational" does.
Even if this ambiguity went unresolved, an important fact which needs to be
stressed is that altogether 49 respondents included "experimental" as at least
partial characterization of their work. This means that 17% of the respondents
and thus presumably around a sixth of CCL users are directly performing
experimental work. We believe this has profound implications for understanding
the way chemical research is evolving, namely, that computational and modeling
tools are progressively becoming commonplace.
Also, 218 people altogether consider themselves engaged in computational
work, alone or combined, i.e., 75% of the List is comprised by at least
"partial" computationalists and fully 25% could be called spectators. We
hypothesize that the results and applications of today's computational
chemistry are of enough interest and impact to attract this readership.
As for software development, 128 respondents state that they do develop
software, 58 that they do not, 53 gave no explicit answer to this point, and 42
state that they develop little software, or do so occasionally, or only develop
small programs, or do so only for their own or in-house use, or only port or
modify or adapt existing software.
We submit that of the 53 who do not answer explicitly on software
development, up to 30 may be estimated to do at least some of this work, and
did not answer the question explicitly because we placed both questions as
parts of a single. Most people also take for granted that, since they perform
computational work, it is a given that they develop software. The stated upper
bound of 30 for this subcategory is derived by our observation that a large
number of those responding that they do not develop software also state that
they perform computational work.
The fact that many people can simultaneously state that they perform
computational work and do not develop software is a sign of a paradigm change
in the way chemical research is undertaken today. Not many years ago almost
anybody working with computers in chemistry would be naturally assumed to write
programs, even those working with the most powerful programs which were widely
distributed. Nowadays, maybe up to a third of researchers who legitimately
call themselves computational chemists work without developing software, using
products programmed by others.
The delegation of responsibility, what we could call a covenant of trust,
implied in the "black-box" use of chemical software, is enormous. One can only
hope that the programmers in the software firms are aware of this burden!
The previous paragraphs should be summarized regarding the demographics of
the list: nearly half of the participants consider their work strictly
computational, half do not. At least 25% of participants do not consider their
work computational at all. A large fraction of computational chemists perform
no software development. The List is dominated but in no way overwhelmed by
hard-core-programmers, it is clearly a user's environment.
Computers and software.
Questions I.9 and I.10 on the survey characterize the respondents as to the
type of computer equipment and networking conditions through which they access
the list, and the type of computer and software they use in their work.
In interpreting the replies, we have made the following considerations:
1. We focus only on whether the computers are equipped with a RISC
processor and the Unix operating system. A detailed breakdown of these figures
according to make and, in some cases, model of computer is also made but not
presented here in order to limit liability from commercial interests.
2. For the same reason we will only generally characterize the type of
connection, which state "Ethernet" or "Internet". The overwhelming majority is
"Ethernet" and we understand the "Internet" reply as equivalent because of
either confusion of terminology or because the standard for full-service
Internet connections is Ethernet in most institutions. There are only two
replies referring to token-ring connections; there is also a single one to
"Bitnet", which would merit clarification.
3. For the level of services available, we classify as "full" those with
access to the World Wide Web (WWW) as this implies a fairly rich
telecommunications environment, a relatively high transmission speed, and
reliabile network connection. Some replies include WWW and explicitly exclude
Usenet or other services; for the reason already stated, we decided to classify
them as having "full access". One more qualification on this point is that
some respondents are endowed with powerful computer and networking resources
but can not make full use of them due to security considerations; they are
represented mainly by the replies stating that the respondent is behind a
"firewall", one of the more usual security systems.
4. Only the software types in most frequent use will be mentioned here.
There is a large variety of commercial and public-domain packages mentioned in
the survey and it would be unsound to do so here.
The number of replies to equipment used for access to the CCL is 283, that
is nearly all respondents answered at least one point of this question.
The types of computers used for access to the CCL are quite varied. The
largest category is represented by "Unix boxes", i.e., computers with RISC
processors and Unix operating systems; 192 of them are mentioned, alone or in
company of personal computers connected to them. Moreover, some more processors
may belong to this category but are insufficiently specified as to CPU type or
operating system. Further detail on this point is only possible by a breakdown
by commercial brands and will not be published here.
The representation of the Unix operating system and RISC processors is even
higher under the label "hardware used at work". Only 16 respondents mention
personal computers alone (whether "IBM compatible" or of the MacIntosh type),
and 5 refer to minicomputers of older standards, as only types of equipment
(there is large overlap between these two groups: the total is only 18). All
other respondents use the aforementioned "Unix boxes" or supercomputers of some
type.
As for type of connection, the majority of replies (228 out of 283) refer
to Ethernet. Nonethernet replies are as follows: 23 respondents have access
through modems (with speeds ranging from 2,400 baud to 19,200 BPS; the use of
two "speed" units need be noted, as it prevails in the networking medium.),
plus 7 who have Ethernet access and in addition use modems from home or from
remote locations; 6 have access through Local Area Networks which lead to an
Internet connection; 2 have access through 56 kBPS lines; 24 leave the question
unanswered, or give answers which are insufficiently specific (in 15 of these
one is led to infer a full Ethernet connection.)
In software there is an even larger dispersion of products used. 82
respondents left this point unanswered. Of those answering, only 13 use only
their own or in-house software, other than systems software and programming
language compilers as well as, stated or presumably, tools of general
applicability. The remaining 188 use some computational-chemistry tools of
either commercial or public-domain origin, with the largest part being
commercial. A more precise count is not given because it is not possible to
distinguish in some cases between a commercial and a public-domain or internal
version of the same package. A breakdown by brand names is not presented here.
The inspection of the replies to these points gives the impression that
"power users" are overrepresented in our sample of the list, and that it would
be unwise to take the distribution of quality of connections and computing
power as fully representative. Planning efforts for Internet services must not
ignore this caveat.
However, these respondents are inferred, from other aspects of the present
survey, to be determining the major trends in computational chemistry. In that
sense, the survey shows that the participants of the list on computational
chemistry have, in general, gained access to mid-range and higher processing
power, and are endowed with equally powerful modeling software. The many
mentions of being in the process of upgrading hardware and software, made as an
aside in some replies, show also that modeling and other aspects of
computational chemistry are still in a state of rapid evolution.
Patterns of usage.
Two related questions shed light on the ways the list is used: question
IV.1 concerns the number of inquiries posted to the list, the answers to them
given to the poster by the respondent through postings to the list and through
direct answers to the person posting, whereas IV.2 asks how many replies the
respondent has obtained to his queries through direct postings to him/her, and
also whether some of this have been confidential ("for your eyes only" in the
questionnaire).
Once we tabulated the responses to these questions, some patterns emerged.
We will not dwell on the detailed statistics, and shall proceed to the analysis
of these patterns directly.
We could easily classify the respondents into fairly distinct groups with
very few borderline cases. The groupings were as follows:
o "normal" users (N), i.e. those with 5 or less questions posted, and not
more than twice that number of answers given to others' postings; there were 97
of these.
o "responders" (R), users who have posted no questions and have
volunteered responses to more than 5 questions; there were 52 of these.
o "lurkers" (L), users who have posted neither questions nor answers. The
name "lurkers" is explicitly given by two of them, and is not meant to be
derogatory neither for that matter are the other group names in this section.
o "talkatives" (T), users with an intense participation in all fields;
there were 36 of these, some of which have posted 20 questions, given 30
answers, etc., or even more. One of them states to have posted more than 50
questions and more than 100 replies, as well as sent 100 replies direct to
posters.
o "shy" (S), users who have posted no questions and have sent up to 5
responses direct to those who posted inquiries; there were 16 of these. In some
cases their responses are explicit as to this light participation, as well as
those in the following category and in the L group: their professions are
somewhat distant from computational chemistry, or they have begun graduate
study and therefore involvement in the field only recently.
o "extremely light users" (V), users who have posted possibly only one
question and up to 5 total responses; there were 14 of these. See the preceding
paragraph for explanation.
A total of 13 respondents have received private replies to their postings
with confidential character. A few expounded: one had a response from a
private company, with contents not intended for public distribution; another
was told that his/her corresponded preferred private electronic mail not to be
included in a summary posting.
Two more respondents were explicit in that, although confidentiality had
not been required of them, they will not post private mail publicly unless they
have their correspondents' agreement. Also, one explained why he/she had asked
for confidentiality, namely, the structural data sent were originated as part
of a thesis and had not yet been submitted for publication.
The statistics in this section may be somewhat less indicative of the
situation which prevails over the whole population of CCL users, since it may
be expected that "lurkers" and "shy" users be underrepresented in the survey,
while the more overt and active users will be overrepresented; this is a common
phenomenon in surveys of this kind.
Time spent using the CCL.
Question IV.8 of the survey was concerned with the time and frequency of
use of the CCL. Only 23 of the respondents did not answer this question or
stated that they read the CCL less than once a day. This means that almost all
the subscribers read the CCL once a day or more; from the responses, it is
immediately apparent that CCL is mostly read together with all other electronic
mail.
Many respondents from the appropriate time zones (typically Europe) state
explicitly that they read all one day's mail in the morning; it has accumulated
during that time zone's night, which is daytime in the United States.
More than half of the respondents state that they read the list more than
once a day. These responses take many forms and we have chosen to group them
together; the more frequent patterns are reading the list in the morning and
once more during the day (typically in the afternoon), and scanning the mail
continuously.
Since half the list's readership is in the United States, time-zone
considerations may be of importance for the planning of similar services
elsewhere. Also for the planning of services based on electronic
communications, the model of the user who is constantly scanning the mail is
important but it should be validated for each service according to the
population of users it is expected to serve. As can be seen elsewhere in this
paper, the users of the CCL are heavily computer-oriented, and endowed with
powerful hardware and networking, which makes it both easy for them to use
electronic services and natural for them to integrate the services with their
already heavy usage of computers.
This validates the picture of a computer user who keeps a mail window open
all the time, or is informed by a bell of the arrival of new messages, or
repeatedly scans the incoming mail as a break or between the use of other
computer applications. However, one should not conclude that electronic mail
and lists will reach everybody immediately, since many users also point
qualifiers why they may not read their mail absolutely every day.
Since half the list's readership is in the United States, time-zone
considerations may be of importance for the planning of similar services
elsewhere.
Also for the planning of services based on electronic communications, the
model of the user who is constantly scanning the mail is important but it
should be validated for each service according to the population of users it is
expected to serve. As can be seen elsewhere in this paper, the users of the CCL
are heavily computer-oriented, and endowed with powerful hardware and
networking, which makes it both easy for them to use electronic services and
natural for them to integrate the services with their already heavy usage of
computers.
Impact on research.
The question on how the CCL or its archives have affected the research
conducted by the respondent received 258 answers. For the authors, this was
surprisingly high, given that the question had a verbose formulation and
invited an essay response.
Nine answers described the impact as "none" or "not much"; five of these in
turn were explained by the respondents as due to the recent start of usage of
the list. This result tells us little, though, since it can be presumed that
most people who find any use for the list will not answer this question in the
negative.
A common factor to more than half of responses (139) is that the impact is
described as positive in general terms, mostly for the awareness of issues
relevant for the computational chemistry community, for general knowledge in
the field, for following interesting discussions on topics like the use of
symmetry in chemical computations, semiempirical methods and their strong and
weak points, etc. (see also under the heading "Memorable moments"), for
providing "rules of thumb" and "conventional wisdom", and so on.
Some of these responses provided descriptions or metaphors for the CCL,
like "a class in computational chemistry where each person gets a turn at
teaching", "a world-wide coffee room", "the commons of the computer-savvy
chemical crowd", "sort of computational chemistry newspaper". The CCL is
perceived by this large fraction of its users as a useful tool to keep in touch
with the computational chemistry community at large. Specific mention is made
by some respondents of not being able to attend meetings, at least often, and
this is compensated by the CCL subscription.
A number of responses in this field, as well as in the "Memorable moments"
one, stress positively the promptness and relevance of responses to queries,
and that these responses are frequently contributed by well-recognized experts.
In its most colloquial, this was described by one respondent as "information
from the horse's mouth".
A few specific classifications were made of the responses, based on the
formulation of this question. 113 respondents have been directed to, found, or
acquired software or data through the CCL. The data is in a few classes, like
the Protein Data Bank, basis sets of functions for quantum-chemical
calculations, etc. 77 respondents state that they have been directed to
literature through the list; some of them stress the relevance of this
literature, and one states that the list has saved him "hundreds of hours" of
library work.
Replies to questions posted by respondents were relevant for 34 of them;
four additional respondents received responses which were not relevant or even
useless. 27 respondents have used information gleaned from the list for
hardware decisions; most of the specific instances quoted used benchmark
information of timings for the execution of software in different platforms,
and one was specific in stating that they excluded a type of hardware by
observing that there were many questions about whether different software
packages had been ported to that platform.
Close to a tenth of the respondents, 25, have made valuable contacts
through the CCL. To us, this shows that in many ways electronic communication
is still fostering communications between persons, and may actually be
instrumental in bringing about contacts which are otherwise very difficult to
achieve. The obtaining of tables of data in electronic form pertaining to
recently published papers, the start of collaborations between distant
laboratories, etc., has been documented in this survey.
Further, 25 respondents recalled tips on the use of software or on software
"bugs" (besides the general mention of this in the "general value" class), 20
have found about other lists and networked information resources through the
CCL, 17 specifically refer to the answers they have given to postings by
others, and 12 have used the CCL to post or announce their own software to
other users.
Some specific responses stand out as an indication of the value of fast
communications in an environment where academic, institutional or other
hierarchies are not dominant. One user found, listening to discussions in the
CCL, that a project which was being planned was out of reach at the time with
the tools at hand, and it was postponed; two years later, the tools for it had
become available and it now can be performed. A couple of responses come from
people who are in some sense isolated: some live in countries far from the
mainstream of academic activity, others are the sole computational chemists in
large research groups (in fields like pharmaceuticals, DNA research, etc.), and
they all are kept bound to the academic community of computational chemists
through the CCL, which in turn allows them to be more effective in the fields
where they apply computational chemistry.
Finally, we should mention two insights of relevance for the commercial use
and development of the Internet which appeared in responses to this question.
One says the CCL provides a way to know what is thought of the competitors; the
other says the CCL provides a unique way of knowing what the users of a given
product (or set of products) need, and to do so interactively. Up to now, this
"eavesdropping" has been well within the realm of conducting CCL business to
the best interest of computational chemists. Doubtless, though, in some time
this kind of "live" interaction with potential customers will be seen to belong
to a sphere somewhat disjoint from the "commons of the computer-savvy chemical
crowd".
Starting collaborations through the CCL.
A different approach to the effect of the list on the way research is
conducted was made in the question whether respondents have started
collaborations through the use of the CCL. Deliberately, we did not further
specify the types of collaborations since these may be very diverse, and since
the one with more formal results, published papers, may still be too early in
the making. We check this fact against the duration of subscriptions, which
allows for few collaborations to have been brought to this point.
In this case, we obtained 27 responses stating "yes" or "once", 4 "twice",
1 "three times", 1 "many", 18 affirmative but specific in illustrating a looser
link: useful exchanges, "almost", "sent programs", "not formalized", "few short
collaborations", or referring to collaborations discussed but not yet started,
or under consideration.
One more response said one collaboration had been initiated, leading to a
new, paying client. This illustrates again the way that, within approved use,
the Internet may be opening up new ways of doing business.
Summing up, collaborations started had altogether 34 fully positive
responses and a total of 52 including those which point to looser links. It
must be further noted that several of the negative responses were "not yet" or
"no, but definitely possible". This underlines the expectation that networked
media like the CCL will develop into social spaces where the opening of new
forms of cooperation is possible.
Memorable moments.
The question on memorable moments of the list was answered by 96 of the
respondents. Some of them referred only to most interesting/useful/pleasant
moments remembered, some only to unpleasant/obnoxious, and 42 to more than one
comment, mostly of more than one kind. We classified the first group as
"positive" comments and the second as "negative" as to the matters they
referred to. These matters are extremely diverse as will be seen.
When counting the statements made altogether about memorable moments, 92
are found to be "positive", 74 "negative" and 3 neutral (like "none, really" or
"nothing major"). We did not classify two more statements because it was
impossible to do so in the way they were stated and by the subject they
referred to. We could venture to state that altogether the CCL is remembered
more for its interesting, useful and pleasant events than by its unpleasant or
even obnoxious ones. Also we must submit that possibly many respondents did not
address this point because of the lengthy nature of the survey and the verbose
response this question required.
Let us address the "negative" points first. Foremost among these are what
respondents call overdrawn and/or overheated discussions on points they
consider of insufficient interest. In most instances cited, these discussions
were also plagued by personal unpleasantness. In that sense, they qualify as
"flame wars". Those most frequently recalled were on:
List rules: after an increase in the number of address inquiries,
explicitly discouraged in the rules of the CCL, a heated discussion ensued on
how the list should be run. The protests of many subscribers against using so
much of the list for the discussion itself finally lead the list coordinator to
close it down for a few days and to hold an electronic vote on whether it
should be run further on with the same rules or if a major change was desired
(a moderated or censored list). The episode is also remembered by respondents
favorably, both by the coordinator's conduct and for the overwhelming vote in
his favor.
Semiempirical methods in quantum chemistry: two debates were recalled on
this subject. One of them was on the relative merits of different
parametrization schemes, and the other was on the relative capacities of
quantum chemical electronic structure calculations and molecular mechanics.
The facts disliked by the respondents were the personal character these debates
took for moments, with strong ad hominem arguments and with serious
misunderstandings of other subscribers' statements.
The relative merits of different programming languages (Fortran and C) and
approaches (object-oriented vs. others): these debates are known as
"religious wars" in the Internet, because of the passion perceived in some of
the participants, and take place in several forums. More than one instance of
them has been seen in the CCL, with a particularly long and detailed, as well
as heated, discussion in 1993. Apparently most of the responses against these
events were specific in considering the CCL an inappropriate forum for them.
Foreign language: a discussion, already somewhat alien to the CCL, on
whether computer languages could substitute for the foreign language
requirement in graduate schools, mostly in the U.S., led to a discussion in
general on the foreign language requirement itself.
Other discussions received mention, like the perceived attack and defense
of commercially available software.
One response is particularly explicit in mentioning several of these
discussions and calling them "good" while deploring their tone. Some other
"negative" responses refer to this also, specifically to the personal character
of some postings: "backbiting and one-upmanship" "pompous and arrogant
posting...", "painful to see so much distaste", "obnoxious discussions [based]
on partial misunderstandings", "nasty fight", "hot headed scientists getting
emotional on issues", "personal abuse", "unethical notes even if addressed to
others".
Other events qualified as unpleasant or obnoxious are:
o Address inquiries (the largest number of unfavorable responses).
o Inappropriate postings: questions which have easy answers outside the
CCL, or which have been recently answered, or which are seen to reflect
laziness of the sender; commercial advertisements; a variety of postings
considered banal.
o Nagging messages to coordinator.
o Personal replies to the listserver.
o Harassment, described as "testosterone-impregnated sales note from a
professor..."
o Multiple copies of messages or "bounces" (usually due to failures in
software, at the CCL site or elsewhere).
o The "positive" side is made up mostly of the following:
Appreciation for help received. In some instances, explicit mention is
made that an issue which the subscriber thought would have a well-known
resolution is still the matter of discussion among experts, like symmetry and
symmetry-breaking in molecular geometry optimization procedures.
Contributions to research and the way it is conducted. A vast majority of
the "positive" responses refer to learning ways of doing things like using
Unix, software packages, etc., and/or to learning of the availability of
solutions to problems the subscriber had. In specific cases the duration of the
event is cited: response on a question regarding literature saved days of
library work, receiving a program from Estonia and having it operate correctly
took only a day, reading a journal paper led to obtaining and using structural
data related to it in twelve hours. This points out to a major point of the CCL
and maybe many other Internet media: access to relevant information and tools
can be made with pinpoint precision in short times.
Another major citation goes to relevant discussions of scientific,
methodological and computational principles. Here too many respondents
acknowledge explicitly the ready availability of renowned experts, both for
these general discussions and for aid on running their programs. One
respondent is explicit, on answering this question, that the science
discussions on the list are the best in his/her view, and hold an important
potential contribution to science from computational/theoretical improvements.
The discussions noted above as "negatively" qualified also received applause,
approximately in equal numbers.
Some people actually like these discussions, notwithstanding their
"negative" aspects; responses in this direction are "enjoy a good discussion,
learn something", and expressions of enjoying even the heated debates.
Personal links established or reestablished. One instance is quoted of
answering the posting of a foreign student, which led to a permanent
correspondence on a variety of subjects. Another is quoted of advisorship of a
student 3,000 miles away as a result of contact through the list. Further,
positive feedback on programs and notes posted are also noted.
Response to a call of distress. The quick response of CCL members to a call
of distress, related to a chemical disaster in Pakistan in the Fall of 1993, is
frequently evoked.
A number of responses point to a more general aspect: helpful "netters",
"feeling of community" derived from the responses, calling the CCL a "very nice
place on the Internet", "generally people have been intelligent, polite and
helpful", etc.
Questions missing.
As mentioned above, a pair of questions was included at the end of the
survey, on questions that should have included and on general thoughts about
the CCL. A large number left these unanswered; it is usually not possible to
distinguish those who skipped the question from those who had no further
thought to contribute after answering the by then lengthy questionnaire.
The most important questions missing pointed out (by 4 respondents) would
have found out demographics of the CCL users: age, gender, nationality,
education, degrees, languages, etc. We deliberately chose not to ask for gender
in view of our perception of discomfort about this issue on the list, and its
possible consequence of denied replies. It might be possible to conduct a
gender-related survey in the near future, with a focus on such issues and the
collaboration of specialists in this particular field. As for nationality, we
chose to concentrate only on two related points: country of reception of the
list, and the questions related to the international impact of the list
described elsewhere.
The issue of education is important to characterize the users of the list.
Although we did not specifically ask a question on this, we obtain it
indirectly but quite faithfully otherwise on asking about position held at the
time of responding. As has been described, we are able to spot and count
undergraduate and graduate students, postdoctoral staff, and several categories
of professors which can be safely assumed to enjoy doctoral degrees in a large
majority of cases. So this issue is dealt with in the survey in a way relevant
to its ends, and beyond.
|