CCL: hydrogen bond
- From: "Mezei, Mihaly"
<mihaly.mezei]_[mssm.edu>
- Subject: CCL: hydrogen bond
- Date: Fri, 5 May 2023 18:20:40 +0000
Sent to CCL by: "Mezei, Mihaly" [mihaly.mezei^^mssm.edu]
Greetings,
I think that the most problematic aspect of using an AI bot to answer a
scientific query is not necessarily the accuracy of the answer (although it IS
important, of course) since AI is expected to improve in time and the accuracy
could thus improve. What I find most problematic is the lack of reference to the
source of the information. Even if the bot may be able to point to the web page
a particular sentence is based on, it is less likely to be a peer-reviewed
article (not that peer reviewing guarantees accuracy but still ...).
I am also wondering about an other issue related to AI bots: as time goes on,
the internet will contain (or, can we say, contaminated with?) a lot of pages
generated by such bots, so training on internet pages runs into the danger of
confirming inaccuracies.
Mihaly Mezei
Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai
Voice: (212) 659-5475 Fax: (212) 849-2456
WWW (MSSM home): http://icahn.mssm.edu/profiles/mihaly-mezei
WWW (Lab home - software, publications): https://mezeim01.dmz.hpc.mssm.edu
WWW (Department): http://www.mssm.edu/departments-and-institutes/pharmacology-and-systems-therapeutics
________________________________________
> From: owner-chemistry+mihaly.mezei==mssm.edu(_)ccl.net
<owner-chemistry+mihaly.mezei==mssm.edu(_)ccl.net> on behalf of Laurence
Cuffe cuffe[*]mac.com <owner-chemistry(_)ccl.net>
Sent: Thursday, May 4, 2023 4:50 PM
To: Mezei, Mihaly
Subject: CCL: hydrogen bond
USE CAUTION: External Message.
There are two aspects to this. One of which is to look at the toll in its
present form. Its been trained on a huge corpus of text from the internet, and
the result is a little like grading a random high school chemistry text. The
answers you get are unlikely to be either nuanced, deep, or particularly deep.
This comes into the class of “Lies told to children”, an
educational term for the simplifications which educators make to convey models
at a level matching the students knowledge and abilities.
The prospect of training such a system on a more specific corpus of texts such
as scientific papers is interesting, though it also presents problems. A lot of
such text is behind paywalls, and we have a second issue of how to incorporate
the idea of scientific progress, at a trivial level in terms of changes in
nomenclature, and at a less trivial level in terms of the advance of science and
the development of theory.
The second aspect is more of a housekeeping one. While ChatGPT will respond to a
very large number of queries, getting an effective response which maximises the
value of the model in answering your query is more complex, and has just opened
up as a field called Prompt engineering. The open AI foundation have a good
starting point on this here:
https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api<https-:-//urldefense.proofpoint.com/v2/url?u=https-3A__help.openai.com_en_articles_6654000-2Dbest-2Dpractices-2Dfor-2Dprompt-2Dengineering-2Dwith-2Dopenai-2Dapi&d=DwMFaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=_pOLssyMlKixy9t2NfGIeaFX83dKDBvdACoDPwc9A9s&m=KOKdHmNHyYRcMfuD3vQuwhbC-kzHoL-C5ddydY9v-5nNuMDWS7C7-58q7snIVlWD&s=6aRY5UV9ubFm2NPs0FuGTpiSKgHVvZ2_vskveEhdcQA&e=>
I offer this as a former computational chemist, who now teaches AI.
Best
Laurence Cuffe
On 4 May 2023, at 01:28, Alan Shusterman alan[-]reed.edu<https://urldefense.proofpoint.com/v2/url?u=http-3A__reed.edu_&d=DwMFaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=_pOLssyMlKixy9t2NfGIeaFX83dKDBvdACoDPwc9A9s&m=KOKdHmNHyYRcMfuD3vQuwhbC-kzHoL-C5ddydY9v-5nNuMDWS7C7-58q7snIVlWD&s=C3UfFhSLjdbCSv6wuFz0DdZdVAR_XI9wWw2YizbvA6I&e=>
<owner-chemistry,,ccl.net<mailto:owner-chemistry,,ccl.net>> wrote:
ChatGPT's answer contains some correct information, but it is not
"fine". It is incomplete and it can lead many readers to incorrect
conclusions.
First, ChatGPT provides an incomplete description of methylamine, "one
nitrogen atom and one methyl group". That phrase adds up to N-CH3. 2 H
atoms are missing.
Second, most hydrogen bonds connect a hydrogen bond donor and a hydrogen bond
acceptor, See Wikipedia, https://en.wikipedia.org/wiki/Hydrogen_bond<https-:-//urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_Hydrogen-5Fbond&d=DwMFaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=_pOLssyMlKixy9t2NfGIeaFX83dKDBvdACoDPwc9A9s&m=KOKdHmNHyYRcMfuD3vQuwhbC-kzHoL-C5ddydY9v-5nNuMDWS7C7-58q7snIVlWD&s=a0T41XX3VjlpwtvOGqQLfJyvQ9yL6ZCxQD0hdNBgyz0&e=>
ChatGPT fails to describe either of these participants properly. It only
mentions one property of the nitrogen (lone pair), and it does not describe any
required properties of hydrogen (see below).
Nitrogen properties. ChatGPT mentions lone pairs. Unfortunately, lone pairs are
not a sufficient basis for an answer. F in HF has 3 lone pairs but is a weak
hydrogen bond acceptor, whereas F in F anion (4 lone pairs) is a much stronger
hydrogen bond acceptor. But don't be misled because Ne (also 4 lone pairs) is
not a hydrogen bond acceptor. We cannot simply rely on lone pairs.
Hydrogen properties. ChatGPT mentions "hydrogen". That is correct.
Hydrogens are always found in hydrogen bond donors, but not every hydrogen atom
forms hydrogen bonds. Usually the hydrogen must carry a partial positive charge,
and this is usually achieved by H being bonded to a more electronegative atom.
ChatGPT does not even specify which hydrogen participates in the hydrogen bond.
While it is tempting to assume that it's implied because the hydrogen bond
occurs between methylamine and pyrazine, that doesn't do the job because
ChatGPT's description of methylamine is incomplete. It states only that
"the hydrogen atom in methylamine can form a hydrogen bond", but there
are 5 H in methylamine. ChatGPT seems to think there are only 3H because it says
methylamine contains "one nitrogen atom and one methyl group".
Here's another ChatGPT example that reached me a few weeks ago that has no
bearing on this thread's conversation around H bonding. A chemistry teacher and
friend (who will remain anonymous) asked ChatGPT to "explain the NMR
spectrum of ethyl acetate" and sent me ChatGPT's answer:
Ethyl acetate is a common organic compound with the chemical formula
CH3COOCH2CH3. Its nuclear magnetic resonance (NMR) spectrum displays
characteristic peaks corresponding to the different types of protons present in
the molecule.
In the proton NMR spectrum of ethyl acetate, there are four types of protons:
the two methyl group protons (CH3), the methylene group proton adjacent to the
carbonyl (C=O) group (CH2), and the carbonyl group proton (C=O). The chemical
shifts and splitting patterns of these protons are as follows:
• The two methyl group protons appear as a singlet peak at around
1.25 ppm (parts per million), which is typical of alkyl methyl protons.
• The methylene group proton adjacent to the carbonyl group appears
as a triplet peak at around 4.1 ppm, with a coupling constant of about 7 Hz.
This triplet splitting arises from the coupling of the methylene proton with the
two equivalent methyl group protons.
• The carbonyl group proton appears as a singlet peak at around 2.05
ppm, which is typical of carbonyl protons.
Overall, the NMR spectrum of ethyl acetate displays four peaks, each
corresponding to a different type of proton in the molecule. By analyzing the
chemical shifts and splitting patterns of these peaks, it is possible to
identify the different types of protons in ethyl acetate and obtain information
about the molecular structure and bonding.
The so-called explanation (much like the previous one regarding methylamine and
pyrazine) is a mixed bag. There is good and there is bad. And even when there is
good, ChatGPT uses it incorrectly.
Example of good info: a correct formula for ethyl acetate.
Some examples of bad inferences:
- "carbonyl group proton" does not exist in this molecule
- there are six, not "two" methyl group protons in the formula
- the formula shows two methylene group protonS, not a "methylene group
proton"
- the methyl groups are inequivalent and produce signals at different chemical
shifts whereas ChatGPT says there is one signal that is due to "two methyl
group protons" and describes them both as "alkyl methyl protons"
(actually, one is an alkyl methyl, the other is an acyl methyl)
- many errors in the coupling patterns and explanations of coupling
And a final conclusion that is highly misleading:
- "the NMR spectrum of ethyl acetate displays four peaks"
No. There are actually 3 types of protons, and they produce these signal
patterns: a singlet (1 peak), a triplet (3 peaks) and a quartet (4 peaks) for a
total of 8 peaks.
ChatGPT's conclusion (4 peaks) doesn't even agree with its own analysis. It
lists 3 types of protons, and identifies them as producing two singlets + one
triplet -> 3 signals or 5 peaks.
Obviously, this is a much worse example of ChatGPT's abilities than the previous
one, but I think they have much in common. Tread carefully.
Alan
On Wed, May 3, 2023 at 1:12 PM Kshatresh Dutta Dubey
kshatresh_+_gmail.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__gmail.com_&d=DwMFaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=_pOLssyMlKixy9t2NfGIeaFX83dKDBvdACoDPwc9A9s&m=KOKdHmNHyYRcMfuD3vQuwhbC-kzHoL-C5ddydY9v-5nNuMDWS7C7-58q7snIVlWD&s=EUZQ9w2diFDnkKwpPEr8jP-8xica5nEM6G-knLsWE7o&e=>
<owner-chemistry**ccl.net<mailto:owner-chemistry**ccl.net>> wrote:
I am pasting answer > from ChatGPT which seems fine to me:
"Yes, it is possible for a hydrogen bond to form between pyrazine and
methylamine.
Pyrazine is a six-membered aromatic heterocycle containing two nitrogen atoms in
its ring structure, while methylamine is a simple amine molecule with one
nitrogen atom and one methyl group.
In the case of hydrogen bonding, the hydrogen atom in methylamine can form a
hydrogen bond with one of the nitrogen atoms in pyrazine. This can occur because
nitrogen has a lone pair of electrons, which can form a hydrogen bond with
hydrogen."
Hope it helps.
KDD
--
Dr. Partha Sarathi Sengupta
Associate Professor
Vivekananda Mahavidyalaya, Burdwan
--
Alan Shusterman
Professor Emeritus
Chemistry Department
Reed College
3203 SE Woodstock Blvd
Portland, OR 97202-8199
http://blogs.reed.edu/alan/<https-:-//urldefense.proofpoint.com/v2/url?u=http-3A__blogs.reed.edu_alan_&d=DwMFaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=_pOLssyMlKixy9t2NfGIeaFX83dKDBvdACoDPwc9A9s&m=KOKdHmNHyYRcMfuD3vQuwhbC-kzHoL-C5ddydY9v-5nNuMDWS7C7-58q7snIVlWD&s=YCEcZp2eUYeH6FMsT3wPiH8ERuaroH_aCtp_n2_aBcw&e=>
"Patience, persistence, and a sense of humor." Dave Barrett
(1956-2017, Reed College '79)