From owner-chemistry@ccl.net Sat Apr 13 23:27:00 2024 From: "Felipe S. S. Schneider schneider.felipe.5^_^gmail.com" To: CCL Subject: CCL: Lossless encoding of a chemical formula as an integer Message-Id: <-55121-240413190058-17930-6qZsd9sHgCGOn54eKV+z1Q.@.server.ccl.net> X-Original-From: "Felipe S. S. Schneider" Content-Type: multipart/alternative; boundary="000000000000eec8fa0616025aac" Date: Sat, 13 Apr 2024 20:00:17 -0300 MIME-Version: 1.0 Sent to CCL by: "Felipe S. S. Schneider" [schneider.felipe.5|a|gmail.com] --000000000000eec8fa0616025aac Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Fran=C3=A7ois, What you're asking seems to fit a XY problem ( https://en.m.wikipedia.org/wiki/XY_problem). Felipe Em sex, 12 de abr de 2024 15:00, Michel Petitjean petitjean.chiral%a% gmail.com escreveu: > > Sent to CCL by: Michel Petitjean [petitjean.chiral^^^gmail.com] > Dear Fran=C3=A7ois, > > At first glance this is a software problem rather than a hardware problem= . > Most compilers deal with integer*4 numbers > (-21474836, 21474836) . > F77 can deal with integer*8 numbers > (-9223372036854775808, 9223372036854775807). > There are other way to handle big integers, such as using BCD arithmetic. > You can also look at BCD arithmetic. > You may manage yourself long integers of arbitrary size: e.g. see > pnkcnp.f in libfor.t with its called routines in libuti.t > (these free libraries can be found at > https://hal.archives-ouvertes.fr/hal-03798049) > > But more seriously, why bothering to save space when storing chemical > formulas ? > Once a formula is coded, you can build a program toi decode it. > It seems of more importance to define which chemical information is of > interest for you to be stored from what you call a "chemical formula". > RN (Registry Numbers) were created by CAS to face this difficult > problem (even the case of water is complicated). > Once you have a clear vision of what you want to store, saving space > or computing time is by far a much easier problem. > An increasing number of chemical compounds is discovered each year, > and the world of chemical formulas is still full of surprises. > Good luck! > > Best regards, > Michel. > > Michel Petitjean, retired chemist > http://petitjeanmichel.free.fr/itoweb.petitjean.graphs.html > > Le ven. 12 avr. 2024 =C3=A0 17:32, Francois Berenger > berenger++edu.k.u-tokyo.ac.jp a =C3=A9crit : > > > > Dear list and experts, > > > > Is there an accepted/known/published way to encode a chemical formula a= s > a rather small integer? > > > > I am looking for a lossless method; i.e. the integer code should be > decodable to the original chemical formula. > > > > I can come up w/ a scheme using prime numbers, but I wonder if there is > already something out there. > > > > When I say "small integer", this is because I don't want the obtained > integer value to overflow the machine-representable integers. > > > > A hash is not acceptable since those are lossy (i.e. irreversible). > > > > Thanks a lot, > > Francois. > > > > > > > > -=3D This is automatically added to each message by the mailing script = =3D-> > > --000000000000eec8fa0616025aac Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Fran=C3=A7ois,=C2=A0

What you're asking seems to fit a XY problem (https://en.m.wikipedia.or= g/wiki/XY_problem).

= Felipe

<= div dir=3D"ltr" class=3D"gmail_attr">Em sex, 12 de abr de 2024 15:00, Miche= l Petitjean petitjean.chiral%a%gmail.com &= lt;owner-chemistry+*+ccl.net&g= t; escreveu:

Sent to CCL by: Michel Petitjean [petitjean.chiral^^^gmail.com]
Dear Fran=C3=A7ois,

At first glance this is a software problem rather than a hardware problem.<= br> Most compilers deal with integer*4 numbers
(-21474836, 21474836) .
F77 can deal with integer*8 numbers
(-9223372036854775808, 9223372036854775807).
There are other way to handle big integers, such as using BCD arithmetic. You can also look at BCD arithmetic.
You may manage yourself long integers of arbitrary size: e.g. see
pnkcnp.f in libfor.t with its called routines in libuti.t
(these free libraries can be found at
https://hal.archives-ouvertes.fr/hal-0379804= 9)

But more seriously, why bothering to save space when storing chemical formu= las ?
Once a formula is coded, you can build a program toi decode it.
It seems of more importance to define which chemical information is of
interest for you to be stored from what you call a "chemical formula&q= uot;.
RN (Registry Numbers) were created by CAS to face this difficult
problem (even the case of water is complicated).
Once you have a clear vision of what you want to store, saving space
or computing time is by far a much easier problem.
An increasing number of chemical compounds is discovered each year,
and the world of chemical formulas is still full of surprises.
Good luck!

Best regards,
Michel.

Michel Petitjean, retired chemist
http://petitjeanmichel.free.fr= /itoweb.petitjean.graphs.html

Le ven. 12 avr. 2024 =C3=A0 17:32, Francois Berenger
berenger++edu.k.u-tokyo.ac.jp <owner-ch= emistry__ccl.net> a =C3=A9crit :
>
> Dear list and experts,
>
> Is there an accepted/known/published way to encode a chemical formula = as a rather small integer?
>
> I am looking for a lossless method; i.e. the integer code should be de= codable to the original chemical formula.
>
> I can come up w/ a scheme using prime numbers, but I wonder if there i= s already something out there.
>
> When I say "small integer", this is because I don't want= the obtained integer value to overflow the machine-representable integers.=
>
> A hash is not acceptable since those are lossy (i.e. irreversible). >
> Thanks a lot,
> Francois.
>
>



-=3D This is automatically added to each message by the mailing script =3D-=
E-mail to subscribers: CHEMISTRY+*+ccl.net or use:
=C2=A0 =C2=A0 =C2=A0 http://www.ccl.net/cg= i-bin/ccl/send_ccl_message

E-mail to administrators: CHEMISTRY-REQUEST+*+ccl.net or use
=C2=A0 =C2=A0 =C2=A0 http://www.ccl.net/cg= i-bin/ccl/send_ccl_message
=C2=A0 =C2=A0 =C2=A0 http://www.ccl.net/chemi= stry/sub_unsub.shtml

Before posting, check wait time at: http://www.ccl.net

Job: http://www.ccl.net/jobs
Conferences: http://server.ccl.= net/chemistry/announcements/conferences/

Search Messages: http://www.ccl.net/che= mistry/searchccl/index.shtml
=C2=A0 =C2=A0 =C2=A0 http://www.ccl.net/spammers.txt
RTFI: http://www.ccl.net/chemistry/a= boutccl/instructions/


--000000000000eec8fa0616025aac--