From owner-chemistry@ccl.net Fri Apr 12 09:49:01 2024 From: "Francois Berenger berenger++edu.k.u-tokyo.ac.jp" To: CCL Subject: CCL: Lossless encoding of a chemical formula as an integer Message-Id: <-55118-240411213614-7291-TEM+QCqOohK9HgpMIymT9Q+/-server.ccl.net> X-Original-From: Francois Berenger Content-Type: multipart/alternative; boundary="0000000000007037450615dc4aac" Date: Fri, 12 Apr 2024 10:35:35 +0900 MIME-Version: 1.0 Sent to CCL by: Francois Berenger [berenger---edu.k.u-tokyo.ac.jp] --0000000000007037450615dc4aac Content-Type: text/plain; charset="UTF-8" Dear list and experts, Is there an accepted/known/published way to encode a chemical formula as a rather small integer? I am looking for a lossless method; i.e. the integer code should be decodable to the original chemical formula. I can come up w/ a scheme using prime numbers, but I wonder if there is already something out there. When I say "small integer", this is because I don't want the obtained integer value to overflow the machine-representable integers. A hash is not acceptable since those are lossy (i.e. irreversible). Thanks a lot, Francois. --0000000000007037450615dc4aac Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Dear list and experts,

Is th= ere an accepted/known/published way to encode a chemical
formula = as a rather small integer?

I am looking for a loss= less method; i.e. the integer code should
be decodable to the ori= ginal chemical formula.

I can come up w/ a scheme = using prime numbers, but I wonder
if there is already something o= ut there.

When I say "small integer", th= is is because I don't want the obtained
integer value to over= flow the machine-representable integers.

A hash is= not acceptable since those are lossy (i.e. irreversible).

Thanks a lot,
Francois.


--0000000000007037450615dc4aac-- From owner-chemistry@ccl.net Fri Apr 12 10:24:01 2024 From: "Jacek Korchowiec korchow[#]chemia.uj.edu.pl" To: CCL Subject: CCL: CTTC IX, 01-05.09 2024, Krakow, Poland Message-Id: <-55119-240412071659-26123-UsndabhzE2Af+YKXUYTjzw]*[server.ccl.net> X-Original-From: "Jacek Korchowiec" Date: Fri, 12 Apr 2024 07:16:49 -0400 Sent to CCL by: "Jacek Korchowiec" [korchow-x-chemia.uj.edu.pl] Dear Colleagues, We would like to announce that the conference CURRENT TRENDS IN THEORETICAL CHEMISTRY IX (CTTC IX) is open for registration. The conference will be held in Krakow from September 1 to 5, 2024 and the organizer is the Faculty of Chemistry at Jagiellonian University. More details can be found on the website: https://cttc9.confer.uj.edu.pl You can contact the organizers for additional information (cttc9^uj.edu.pl). If you are unfamiliar with the CTTC series of conferences, feel free to visit the websites of previous CTTC conferences: www2.chemia.uj.edu.pl/cttc5/, , www2.chemia.uj.edu.pl/cttc8/. We hope that we will meet your scientific needs. We leave the rest to the city of Krakow, which, as the former capital of Poland, will make your stay all the more appealing. CTTC IX Organizers From owner-chemistry@ccl.net Fri Apr 12 13:13:00 2024 From: "Michel Petitjean petitjean.chiral%a%gmail.com" To: CCL Subject: CCL: Lossless encoding of a chemical formula as an integer Message-Id: <-55120-240412131018-16502-BWuJmKf+6VIFjmFmzwNXQw|,|server.ccl.net> X-Original-From: Michel Petitjean Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="UTF-8" Date: Fri, 12 Apr 2024 19:09:53 +0200 MIME-Version: 1.0 Sent to CCL by: Michel Petitjean [petitjean.chiral^^^gmail.com] Dear François, At first glance this is a software problem rather than a hardware problem. Most compilers deal with integer*4 numbers (-21474836, 21474836) . F77 can deal with integer*8 numbers (-9223372036854775808, 9223372036854775807). There are other way to handle big integers, such as using BCD arithmetic. You can also look at BCD arithmetic. You may manage yourself long integers of arbitrary size: e.g. see pnkcnp.f in libfor.t with its called routines in libuti.t (these free libraries can be found at https://hal.archives-ouvertes.fr/hal-03798049) But more seriously, why bothering to save space when storing chemical formulas ? Once a formula is coded, you can build a program toi decode it. It seems of more importance to define which chemical information is of interest for you to be stored from what you call a "chemical formula". RN (Registry Numbers) were created by CAS to face this difficult problem (even the case of water is complicated). Once you have a clear vision of what you want to store, saving space or computing time is by far a much easier problem. An increasing number of chemical compounds is discovered each year, and the world of chemical formulas is still full of surprises. Good luck! Best regards, Michel. Michel Petitjean, retired chemist http://petitjeanmichel.free.fr/itoweb.petitjean.graphs.html Le ven. 12 avr. 2024 à 17:32, Francois Berenger berenger++edu.k.u-tokyo.ac.jp a écrit : > > Dear list and experts, > > Is there an accepted/known/published way to encode a chemical formula as a rather small integer? > > I am looking for a lossless method; i.e. the integer code should be decodable to the original chemical formula. > > I can come up w/ a scheme using prime numbers, but I wonder if there is already something out there. > > When I say "small integer", this is because I don't want the obtained integer value to overflow the machine-representable integers. > > A hash is not acceptable since those are lossy (i.e. irreversible). > > Thanks a lot, > Francois. > >