Contemporary Computer-Assisted Approaches to Molecular Structure Elucidation
eBook - ePub

Contemporary Computer-Assisted Approaches to Molecular Structure Elucidation

  1. 504 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Contemporary Computer-Assisted Approaches to Molecular Structure Elucidation

Book details
Book preview
Table of contents
Citations

About This Book

Computer-Assisted Structure Elucidation (CASE) systems are a combination of software algorithms and tools to support and enable chemists and spectroscopists engaged in the process of molecular structure elucidation via the analysis of spectroscopic data. These expert systems dramatically reduce the time associated with structure elucidation and improve the reliability of the results. Contemporary Computer-Assisted Approaches to Molecular Structure Elucidation describes the principles on which these expert systems for spectroscopic structure elucidation are based and concisely explains the algorithmic concepts behind the programs. The authors use their own personal experiences in the development of the Structure Elucidator (StrucEluc) CASE software system to discuss the present state-of-the-art in computer-assisted structure elucidation. Scientists that are presently using CASE systems will be interested in the algorithms and modern approaches and for organizations that are currently using the StrucEluc platform the book is designed to help researchers understand the strategies behind CASE as well as details regarding the StrucEluc platform. For scientists that have never used CASE systems they will now have access to all necessary information to understand CASE systems for mastering this new and very effective approach to structure elucidation. The authors overall goal is writing this book is to produce the 'must read' definitive text that will represent the results of decades of work to develop computer-assisted structure elucidation software systems. CASE systems are now powerful software tools commonly outperforming and correcting human interpretations of data. This book will also provide an historical perspective of the work of the founding fathers of the technique and identify the challenges that have been overcome to produce modern CASE systems.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Contemporary Computer-Assisted Approaches to Molecular Structure Elucidation by Mikhail E Elyashberg, Antony Williams, Kirill Blinov in PDF and/or ePUB format, as well as other popular books in Ciencias físicas & Química analítica. We have over one million books available in our catalogue for you to explore.

Information

Year
2015
ISBN
9781782625766
Part I
COMPUTER-ASSISTED STRUCTURE ELUCIDATION: FUNDAMENTALS
CHAPTER 1
General Principles of CASE Systems

1.1 Statement of the Problem of Structure Elucidation

The first reports devoted to computer-assisted structure elucidation (CASE) were published by four independent groups of researchers14 in the late 1960s. Prior to describing CASE methods, we will consider the complex nature of the structure elucidation problem.
It is necessary to distinguish two different analytical problems that are associated with the structure elucidation of molecules. The first relates to the identification of a molecule that is assumed to be known already and whose physicochemical characteristics, specifically the associated spectral data, are included into collections of reference data. In this case the solution is found by searching through the reference data and comparing the data measured for the unknown with the available reference data. Most frequently the search is performed using one or more of the mass spectrometry (MS), nuclear magnetic resonance (NMR) or infrared (IR) spectra of the unknown. The second problem is much more challenging and supposes that the unknown is a novel compound synthesized or isolated for the first time. The methods of solving these two problems are quite different. Therefore the first step is the assignment of a given unknown to one of the two mentioned categories. As we will see later in Section 1.6 specific procedures are used for this goal. This book will focus primarily on the structure elucidation of the new organic compounds.
The problem of elucidating the structure of a new compound can be divided into the two following sub-problems: (a) establishing the molecular formula, i.e. determine the type and number of each of the chemical elements making up the molecule, and (b) determining how the atoms are connected by chemical bonds of different multiplicities in the structure. Modern approaches for the determination of molecular formula are well established and based on high-resolution MS (HRMS). In general, they allow for the unambiguous determination of the molecular formula of an unknown.5
Molecular structure determination is such a challenge primarily due to the phenomenon of isomerism. In the 18th century Alexander Humboldt was probably the first who conjectured that there might be chemical substances which are composed of the same set of atoms but have different properties. This was later proven experimentally by Gay-Lussac, Liebig and Wohler6 at the beginning of the 19th century, and the new term “isomers” was introduced into chemistry by Berzelius7 in 1830. The following questions then became of interest to chemists: how many isomers can theoretically exist for a given molecule and how can they be exhaustively enumerated? The mathematical challenge of enumerating all isomers corresponding to a specific molecular formula was later realized when the notion of atom valence was defined and the first research into this area was initiated by Cayley8 in the second half of the 19th century. The computation of the number of isomers from the molecular formula became possible only in the 1960s when computers arrived on the scene and computational chemistry, as a specific area of scientific investigation, was born. Mathematical algorithms and programs were then developed911 that provided a possibility not only to calculate the number of isomers corresponding to a given molecular formula but also to generate the structural formulas for all isomers, which is clearly an important capability. As a result, chemists then had the ability to estimate the magnitude of the number of isomers that could be related to well-known substances and the values were rather unexpected. For example, it turned out that benzene was one of 217 conceivable isomers with the composition of C6H6 and articles were published in which all of the structural isomers were enumerated and depicted for the first time.12 The number of isomers produced by the calculations showed that the number is unexpectedly large, even for small molecular formulae.
Figure 1.1 displays the structures associated with a series of modest-sized chemical compounds and the number of potential calculated structural isomers.13,14
image
Figure 1.1 The structures of a series of small molecules and the theoretical numbers of isomers (N) associated with the related molecular formulae.
Figure 1.1 shows that even the simplest of structures can have hundreds of millions, up to trillions, of isomers. For the simple structure with the associated molecular formula of C10H17Br2ClO2, the number of isomers, N, exceeds 50 million and rudimentary inspection suggests that more than 40 million of these could likely exist. It should be noted that the CAS registry15 contains “only” 56 million known chemical compounds while 45 million are commercially available.
The number of isomers associated with the structures of medium-sized complex organic molecules can be estimated as approximately 1020–1030 isomers (in the order of Avogadro's number). At the same time, the following very important conclusion can be drawn: although the number of possible isomers is huge, those corresponding to a given molecular formula do make up a countable(at least in principle) and finite set. With this in mind, we can immediately formulate the following general CASE strategy: to eliminate “superfluous” isomers from the full isomer set by imposing different structural constraints. Figuratively, the general CASE strategy is similar to that of a sculptor who removes superfluous material to produce a masterpiece (Figure 1.2).
image
Figure 1.2 The analogy between the CASE strategy and the technique of a sculptor.
The more constraints that are imposed, and the more severe they are, then the larger the number of “superfluous” isomers that will be rejected. Since structure identification boils down to the selection of a unique structural formula assigned to an unknown, then a successful result depends on the screening and rejection of N–1 structural formulae that do not comply with the experimental data and constraints applied. We will call the set of n non-identical isomers (1≤n<N) selected as a result of imposing a series of constraints the solution of the structure elucidation problem. The solution is called valid if it contains the correct (genuine) structure and otherwise it is invalid. The solution is called unambiguous if the response file contains only one structure. It should be noted that an unambiguous solution can be either valid or invalid. If the response file contains no structures (n=0), then the imposed constraints are contradictory and the problem has no solution under the chosen conditions. The conceivable constraints leading either to unique structure or at least to a manageable set of plausible structures are outlined below.

1.2 A Molecule as a “Machine” for Coding Structural Information

Molecular structure elucidation is based on the same general cognitive principles that are common to the properties of particles belonging to the atomic and subatomic world. In order to obtain information regarding a particular property of a particle it is necessary to stimulate the particle using electromagnetic radiation (or combination of electromagnetic radiation and magnetic field) or stimulate using a stream of particles and then analyze the resulting response signal. In those cases where we want to extract information about the structure of a molecule we excite the system with electromagnetic radiation over a wide frequency range using electri...

Table of contents

  1. Cover
  2. Title
  3. Copyright
  4. Foreword
  5. Preface
  6. Contents
  7. Acknowledgments
  8. Part I Computer-Assisted Structure Elucidation: Fundamentals
  9. Part II Examples of Case Expert Systems
  10. Part III Expert System: Structure Elucidator
  11. Subject Index