Foreword By Alexandru T. Balaban
The twelve chapters of this book edited by Dr. Mahmud T. H. Khan have been written by 38 authors in a truly international collaborative effort. Addresses of the authors indicate the following 11 countries (in alphabetical order): Argentina, Cuba, Denmark, Egypt, India, Italy, Japan, Portugal, Romania, Spain, and USA.
Quantitative structure-activity relationships (QSARs) allow medicinal chemists to explore the infinite “chemical and stereochemical space”, and to select structures that offer hope in finding new “lead compounds”, which will be the basis for developing medicinal drugs after further investigations concerning ADME-Tox properties (absorption, distribution, metabolism, excretion and toxicity of drugs). At present, a patent offers the inventors a protection for 20 years during which the pharmaceutical company is expected to recover the investment that approaches one billion US$ for a new drug on the market. However, the cell culture tests followed by animal and clinical tests last for at least 10 years, reducing by half the real protection time for brand medicines before generic drugs appear. Because of these huge costs, all main pharmaceutical companies have large teams of computer-aided specialists for screening possible structures and in order to abandon early the unproductive avenues. The present thesaurus of about 50 million chemical compounds contained in the Chem. Abstr. database is enriched with about one million new structures through the combined effort of researchers from universities and from chemical, materials sciences, or especially pharmaceutical companies. Many more structures per year are currently obtained by combinatorial chemical syntheses, but the Chem. Abstr. database records only the substances that are selected, isolated and characterized.
The development of quantitative structure-activity or structure-property relationships (QSARs and QSPRs, respectively) has to be based on molecular descriptors in order to provide a metric to chemical structures. This can be done in several ways, for instance by the fragment-based approach, by associating biological activities with other (measurable or computable) properties such as the 1-octanol ─ water partition coefficient or the lipophilicity (logP), number of hydrogen-bond donors or acceptors, molecular weight, etc.
In drug development, there are necessary sequences of events: hit identification → lead generation → lead optimization → candidate drug nomination. At present, the biochemical tests gained a new, much more economical, first step: in silico → in vitro → in vivo tests. For details, one may consult a recently published book edited by Oprea, T., Chemoinformatics in Drug Discovery, Wiley-VCH, Weinheim, 2005.
The close correspondence between chemical constitution and chemical graphs has led to one class of molecular descriptors that are simple to compute and offer a large field of applications: topological indices. Several books on such descriptors have been published during the last 15 years: (i) Kier, L. B.; Hall, L. H., Molecular Structure Description: The Electrotopological State, Academic Press, 1999; (ii) Devillers, J.; Balaban, A. T. (Editors), Topological Indices and Related Descriptors in QSAR and QSPR, Gordon and Breach, The Netherlands, 1999; (iii) Karelson, M.,Molecular Descriptors in QSAR/QSPR, Wiley-Interscience, New York, 2000; (iv) Todeschini, R.; Consonni, V., Molecular Descriptors for Chemoinformatics, Wiley-VCH, New York, 2009; (v) Gonzalez-Diaz, H. and Munteanu, C. R. (Editors), Topological Indices for Medicinal Chemistry, Biology, Parasitology, Neurological and Social Networks, Transworld Research Network, Kerala, 2010
A real-world example can serve to illustrate the usefulness of combining several types of molecular descriptors. For optimizing a lead decapeptide with immunosuppressive properties, a virtual library with 64 billion structures resulted by assigning 35 natural and non-natural amino acids to seven of the ten positions in the decapeptide. Thirteen molecular descriptors belonging to four classes (one which consisted of four topological Kier-Hall and Balaban indices) provided “windows” of favorable or unfavorable numerical values for gradually filtering the different decapeptides using computed values. As a final result, five decapeptides were predicted to have the desired immunosuppressive properties; they were synthesized, and one of them was indeed found to improve 100-fold the activity of the lead decapeptide (Lahana, R. and coworkers,Nature Biotechnol. 1998, 16, 748-752).
Many QSAR studies benefit at present from the availability of computer programs that calculate hundreds of molecular descriptors from constitutional formulas (tridimensional features are seldom taken into account). Two such computer programs are Katritzky’s CODESSA and Todeschini’s DRAGON. Then a statistically-driven selection of a few descriptors follows with validation of the results that yields confidence in predictions within the range of the tested structures.
The present developments in understanding the mechanism of action for many natural antibiotics have often been based on observing directly the complex interactions at molecular level from X-ray diffraction studies. Of course, natural evolution works under strict limitations imposed by the availability of starting materials and thermodynamically-allowed processes using ATP-based energy transfer, and receptors manufactured from the 20 natural amino acids. Drug design is free from such constraints and is limited only by the imagination of medicinal chemists. In this context, one may recall that chemistry, like mathematics, is a science that combines rigor with imagination and intuition, adding art-like qualities. Accordingly, drug design was designated by German scientists as the “art of the four G’s”: Geduld (patience), Glück (luck), Geschick (skill), and Geld (money).
Foreword By Roberto Todeschini
QSARs are based on the assumption that the structure of a molecule must contain the features responsible for its physical, chemical and biological properties and on the ability to capture these features into one or several numerical descriptors. With QSAR models, the biological activity (or property, reactivity, etc.) of a newly designed or untested chemical can be inferred from the molecular structure of similar compounds, whose activities (properties, reactivities, etc.) have already been assessed.
For QSAR as well as all the research related human activities, knowledge should not be considered as something given once and for all, based on some final basic theories, but as a network of models in progress. This network primarily consists of knots, i.e. objects, facts, theories, statements, and models, and the links between the knots are relationships, comparisons, differences, and analogies: such a network is something more than a collection of facts, resulting in a powerful engine for analogical reasoning. This analogical reasoning, which can now be based on an experience of around 50 years, should further strengthen the field of QSAR and broaden its applications.
Indeed, it has been nearly 5 decades since the QSAR modeling was practiced in agrochemistry, drug design, toxicology, industrial and environmental chemistry. In the coming years, growing use of QSARs can be mainly attributed to the rapid and extensive development in methodologies and computational techniques, that have allowed to delineate and refine the many variables and approaches used to model the molecular properties. Furthermore, the popularity of QSARs is growing day by day since their applications are no longer just confined to academic research, but are widespread in several public and private sectors within medicine, pharmacology and toxicology and, in general, for all the issues where the human health is involved. For instance, the usefulness of QSAR to generate data on chemicals in the interest of time and cost effectiveness or their contribution in modern science towards drug discovery are amongst several advancements that have considerably broaden the perspectives of QSAR.
However, it should be recognized that, often QSAR analysis on its own, cannot give useful answers to several complex problems. In such cases, the analysis has to be accompanied by one or several tools that can bridge this gap, making it feasible for QSAR to deal with highly complex problems. The increasing complexity of QSAR can be easily reflected by its highly ambitious objectives: from the classical simple models evaluated on few congeneric compounds, the interest was concentrated on modeling several thousands of diverse compounds provided by huge databases. This step was accompanied by the major developments in chemoinformatics which, together with several other tools, allowed easy handling of huge and complex data sets.
However, the rising complexity of the studied systems has not always reflected a corresponding increase in the quality of the modeling tools. Indeed, to deal with this increased complexity, we need to catch not only the linear relationships but also the nonlinear relationships need to be taken into account. This point is particularly important because the loss of all the nonlinear aspects of the problem often leads to incomplete analysis and practically unuseful models.
The problem complexity is closely related to the current availability of several thousands of molecular descriptors. Indeed, molecular descriptors are of crucial importance in the research field of QSAR, where they are the independent chemical information used to predict the biological activities of interest. However, the use of irrelevant descriptors, not only increases unnecessarily the model complexity, but usually also lowers the predictive capability of the model. This leads to the need of variable subset selection approaches, which further adds to this increasing complexity.
As an additional term of complexity, it should be also remembered that, from a theoretical point of view and unlike other systems, molecules constitute an intrinsic discrete space, i.e. the space between two molecules does not exist in principle.
Considering all these aspects, the perspective to derive a unique generalized model valid for all chemical categories and able to predict some of their responses, not just appears very ambitious but it almost seems like a dream. An obvious feasible alternative is to explore the possibilities to build local models, which are able to make predictions based on the information available for the most similar compounds to the target compound, such as suggested, for example, by the read-across strategy.
But it can’t be forgotten that, dreams sometimes become true…!
In this book, another prominent step towards the improvement of QSAR methodologies, their effective solutions and applications using several interesting case studies are proposed, with an aim to highlight new significant contributions to the field of QSAR.