Artificial intelligence (AI) is essentially the ability of computer software to learn, rather than merely carrying out instructions it has been pre-programmed to perform.
AI comes in many forms, but machine learning and natural language processing are two which are helping drug discovery scientists find new drug candidates. Machine learning is a form of AI in which systems learn from data to determine patterns and make decisions based on these patterns, while natural language processing refers to computer systems that can ‘read’ and use written information.
There has been great progress in machine learning, and it gives real hope for speeding up drug discovery. At the moment, computational modeling of drugs and their interaction sites use AI, but there is also the possibility of controlling experimental labs and running large scale experiments in an automatic way.
AI has truly moved from concept to reality in the pharmaceutical industry. Many companies are already using it to pour through mountains of scientific data in an effort to speed and improve the drug discovery process. And the technology is starting to find new applications in areas as diverse as regulatory/compliance, clinical trials, manufacturing.
Concepts in AI
Big Data Analysis
Big data has long been a buzzword in drug discovery, but as analysis methods become more sophisticated, its potential is beginning to be realized. We look at some of the latest advances in big data analysis for drug discovery.
Conventional logic demands that a proposition is either true or false. This maps onto a conventional set theory, so that a hypothesis lies either in the “true” set, or lies wholly outside it. That is, the membership function in the “true” set is either 1 (the hypothesis is true) or 0 (the hypothesis lies outside the “true” set, and is false). In real life, though, these black-and-white concepts may be of little utility. For fuzzy sets, membership functions are not restricted to be 0 or 1, but can take any continuous value between these limits. Fuzzy logic can be especially useful in describing target properties for optimizations.
Artificial neural network (ANN):
Neural networks are mathematical constructs that are capable of “learning” relationships within data, with no prior knowledge required from the user. The neural network makes no assumptions about the functional form of the relationships; it simply generates and assesses a range of models to determine one that will best fit the experimental data provided to it. The models generated by neural networks allow “what if” possibilities to be investigated easily. However, their capabilities are enhanced substantially by combining them with other technologies. For example, using genetic algorithms for optimization, together with neural networks models, has proved to be exceptionally powerful. Also, fuzzy logic which allows objectives to be expressed in simple terms, complements neural network modeling also called neurofuzzy logic. Neurofuzzy logic combines the ability of neural networks to learn from data with fuzzy logic’s ability to express complex concepts intuitively.
Like humans, neural networks learn directly from input data. The learning algorithms take two main forms. Unsupervised learning, where the network is presented with input data and learns to recognize patterns in the data, is useful for organizing amounts of data into a smaller number of clusters. For supervised learning, which is analogous to “teaching” the network, the network is presented with a series of matching input and output examples, and it learns the relationships connecting the inputs to the outputs. Supervised learning has proved most useful for the formulation, where the goal is to determine cause-and effect links between inputs (ingredients and processing conditions) and outputs (measured properties). The basic component of the neural network is the neuron, a simple mathematical processing unit that takes one or more inputs and produces an output.
ANN programs are useful for understanding cause-and-effect relationships between inputs (as formulation parameters) and outputs (as product properties).
Evolutionary computing is a general term that describes computational processes in which solutions evolve, using rules of inheritance, recombination (or cross-over), mutation and selection. Evolutionary algorithms has found the application in the formulation research.
It provides a search technique which is particularly suited to optimization. During this process, an initial population of solutions is generated, and the fitness of each member of the population is assessed. The fittest solutions then become the “parents” of the next generation. Allowing some recombination and mutation introduces a further degree of novelty into the population so that the genetic algorithm is more likely to find a global optimum solution. It is this ability to find the global optimum in a complex design space which renders genetic algorithms so useful. One requirement for genetic algorithms is that a criterion of “fitness” can be defined. This can vary from problem to problem. For multi-dimensional optimization, it has proved useful to define an objective function which is a weighted sum of the desirability of each of the properties. The use of weights in the sum allows some properties to assume more importance than others, and the fittest solutions are those that best meet the overall objectives.
Applications of AI in pharmaceuticals
Computational power has been used to aid drug discovery, but before AI this mainly took the form of virtual screening, molecular modeling and predicting how likely a drug candidate is to have the right properties to be non-toxic in the body. This process essentially skims the pool of drug candidates for those most likely to be successful, meaning the later – more expensive and time consuming – steps in pre-clinical tests are not done on molecules unlikely to work.
There are three specific areas of interest in drug discovery where AI and machine learning can be applied; target identification, lead optimization and screening drug candidates. Target identification involves further validation and/or refinement of targetable areas for lead identification and/or compound screening. Lead optimization involves algorithms to assist in simulations, such as simulated structure, toxicity, binding and drug availability. Screening drug candidates uses algorithms for image or pattern recognition in very high content or high throughput screens to assist scientists in identifying rare or non-obvious patterns in very large data sets.
Within drug discovery this could mean looking at libraries of potential drug compounds and determining from previously successful candidates in other diseases which molecule will work for a new problem. The difference with AI is that after being trained on libraries of compounds with known properties, it can learn to make associations for itself and tell us which molecules are likely to be successful for the desired.
The rate at which new drugs are discovered is in decline, The advent of rational computer-aided drug design was heralded as a new dawn for drug discovery – with the ability to model potential drug targets, drugs and even the systems they act in, it was only a matter of time before new drugs moved swiftly through the pipeline to clinic.
Tech-giant Google’s DeepMind Health is making advances into healthcare through AI by working in partnership with Moorfield’s Eye Hospital NHS Foundation Trust in London, developing technology to address macular degeneration in aging eyes. The results, recently published in Nature Medicine’s journal, detail the process in which AI technology analyzed thousands of historic retinal scans to identify signs of eye diseases — such as glaucoma, macular degeneration and diabetic retinopathy.
Boston-based biopharma company Berg is using AI to research and develop diagnostics and therapeutic treatments in multiple areas, including oncology, by applying an algorithm and probability-based artificial intelligence to analyze large numbers of patients’ genotype, phenotype and other characteristics.
Berg’s team identified the importance of certain naturally occurring molecules in alterations in cellular metabolism pathways. This led the group to discover how its own AI assisted cancer drug BPM31510 works and indicated some possible therapeutic uses. The company is also using this AI system to look for drug targets and therapies for other conditions, including diabetes and Parkinson’s disease.
London-based start-up firm Benevolent AI developed its own intelligence platform, fed with data from research papers, patents, clinical trials and patient records.
This forms a representation of more than one billion known and inferred relationships between biological entities such as genes, symptoms, diseases, proteins, tissues, species and candidate drugs.
Benevolent Bio have used their natural language processing software to find a drug candidate which reduced symptoms of the neurodegenerative disease amyotrophic lateral sclerosis (ALS) in mice. The platform contains information from patient records, clinical trials, the research literature and patents and can then infer relationships between potential drug candidates, diseases, genes and more to predict which drug-like molecules will work for which diseases.
In an experiment researchers asked the system to suggest new ways to treat amyotrophic lateral sclerosis. It returned around 100 existing potential compounds.
Scientists selected five to undergo tests using patient-derived cells. The research found that four of these compounds delivered promising results.
GNS Healthcare offers Reverse Engineering & Forward Simulation (REFS), a machine learning software that automates work that previously involved trial and error to match drug interventions with individual patients. It is claimed that REFS-generated machine learning models are capable of predicting a patient’s response to possible drug treatments by inferring possible relationships among factors that might be affecting the results, such as the body’s ability to absorb the compounds, the distribution of those compounds around the body, and a person’s metabolism.
Atomwise provided the AI technology to perform the drug research, develop a treatment for Ebola virus infections the University of Toronto contributed biological insights about the virus. Atomwise defined a region to investigate for potential small molecules. This region was then screened for molecules that bind to glycoprotein. The compounds already had safety data for use in patients and could be rapidly brought forward for clinical trials.
Nuritas claims to have developed a machine learning application that finds and unlocks naturally occurring bioactive peptides from food sources in the management of chronic metabolic diseases.
Deep learning for in vitro prediction of pharmaceutical formulations such as drug release and disintegration time (incomplete sentence).The project was aimed at discovering hidden knowledge associated with the manufacture of ramipril tablets using a range of artificial intelligence-based software, with the intention of establishing a multi-dimensional design space that ensures consistent product quality.
Neurofuzzy logic has been used to generate a detailed understanding of the extrusion/spheronization process comprising several key stages, the aspect ratio of the produced spheroids was regarded as one of the key quality attributes of the finished product.
Neurofuzzy logic software was used to generate predictive models from 56 experimental data records from which a set of “if then” rules were extracted. These rules illustrated the cause and effect relationship between key process variables and the quality of pellets.
Literature report contain use of ANN for direct compression tablet formulation of hydrochlorothiazide in order to maximize tablet strength and select the best lubricant, in order to relate both formulation (diluent type and concentration, binder concentration) and processing variables (type of granulator, method of addition of binder) with granule and tablet properties (friability, hardness, and disintegration time). Neural networks have also been applied to modeling the immediate release capsule formulations, rapidly disintegrating or dissolving tablets and a novel oral micro emulsion formulation of rifamycin and isoniazid for the treatment of children during the continuation phase of tuberculosis
An ANN model to optimize diclofenac sodium sustained release matrix tablets. Formulation variables including concentrations of cetyl alcohol, polyvinylpyrolidone K 30 and magnesium stearate, and sampling time were chosen as inputs.
Neural network has been used to model the formulation of salbutamol sulfate osmotic pump tablets, the authors predicted the release parameters for 1000 formulations, from which they selected an optimum with the desired release pattern.
Various studies have concluded that neural networks were a more reliable data predictor in the design of their system.
A big drain in the pharmaceutical pipeline is the discovery that some drugs only work for a certain group of people – sometimes because of their genes, sometimes even their microbiome. GNS Healthcare partnered with Genentech, a member of the Roche group, to use machine learning for developing personalized medicine approaches using AI. GNS hope AI will predict which subgroups will respond to treatments to save money and enable drugs to be used only in those who will benefit from them.
Pharmaceutical drug manufacturing, from formulation development to finished product, is very complex. This process includes multivariate interactions between raw materials and process conditions. These interactions are very important for the processability and quality of the finished product. Hence, these interactions should be taken into account early on, such that later loss of time and money is not incurred.
The genetic algorithm is an effective and useful tool to predict the results that arise from changes in the input parameters, such as the formulation. Using this approach with neural networks can be productive because it provides “what if” predictions and optimization (8, 9).
If AI technology can be developed the options are virtually limitless. A true AI technology could think like a human brain but with a much greater capacity, holding the entire system of human and disease biology in its ‘mind’ at one time, alongside all the information in the scientific literature. It could mine both these resources to find the best therapy option for any given disease and individual. Wang concludes, “Continuing advances in computer power will allow scientists to continue to build and augment existing models, further improving the adoption of synthesis planning software on a larger scale. The AI and machine learning landscape within pharma has seen massive growth within the last 5+ years, the expectation is that this growth will only continue as capabilities expand.”
Dr. M. R. Bhalekar
Prof. Pharmaceutics Dept.