The in-silico target prediction or inverse virtual screening of a known bioactive small molecule has several important applications in medicinal chemistry research: the identification of off-targets for preventing side effects, the repurposing of known drugs, the prediction of polypharmacology effects and the identification of the actual target in phenotypic screening approaches.1 The use of docking-based target prediction has the advantage that it is independent of the underlying molecular structure. Unfortunately, the ranking of protein targets based on scoring function results is problematic and fails in most cases due to “inter-protein scoring noise”. The retrieved scores depend highly on the physicochemical properties of the protein binding pockets used for docking and therefore are not comparable between different protein targets. Target-specific scoring functions are an alternative to improve target ranking. Of course, a huge effort must be made to develop scoring functions for several targets, but for a selected number of interesting targets this should be worthwhile to do. This is especially true for polypharmacology prediction, where only two or three targets need to be addressed to find a ligand that simultaneously binds to these targets. 

We developed a novel protein-ligand interaction fingerprint PADIFP2 (Protein per Atom Score Contributions Derived Interaction Fingerprint) which utilizes the protein per atom score contributions of the GOLD's scoring function ChemPLP for post-processing of created docking poses. The fingerprint consists of eight different interaction terms that are available for each protein atom and contribute to the final scoring value. Unlike many other methods, this approach incorporates the strength of an interaction based on the scoring function contributions. This strength covers also unfavourable interactions, since the underlying atom contributions can also describe unfavourable terms with a negative effect on the final scoring. In an exhaustive validation study,  PADIF shows superior results in docking-based virtual screening in comparions to other protein-ligand-interaction fingerprints.

In subsequent studies, PADIF showed its usefulness in docking-based target prediction approaches.3 Using the calculated PADIFs, artificial neural network (ANN) and support vector machine (SVM) classification models were created as target-specific scoring functions that predict a molecule as active or inactive based on calculated probability scores for a given target. These probability scores generated by each model were used for ranking and inter-target comparison of a diverse set of protein structures. The results showed that the approach led in most of the cases to reasonable target prediction by the selection of the top ranked targets and comparison with the known target activity profile for a molecule. Furthermore, reasonable prediction capabilities are achieved also for compounds with bioactivity data for more than one of the 20 targets.

In future projects, this project should be extended to a higher amount of protein targets and for polypharmacology prediction.

References

  1.  Sydow, D., Burggraaff, L., Szengel, A., van Vlijmen, H.W.T., IJzerman, A.P., van Westen, G.J.P., Volkamer, A. Advances and challenges in computational target prediction. J. Chem. Inf. Model. 2019, 59, 1728-1742.
  2. Jasper, J., Humbeck, L., Brinkjost, T., Koch, O.* A novel interaction fingerprint derived from per atom score contributions: Exhaustive evaluation of interaction fingerprint performance in docking based virtual screening. J. Cheminf. 2018, 10.
    https://dx.doi.org/10.1186/s13321-018-0264-0
  3. Nogueira, M.S., Koch, O.* The Development of target-specific machine learning models as scoring functions for docking-based target prediction. J. Chem. Inf. Model. 2019, 59, 1238 - 1252. Part of the special issue “machine learning in drug discovery”.
    https://doi.org/10.1021/acs.jcim.8b00773