Hacker News

Why computational predictive toxicology is hard

by abhishaikeon 10/20/2024, 10:09:46 PM with 3 comments

by youoyon 10/23/2024, 11:18:43 AM
> Some drug side effects, like mild nausea or headache, might be considered acceptable trade-offs for therapeutic benefit. But others, like liver failure or birth defects, would be considered unacceptable at any dose. This is particularly true when it comes to environmental chemicals, where the effects may be subtler and the exposure levels more variable. Is a chemical that causes a small decrease in IQ scores toxic? What about one that slightly increases the risk of cancer over a lifetime (20+ years)?
This is an interesting question. For example we know that exposure to traffic pollution reduces fertility and life expectancy, and that seems to be acceptable. For example city centers are usually the most polluted, but can have some of the most expensive square meters to live in. Although it is true that maybe most people are not aware of that.
by tanananon 10/23/2024, 12:05:16 PM
Biology is messy, quite hard to fit into neat lock-and-key paradigms.
Furthermore, the data we do collect on relationships between drugs and X (whether fine/coarse grained toxicity, activity, preference for a target, etc.) is disorganized and biased, both intentionally and unintentionally.
I was working on affinity models using ML for a while (whether drug sticks to X). I spent quite some time imagining cool architectures to handle the task, and at one point even thought I might have SOTA on a common benchmark.
It took me a bit to realize that not only was I not SOTA with a proper "hard" split, but that this whole zoo of models coming out - claiming to have an inch over the previous best model - all perform more-or-less the same. The "better-performing" ones often added a lot of bells & whistles and smart theory only to result in no pragmatic edge.
The cherry on top was a study which found that a common benchmark was so biased that models perform eerily well even when you remove the drug or the target from the datapoint. Yup, your model could predict relatively well whether Mike and Alice like each other if you only show it Mike (or Alice) at training and inference. The exact reference evades me, sadly.
This is all to say: The space of interactions between drugs and targets/cells/tissues/organisms is so sparsely explored, that "foundation models" of this sort still seem to me a thing of science fiction. They're that far out.
If we're to make groundbreaking discoveries, my best guess is it would be in very restricted problems, as opposed to applying a general solution to a particular case.
by rob74on 10/23/2024, 10:18:21 AM
As Paracelsus already wrote, "All things are poison, and nothing is without poison; the dosage alone makes it so a thing is not a poison." (https://en.wikipedia.org/wiki/The_dose_makes_the_poison) That might be one of the reasons why predictive toxicology is hard?