I'm a little disappointed that their linked preprint doesn't appear to include any molecular biology; i.e. they don't actually try to synthesize any of their predicted sequences and test function. It wouldn't be an outrageous synthesis task to make some of the CRISPR-Cas sequences they generated.
Also interesting that AlphaMisense is omitted from Figure 2B; it substantially outperforms the ESM-based ESM1b in our hands. But I guess the idea is that this is a general-purpose DNA language model whereas AlphaMissense is domain-specific for variant effect prediction?
Would be interesting to see what comes of it.
As you progress along the following chain: genomics-->proteomics->interactomics->metabolomics, our understanding becomes blurrier and challenges harder.
Just gonna leave this here.
https://www.biorxiv.org/content/10.1101/2024.02.29.582810v1
Tl;dr: DNA is NOT all you need.
DNA is all you need? In the future generative AI will generate You!
I built the wrapper/playground [0] linked in the article. Feel free to give feedback here or by the email in my bio
[0] https://evo.nitro.bio/