This was posted here already a few weeks ago.
Whenever I try to read and understand this paper, I feel extremely dumb. I have my degree in CS, but this is just too complex for me to understand.
Yeah I always end up lost in papers like this too, even with my CS degree, the research keeps leveling up nonstop.
Wow.
I can't wait to see ideas from the diffusion image generation world (like controlnet) work their way into language models.