Hacker News

Ask HN: Why we need complex text-to-image networks if this simple one works?

by freakyniton 6/28/2024, 5:36:25 AM with 4 comments

by bjourneon 6/28/2024, 1:09:14 PM
But the images your network generates look nothing like MNIST digits. There is also no variance. For example, all 9s are identical.
by p1eskon 6/28/2024, 6:03:23 PM
For MNIST your model is sufficient. But it’s not structurally complex enough to generate more complex images, even if you scale it up.
by alexliu518on 6/28/2024, 6:17:12 AM
[flagged]
by Am4TIfIsER0pposon 6/28/2024, 9:40:34 AM
Have you employed a dozen ethicists to go through the input and output to make sure it can't say any slurs? [EDIT] That's why small ones don't exist