Hacker News

AudioGen: Textually Guided Audio Generation

by pierreon 9/30/2022, 7:02:18 PM with 11 comments

by solardevon 9/30/2022, 11:29:38 PM
The last thing you'll hear before the AI eats you: https://felixkreuk.github.io/text2audio_arxiv_samples/large_...
by iamthemonsteron 10/1/2022, 8:19:48 AM
It would be very interesting indeed to have an ebook reader paired with bluetooth earphones, and it simultaneously feeds the words into this to make an ambient soundtrack, perhaps also choosing music appropriate to the word-choice on the page.
by nudpiedoon 9/30/2022, 9:11:40 PM
That could be another missing piece to videogame generational art, sfx sounds and soon soundtracks.
by kevmo314on 9/30/2022, 9:55:26 PM
The speech samples are really funny. Very Sims-esque.
by karmasimidaon 9/30/2022, 9:20:36 PM
It will be more useful if it can narrate text along with those background effects.
by youssefabdelmon 10/1/2022, 11:18:59 AM
-__- I wish researchers would train a stereo 44.1kHz version...why always 16kHz? I know I know 16kHz saves more compute but come ooooon you're Meta
by fragmedeon 10/3/2022, 4:54:06 AM
Text2audio is impressive, but I wanna see dance2audio. Just need a million dollars in funding to pay for cameras and dancers.
by fuzzythinkeron 9/30/2022, 8:39:52 PM
[code] redirects to the same page
by uwagaron 10/1/2022, 3:20:29 AM
s/textually/sexually
i giggled :)
by creative2022on 10/12/2022, 12:24:42 PM
undefined
by creative2022on 10/12/2022, 12:24:24 PM
undefined