AudioGen: Textually Guided Audio Generation

  • The last thing you'll hear before the AI eats you: https://felixkreuk.github.io/text2audio_arxiv_samples/large_...

  • It would be very interesting indeed to have an ebook reader paired with bluetooth earphones, and it simultaneously feeds the words into this to make an ambient soundtrack, perhaps also choosing music appropriate to the word-choice on the page.

  • That could be another missing piece to videogame generational art, sfx sounds and soon soundtracks.

  • The speech samples are really funny. Very Sims-esque.

  • It will be more useful if it can narrate text along with those background effects.

  • -__- I wish researchers would train a stereo 44.1kHz version...why always 16kHz? I know I know 16kHz saves more compute but come ooooon you're Meta

  • Text2audio is impressive, but I wanna see dance2audio. Just need a million dollars in funding to pay for cameras and dancers.

  • [code] redirects to the same page

  • s/textually/sexually

    i giggled :)

  • undefined

  • undefined