Unless I misunderstand the description, you're talking about an idea of autoencoder and yeah, we kind of use that already. It compresses the data to a smaller representation and similar things end up in a similar area of latent space. It can be guided a bit too if you care about specific concepts more than other.
Word2vec does what you explained with words and their context.
Unless I misunderstand the description, you're talking about an idea of autoencoder and yeah, we kind of use that already. It compresses the data to a smaller representation and similar things end up in a similar area of latent space. It can be guided a bit too if you care about specific concepts more than other.
Word2vec does what you explained with words and their context.