Similarity Learning lacks a framework. So we built one

  • Not a full featured framework, but pytorch-metric-learning has data loaders, lossess, etc. to facilitate similarity learning: https://github.com/KevinMusgrave/pytorch-metric-learning

    Disclaimer: I've made some contributions to it.

  • Found the wiki article more useful in describing what Similarity Learning and Metric Learning are: https://en.wikipedia.org/wiki/Similarity_learning

  • Great article. I've been working in and around this space since 2014, and I think similarity learning, vector search, and embedding management will be a core part of future applications that leverage ML.

    I recently built a similarity search application that recommends new Pinterest users channels to follow based on liked images using Milvus (https://github.com/milvus-io/milvus) as a backend. Similarity learning is a huge part of it, and I'm glad more and more tools like Quaterion are being released to help make this kind of tech ubiquitous.

  • Is it somehow connected to Qdrant similarity search engine? Is there a default integration for it?

  • I’m familiar with metric learning within the Mahalanobis family for kNN oriented applications . I’m not getting what use cases this framework targets? Is it custom image search type stuff which may benefit from fine tuning?

    What is a realistic minimum viable dataset for an approach like this? When is it not advisable? How does it compare to other more basic approaches?

  • Very cool. Can you comment on how this compares with tensorflow similarity? https://blog.tensorflow.org/2021/09/introducing-tensorflow-s...

  • I realise this is an overly-broad question, but any insight into what's the state-of-art in Similarity Learning for article-type text?

    More specifically, I'm interested in deriving distances between writing style, arguing style, etc.

  • There is one https://github.com/jina-ai/finetuner pretty well-designed and also gives SOTA performance from its docs

  • The title is written in a clickbait format. So I had to point it out.