Word Mover's Distance (WMD) is a powerful technique for measuring the semantic similarity between two text documents, taking into account the underlying geometry of word embeddings.
WMD has been widely studied and improved upon in recent years. One such improvement is the Syntax-aware Word Mover's Distance (SynWMD), which incorporates word importance and syntactic parsing structure to enhance sentence similarity evaluation. Another approach, Fused Gromov-Wasserstein distance, leverages BERT's self-attention matrix to better capture sentence structure. Researchers have also proposed methods to speed up WMD and its variants, such as the Relaxed Word Mover's Distance (RWMD), by exploiting properties of distances between embeddings.
Recent research has explored extensions of WMD, such as incorporating word frequency and the geometry of word vector space. These extensions have shown promising results in document classification tasks. Additionally, the WMDecompose framework has been introduced to decompose document-level distances into word-level distances, enabling more interpretable sociocultural analysis.
Practical applications of WMD include text classification, semantic textual similarity, and paraphrase identification. Companies can use WMD to analyze customer feedback, detect plagiarism, or recommend similar content. One case study involves using WMD to explore the relationship between conspiracy theories and conservative American discourses in a longitudinal social media corpus.
In conclusion, WMD and its variants offer valuable insights into text similarity and have broad applications in natural language processing. As research continues to advance, we can expect further improvements in performance, efficiency, and interpretability.

Word Mover's Distance (WMD)
Word Mover's Distance (WMD) Further Reading
1.Re-evaluating Word Mover's Distance http://arxiv.org/abs/2105.14403v3 Ryoma Sato, Makoto Yamada, Hisashi Kashima2.Moving Other Way: Exploring Word Mover Distance Extensions http://arxiv.org/abs/2202.03119v2 Ilya Smirnov, Ivan P. Yamshchikov3.SynWMD: Syntax-aware Word Mover's Distance for Sentence Similarity Evaluation http://arxiv.org/abs/2206.10029v1 Chengwei Wei, Bin Wang, C. -C. Jay Kuo4.Improving word mover's distance by leveraging self-attention matrix http://arxiv.org/abs/2211.06229v1 Hiroaki Yamagiwa, Sho Yokoi, Hidetoshi Shimodaira5.Speeding up Word Mover's Distance and its variants via properties of distances between embeddings http://arxiv.org/abs/1912.00509v2 Matheus Werner, Eduardo Laber6.WMDecompose: A Framework for Leveraging the Interpretable Properties of Word Mover's Distance in Sociocultural Analysis http://arxiv.org/abs/2110.07330v1 Mikael Brunila, Jack LaViolette7.Text classification with word embedding regularization and soft similarity measure http://arxiv.org/abs/2003.05019v1 Vít Novotný, Eniafe Festus Ayetiran, Michal Štefánik, Petr Sojka8.An Efficient Shared-memory Parallel Sinkhorn-Knopp Algorithm to Compute the Word Mover's Distance http://arxiv.org/abs/2005.06727v3 Jesmin Jahan Tithi, Fabrizio Petrini9.Wasserstein-Fisher-Rao Document Distance http://arxiv.org/abs/1904.10294v2 Zihao Wang, Datong Zhou, Yong Zhang, Hao Wu, Chenglong Bao10.A New Parallel Algorithm for Sinkhorn Word-Movers Distance and Its Performance on PIUMA and Xeon CPU http://arxiv.org/abs/2107.06433v3 Jesmin Jahan Tithi, Fabrizio PetriniWord Mover's Distance (WMD) Frequently Asked Questions
What is Word Mover's Distance (WMD)?
Word Mover's Distance (WMD) is a technique used to measure the semantic similarity between two text documents. It takes into account the underlying geometry of word embeddings, which are vector representations of words that capture their meanings. By comparing the distances between word embeddings in two documents, WMD can determine how similar the documents are in terms of their semantic content.
How does WMD work?
WMD works by leveraging pre-trained word embeddings, such as Word2Vec or GloVe, to represent words as vectors in a high-dimensional space. It then calculates the minimum "transportation cost" required to transform one document's word embeddings into another document's word embeddings. This transportation cost is based on the Earth Mover's Distance (EMD), a measure used in optimal transport theory. The lower the cost, the more similar the two documents are in terms of their semantic content.
What are some improvements and variants of WMD?
There have been several improvements and variants of WMD proposed in recent years. Some notable examples include: 1. Syntax-aware Word Mover's Distance (SynWMD): This method incorporates word importance and syntactic parsing structure to enhance sentence similarity evaluation. 2. Fused Gromov-Wasserstein distance: This approach leverages BERT's self-attention matrix to better capture sentence structure. 3. Relaxed Word Mover's Distance (RWMD): This method speeds up WMD by exploiting properties of distances between embeddings, providing a faster approximation of the original WMD.
What are some practical applications of WMD?
WMD has various practical applications in natural language processing, including: 1. Text classification: WMD can be used to classify documents into categories based on their semantic content. 2. Semantic textual similarity: WMD can measure the similarity between two sentences or documents, which is useful for tasks like paraphrase identification or document clustering. 3. Analyzing customer feedback: Companies can use WMD to analyze customer reviews and feedback, identifying common themes and sentiments. 4. Plagiarism detection: WMD can help detect instances of plagiarism by comparing the semantic similarity between documents. 5. Content recommendation: WMD can be used to recommend similar content to users based on their interests and preferences.
What is the relationship between WMD and Earth Mover's Distance (EMD)?
Earth Mover's Distance (EMD) is a measure used in optimal transport theory to calculate the minimum "transportation cost" required to transform one distribution into another. WMD is an adaptation of EMD for natural language processing tasks, specifically for measuring the semantic similarity between text documents. WMD leverages the underlying geometry of word embeddings and uses EMD to compute the transportation cost between the word embeddings of two documents.
How does recent research extend WMD?
Recent research has explored extensions of WMD by incorporating additional information, such as word frequency and the geometry of word vector space. These extensions have shown promising results in document classification tasks. Additionally, the WMDecompose framework has been introduced to decompose document-level distances into word-level distances, enabling more interpretable sociocultural analysis. As research continues to advance, we can expect further improvements in performance, efficiency, and interpretability of WMD and its variants.
Explore More Machine Learning Terms & Concepts