A Beginner in Word2vec

An introduction to one technique for NLP

I am writing this article to learn more about Word2vec. This is a short description based on the article on Wikipedia and will not contain any extensive technical descriptions of application.

  1. One trained it can detect synonymous words or suggest additional words for a partial sentence.
  1. Semantic similarity is a metric defined over a set of documents or terms, where the idea of distance between items is based on the likeness of their meaning or semantic content as opposed to lexicographical similarity.”
  1. Typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space. (dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within i)
  2. Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located close to one another in the space.
  • Continuous skip-gram.

AI Policy and Ethics at Student at University of Copenhagen MSc in Social Data Science. All views are my own.

