dm.cs.tu-dortmund.de/en/mlbits/text-mining-contextual-information/
Contextual Information – Lecture Notes
Obama
\(=(\;0\;,\;0\;,\;0\;,\;1\;,\;0\;,\;\ldots \;,\;0)\)
President
\(=(\;0\;,\;1\;,\;0\;,\;0\;,\;0\;,\;\ldots \;,\;0)\)
Obama
\(\cdot\)
President
\(=(\;0\;,0\mkern -1.8mu\cdot \mkern -1.8mu 1 ,\;0\;,1\mkern [...] 1\mkern -1.8mu\cdot \mkern -1.8mu 0,\;0\;,\;\ldots \;,\;0) = 0\)
Because of this, the documents are completely dissimilar (except for stopwords): \(\operatorname {sim}(\{ \texttt{Obama}, \texttt{speaks}, [...] \{ \texttt{President}, \texttt{greets}, \texttt{media}, \texttt{Chicago}\} ) = 0\)
➜ We want a word representation where \(0\ll \operatorname {sim}(\texttt{Obama},\texttt{President}) \lt{} 1\) .
Preferably …