Natural Language Processing
(From the perspective of an expert in information science)
-
Two types of “meaning”: word-level meaning / structural meaning
-
Word-level meaning
-
Structural meaning
- Predicate-argument structure / predicate logic
- Compositionality
-
Creating thesaurus and ontology manually is extremely difficult and requires regular updates
- It would be great if we could create them from vast amounts of text data (e.g. Twitter) (=corpus)
- Using context clues (blu3mo)
- Based on the idea that similar words appear in the vicinity of words with similar meanings
- word2vec uses Machine Learning to make the vectors of the main word and the surrounding words closer
- (Closer = larger dot product)