Introduction to Text Analysis in R
Workshop (PhD / postdoctoral level), Lucerne University, 2019
General introduction to natural language processing (semantic and bag-of-word approach, building a corpus, text preprocessing, the document-feature matrix). Basic forms of textual data visualization (lexical dispersion and frequency plots). Text modelling (dictionaries, text scaling, statistical topic modeling, structural topic modeling). Advanced topics: elements of scraping (html, regex, working with APIs); mention of Beyond Bag-of-Words: POS tagging and Word Embeddings (Word2vec).