Topic Modeling: NMF
Topic modeling is a process that uses unsupervised machine learning to discover latent, or “hidden” topical patterns present across a collection of text.
This tool begins with a short review of topic modeling and moves on to an overview of a technique for topic modeling: non-negative matrix factorization (NMF). The slide deck provides an intuitive narrative of how NMF works. After reviewing the slide deck and completing the assignment, you should have enough understanding of NMF to be able to use it in practice, interpret results, and appreciate some of the challenges that can occur with topic modeling.
Using the application, you will generate word distributions from a corpus of documents to see if you can identify coherent topics. Note that this is the same exercise found in Introduction to Topic Modeling. However, that tool uses latent Dirichlet allocation (LDA) as the topic modeling technique instead of NMF. Although each tool stands on its own, they have been designed so that you can compare topic modeling results using the two different methods.
Select the Slide Deck for the contextual information. Then use the Link to Platform to complete the exercise. Your instructor may have additional guidance regarding the use of this Teaching Tool.