The dataset is made of 2.700 Ted talks spanning from 2004 to 2017

25 "pure" (mono-thematic) clusters have been extracted applying optimized NMF to speeches

Each topic has been analyzed in terms of content, style and trend in time through a web-based tool, introduced in the next section

Word-similarity graph (zoom-in and zoom-out is enabled):

Number of speeches in the topic:

Top 10 NMF words:

