Fig 1.
Illustrating the distribution of words across articles.
Fig 2.
Illustrating the coherence score for different number of topics for bag of words and TF-IDF models employed on article and summary.
Fig 3.
Various linkage metrics for hierarchical clustering.
The dashed lines are the linkage between clusters and highlighted edge shows the optimal linkage for clustering.
Fig 4.
Illustrating the variation in sentiment scores for articles.
The positive, negative, and neutral sentiments for each article reveals the ratio of polarities within an article.
Fig 5.
Illustrating the violin plot for distribution of sentiment scores across experimental dataset.
The width of the plot at each instance shows the density estimate of articles having a polarity score. In addition to density and distribution, violin plot also shows the inter-quartile summaries of sentiment scores.
Table 1.
Illustrating the words describing various topics identified using BoW model for both article text and summaries.
Table 2.
Illustrating the words describing various topics identified using TF-IDF model for both article text and summaries.
Fig 6.
Illustrating the top 30 salient terms present in first topic identified using TF-IDF model employed on articles.
The bubble chart visualises the overlap in various topics plotted in a two dimensional (components) space.
Fig 7.
Illustrating the top 30 salient terms present in first topic identified using TF-IDF model employed on articles’ summaries.
The bubble chart visualises the overlap in various topics plotted in a two dimensional (components) space.
Fig 8.
Illustrating the full dendrogram result of agglomerative hierarchical clustering employed on news headlines.
Fig 9.
Illustrating the partial dendrogram obtained from agglomerative hierarchical clustering and representing different clusters upto 5 levels from root.