AoT Topic Modelling results

To understand the different credibility levels of talking about Covid within the AoT collection

Plain text from working URLs from AoT mentioning Covid related topics

Link checker with code
Downloaded all the articles into a pandas data frame: text, link, keywords, summaries (Newspaper3k package)
Run a text search with the 10 most frequent Covid-related terms, then manually filtered the irrelevant ones out in Openrefine
After the literature review, decided to use the "information credibility" framework. This framework divides sources into three categories: credible, questionable, and non-credible.
Manually categorised the articles following the framework guidelines
Word frequency and topic modelling on all categories (python: nltk, spacy, gensim)

Results

104 rows

Untitled

Screenshot 2023-01-19 at 09.18.55.png

Untitled

33 rows

Untitled