Monitorings of Russian disinformation topics

  1. Слуги стають БПП, бацила націоналізму та “соросята” оточили ЗеленськогоOctober, 14—27 2019
  2. "Neo-Nazis are preparing the Third Maidan", "Zelenskyi has no real power"September, 30 — October, 13 2019
  3. Militant Ukrainian authorities, “a bunch of no-names” in the Cabinet of Ministers and the return of YanukovychSeptember, 23—29 2019
  4. Zelensky is not worthy of a meeting with Putin. Poroshenko will set his house to fire as GontarevaSeptember, 16—22 2019
  5. Forcing to peace and discrediting the land marketSeptember, 9—15 2019
  6. Release of hostages, “Zelenskyi corrects Poroshenko's mistakes”September, 2—8 2019
  7. "Too liberal" Cabinet of Ministers and the threat of putschAugust, 26 — September, 1 2019
  8. "Bandera Meeting" by Netanyahu, justification of BuzhanskyAugust, 19—25 2019
  9. Хода РПЦ в Україні, і знову МедведчукJuly, 22—28 2019 року
  10. Medvedchuk and ShariyJuly, 8—21 2019
  11. More MedvedchukJune, 24—30 2019
  12. Poroshenko and Election Circus Are to BlameJune, 17—23 2019
  13. All attention to the electionsJune, 10—16 2019
  14. Zelensky can become "bloody"May, 27 — June, 2 2019
  15. Критика Зеленського, 9 травня, "Одеська трагедія"May, 1—15 2019


Data collection and preprocessing

We were collecting news from a set of Ukrainian sites dominated by clickbait sites (sites of "junk news"). Data was downloaded from sites' RSS feeds or links on their Facebook pages. For each piece of news, we uploaded the date of publication, link, headline, and full text. Next, we defined the language for each text and selected only those that were in Russian. The choice of language was due to an observation that most disinformation is written in Russian. Each text was prepared for analysis: tokenized (divided into language units - words and punctuation marks), lemmatized for topic modeling (words converted to normal form, infinitive), and a vocabulary of words in the news data array was compiled.

All the news was filtered to discard those unrelated to social and political life in Ukraine. Texts about sports, weather, starry life, unrelated international news were filtered out. There were about half of such materials.

Detecting manipulative news

Each news item was evaluated by an improved version of the manipulative news classifier: the algorithm was further trained on new data to improve accuracy (previous version of the classifier). It estimates the likelihood that the news contains emotional manipulation and/or false argumentation. According to news from clickbait sites, sites from the occupied territories and publications with the anti-Ukrainian position, the classifier finds 62% of the materials containing at least one type of manipulation, while incorrectly marks as manipulative 6% of the materials. That is, the algorithm rather misses the manipulation than falsely marks it.

For topic modeling, we selected only news from websites where manipulative news classifier found more than 10% of all news (excluding entertainment, sports, international) contained manipulations.

The following sites were monitored:,,,,,,,, bbc-ccnn,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, укроп.org,,

Detection of topics

Selected manipulative news, 3,000 pieces per week on average, was broken down into topics of the week by automatic topic modeling (NMF). We edited the resulting news clusters manually: similar topics combined, irrelevant or overly general clusters discarded, topics that did not follow Russian disinformation deleted. Topics were identified automatically, so a small part of the news may be unrelated to the topic.

Each subtopic is illustrated by a selection headlines that the classifier has identified as manipulative with high confidence.

For monitoring purposes, we have integrated weekly topics into subtopics that are relevant during the monitoring period. The subtopics are grouped into meta-topics for generalization.