The study of Russian nationalists and data mining: why social researchers should pay attention to computer science
The new meeting of the Research and Study Group (RSG) was held on the 3rd of April. The main speaker was the head of the RSG Alexey Rotmistrov, who dedicated the report to the particularities of Data Mining paradigm in sociology. During the report, the topic of Russian nationalists was also discussed.
Russian nationalists are one of the main topics of Alexey Rotmistrov's interest. Basing on information from the Internet, the researcher spent a lot of effort to compile and regularly update the database on nationalist organizations. However, in connection with the conflict events in the near abroad and the subsequent active pursuit of extremist organizations, it became difficult for the investigator to update the database, as well as communicate with informants. This prompted Alexey to look for new forms of data collection. The choice fell on the methods of computer science.
Thereby, Alexey got the understanding of the general problem in social sciences - the lack of attention to the Internet space, which has a great potential for acquisition through the methods of computer science, for example, Data Mining. The peculiarity of using the Data Mining paradigm in sociology is the active application of web scraping, which was proposed by the head of the RSG to the participants for the implementation of their own projects. On the example of his own database dedicated to Russian nationalists, Alexey Rotmistrov explained that this method allows to look up pre-agreed data on the Internet and collect it in the database not manually but using a machine. The web scraping is used, among other things, in an application with face recognition technology to compare a photo taken or saved by a user, with millions of photos on social networks.
The head of RSG also mentioned other characteristics that distinguish the application of the methods of Data Mining in the computer sciences from the social ones. These are the relationship between theory and empirical data, the importance of quantification available data and the need for a comprehensive application of analysis methods.
More general differences of computer science from social sciences were discussed during the seminar. For example, the ratio of sample and the general population in the available database is one of those. As it is known, due to the specificity of the object and methods the data in social sciences is usually just a sample. However, in computer science, the researcher often deals with the data covering the population itself. As the result, for computer scientists, it is not necessary to check the model built on such data for stability, while sociologists should do the checking regularly.
The seminar was also marked by the arrival of the co-leader of the RSG Nikolay Zakharov. So, the participants of the group had the opportunity to meet the colleague from Sweden personally. At the end of the seminar, all the members discussed and shared results of their personal projects, including those that are not included in the sphere of interest of the RSG.