Inspira Group, a research company, according to their “data for social good” principle, made it possible for Anna Farkas, a BA student in sociology, to add questions to their online omnibus research, free of charge. The paper, supervised by Renáta Németh, is a case study that investigates gender bias in Google Translate and its translations of occupations from Hungarian (a gender-neutral language) to English (a gender-based language) (the thesis can be accessed on this link). Using quantitative methods, the study aims to measure the extent of gender bias in machine translations. It examines the use of pronouns in the English translation of sentences such as “ő egy orvos” (“he/she is a doctor”). To measure the bias in the algorithm, the study compares Google Translate’s translations to the proportion of men and women in each occupation, and to society’s perception of those occupations. To assess whether people find those occupations feminine or masculine, we used a survey. Inspira assisted in this research: as part of their online omnibus research, they carried out the survey on a representative sample about the perceptions of occupations using questions provided by the Anna Farkas. The study found that Google Translate mirrors people’s perception of occupations to a greater extent than the proportion of men and women in those occupations.
Anna Brecsok’s thesis (Survey Statistics and Data Analytics MSc), in which she conducted a survey experiment to investigate a problem she encountered at her workplace, was published in the Hungarian Statistical Review. Anna’s supervisor, and the co-author of the paper was Renáta Németh, co-leader of our research group.
Organizers are now accepting papers for the Sociology at the Dawn of a Successful Century conference, which will take place from 11 to 12 June 2020 at the Hungarian Academy of Sciences, Centre for Social Sciences. At the conference, the sociological applications of NLP will be presented in a separate section. The section was facilitated by the two co-leaders of our research group, Ildikó Barna and Renáta Németh, along with Bence Ságvári, head of the CSS-Recens research group. Registration for the conference is open until March 31st.
Update: The situation with the coronavirus has made it uncertain whether the conference can be held in June. Regardless, we encourage everyone to submit an abstract by the deadline of 15 April (extended deadline), as the conference will be held at worst at a later date.
Ildikó Barna, co-leader of our research group, gave a presentation on contemporary Hungarian antisemitism at the Formal and Applied Linguistics Institute of Charles University Prague. The presentation was based on the Online Antisemitism project conducted with Árpád Knap. In addition to presenting the results of the research so far, in her lecture she also discussed why sociological and domain knowledge is indispensable for interpreting the output of natural language processing.
Our research group is represented in the NTP-HHTDK grant won by the ELTE TÁTK TDK Workshop. The grant was established to support the Hungarian Scientific Students’ Associations and their events. As part of this, we will be launching a Python-based text analytics course for faculty students during the spring semester, after which they will be able to join our research team for internship positions.
Our research group’s two leaders, Ildikó Barna and Renáta Németh has successfully initiated an NLP-related section (Natural Language Processing: a New Tool in the Methodological Tool-Box of Sociology) at the International Sociological Association’s RC33 (Logic and Methodology in Sociology) conference, held between 8th-11th September, 2020. Abstract submissions are welcome until 30th January on the conference website.
Flóra Bolonyai, second year Survey Statistics and Data Analytics MsC student, took first place with her presentation entitled “Text Analytics Models for Profiling Authors’ gender” at the faculty’s TDK (Scientific Students’ Associations) conference on December 13th. Flóra’s supervisor was one of our research group’s member, Eszter Katona.
In 2019, our research group’s student program received the support of Ariosz Ltd. The donation aims to contribute to the transformation in the field of quantitative social research.
Full house and great success at the workshop organized by our research group on the new ways of sociological research
The first section, chaired by Sára Simon, dealt with the social scientific application of automated text analysis. In the first lecture, Renáta Németh talked about the aim of the research group and the positions of the new methods in social sciences. Ildikó Barna and Árpád Knap detected various types of antisemitic narratives in antisemitic articles and comments, some of which are not measurable by surveys. Zoltán Kmetty and Julia Koltai focused on word embedding NLP models; in addition to presenting the methodology, they demonstrated the potential of application for social scientists with concrete examples. Domonkos Sik and Fanni Máté talked about the results of their research on the framing of depression in online forums, including the benefits and difficulties of human annotation. Eszter Katona and Fanni Máté, in the last lecture of the section, detailed the methodology of supervised teaching models in the same research.
In the second section, Gábor Palkó talked about the analytical possibilities of a semantic prosopographic (collective biography) data network they built, called HECEdata, which can be examined together with the data elements of WikiDATA. Then Balázs Indig argued for the scientific necessity of using webcrawlers, and introduced the process step-by-step, in the case of humanities and social scientific applications.
The third section was chaired by Kinga Szálkai and opened by Nikosz Fokasz. In his presentation, the professor spoke about Hungarian and Greek auto- and heterostereotypes, and networks, based on identities and newspaper articles. The second lecture of the section was given by Pál Susánszky and Márton Gerő on the analysis of the police database on protests in Hungary, which they supplemented. The last presentation of the section and also the whole workshop was held by Erzsébet Takács and Flóra Takács. They put the previous presentation into a larger context and unfolded the traditions and problems of domestic movement research and the instruments of demonstration.
The presentations (in Hungarian) can be found here:
The RC2S2 (Research Center for Computational Social Science founded by the Institute of Empirical Studies of ELTE Faculty of Social Sciences) contributes to international research cooperation titled “European Memory Politics – Populism, Nationalism and the Challenges to a European Memory Culture (EuMePo) led by the Canadian University of Victoria (UVic). Other partners are the Université de Strasbourg and the Adam Mickiewicz University (Poznań).