Marta Kołczyńska

Sociology • Social Science Data & Methods

Trends in educational attainment in Europe

Note: This part of data processing was used to construct poststratification tables used to create country-year estimates of political trust in Europe. The full paper titled “Modeling public opinion over time and space: Trust in state institutions in Europe, 1989-2019” is availabe on SocArXiv: This research was supported by the Bekker Programme of the Polish National Agency for Academic Mobility under award number PPN/BEK/2019/1/00133. The Eurostat provides a host of useful data, including socio-demographic statistics on educational attainment, which enable tracking the changes in educational composition of European societies over the last several years.

(In)Consistency between international corruption indicators

Overview Scatter plots Correlations Trends in corruption indicators in Europe, 1990-2019 Note: Results from this post are presented more systematically in the paper “Marketplace of indicators: Inconsistencies between country trends of measures of governance” co-authored with Paul Bürkner and available on SocArXiv: Overview Measuring corruption is hard, especially if one is interested in having corruption indicators that are comparable across countries and over time. Arguably the most famous corruption ranking is the Corruption Perceptions Index published annually by Transparency International, but it can’t be used for over-time comparisons (cf.

Cleaning Freedom House indicators

How to clean a very untidy data set with Freedom House country ratings, saved in an Excel sheet, which violates many principles of data organization in spreadsheets described in this paper by Karl Broman and Kara Woo, but otherwise is an invaluable source of data on freedom in the world? Data source: The full code used in this post is available here. I would do this: Read in the file,

Environmental attitudes in Europe

The climate protests in March 2019 mobilized over a million of people around the globe. A team of social scientists from universities across Europe organized a survey of the #FridaysForFuture strike events on March 15 in 13 cities in nine countries. The report can be found here. A new wave of climate protests (and surveys) is planned for the end of September. Naturally, most participants at these protests are acutely aware of the environmental threats and motivated to take action.

Toxicity of comments to votes in Request for Adminship on English Wikipedia

This post was written during a research visit at the Department of Computer Science at Aalto University, Finland, supported by the Helsinki Institute for Information Technology. Perspective is an API that uses machine learning models to predict the impact of a comment on the conversation. One of the models predicts the extent to which the comment might be perceived as toxic. A toxic comment is defined as “a rude, disrespectful, or unreasonable comment that is likely to make you leave a discussion.

Codebook from ISSP waves 1985-2017

The International Social Survey Programme offers a wealth of data, with thematic modules repeated around every 10 years, and a solid and relatively stable block of socio-demographics. The data can be downloaded from the GESIS data archive either in separate files per year or with data bundled by topic (e.g., the Social Inequality dataset contains data from rounds 1987, 1992, 1999, and 2009). There is no integrated codebook indicating the availability of variables in different rounds, so someone interested in longitudinal analyses would need to download all files, open them and look for the variables of interest.

Do-It-Yourself Harmonization: Exploring trust items in three European survey projects

Introduction Illustration: Trust in institutions Step 1: Preparation and coding of technical variables Step 2: Selection of source variables for harmonization Step 3: Mapping source values to target values Step 4: Harmonization Results: Availability of trust items Comparability of sample aggregates Appendices: Code examples Appendix 1: Data preparation Appendix 2: Codebook from labelled data in R Appendix 3: Values crosswalk Appendix 4: Harmonization Introduction Ex-post (or retrospective) data harmonization refers to procedures applied to already collected data to improve the comparability and inferential equivalence of measures collected by different studies.

Searchable codebook from labelled data in R

Working with categorical data, such as from surveys, requires a codebook. After spending some time unsuccessfully looking for a function that would create a nice, searchable codebook from labelled data in R, I decided to write my own. What I want to achieve is a simple table with variable names, labels, and frequencies of labelled values like the one below, to search for specific keywords in the value labels and to see distributions of various variables.

So you want to harmonize data?

So You Want to Write a Fugue? Glenn Gould So you want to write a fugue? You’ve got the urge to write a fugue You’ve got the nerve to write a fugue So go ahead and write a fugue that we can sing Pay no heed to what we’ve told you Give no mind to what we’ve told you Just forget all that we’ve told you And the theory that you’ve read

Political trust among electoral winners and losers in Europe

Winner-loser trust gap across countries Winner-loser trust gap in Poland Trust differences across parties in Poland Voting for a party that ends up losing the election is known to be associated with lower satisfaction with democracy and trust in the parliament (cf. Martini and Quaranta 2019). How does Poland compare to other European countries? How has the winner-loser trust gap changed in Poland over time, and how have trust levels among supporters of current and former ruling parties changed in periods when they were not in government?