Introduction Illustration: Trust in institutions Step 1: Preparation and coding of technical variables Step 2: Selection of source variables for harmonization Step 3: Mapping source values to target values Step 4: Harmonization Results: Availability of trust items Comparability of sample aggregates Appendices: Code examples Appendix 1: Data preparation Appendix 2: Codebook from labelled data in R Appendix 3: Values crosswalk Appendix 4: Harmonization Introduction Ex-post (or retrospective) data harmonization refers to procedures applied to already collected data to improve the comparability and inferential equivalence of measures collected by different studies.
Working with categorical data, such as from surveys, requires a codebook. After spending some time unsuccessfully looking for a function that would create a nice, searchable codebook from labelled data in R, I decided to write my own. What I want to achieve is a simple table with variable names, labels, and frequencies of labelled values like the one below, to search for specific keywords in the value labels and to see distributions of various variables.
So You Want to Write a Fugue? Glenn Gould So you want to write a fugue? You’ve got the urge to write a fugue You’ve got the nerve to write a fugue So go ahead and write a fugue that we can sing Pay no heed to what we’ve told you Give no mind to what we’ve told you Just forget all that we’ve told you And the theory that you’ve read
Winner-loser trust gap across countries Winner-loser trust gap in Poland Trust differences across parties in Poland Voting for a party that ends up losing the election is known to be associated with lower satisfaction with democracy and trust in the parliament (cf. Martini and Quaranta 2019). How does Poland compare to other European countries? How has the winner-loser trust gap changed in Poland over time, and how have trust levels among supporters of current and former ruling parties changed in periods when they were not in government?
Data Packages Varieties of Democracy (V-Dem): Dedicated package Polyarchy: Semicolon delimited CSV file -> rio Freedom House: Excel file with by-year sheets Polity IV: SPSS file -> rio Democracy Barometer: Excel file with header in top rows -> rio The Standardized World Income Inequality Database (SWIID): Plain CSV file -> rio World Bank’s World Development Indicators: Dedicated package Merging all datasets Writing to file Shortly after writing this post on importing datasets in different formats (CSV, XLS, XLSX, SAV) to R, I got the following comment:
Data Packages Varieties of Democracy (V-Dem): Dedicated package Polyarchy: Semicolon delimited CSV file Freedom House: Excel file with by-year sheets Polity IV: SPSS file Democracy Barometer: Excel file with header in top rows The Standardized World Income Inequality Database (SWIID): Plain CSV file World Bank’s World Development Indicators: Dedicated package Merging all datasets Country graphs Variable graphs Writing to file with Viktoriia Muliavka Social and political scientists often need to put together datasets of country-level political, economic, and demographic variables with data from different sources.
How 2015 voters voted in 2007 and 2011 How 2007 voters voted in 2011 and 2015 About POLPAN Where did the current governing party get their votes from? Did supporters of the previous ruling party switch preferences or did they abstain from voting altogether? Cross-sectional datasets, such as one-off election polls, do not typically provide data to answer these questions. Panel studies, such as the Polish Panel Survey (POLPAN), do.
Determining meritocratic allocation Calculating the distance to meritocracy Distance to meritocracy by country Meritocracy is a principle according to which rewards are based on merit, as well as an ideal situation resulting from the operation of this principle. In their 1985 Social Foces paper titled “How Far to Meritocracy? Empirical Tests of a Controversial Thesis”, Tadeusz Krauze and Kazimierz M. Słomczyński proposed an algorithm to construct a theoretical joint distribution of education and income, given their marginal distributions, that would satisfy the conditions of meritocratic allocation.
What comes first? Wikipedia, Google, News Interest in technology Cross-correlations News coverage versus Wikipedia page views with Maria Khachatryan, Filip Kowalski, Jakub Siwiec, and Paweł Zawadzki The Hackathon Next Generation Internet Data Sprint was organized by the Digital Economy Lab of the University of Warsaw on November 9 and 10, 2018. The goal of the hackathon was to explore datasets on Wikipedia page views and edits, Reddit posts, media mentions, and others, to generate insights about the use of the internet and new technologies.
BigSurv18 and the Green City Hackathon Team number 5 Data Bike use Altitude of Bicing stations Location of mechanical and electric bike stations Empty stations by station altitude Next steps with Saleha Habibullah, Sakinat Folorunso, and Vera Paul BigSurv18 and the Green City Hackathon One of accompanying events of the BigSurv18: Big Data Meets Survey Science conference in Barcelona last week was the Green City Hackathon.