The International Social Survey Programme offers a wealth of data, with thematic modules repeated around every 10 years, and a solid and relatively stable block of socio-demographics. The data can be downloaded from the GESIS data archive either in separate files per year or with data bundled by topic (e.g., the Social Inequality dataset contains data from rounds 1987, 1992, 1999, and 2009). There is no integrated codebook indicating the availability of variables in different rounds, so someone interested in longitudinal analyses would need to download all files, open them and look for the variables of interest.
Introduction Illustration: Trust in institutions Step 1: Preparation and coding of technical variables Step 2: Selection of source variables for harmonization Step 3: Mapping source values to target values Step 4: Harmonization Results: Availability of trust items Comparability of sample aggregates Appendices: Code examples Appendix 1: Data preparation Appendix 2: Codebook from labelled data in R Appendix 3: Values crosswalk Appendix 4: Harmonization Introduction Ex-post (or retrospective) data harmonization refers to procedures applied to already collected data to improve the comparability and inferential equivalence of measures collected by different studies.
Working with categorical data, such as from surveys, requires a codebook. After spending some time unsuccessfully looking for a function that would create a nice, searchable codebook from labelled data in R, I decided to write my own. What I want to achieve is a simple table with variable names, labels, and frequencies of labelled values like the one below, to search for specific keywords in the value labels and to see distributions of various variables.