Aim and approach

This course provides a strong methodology foundation in applying data science methods for social sciences. In the end of the course, the student knows how to (a) conceptually apply data science approaches in work and (b) knows how to code machine learning using R or Python language.

This course focuses on

conceptual understanding of applying data science approaches through discussing papers which have used a particular approach.
hands-on skills of conducting these analysis with data sets

By the end of the course, students can

conduct data pre-processing both with quantitative and qualitative datasets
apply unsupervised machine learning methods to quantitative and qualitative data and draw social science relevant conclusions from the results
apply supervised machine learning methods to quantitative and qualitative data and draw social science relevant conclusions from the results
discuss the benefits and challenges of using data science methods for social science research methods and skills to apply these methods in students’ own research domain.

If you have quiestions about the course, please contact us via grp-dcm-teaching@helsinki.fi .

Prerequisite

Before the first class, student should master bascis or Python or R and know how to

working with variables, for-loops and if-structures
opening files and writing files
calling functions or methods

Course materials

Chapter "Algorithmic Data Analysis" from Coding Social Science. Understanding and Doing Computational Social Science and prepeare by going through the exercises.
In class coding activities

Course evaluation

Course is evaluated as pass/fail. To pass, you need 125 points. Each student choose what learning activities they want to engage tio help them in their learning process and combine different approaches and modules.

Attending class discussion: 5 points per each class
Writing an response to data science article of your choice discussing its methodological choices: 5 points per each response
Doing the class activity and writing a reflection diary based on it: 10 points per each activity
Taking a paper of your choice which is not using data science methods and introduce how you would use data science approaches to redo that paper: 10 points per paper
Taking a paper of your choice which is not using data science methods and do write a replication study which used data science methods: 25 points per paper
Writing an empirical article (with introduction, theory, methods etc.) which utilises two methods discussed in the class: 80 points per article
Writing an empirical article (with introduction, theory, methods etc.) which utilises one method discussed in the class: 60 points per article
Writing a brief analysis of a research problem of your choice with these methods: 15 points per article
Propose your own activity here

Data science Spring 2020

Aim and approach

Prerequisite

Course materials

Suggested additional reading

Course evaluation

Syllabus

17.1. Introduction and Social science research questions and data science

23.1. Manual rules as data science and Working with Textual Data

31.1. K-means and cluster analysis

7.2. Topic models

14.2. Support vector machines and Naive Bayes

21.2. Decision trees and random forests

28.2. Association rules and Future Outlook