Videos

Week 2 - https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=21662ebd-0862-4163-91d0-044ec32aa617
Week 3, Exercise 1 - https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=34fcb707-a3a8-494c-9361-ec3a97502685
Week 3, Exercise 4 - https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=03f8410b-81bb-46e5-bf61-ff38d1f56d36
Week 4, Exercise 1 - https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=48fda02f-b301-469f-82d0-12ba0d34fdc1
Week 4, Exercise 2 -https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=abecce45-8040-4a7f-8e50-15fd2e088669

Examples

Working with variables

Number variables

In Python, = is used to give an variable a value. For example, the variable my_birth_year can have the value 1988 with the following command:

my_birth_year = 1988

Now the data is stored in my_birth_year and we can use it in the following lines of code to print it and to use it in calculation (and to store the calculation results in another variable).

print my_birth_year
print 2017 - my_birth_year
my_age = 2017 - my_birth_year

In Finland, each person is given a Personal Identity Code or PIC for short. The PIC codes details of date of birth and gender into it as explained in Wikipedia.

Text variables

There can also be text variables in Python, as shown below:

pic = '131052-308T'

One can access some parts of the text selecting text from an index to second index. For example, the birth year (1952 in this case) is in indexes from 4 to 5 (as the first place is 0) and can be accessed print pic[4:6]. However, using it with a number, for example changing the value from 52 to 1952 (by adding 1900) does not work directly:

print pic[4:6] + 1900
TypeError: cannot concatenate 'str' and 'int' objects

After understanding the error through Googling it, it appears we need to change type of the text variable to a number variable to answer this question:

print int(pic[4:6]) + 1900

Comments

It is possible, as seen above, make markings to Python code that is not executed. Just start with # and then rest of the line is not considered as code.

Control structures

In the PIC the sex is identified in the final three numbers before the letter. If it is even, the person is female and if it is odd, then the person is male.

In Python we use if {condition}: structure to do this:

sex = int( pic[7:10] )
if sex % 2 == 0: # check if sex is even
   print 'female'
if sex % 2 == 1: # check if sex is odd
   print 'male'

To compute the total sum of all costs in City of Helsinki which were in a huge file, one cost per a row. We need to go over each line of their cost data and sum those up. We shall separate this to two different subproblems:

(1) To go over a file and separate all lines of texts from there we use the for {variable name} in {data source} style:

for line in open('costs.csv'):
   line = line.strip() ## remove extra spaces
   line = line.split(',') ## make the line as separate elements split by the ,-character

   cost = float( line[11] ) ## we know from the file structure that the cost is on the 12th 'column'
   print cost

(2) Count the sum we create a new variable to store the sum this far and add the sum by the cost every time we see a new costs

total_cost = 0 ## start by zero
for line in open('costs.csv'):
   line = line.strip() ## remove extra spaces
   line = line.split(',') ## make the line as separate elements split by the ,-character

   cost = float( line[11] ) ## we know from the file structure that the cost is on the 12th 'column'
   total_cost = total_cost + cost

print total_cost

Data structure

Lists

Lists are variables where you can store several values in a sequence. An empty list is created

stop_39 = []

and data can be added to that list by using the syntax {variable_name}.append( {variable} )

for stop in data[1:]:

    stop = stop.strip()
    stop = stop.split(',')

    if stop[1] == '1039':
        print stop[0]
        stop_39.append( stop[0] )

Checking if an value is in a list one can use the in command:

data = open('passengers.csv').readlines()
number_of_passanger = 0
for stop in data[1:]:
    stop = stop.strip()
    stop = stop.split(',')    
    ## print stop[2], stop[8]

    if stop[2] in stop_39:

        print 'Yes'

        number_of_passanger = number_of_passanger + int(stop[8])

print number_of_passanger

Also you can access certain elements of a list using the {variable}[index] notation: stop_39[0] prints the first stop of the busline, stop_39[-1] prints the last stop and stop_39[1:3] prints the second and third stop.

JSON

JSON is not actually a data format, but rather a file storage format. It allows storing various formats in a machine readable format. The great thing is that the format is flexible and allows storying many objects (such as lists) to the file easyly.

To utilize JSON, import the json library

import json

Now you can load (read a file) and dump (save a file)

data = json.load( open('yle.json') )
json.dump( data, open('yle.json', 'w') )

Dictonaries

Dictonaries allow storing data in a key-value format: there is only one variable but it has several keys.

person = { 'first_name' : 'John', 'last_name' : 'Smith', birth_year = 2000 }

print person ## show the whole dictonary
print person.keys() ## show only keys as a list
print person.values() ## show only values as a list

print person['first_name'] ## print only the variable under the key first_name

Using a dictonary is similar to any other variable, it can be changed through assigment and evaluated in the usual manner:

person['last_name'] = 'Smith-Smith'

A new key can be added just by giving it a key and a value

person['married'] = True

if person['married'] == True:
   print person['name'], 'is married'

An existence of a variable in the dictonary can be checked with in operator

if 'birth_year' in person:
    print 'We know that', person['name'], 'was born in', person['birth_year']

if 'death_year' in person:
    print 'We know that', person['name'], 'died in', person['birth_year']
else:
    print 'We do not know when', person['name'], 'died'

Programming for Social Science Tutorials