In Python, =
is used to give an variable a value.
For example, the variable my_birth_year
can have the value 1988 with the following command:
my_birth_year = 1988
Now the data is stored in my_birth_year
and we can use it in the following lines of code to print it and to use it in calculation (and to store the calculation results in another variable).
print my_birth_year
print 2017 - my_birth_year
my_age = 2017 - my_birth_year
In Finland, each person is given a Personal Identity Code or PIC for short. The PIC codes details of date of birth and gender into it as explained in Wikipedia.
There can also be text variables in Python, as shown below:
pic = '131052-308T'
One can access some parts of the text selecting text from an index to second index. For example, the birth year (1952 in this case) is in indexes from 4 to 5 (as the first place is 0) and can be accessed print pic[4:6]
.
However, using it with a number, for example changing the value from 52 to 1952 (by adding 1900) does not work directly:
print pic[4:6] + 1900
TypeError: cannot concatenate 'str' and 'int' objects
After understanding the error through Googling it, it appears we need to change type of the text variable to a number variable to answer this question:
print int(pic[4:6]) + 1900
It is possible, as seen above, make markings to Python code that is not executed.
Just start with #
and then rest of the line is not considered as code.
In the PIC the sex is identified in the final three numbers before the letter. If it is even, the person is female and if it is odd, then the person is male.
In Python we use if {condition}: structure to do this:
sex = int( pic[7:10] )
if sex % 2 == 0: # check if sex is even
print 'female'
if sex % 2 == 1: # check if sex is odd
print 'male'
To compute the total sum of all costs in City of Helsinki which were in a huge file, one cost per a row. We need to go over each line of their cost data and sum those up. We shall separate this to two different subproblems:
(1) To go over a file and separate all lines of texts from there we use the for {variable name} in {data source} style:
for line in open('costs.csv'):
line = line.strip() ## remove extra spaces
line = line.split(',') ## make the line as separate elements split by the ,-character
cost = float( line[11] ) ## we know from the file structure that the cost is on the 12th 'column'
print cost
(2) Count the sum we create a new variable to store the sum this far and add the sum by the cost every time we see a new costs
total_cost = 0 ## start by zero
for line in open('costs.csv'):
line = line.strip() ## remove extra spaces
line = line.split(',') ## make the line as separate elements split by the ,-character
cost = float( line[11] ) ## we know from the file structure that the cost is on the 12th 'column'
total_cost = total_cost + cost
print total_cost
Lists are variables where you can store several values in a sequence. An empty list is created
stop_39 = []
and data can be added to that list by using the syntax {variable_name}.append( {variable} )
for stop in data[1:]:
stop = stop.strip()
stop = stop.split(',')
if stop[1] == '1039':
print stop[0]
stop_39.append( stop[0] )
Checking if an value is in a list one can use the in command:
data = open('passengers.csv').readlines()
number_of_passanger = 0
for stop in data[1:]:
stop = stop.strip()
stop = stop.split(',')
## print stop[2], stop[8]
if stop[2] in stop_39:
print 'Yes'
number_of_passanger = number_of_passanger + int(stop[8])
print number_of_passanger
Also you can access certain elements of a list using the {variable}[index] notation: stop_39[0]
prints the first stop of the busline, stop_39[-1]
prints the last stop and stop_39[1:3]
prints the second and third stop.
JSON is not actually a data format, but rather a file storage format. It allows storing various formats in a machine readable format. The great thing is that the format is flexible and allows storying many objects (such as lists) to the file easyly.
To utilize JSON, import the json library
import json
Now you can load (read a file) and dump (save a file)
data = json.load( open('yle.json') )
json.dump( data, open('yle.json', 'w') )
Dictonaries allow storing data in a key-value format: there is only one variable but it has several keys.
person = { 'first_name' : 'John', 'last_name' : 'Smith', birth_year = 2000 }
print person ## show the whole dictonary
print person.keys() ## show only keys as a list
print person.values() ## show only values as a list
print person['first_name'] ## print only the variable under the key first_name
Using a dictonary is similar to any other variable, it can be changed through assigment and evaluated in the usual manner:
person['last_name'] = 'Smith-Smith'
A new key can be added just by giving it a key and a value
person['married'] = True
if person['married'] == True:
print person['name'], 'is married'
An existence of a variable in the dictonary can be checked with in operator
if 'birth_year' in person:
print 'We know that', person['name'], 'was born in', person['birth_year']
if 'death_year' in person:
print 'We know that', person['name'], 'died in', person['birth_year']
else:
print 'We do not know when', person['name'], 'died'