> Data science course by AMA & ICAML - Practical Part

Introduction - Python

We start with a brief introduction to Python. Additional topics will be included to the later topics as soon as we need them. Nevertheless this is not a Python tutorial and we skip many details. If you are interested, a more detailed and complete Python introduction is available here.

Hello World

As always we start with a hello world example. Using Python within a Jupyter notebook makes this extra simple. The function print() is the easiest way to get feedback from Python code. It takes one or many arguments inside the parenthesis which are then printed (usually to the interactive console).

Jupyter notebooks organize the code in cells. If code in a cell produces output, it is shown below the cell ones the cell (and the code within) is executed. Execute the Hello World example in the next cell by clicking on the following cell (should be surrounded by blue or green box) and then clicking on the ”> Run” button at the top of the page or pressing [ctrl+enter]. A full list of shortcuts is available at “Help” -> “Keyboard Shortcuts”.

Note: In Jupyter notebooks we can also print values by just typing the variables name at the end of a cell.

# This is a code cell, which contains executable python code

# In Python blank lines as well as comments (text after #) are ignored

print("Hello World")           # Here we call the function print() with one textual parameter
print("Python", "is", "cool")  # You can pass multiple paramteres seperated by commas
Hello World
Python is cool

Printing values is usually important in two scenarios:

  1. For generating an (interactive) console program to give instructions or to show results
  2. During development to check the value of a variable (e.g. for debugging)

The second point is especially important in Jupyter notebooks, since we cannot set breakpoints. Alternatively the code is split into multiple cells which makes it easy to check intermediate results.

Run the next cell to check what happens if we add two strings in Python.

h = "Hello" # here we create two variables and store text in it.
w = "World"

hw = h + w  # do you know whats the result of adding two texts?

print(hw)   # let's find it out by printing the variables value
HelloWorld

Variables and Operations

In Python we don’t need to declare a variable before we assign a value to it. Neither do we need to specify the type of the variable. Instead Python will implicitly infer the type during assignment. This makes Python syntax compact and easy to learn, but can sometimes also cause unexpected behavior.

We start by doing some simple calculus with variables. The default operations like +, -, *,/ can be used quite intuitively for numbers.

a = 1         # Assign values to variables
b = 2
c = 3

d = a - b + c # Do some math and assign the result to 'd'
print(d)

d += 1        # Incremental operator (same as d = d + 1)
print(d)
2
3

Variables defined within a cell are available globally in the notebook unless they were defined within a certain scope (e.g. within a function). The state of any variable persists until it is overwritten, the kernel of the notebook is stopped or the variable is explicitly deleted.

Execute the following cell a few times and observe the output. You can restart the kernel by clicking on Kernel »> Restart.

a = a * 2  # Global scope within the notebook
print(a)
2

The fact that in Python the data types are not explicitly declared can be dangerous because it is possible to override variables without warning and maybe without knowing. In the next example the variables a and b are overwritten by strings. This will implicitly change the data type of the variables, which also affects the behavior of operations.

a = "abc"   # Overriding without warning
a = a * 2   # Operation depends on data type
print(a)
abcabc

Data types and casting

Next, we will introduce the so called primitive data types. A list of those is given in the next table.

Type Keyword Values Examples
Integer int integral numbers (no decimals) -135 / -15 / 0 / 1 / 9 / 999
Float float real numbers (with decimals) -15.3 / 0.0 / 1.5 / 9.142
Complex complex complex numbers 5+1j / -2+0j / 0-15j
       
Boolean bool binary / logical values True / False
String str text / letters 'Hello' / "it's a text" / 'C'
None NoneType single value for “nothing” None

In Python we can use the build-in function type() to check the type of a variable.


The process of converting a variable from one data type to another is called casting. This is done by using the corresponding keyword of the target data type like a function and passing the variable that should be casted as parameter. The function will return a representation of the variable in the desired data type without modifying the original variable.

In the next cells, some examples for checking data type and casting are given.

s = '5'
print(s, type(s))
5 <class 'str'>
i = int(s)
print(i, type(i))
5 <class 'int'>
sf = '5.1'
print(sf, type(sf))
5.1 <class 'str'>
f = float(sf)
print(f, type(f))
5.1 <class 'float'>
i = int(f)
print(i, type(i))
5 <class 'int'>

Magic commands

In Jupyter notebooks we can use so called magic commands. One of them is whos, which will list all variables that are currently existing as well as their value and data type. Another useful magic command is %reset -f which will delete all variables in the kernel. Another nice feature is that we can simply append ? to any function in order to open its’ documentation.

%whos
Variable   Type     Data/Info
-----------------------------
a          str      abcabc
b          int      2
c          int      3
d          int      3
f          float    5.1
h          str      Hello
hw         str      HelloWorld
i          int      5
s          str      5
sf         str      5.1
w          str      World
%reset -f -s
print?

Functions

Functions are defined using the keyword def as shown in the following cell. The arguments are defined in parenthesis, separated by commas. Similar to variables there is no need to declare the type of the arguments or of the return value. The scope of the function is defined using white space indention. Single indent consists of 4 white spaces by convention. The Python conventions can be found here.

Once a function was declared, we can call it by it’s name and pass parameters, again using parenthesis. The value, returned by the function, can directly be assigned to a variable.

# Definition of a function

def add(arg1, arg2):
    result = arg1 + arg2
    return result
# Usage of a function

a = 1
b = add(a, 3)
print(b)
4

At this point we shortly talk about execution orders. In most programming languages (including Python) many statements and function calls can be nested to form compact code. Although this is often a bad idea since it decreases the readability we will see some examples in the next cells. The second one is of course an example for bad code. Still it is important to be able to decompose such complex statements.

Can you guess the result? Including the data type?

a = 2*3+4*(3-1) # for arithmetics the order follows the mathematical rules, also parenthesis can be used
print(a, type(a))
14 <class 'int'>
a = add(4,6)/add(int(3/2),1)
print(a, type(a))
5.0 <class 'float'>
Author: Dennis Wittich, Artem Leichter
Last modified: 15.10.2019