Python: Primer

Seong-Hwan Jun

Python on the cloud

  • We will primarily use Google Colab for classroom demos.

  • Basic features of colab

  • We may also run Python on JupyterLab via Center for Integrated Research Computing (CIRC).

  • To request account, speak to your advisor if he/she has an account with CIRC; otherwise, contact Prof. McCall (DBCB students). This is optional.

Python locally

  • Recommended to work locally for homeworks and projects.
  • Visual studio code is an integrated development environment (like RStudio) that is available free of use.
  • More details to come on setting up the local environment.

Language basics

  • Just like R, it is an interpreted language as opposed to being a compiled language.
  • Important distinction from R: indexing begins from 0 not 1!

Language basics

a = 3
b = 5
print(a+b)
8

Primitive types

Data types like boolean, integer, float, and strings (characters in R) are natively supported.

Print

print("hello")
age = 5
print("hello I am " + str(age) + " years old.")
hello
hello I am 5 years old.

String formatting

age = 5
print(f"hello I am {age} years old.")
pi = 3.141
print(f"hello pi is {pi:.2f} to 2 decimal places.")
hello I am 5 years old.
hello pi is 3.14 to 2 decimal places.

Data structures: list

Ordered, mutatble collection of items (similar to R’s list).

a = []
a.append(3)
a.append(5)
print(a)
[3, 5]
a[0] = 1.8
print(a)
[1.8, 5]

Data structures: list

b = [5, "hello"]
a.append(b)
print(a)
[1.8, 5, [5, 'hello']]

Data structures: list

c = ["ABC", 3.7]
d = b + c # concatenates b and c and assign as d
print(d)
[5, 'hello', 'ABC', 3.7]
b.extend(c) # modifies b by concat of c
print(b)
[5, 'hello', 'ABC', 3.7]
print(len(b)) # len() is a function that retrieves the length of the list.
4

Data structures: list

Indexing and slicing:

print(b)
[5, 'hello', 'ABC', 3.7]
print(b[0:2])
[5, 'hello']
print(b[-2]) # retrieve the second elements from the end
ABC
print(b[-3:]) # third last element to the end.
['hello', 'ABC', 3.7]
print(b[:3]) # first three elements.
[5, 'hello', 'ABC']

Data structures: list

aa = [1.5, 0.3, -5, 9]
aa.sort()
print(aa)
[-5, 0.3, 1.5, 9]
bb = ["hello", "ah!!!", "who?", "123", "?!@#!"]
bb.sort()
print(bb)

bb.reverse()
print(bb)
['123', '?!@#!', 'ah!!!', 'hello', 'who?']
['who?', 'hello', 'ah!!!', '?!@#!', '123']

Data structures: list

aa = [1.5, 0.3, -5, 9]
aa.remove(1.5)
print(aa)
[0.3, -5, 9]
aa.insert(1, 5) # Insert 5 at index 1.
print(aa)
[0.3, 5, -5, 9]
aa.append(10) # Insert at the end.
print(aa)
[0.3, 5, -5, 9, 10]
last_element = aa.pop() # Remove and return the last element.
print(aa)
print(last_element)
[0.3, 5, -5, 9]
10
element=aa.pop(1) # Remove and return the element at index 1.
print(element)
print(aa)
5
[0.3, -5, 9]

Data structures: list

print(aa)
aa.append(-5)
print(aa.count(-5))
[0.3, -5, 9]
2
print(aa.index(-5)) # returns the index corresponding to the first occurrence of -5.
1
aa.clear() # Remove all items.
print(aa)
[]

Data structures: tuples

Ordered, immutatble collection of items.

  • Faster and minimal storage requirements compared to list (e.g., accessing and unpacking, no overhead needed to handle dynamic size changes).
  • Often used by functions returning multiple objects.
  • Hashable. (what is this?)

Data structures: tuples

coord = (5.2, 1.8, -1.4)
print(coord)
(5.2, 1.8, -1.4)
coord[0] = 10.0 # Immutable so we cannot modify in place.
print(coord)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[27], line 1
----> 1 coord[0] = 10.0 # Immutable so we cannot modify in place.
      2 print(coord)

TypeError: 'tuple' object does not support item assignment

Data structures: dictionary

  • Stores Key-Value pair.
  • Key needs to be hashable: most primitive types are hashable.
  • Runtime is \(O(1)\) for access, insertion, removal, and search.
  • Cannot have duplicate keys.

Data structures: dictionary

a = {}
a["key1"] = 3.5
a["key2"] = 9.5
print(a)
{'key1': 3.5, 'key2': 9.5}
b = {"k1": "value1", "k2": "v2"}
b["k1"] = "x1" # Overwrites the value at k1
print(b)
print("k1" in b) # Check if key is in dict.
{'k1': 'x1', 'k2': 'v2'}
True

Data structures: set

  • Similar as set in mathematics: no duplicates.
  • \(O(1)\) for insertion, removal, and search.
  • \(O(n)\) for set operations: union, intersection, difference.
a = {1, 2, 3}
b = set()
b.add(1)
b.add(5)
print(1 in a) # Check if item in set
True
print(a.intersection(b))
{1}
print(a.union(b))
{1, 2, 3, 5}
print(a.difference(b))
{2, 3}

Control flows: if, elif, else

x = 3
if x > 5:
    print("x is greater than 5")
elif x == 5:
    print("x is equal to 5")
else:
    print("x is less than 5")
x is less than 5

Supports ternary operation:

ss = "x is greater than 5" if x > 5 else "x is less than equal to 5"
print(ss)
x is less than equal to 5

Control flows: for loop

for i in range(5): # exclusive so 5 is not included
    print(i)
0
1
2
3
4
for i in range(2, 5): # can specify begin index
    print(i)
2
3
4
a = ["item1", "item2", 3, 6]
for i in a:
    print(i)
item1
item2
3
6

Control flows: for loop

d = {1: "item1", 2: "item2", 3: "item3"}
for k, v in d.items():
    print("Key: " + str(k) + ", Value: " + v)
    if k == 2:
      break
Key: 1, Value: item1
Key: 2, Value: item2
a = {1, 2, 3}
for item in a:
    print(item)
1
2
3

zip

Combine two or more lists into a list of tuples for iteration.

names = ["Alice", "Bob"]
ages = [25, 30]
for name, age in zip(names, ages):
    print(f"{name} is {age} years old.")
Alice is 25 years old.
Bob is 30 years old.
names = ["Alice", "Bob"]
ages = [25, 30]
net_assets = [1000000, 1000, 400000]
for name, age, net_asset in zip(names, ages, net_assets): # zip truncates to the length of the shorter list
    print(f"{name} is {age} years old, and has ${net_asset}.")
Alice is 25 years old, and has $1000000.
Bob is 30 years old, and has $1000.
dictionary = dict(zip(names, ages)) # forms key:value pairs.
print(dictionary) 
{'Alice': 25, 'Bob': 30}

enumerate

Iterate over the index and the item.

fruits = ["apple", "banana", "cherry"]
for index, fruit in enumerate(fruits): 
    print(f"{index}: {fruit}")
0: apple
1: banana
2: cherry

Comprehensions

Generate new collection by iteration and optionally applying conditions/operations.

squares = [x**2 for x in range(5) if x % 2 == 0]
print(squares)
[0, 4, 16]
squares = {x: x**2 for x in range(5)}
print(squares)  # Output: {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

Functions

def foo(a, b):
  return(a + b, a-b, a*b, a/b) # Returning a tuple.

add, sub, mul, div = foo(3, 5)
print(add)
print(sub)
print(mul)
print(div)
8
-2
15
0.6

Functions

def foo(a, b) -> float: # declare return type for clarity
  return(a + b)

print(foo(5, 2)) 
7
def bar(a, b) -> float: # in larger projects, this adds clarity
  return("hah!")

print(foo(5, 2))  # but interpreter ignores the return type
7

Lambda

Functions without names, often used where short, disposable functionality is needed. Similar to anonymous functions in R’s apply.

numbers = [1, 2, 3, 4]
squared = list(map(lambda x: x**2, numbers))
print(squared) 
[1, 4, 9, 16]

Lambda

numbers = [1, 2, 3, 4, 5]
odd_numbers = list(filter(lambda x: x % 2 != 0, numbers))
print(odd_numbers)
[1, 3, 5]

Lambda

from functools import reduce
numbers = [1, 2, 3, 4]
initial_value = -5
total = reduce(lambda x, y: x + y, numbers, initial_value)
print(total)
5

Lambda

pairs = [(2, 'b'), (1, 'a'), (3, 'c')]
pairs.sort(key=lambda pair: pair[0])
print(pairs)
[(1, 'a'), (2, 'b'), (3, 'c')]

Lambda

def foo(x):
  bar = lambda x: x**2 # can assign it to a variable for re-use.
  y = bar(x)
  z = 2*bar(x)
  return(y, z)

print(foo(3))
(9, 18)
bar = lambda x: x**2 # can assign it to a variable if desired.
def foo(fn, z):
  return(fn(z))

print(foo(bar, 3))
9

Context managers

Starts with keyword with to declare a opening of a resource such as file.

with open("example.txt", "w") as f:
    f.write("Hello, Python!") 

# File is closed and any resources released automatically without explicitly calling f.close()

f.write("Bye Python!") # Error.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[56], line 6
      2     f.write("Hello, Python!") 
      4 # File is closed and any resources released automatically without explicitly calling f.close()
----> 6 f.write("Bye Python!")

ValueError: I/O operation on closed file.

Context managers

with open("example.txt", "r") as f:
    print(f.readline())

print(f.readline())
Hello, Python!
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[57], line 4
      1 with open("example.txt", "r") as f:
      2     print(f.readline())
----> 4 print(f.readline())

ValueError: I/O operation on closed file.

Class

Class allows you to create your own data type made up of attributes (variables) and operations (functions).

  • Bundle attributes and operations together (Encapsulation).
  • Improved reusability: classes can be extended where deriving class implements its own specific attribute and methods while inheriting behaviors of parent class.
  • Functions defined for a class is referred to as methods.

Class

  • Instance: A specific object created from a class.
  • self: Refers to the instance on which the method is called.
  • __init__: A special method for initializing a new object, called constructor.
  • Attributes: Variables that store the state or data of an object.
  • Methods: Functions that define the behavior of an object. Every function of the class must contain self as first argument.

Class

class Animal:
  def __init__(self, name):
    print("Animal constructor.")
    self.name = name

  def speak(self):
    print(f"{self.name}'s default noise.")

  def print_name(self):
    print(self.name)

object = Animal("John")
object.print_name()
object.speak()
Animal constructor.
John
John's default noise.

Class

class Cat(Animal):
  def __init__(self, name, sound):
    print("Cat constructor.")
    super().__init__(name)
    self.sound = sound

  def speak(self): # overriding parent's method
    print(f"{self.name} {self.sound}")

  def print_name(self):
    print(self.name)

doug = Cat("Doug", "barks")
doug.print_name()
doug.speak()
Cat constructor.
Animal constructor.
Doug
Doug barks

Debugging

class TestClass:
  def __init__(self):
    print("Class constructor.")
    self.a = 5
    self.dict = dict()

  def add_to_dict(self, k, v):
    # Divide v by self.a, then the resulting value to self.dict[k]
    self.dict[k] = k / self.a
    print("Added to dictionary.")

  def get_item(self, k):
    return self.dict[k]

obj = TestClass()
obj.add_to_dict(1, 5)
expected_value = 5 / obj.a
realized_value = obj.get_item(1) 
print(f"Expected value: {expected_value}")
print(f"Realized value: {realized_value}") # Not what we expected.
Class constructor.
Added to dictionary.
Expected value: 1.0
Realized value: 0.2