Data Type
string: immutable, slicing
list: mutable, slicing, nested, multidimentional, ordered
dict: mutable, no ordered, no dupilicated key
tuples: immutable, sequence, better performance than list
set: no ordered, immutable
int: not iterable
Class
inheritance: keep all methods and functions
super(), make child class inherit all the methods and properties from its parent
class Student(Person):
def __init__(self, fname, lname):
super().__init__(fname, lname)
staticmethod: can be called without an object for that class, passed no extra object
class Calculator:
def addNumbers(x, y):
return x + y
# create addNumbers static method
Calculator.addNumbers = staticmethod(Calculator.addNumbers)
print('Product:', Calculator.addNumbers(15, 110))
class attributes: shared by all instances at the class level
class Person:
num_instances = 0
def __init__(self):
Person.num_instances = Person.num_instances + 1
class method: process class data instead of instance data, passed a class object
class Person:
num_instances = 0
def __init__(self):
Person.num_instances = Person.num_instances + 1
@classmethod
def print_num_instances(cls):
print(cls.num_instances)
Instance methods: passed a self instance object by default
Iterator
An iterator must have two methods: iter and next
When it reaches the end, will give an error
You can convert iterators to lists or tuples
why? More efficient for handling larger datasets.
L = [1, 2, 3]
it = iter(L)
it.__next__()
next(it)
Generator
create your own iterators with generators in Python.
def fib2(limit):
a, b = 0, 1
while a < limit:
yield a # call fib2() output a, next() output next a
a = b
b = a + b
- Using generators save memory and time because they only provide you and compute what you need for now.
- Using generators can also save the current state for you if you need it for future computations.
Recursion
a function within which it calls itself.
def mysum(l):
print(l)
if not l:
return 0
else:
return l[0] + mysum(l[1:])
Recursion error can also happen due to stack overflow
Regular Expression
specify the rules for the set of possible strings that you want to match
[a-c]: length 1, if a-c
[^a]: length 1, if not a
ap*: length unlimited, if a or a with any num of p
ap+: length >= 2, if ap or a with any num of p, num!=0
re.findall() returns a list containing all matches
re.search() returns the first match as a Match object
.start().end().span().string.group(), return none if no match
re.sub(pattern, repl, string, count=0, flags=0)
File
os.walk: Iterate a directory
import os
rootdir = './' # or you can change to any of your directory that you want to iterate over
for subdir, dirs, files in os.walk(rootdir):
for file in files:
print(subdir)
print(os.path.join(subdir, file))
# ./
# ./Lec8-HandOut.ipynb
import os
rootdir = './'
for subdir, dirs, files in os.walk(rootdir):
for file in files:
if file.endswith('.txt'):
print(os.path.join(subdir, file))
Argparse
args = sys.argv[1:]
parser = argparse.ArgumentParser(description='Process some integers.')
parser.add_argument('--mode', type=str, required=True)
args = parser.parse_args()
Stack
last in first out, first in last out
Linked List Format - 每一个node 的next是前面一个node
push等于加一个新node并把上一个node放在next,之后把当前node变成上一个node
pop等于把上一个node变成当前node的next,因为没有了reference,自动就删除了
# A linked list implementation
class Node:
def __init__(self, value):
self.value = value
self.next = None
class Stack:
def __init__(self):
self.head = Node("Head")
self.size = 0
# Override the pring function to print more user-friendly stack
def __str__(self):
cur = self.head.next
out = ""
while cur:
out += str(cur.value) + "->"
cur = cur.next
return out[:-3]
# Get the current size of the stack
def getSize(self):
return self.size
# Check if the stack is empty
def isEmpty(self):
return self.size == 0
# Get the top item of the stack
def peek(self):
# Sanitary check to see if we
# are peeking an empty stack.
if self.isEmpty():
raise Exception("Peeking from an empty stack")
return self.head.next.value
# Push a value into the stack.
def push(self, value):
node = Node(value)
node.next = self.head.next
self.head.next = node
self.size += 1
# Remove a value from the stack and return.
def pop(self):
if self.isEmpty():
raise Exception("Popping from an empty stack")
remove = self.head.next
self.head.next = self.head.next.next
self.size -= 1
return remove.value
# Driver Code
if __name__ == "__main__":
stack = Stack()
for i in range(1, 11):
stack.push(i)
print(f"Stack: {stack}")
for _ in range(1, 6):
remove = stack.pop()
print(f"Pop: {remove}")
print(f"Stack: {stack}")
Queue
first in first out, last in last out
from collections import deque
q = deque()
q.append('a')
q.append('b')
q.append('c')
print(q) # deque(['a', 'b', 'c'])
q.appendleft('e')
print(q) # deque(['e', 'a', 'b', 'c'])
print(q.popleft()) # e
print(q) # deque(['a', 'b', 'c'])
print(q.pop()) # c
print(q) # deque(['a', 'b'])
q.count('a') # 1
linked list - head为表头,next代表下一个node
insert - 创建node,如果没有head则定为head,如果有head则loop到最后一位并设到next
delete - 创建node,loop到最后并给prev和cur设置reference,删除prev的next
Binary Search
列表中的数必须按照从小到大的顺序排列
查询过程 - 每次取两个index值,并取一个mid,如果mid不是对象,则根据对象大小改变first和last
原理,如果对象比查询要小,则仅查询mid之前,如果对象比查询要大,则仅查询mid之后
def binarySearch(alist, item):
first = 0
last = len(alist)-1
found = False
while first<=last and not found:
midpoint = (first + last)//2
print(first, last)
print(midpoint)
if alist[midpoint] == item:
found = True
return midpoint
else:
if item < alist[midpoint]:
last = midpoint-1
else:
first = midpoint+1
return midpoint
binarySearch([1,2,2,3,4,5],2)
Bubble sort
每一次loop往前摆一个位置,只在两个数中调换
def bubbleSort(array):
for i in range(len(array)):
for j in range(1, len(array) - i - 1):
# print(i, j)
if array[j] > array[j + 1]:
temp = array[j]
array[j] = array[j+1]
array[j+1] = temp
# print(array)
return array
data = [-1, 4, 0, 5, -3]
bubbleSort(data)
Insertion Sort
每次从前面的最近比较到最远,如果前一位更大,ref后把当前数换到前一位,继续比,如果前一位更小或者没有,把当前数换成ref
每一次loop把当前数摆到正确的位置
def insertion_sort(lst):
for i in range(1, len(lst)):
val = lst[i]
j = i - 1
while j >= 0 and val < lst[j]:
lst[j+1] = lst[j]
j = j - 1
lst[j+1] = val
return lst
insertion_sort([0,-1,2,1,-3])
Selection sort
从头开始loop,当前值是index,loop一遍后面的值找到最小的,和当前值替换。所以当到第二位时,前面的值一定等于或者小于前面所有的值。
def selection_sort(lst):
size = len(lst)
for cur in range(size):
min_pos = cur
# min_val = lst[cur]
for i in range(cur + 1, size):
if lst[i] < lst[cur]:
min_pos = i
tmp = lst[min_pos]
lst[min_pos] = lst[cur]
lst[cur] = tmp
return lst