Getting Started

Source

This is the summary of the book “A Whirlwind Tour of Python” by Jake VanderPlas.

You can view it in

import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Operators

Identity and Membership

The identity operators, is and is not check for object identity. Object identity is different than equality.

a = [1, 2, 3]
b = [1, 2, 3]
a == b
True
a is b
False
1 in a
True
4 in a
False

Built-in Types

Python’s simple types:

TypeExampleDescription
intx = 1integers (i.e., whole numbers)
floatx = 1.0floating-point numbers (i.e., real numbers)
complexx = 1 + 2jComplex numbers (i.e., numbers with real and imaginary part)
boolx = TrueBoolean: True/False values
strx = 'abc'String: characters or text
NoneTypex = NoneSpecial object indicating nulls

Complex Numbers

complex(1, 2)
(1+2j)

Alternatively, we can use the “j” suffix in expressions to indicate the imaginary part:

1 + 2j
(1+2j)
c = 3 + 4j
c.real
3.0
c.imag
4.0
c.conjugate()
(3-4j)
abs(c)
5.0

String Type

msg = "what do you like?" # double quotes
response = 'spam' # single quotes
# length
len(response)
4
# Upper/lower case
response.upper()
'SPAM'
# Capitalize, see also str.title()
msg.capitalize()
'What do you like?'
# concatenation with +
msg + response
'what do you like?spam'
# multiplication is multiple concatenation
5 * response
'spamspamspamspamspam'
# Access individual characters (zero-based (list) indexing)
msg[0]
'w'

None Type

Most commonly used as the default return value of a function

type(None)
NoneType
ret_val = print("abc")
abc
print(ret_val)
None

Likewise, any function in Python with no return value is, in reality, returning None.

Boolean

Booleans can also be constructed using the bool() object constructor: values of any other type can be converted to Boolean via predictable rules

  • any numeric type is False if equal to zero, and True otherwise

  • The Boolean conversion of None is always False

  • For strings, bool(s) is False for empty strings and True otherwise

  • For sequences, the Boolean representation is False for empty sequences and True for any other sequences

# numeric type
bool(0)
False
bool(1)
True
a = 0
if not a:
    print("a")
a
bool(None)
False
bool("")
False
bool("Hello World!")
True
bool([])
False
bool([1])
True
l_1 = [1, 2, 3]
l_2 = []

def is_empty(l):
    if l:
        print("not empty")
        return False
    else:
        print("empty")
        return True
is_empty([1, 2, 3])
not empty





False
is_empty([])
empty





True

Built-In Data Structures

Type NameExampleDescription
list[1, 2, 3]Ordered collection
tuple(1, 2, 3)Immutable ordered collection
dict{'a':1, 'b':2, 'c':3}Unordered (key,value) mapping
set{1, 2, 3}Unordered collection of unique values

Defining and Using Functions

*args and **kwargs

Write a function in which we don’t initially know how many arguments the user will pass.

  • *args:

    • * before a variable means “expand this as a sequence”

    • args is short for “arguments”

  • **kwargs

    • ** before a variable means “expand this as a dictionary”

    • kwargs is short for “keyword arguments”

def catch_all(*args, **kwargs):
    print("args = ", args)
    print("kwargs = ", kwargs)
catch_all(1, 2, 3, a=4, b=5)
args =  (1, 2, 3)
kwargs =  {'a': 4, 'b': 5}
inputs = (1, 2, 3)
keywords = {"one": 1, "two": 2}

catch_all(*inputs, **keywords)
args =  (1, 2, 3)
kwargs =  {'one': 1, 'two': 2}

Iterators

enumerate

“Pythonic” way to enumerate the indices and values in a list.

l = [2, 4, 6, 8, 10]
for i, val in enumerate(l):
    print("index: {}, value: {}".format(i, val))
index: 0, value: 2
index: 1, value: 4
index: 2, value: 6
index: 3, value: 8
index: 4, value: 10

zip

Iterate over multiple lists simultaneously

L = [1, 3, 5, 7, 9]
R = [2, 4, 6, 8, 10]

for l_val, r_val in zip(L, R):
    print("L: {}, R: {}".format(l_val, r_val))
L: 1, R: 2
L: 3, R: 4
L: 5, R: 6
L: 7, R: 8
L: 9, R: 10
for i, val in enumerate(zip(L, R)):
    print("Index: {}, L: {}, R: {}".format(i, val[0], val[1]))
Index: 0, L: 1, R: 2
Index: 1, L: 3, R: 4
Index: 2, L: 5, R: 6
Index: 3, L: 7, R: 8
Index: 4, L: 9, R: 10

map and filter

map: takes a function and applies it to the values in an iterator

func = lambda x: x + 1
l = [1, 2, 3, 4, 5]
list(map(func, l))
[2, 3, 4, 5, 6]

filter: only passes-through values for which the filter function evaluates to True

is_even = lambda x: x % 2 == 0
list(filter(is_even, l))
[2, 4]

Iterators as function arguments

It turns out that the *args syntax works not just with sequences, but with any iterator:

print(*range(5))
0 1 2 3 4
list(range(3))
[0, 1, 2]
print(*map(lambda x: x + 1, range(3)))
1 2 3
L1 = [1, 2, 3, 4]
L2 = ["a", "b", "c", "d"]

z = zip(L1, L2)
print(*z)
(1, 'a') (2, 'b') (3, 'c') (4, 'd')
z = zip(L1, L2)
new_L1, new_L2 = zip(*z)
new_L1
(1, 2, 3, 4)
new_L2
('a', 'b', 'c', 'd')

Specialized Iterators: itertools

from itertools import permutations

p = permutations(range(3))
print(*p)
(0, 1, 2) (0, 2, 1) (1, 0, 2) (1, 2, 0) (2, 0, 1) (2, 1, 0)
p
<itertools.permutations at 0x10fb32710>

List Comprehensions

l = [1, 2, 3, 4, 5]
[2 * el for el in l if el > 3]
[8, 10]

which is equivalent to the loop syntax, but list comprehension is much easier to write and to understand!

L = []
for el in l:
    if el > 3:
        L.append(2 * el)
        
L
[8, 10]
[(i, j) for i in range(2) for j in range(3)]
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)]
print(*range(10))

# Leave out multiples of 3, and negate all multiples of 2
[val if val % 2 else -val for val in range(10) if val % 3]
0 1 2 3 4 5 6 7 8 9





[1, -2, -4, 5, 7, -8]
L = []

for val in range(10):
    if val % 3 != 0: # conditional on iterator 
        # conditional on value
        if val % 2 != 0:
            L.append(val)
        else:
            L.append(-val)

L
[1, -2, -4, 5, 7, -8]
{n * 2 for n in range(5)}
{0, 2, 4, 6, 8}
{a % 3 for a in range(100)}
{0, 1, 2}

Generators

Difference between list comprehensions and generator expressions:

List comprehensions use square brackets, while generator expressions use parentheses

# list comprehension:
[n * 2 for n in range(5)]
[0, 2, 4, 6, 8]
# generator
g = (n * 2 for n in range(5))
list(g)
[0, 2, 4, 6, 8]

A list is a collection of values, while a generator is a recipe for producing values

When you create a list, you are actually building a collection of values, and there is some memory cost associated with that.

When you create a generator, you are not building a collection of values, but a recipe for producing those values.

Both expose the same iterator interface.

l = [n * 2 for n in range(5)]
for val in l:
    print(val, end=" ")
0 2 4 6 8
g = g = (n * 2 for n in range(5))
for val in g:
    print(val, end=" ")
0 2 4 6 8

The difference is that a generator expression does not actually compute the values until they are needed. This not only leads to memory efficiency, but to computational efficiency as well! This also means that while the size of a list is limited by available memory, the size of a generator expression is unlimited!

A list can be iterated multiple times; a generator expression is single-use

l = [n * 2 for n in range(5)]

for val in l:
    print(val, end=" ")

print("\n")

for val in l:
    print(val, end=" ")
0 2 4 6 8 

0 2 4 6 8
g = g = (n * 2 for n in range(5))

list(g)
[0, 2, 4, 6, 8]
list(g)
[]

This can be very useful because it means iteration can be stopped and started:

g = g = (n ** 2 for n in range(12))

for n in g:
    print(n, end=" ") 
    if n > 30:
        break

print("\nDoing something in between...")

for n in g:
    print(n, end=" ")
0 1 4 9 16 25 36 
Doing something in between...
49 64 81 100 121

This is useful when working with collections of data files on disk; it means that you can quite easily analyze them in batches, letting the generator keep track of which ones you have yet to see.

Generator Functions: Using yield

# list comprehension

L1 = [n * 2 for n in range(5)]

L2 = []
for n in range(5):
    L2.append(n * 2)

print("L1:", L1)
print("L2:", L2)
L1: [0, 2, 4, 6, 8]
L2: [0, 2, 4, 6, 8]
# generator
G1 = (n * 2 for n in range(5))

# generator function
def gen():
    for n in range(5):
        yield n * 2

G2 = gen()

print(*G1)
print(*G2)
0 2 4 6 8
0 2 4 6 8

Example: Prime Number Generator

# Generate a list of candidates
L = [n for n in range(2, 40)]
print(L)
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39]
# Remove all multiples of the first value
L = [n for n in L if n == L[0] or n % L[0] > 0]
print(L)
[2, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39]
# Remove all multiples of the second value
L = [n for n in L if n == L[1] or n % L[1] > 0]
print(L)
[2, 3, 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, 35, 37]

If we repeat this procedure enough times on a large enough list, we can generate as many primes as we wish.

Encapsulate this logic in a generator function:

def gen_primes(N):
    """
    Generate primes up to N
    """
    primes = set()
    for n in range(2, N):
        # print("n = ", n, ":", *(n % p > 0 for p in primes))
        if all(n % p > 0 for p in primes):
            primes.add(n)
            yield n


print(*gen_primes(100))
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97