2 Collections

Data Types for Collections (built-ins)

  • Beyond scalars, Python provides container types to hold multiple values.
  • Core built-in collections you’ll use daily:
    • Sequences
      • list (mutable) — ordered, indexable: x = [1, 2, 3]
      • tuple (immutable) — ordered, indexable: t = (1, 2, 3)
      • range (immutable sequence, computed on demand): r = range(1, 11)
    • dict (mutable) — mapping of keys → values: d = {"name": "Rasel", "dept": "ISRT"}
    • set (mutable, unique items) — membership & set algebra: s = {1, 2, 3}
Mutability & order
  • list, dict, set are mutable.
  • tuple, range, str are immutable (the container cannot change size/content in place).

1. List

A list stores an ordered sequence of items. Create with square brackets:

li1 = [47, 52, 71, 90]
li1, type(li1), len(li1)
([47, 52, 71, 90], list, 4)
  • Lists can contain mixed types (useful but use judiciously):
li2 = [1, "a", [3, 4]]
li2, len(li2)
([1, 'a', [3, 4]], 3)
  • Membership test:
47 in li1, 99 in li1
(True, False)

Methods

  • Most of the functions we use with lists are methods.
  • The distinction between a method and other kinds of functions is subtle; we’ll cover it in detail later.
  • For now, note:
    1. A method is a function attached to a specific object/type (e.g., int, float, str, list).
    2. You call a method on an object with dot notation: object.method(...).
  • Methods often mutate the object (change it in place).
  • Example methods you’ll meet right away: append, extend, insert, pop, sort, clear, copy.

Adding & removing elements

li1.append(24)       # add a single item to the end
li1
[47, 52, 71, 90, 24]
li1.insert(2, 100)    # insert at index 2
li1
[47, 52, 100, 71, 90, 24]
li1.pop(1)            # remove-and-return item at index 1
li1
[47, 100, 71, 90, 24]
  • append(x) adds one element; extend(iterable) adds many elements:
x = [4, None, "foo"]
x.extend([7, 8, (2, 3)])   # concatenates elements from the iterable
x
[4, None, 'foo', 7, 8, (2, 3)]
  • Concatenation creates a new list:
[4, None, "foo"] + [7, 8, (2, 3)]
[4, None, 'foo', 7, 8, (2, 3)]

Sorting

  • In-place sort (stable) with optional reverse= and key=:
a = [1, 2, 3, 31, 32, 33, 11, 12, 13]
a.sort()
a
[1, 2, 3, 11, 12, 13, 31, 32, 33]
a.sort(reverse=True)
a
[33, 32, 31, 13, 12, 11, 3, 2, 1]
names = ["alice", "Bob", "carol"]
names.sort(key=str.lower)  # case-insensitive sort
names
['alice', 'Bob', 'carol']
  • To get a sorted copy without modifying the original:
b = [3, 1, 2]
sorted_b = sorted(b, reverse=True)
b, sorted_b
([3, 1, 2], [3, 2, 1])

Slicing

  • Basic slice: seq[start:stop] returns a new list; start inclusive, stop exclusive.
seq = [1, 2, 3, 41, 42, 43, 11, 12, 13]
seq[3:4], seq[3:5]
([41], [41, 42])
  • Omit start or stop to use defaults:
seq[:5], seq[5:]
([1, 2, 3, 41, 42], [43, 11, 12, 13])
  • Negative indices count from the end:
seq[-4:], seq[-6:-2]
([43, 11, 12, 13], [41, 42, 43, 11])
  • Slices can assign to replace a region (sizes may differ):
seq[3:5] = [6, 3]
seq
[1, 2, 3, 6, 3, 43, 11, 12, 13]
  • Step (stride) with seq[start:stop:step] (can be negative):
seq[::2], seq[::-1]   # every 2nd; reversed copy
([1, 3, 3, 11, 13], [13, 12, 11, 43, 3, 6, 3, 2, 1])
  • Slicing semantics takes a bit of getting used to, especially for users coming from R.

Names, objects & mutability

  • Names point to objects. Rebinding a name does not change other names.
a = 1947
b = a     # both refer to the same int object (1947)
a = 1971  # a now refers to a new int object; b still 1947
print(a, b)
1971 1947
  • Mutable objects (like lists) can change in place:
orig = [1, 2, 3]
alias = orig          # same object
orig.append(99)
alias, orig
([1, 2, 3, 99], [1, 2, 3, 99])
Key Idea
  • Scalars (int, float, bool, str) are immutable; when you “change” them, you are actually re-binding the name to a new object.
  • Containers like list are often mutable; mutation changes the object itself without re-binding the name.

Copying lists

  • Sometimes you need a separate list object so later mutations don’t affect the original.
orig = [1, 2, 3]
alias = orig          # same object
shallow = orig[:]     # new list (shallow copy) — or: orig.copy(), list(orig)
orig.append(99)
alias, shallow
([1, 2, 3, 99], [1, 2, 3])

2. Tuple

A tuple is an immutable, ordered sequence.

tup1 = (4, 5, 6)
tup2 = 4, 5, 6          # parentheses optional in many contexts
tup1, tup2
((4, 5, 6), (4, 5, 6))
  • Nested tuples are fine:
nested_tup = (4, 5, 6), (7, 8)
nested_tup
((4, 5, 6), (7, 8))
  • Convert any iterable to a tuple:
tuple([4, 0, 2]), tuple("string")
((4, 0, 2), ('s', 't', 'r', 'i', 'n', 'g'))
  • Index like lists:
tup = ("pre", "post", 3)
tup[0], tup[-1]
('pre', 3)
  • Concatenate / repeat:
(4, None, "foo") + (6, 0) + ("bar",), ("pre", "post") * 3
((4, None, 'foo', 6, 0, 'bar'), ('pre', 'post', 'pre', 'post', 'pre', 'post'))
Immutability nuance

Tuples are immutable, but they can contain mutable objects:

t = (1, [2, 3])
t[1].append(4)   # allowed: we mutate the list inside the tuple
t
(1, [2, 3, 4])

What’s immutable is the container (which objects are stored), not necessarily the contents.


Tuple unpacking (including starred)

x, y = (10, 20)
x, y
(10, 20)
a, *b, c = (1, 2, 3, 4, 5)
a, b, c
(1, [2, 3, 4], 5)

3. Range

  • range is an immutable sequence of evenly spaced integers.
  • General form: range(start, stop, step)
    • start → first value (default = 0)
    • stop → sequence goes up to but not including this value
    • step → increment (default = 1)
  • Internally, it only stores start, stop, and step, and computes elements on demand.
  • Supports efficient operations: len(), indexing, slicing, and membership testing.
r = range(10)            # 0...9
len(r), r[0], r[-1]
(10, 0, 9)
list(range(0, 20, 2))
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
list(range(10, 0, -3))
[10, 7, 4, 1]
  • Helpful properties:
5 in range(10), 25 in range(10)   # fast membership check
(True, False)
range(0, 10, 2)[2:5]              # slicing a range yields another range
range(4, 10, 2)

4. Dictionary

dict is likely the most important built-in Python data structure. A more common name for it is hash map or associative array. It is a flexibly sized collection of key-value pairs, where key and value are Python objects.

  • One approach for creating a dict is to use curly braces {} and colons to separate keys and values:
empty_dict = {}
d1 = {'a' : 'some value', 'b' : [1, 2, 3, 4]}
d1
{'a': 'some value', 'b': [1, 2, 3, 4]}
  • Access:
d1["b"]
[1, 2, 3, 4]
  • Insert / update:
d1["dept"] = "ISRT"
d1["b"] = list(range(3))
d1
{'a': 'some value', 'b': [0, 1, 2], 'dept': 'ISRT'}
  • Membership checks keys:
"a" in d1, "z" in d1
(True, False)
  • d1["missing"] would throw an error (KeyError) if the key missing doesn’t exist. d1.get("missing") avoids that error and gives you a safe fallback (None or the value you provide).
d1.get("d")               # [0, 1, 2] (key exists)
d1.get("age")             # None (key missing)
d1.get("age", "N/A")      # N/A (key missing, fallback given)     
'N/A'
  • Delete operations:
d1["dummy"] = "temp"
val = d1.pop("dummy")     # returns value and removes key
val, d1
('temp', {'a': 'some value', 'b': [0, 1, 2], 'dept': 'ISRT'})
d1["x"] = 1
del d1["x"]               # delete by key (raises KeyError if absent)
d1
{'a': 'some value', 'b': [0, 1, 2], 'dept': 'ISRT'}
  • Views (dynamic, reflect changes): keys(), values(), items():
list(d1.keys()), list(d1.values()), list(d1.items())
(['a', 'b', 'dept'],
 ['some value', [0, 1, 2], 'ISRT'],
 [('a', 'some value'), ('b', [0, 1, 2]), ('dept', 'ISRT')])
  • Bulk update / merging:
d2 = {"b": 999, "new": 1}
d1.update(d2)   # overwrites existing keys
d1
{'a': 'some value', 'b': 999, 'dept': 'ISRT', 'new': 1}

5. Set

A set stores unique elements (unordered). Great for membership tests and set algebra.

set([2, 2, 2, 1, 3, 3]), {2, 2, 2, 1, 3, 3}   # literal deduplicates too
({1, 2, 3}, {1, 2, 3})
  • Core operations:
a = {1, 2, 3, 4, 5}
b = {3, 4, 5, 6, 7, 8}
a | b, a & b, a - b, a ^ b   # union, intersection, difference, symmetric diff
({1, 2, 3, 4, 5, 6, 7, 8}, {3, 4, 5}, {1, 2}, {1, 2, 6, 7, 8})
  • Mutating counterparts:
s = {1, 2, 3}
s.add(4)           # add single element
s.update([3, 5])   # add many
s
{1, 2, 3, 4, 5}
s.remove(3)        # remove; raises KeyError if absent
s.discard(3)       # remove; does nothing if absent
s
{1, 2, 4, 5}

Python set operations (summary)

Function Operator Description
a.add(x) Add element x to a
a.clear() Remove all elements
a.remove(x) Remove x (error if absent)
a.discard(x) Remove x (no error if absent)
a.pop() Remove & return an arbitrary element
a.union(b) a | b All elements in a or b
a.update(b) a |= b In-place union
a.intersection(b) a & b Elements in both
a.intersection_update(b) a &= b In-place intersection
a.difference(b) a - b Elements in a not in b
a.difference_update(b) a -= b In-place difference
a.symmetric_difference(b) a ^ b In a or b but not both
a.symmetric_difference_update(b) a ^= b In-place symmetric diff
a.issubset(b) a <= b All elems of a in b
a.issuperset(b) a >= b All elems of b in a
a.isdisjoint(b) No elements in common

Summary

  • Names reference objects; immutables are never modified in place.
  • Choose list for ordered mutable sequences; tuple for fixed records; range for integer sequences without materializing elements.
  • Use dict for labeled data (keys → values) with insertion order preserved.
  • Use set for uniqueness and fast membership & algebra.
  • Understand mutability, slicing semantics, copying, and the differences between append/extend, remove/discard.

Help

In Python, different data types (list, tuple, range, set, dict) support different sets of methods. To find out which ones are available and how to use them:

  1. dir(type) → lists all attributes (including methods) of that type.
dir(list)     # attributes and methods of list
dir(tuple)    # attributes and methods of tuple
  • This prints a long list of names.
  • Focus on the ones without double underscores (e.g., append, remove, add, keys).
  • The double-underscore methods (like __len__, __add__) are special methods, usually ignored in day-to-day use.

  1. help(type.method) → shows detailed documentation of a specific method.
help(list.append)     # details about append()
help(dict.update)     # details about update()
  • Use this to understand the purpose, parameters, and usage examples of a method.