2 Collections

Data Types for Collections (built-ins)

Beyond scalars, Python provides container types to hold multiple values.
Core built-in collections you’ll use daily:
- Sequences
  - list (mutable) — ordered, indexable: x = [1, 2, 3]
  - tuple (immutable) — ordered, indexable: t = (1, 2, 3)
  - range (immutable sequence, computed on demand): r = range(1, 11)
- dict (mutable) — mapping of keys → values: d = {"name": "Rasel", "dept": "ISRT"}
- set (mutable, unique items) — membership & set algebra: s = {1, 2, 3}

Mutability & order

list, dict, set are mutable.
tuple, range, str are immutable (the container cannot change size/content in place).

1. List

A list stores an ordered sequence of items. Create with square brackets:

li1 = [47, 52, 71, 90]
li1, type(li1), len(li1)

([47, 52, 71, 90], list, 4)

Lists can contain mixed types (useful but use judiciously):

li2 = [1, "a", [3, 4]]
li2, len(li2)

([1, 'a', [3, 4]], 3)

Membership test:

47 in li1, 99 in li1

(True, False)

Methods

Most of the functions we use with lists are methods.
The distinction between a method and other kinds of functions is subtle; we’ll cover it in detail later.
For now, note:
1. A method is a function attached to a specific object/type (e.g., int, float, str, list).
2. You call a method on an object with dot notation: object.method(...).
Methods often mutate the object (change it in place).
Example methods you’ll meet right away: append, extend, insert, pop, sort, clear, copy.

Adding & removing elements

li1.append(24)       # add a single item to the end
li1

[47, 52, 71, 90, 24]

li1.insert(2, 100)    # insert at index 2
li1

[47, 52, 100, 71, 90, 24]

li1.pop(1)            # remove-and-return item at index 1
li1

[47, 100, 71, 90, 24]

append(x) adds one element; extend(iterable) adds many elements:

x = [4, None, "foo"]
x.extend([7, 8, (2, 3)])   # concatenates elements from the iterable
x

[4, None, 'foo', 7, 8, (2, 3)]

Concatenation creates a new list:

[4, None, "foo"] + [7, 8, (2, 3)]

[4, None, 'foo', 7, 8, (2, 3)]

Sorting

In-place sort (stable) with optional reverse= and key=:

a = [1, 2, 3, 31, 32, 33, 11, 12, 13]
a.sort()
a

[1, 2, 3, 11, 12, 13, 31, 32, 33]

a.sort(reverse=True)
a

[33, 32, 31, 13, 12, 11, 3, 2, 1]

names = ["alice", "Bob", "carol"]
names.sort(key=str.lower)  # case-insensitive sort
names

['alice', 'Bob', 'carol']

To get a sorted copy without modifying the original:

b = [3, 1, 2]
sorted_b = sorted(b, reverse=True)
b, sorted_b

([3, 1, 2], [3, 2, 1])

Slicing

Basic slice: seq[start:stop] returns a new list; start inclusive, stop exclusive.

seq = [1, 2, 3, 41, 42, 43, 11, 12, 13]
seq[3:4], seq[3:5]

([41], [41, 42])

Omit start or stop to use defaults:

seq[:5], seq[5:]

([1, 2, 3, 41, 42], [43, 11, 12, 13])

Negative indices count from the end:

seq[-4:], seq[-6:-2]

([43, 11, 12, 13], [41, 42, 43, 11])

Slices can assign to replace a region (sizes may differ):

seq[3:5] = [6, 3]
seq

[1, 2, 3, 6, 3, 43, 11, 12, 13]

Step (stride) with seq[start:stop:step] (can be negative):

seq[::2], seq[::-1]   # every 2nd; reversed copy

([1, 3, 3, 11, 13], [13, 12, 11, 43, 3, 6, 3, 2, 1])

Slicing semantics takes a bit of getting used to, especially for users coming from R.

Names, objects & mutability

Names point to objects. Rebinding a name does not change other names.

a = 1947
b = a     # both refer to the same int object (1947)
a = 1971  # a now refers to a new int object; b still 1947
print(a, b)

1971 1947

Mutable objects (like lists) can change in place:

orig = [1, 2, 3]
alias = orig          # same object
orig.append(99)
alias, orig

([1, 2, 3, 99], [1, 2, 3, 99])

Key Idea

Scalars (int, float, bool, str) are immutable; when you “change” them, you are actually re-binding the name to a new object.
Containers like list are often mutable; mutation changes the object itself without re-binding the name.

Copying lists

Sometimes you need a separate list object so later mutations don’t affect the original.

orig = [1, 2, 3]
alias = orig          # same object
shallow = orig[:]     # new list (shallow copy) — or: orig.copy(), list(orig)
orig.append(99)
alias, shallow

([1, 2, 3, 99], [1, 2, 3])

2. Tuple

A tuple is an immutable, ordered sequence.

tup1 = (4, 5, 6)
tup2 = 4, 5, 6          # parentheses optional in many contexts
tup1, tup2

((4, 5, 6), (4, 5, 6))

Nested tuples are fine:

nested_tup = (4, 5, 6), (7, 8)
nested_tup

((4, 5, 6), (7, 8))

Convert any iterable to a tuple:

tuple([4, 0, 2]), tuple("string")

((4, 0, 2), ('s', 't', 'r', 'i', 'n', 'g'))

Index like lists:

tup = ("pre", "post", 3)
tup[0], tup[-1]

('pre', 3)

Concatenate / repeat:

(4, None, "foo") + (6, 0) + ("bar",), ("pre", "post") * 3

((4, None, 'foo', 6, 0, 'bar'), ('pre', 'post', 'pre', 'post', 'pre', 'post'))

Immutability nuance

Tuples are immutable, but they can contain mutable objects:

t = (1, [2, 3])
t[1].append(4)   # allowed: we mutate the list inside the tuple
t

(1, [2, 3, 4])

What’s immutable is the container (which objects are stored), not necessarily the contents.

Tuple unpacking (including starred)

x, y = (10, 20)
x, y

(10, 20)

a, *b, c = (1, 2, 3, 4, 5)
a, b, c

(1, [2, 3, 4], 5)

3. Range

range is an immutable sequence of evenly spaced integers.
General form: range(start, stop, step)
- start → first value (default = 0)
- stop → sequence goes up to but not including this value
- step → increment (default = 1)
Internally, it only stores start, stop, and step, and computes elements on demand.
Supports efficient operations: len(), indexing, slicing, and membership testing.

r = range(10)            # 0...9
len(r), r[0], r[-1]

(10, 0, 9)

list(range(0, 20, 2))

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

list(range(10, 0, -3))

[10, 7, 4, 1]

Helpful properties:

5 in range(10), 25 in range(10)   # fast membership check

(True, False)

range(0, 10, 2)[2:5]              # slicing a range yields another range

range(4, 10, 2)

4. Dictionary

dict is likely the most important built-in Python data structure. A more common name for it is hash map or associative array. It is a flexibly sized collection of key-value pairs, where key and value are Python objects.

One approach for creating a dict is to use curly braces {} and colons to separate keys and values:

empty_dict = {}
d1 = {'a' : 'some value', 'b' : [1, 2, 3, 4]}
d1

{'a': 'some value', 'b': [1, 2, 3, 4]}

Access:

d1["b"]

[1, 2, 3, 4]

Insert / update:

d1["dept"] = "ISRT"
d1["b"] = list(range(3))
d1

{'a': 'some value', 'b': [0, 1, 2], 'dept': 'ISRT'}

Membership checks keys:

"a" in d1, "z" in d1

(True, False)

d1["missing"] would throw an error (KeyError) if the key missing doesn’t exist. d1.get("missing") avoids that error and gives you a safe fallback (None or the value you provide).

d1.get("d")               # [0, 1, 2] (key exists)
d1.get("age")             # None (key missing)
d1.get("age", "N/A")      # N/A (key missing, fallback given)

'N/A'

Delete operations:

d1["dummy"] = "temp"
val = d1.pop("dummy")     # returns value and removes key
val, d1

('temp', {'a': 'some value', 'b': [0, 1, 2], 'dept': 'ISRT'})

d1["x"] = 1
del d1["x"]               # delete by key (raises KeyError if absent)
d1

{'a': 'some value', 'b': [0, 1, 2], 'dept': 'ISRT'}

Views (dynamic, reflect changes): keys(), values(), items():

list(d1.keys()), list(d1.values()), list(d1.items())

(['a', 'b', 'dept'],
 ['some value', [0, 1, 2], 'ISRT'],
 [('a', 'some value'), ('b', [0, 1, 2]), ('dept', 'ISRT')])

Bulk update / merging:

d2 = {"b": 999, "new": 1}
d1.update(d2)   # overwrites existing keys
d1

{'a': 'some value', 'b': 999, 'dept': 'ISRT', 'new': 1}

5. Set

A set stores unique elements (unordered). Great for membership tests and set algebra.

set([2, 2, 2, 1, 3, 3]), {2, 2, 2, 1, 3, 3}   # literal deduplicates too

({1, 2, 3}, {1, 2, 3})

Core operations:

a = {1, 2, 3, 4, 5}
b = {3, 4, 5, 6, 7, 8}
a | b, a & b, a - b, a ^ b   # union, intersection, difference, symmetric diff

({1, 2, 3, 4, 5, 6, 7, 8}, {3, 4, 5}, {1, 2}, {1, 2, 6, 7, 8})

Mutating counterparts:

s = {1, 2, 3}
s.add(4)           # add single element
s.update([3, 5])   # add many
s

{1, 2, 3, 4, 5}

s.remove(3)        # remove; raises KeyError if absent
s.discard(3)       # remove; does nothing if absent
s

{1, 2, 4, 5}

Python set operations (summary)

Function	Operator	Description
`a.add(x)`	—	Add element `x` to `a`
`a.clear()`	—	Remove all elements
`a.remove(x)`	—	Remove `x` (error if absent)
`a.discard(x)`	—	Remove `x` (no error if absent)
`a.pop()`	—	Remove & return an arbitrary element
`a.union(b)`	`a \| b`	All elements in `a` or `b`
`a.update(b)`	`a \|= b`	In-place union
`a.intersection(b)`	`a & b`	Elements in both
`a.intersection_update(b)`	`a &= b`	In-place intersection
`a.difference(b)`	`a - b`	Elements in `a` not in `b`
`a.difference_update(b)`	`a -= b`	In-place difference
`a.symmetric_difference(b)`	`a ^ b`	In `a` or `b` but not both
`a.symmetric_difference_update(b)`	`a ^= b`	In-place symmetric diff
`a.issubset(b)`	`a <= b`	All elems of `a` in `b`
`a.issuperset(b)`	`a >= b`	All elems of `b` in `a`
`a.isdisjoint(b)`	—	No elements in common

Summary

Names reference objects; immutables are never modified in place.
Choose list for ordered mutable sequences; tuple for fixed records; range for integer sequences without materializing elements.
Use dict for labeled data (keys → values) with insertion order preserved.
Use set for uniqueness and fast membership & algebra.
Understand mutability, slicing semantics, copying, and the differences between append/extend, remove/discard.

Help

In Python, different data types (list, tuple, range, set, dict) support different sets of methods. To find out which ones are available and how to use them:

dir(type) → lists all attributes (including methods) of that type.

dir(list)     # attributes and methods of list
dir(tuple)    # attributes and methods of tuple

This prints a long list of names.
Focus on the ones without double underscores (e.g., append, remove, add, keys).
The double-underscore methods (like __len__, __add__) are special methods, usually ignored in day-to-day use.

help(type.method) → shows detailed documentation of a specific method.

help(list.append)     # details about append()
help(dict.update)     # details about update()

Use this to understand the purpose, parameters, and usage examples of a method.