8 NumPy 1D-Arrays

(CSE331) Python for Data Science

Author

Affiliation

Md Rasel Biswas

ISRT, University of Dhaka

NumPy (short for Numerical Python) is an essential package for performing scientific and numerical computing in Python. It provides powerful tools for working with arrays and matrices, enabling fast mathematical operations on large datasets.

Nearly every major Python library used in data science, such as pandas, scikit-learn, and TensorFlow, is either built on top of NumPy or designed to work seamlessly with it.

NumPy is not part of the Python Standard Library, but it comes pre-installed with the Anaconda distribution.

By convention, NumPy is imported under the alias np:

import numpy as np

The main tool provided by NumPy is the n-dimensional array (or ndarray).
An array is a collection of elements (usually numbers) arranged in an ordered, structured way.
Arrays can be one-dimensional (like a list), two-dimensional (like a matrix), or higher-dimensional.

1 One-Dimensional Arrays

Let’s start with simple 1D arrays. A NumPy array is similar to a standard Python list, but faster, more memory-efficient, and capable of vectorized operations.

A NumPy array can be created from a Python list using the np.array() function:

my_list = [4, 1, 7, 3, 5]
my_array = np.array([4, 1, 7, 3, 5])

We can check their data types:

print('Type of my_list :', type(my_list))
print('Type of my_array:', type(my_array))

Type of my_list : <class 'list'>
Type of my_array: <class 'numpy.ndarray'>

The type of my_array is numpy.ndarray, which stands for n-dimensional array.

Let’s print both objects:

print(my_list)
print(my_array)

[4, 1, 7, 3, 5]
[4 1 7 3 5]

The output looks similar, except that NumPy arrays are displayed without commas between elements.

2 Array Indexing and Slicing

Array elements can be accessed using indices — just like Python lists.

print(my_list[2])
print(my_array[2])

7
7

Arrays also support slicing:

print(my_list[:3])
print(my_array[:3])

[4, 1, 7]
[4 1 7]

Most functions that accept lists also work with arrays. For example:

print(len(my_list))
print(len(my_array))

5
5

We can also use the built-in sum() function (though NumPy provides its own optimized version, which we’ll see later):

print(sum(my_list))
print(sum(my_array))

20
20

3 Array Operations

So far, arrays and lists seem quite similar. The key difference lies in how they handle arithmetic operations.

NumPy arrays allow vectorized or elementwise operations, meaning operations are applied automatically to each element without writing loops.

For example, to multiply each element of a list by 5:

# Using a for-loop
new_list = []
for item in my_list:
    new_list.append(5 * item)
print(new_list)

[20, 5, 35, 15, 25]

This can be shortened using a list comprehension:

new_list = [5 * x for x in my_list]
print(new_list)

[20, 5, 35, 15, 25]

However, with NumPy arrays, this is even simpler:

new_array = 5 * my_array
print(new_array)

[20  5 35 15 25]

This operation is elementwise, meaning each element in my_array is multiplied by 5.

Note

If you multiply a list by an integer, it repeats the list rather than multiplying its elements.

print(5 * my_list)

[4, 1, 7, 3, 5, 4, 1, 7, 3, 5, 4, 1, 7, 3, 5, 4, 1, 7, 3, 5, 4, 1, 7, 3, 5]

We can perform many other arithmetic operations directly on arrays:

print(my_array ** 2)    # square each element
print(my_array + 100)   # add 100 to each element

[16  1 49  9 25]
[104 101 107 103 105]

Operations Involving Two Arrays

Arrays of the same shape can be added, subtracted, multiplied, or divided elementwise:

array1 = np.array([1, 4, 3])
array2 = np.array([5, 8, 2])

print('Sum:       ', array1 + array2)
print('Difference:', array1 - array2)
print('Product:   ', array1 * array2)
print('Quotient:  ', array1 / array2)

Sum:        [ 6 12  5]
Difference: [-4 -4  1]
Product:    [ 5 32  6]
Quotient:   [0.2 0.5 1.5]

If the arrays have different lengths, NumPy cannot align them properly and will raise an error:

array1 = np.array([2, 1, 4])
array2 = np.array([3, 9, 2, 7])

print(array1 + array2)  # Error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[15], line 4
      1 array1 = np.array([2, 1, 4])
      2 array2 = np.array([3, 9, 2, 7])
----> 4 print(array1 + array2)

ValueError: operands could not be broadcast together with shapes (3,) (4,)

4 Data Types of Array Elements

Unlike Python lists, all elements in a NumPy array must be of the same data type.

int_array = np.array([8, 4, 5, 2, 4, 6, 3])
print(int_array)

[8 4 5 2 4 6 3]

If we insert a value of a different type, NumPy tries to coerce it into the array’s existing type:

int_array[2] = 7.9
print(int_array)

[8 4 7 2 4 6 3]

Here, 7.9 is converted (truncated) to 7.

If the conversion is not possible (e.g., inserting a string into an integer array), NumPy raises an error.

To explicitly change the data type of an entire array, use the .astype() method:

float_array = int_array.astype('float')
print(float_array)

float_array[2] = 7.9
print(float_array)

[8. 4. 7. 2. 4. 6. 3.]
[8.  4.  7.9 2.  4.  6.  3. ]

5 Creating Special Arrays

NumPy provides several convenient functions for creating structured arrays:

np.zeros() creates an array of zeros.

array0 = np.zeros(10)
print(array0)

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

np.ones() creates an array of ones.

array1 = np.ones(10)
print(array1)

[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]

np.arange() creates a sequence of evenly spaced values. This is similar to Python’s range() but returns an array.

array2 = np.arange(start=2, stop=4, step=0.25)
print(array2)

[2.   2.25 2.5  2.75 3.   3.25 3.5  3.75]

Note: The stop value is not included, just like in range().

np.linspace() creates an evenly spaced sequence based on the number of elements instead of step size.

array3 = np.linspace(start=2, stop=4, num=11)
print(array3)

[2.  2.2 2.4 2.6 2.8 3.  3.2 3.4 3.6 3.8 4. ]

Unlike np.arange(), np.linspace() includes the stop value.

6 Array Functions

NumPy provides many built-in functions that operate on arrays to quickly compute descriptive statistics or perform elementwise transformations. These functions are optimized for speed and can process large arrays efficiently without explicit loops.

6.1 Common Aggregate Functions

The following functions return summary statistics from arrays:

Function	Description
`np.sum()`	Returns the sum of all elements in an array.
`np.prod()`	Returns the product of all elements in an array.
`np.max()`	Returns the largest element in an array.
`np.min()`	Returns the smallest element in an array.
`np.argmax()`	Returns the index of the largest element.
`np.argmin()`	Returns the index of the smallest element.
`np.mean()`	Returns the mean (average) of the elements.
`np.std()`	Returns the standard deviation of the elements.
`np.unique()`	Returns an array of distinct (unique) elements.

Let’s see an example:

test_array = np.array([3.2, 4.8, 8.7, 8.7, 6.4, 5.3, 5.3, 
                       1.8, 4.8, 1.8, 5.4, 3.1, 3.2, 1.8])

print('Sum:               ', np.sum(test_array))
print('Product:           ', np.prod(test_array))
print('Max:               ', np.max(test_array))
print('Min:               ', np.min(test_array))
print('ArgMax:            ', np.argmax(test_array))
print('ArgMin:            ', np.argmin(test_array))
print('Mean:              ', np.mean(test_array))
print('Standard Deviation:', np.std(test_array))
print('Distinct Elements: ', np.unique(test_array))

Sum:                64.3
Product:            313419215.1817097
Max:                8.7
Min:                1.8
ArgMax:             2
ArgMin:             7
Mean:               4.5928571428571425
Standard Deviation: 2.2079286627044445
Distinct Elements:  [1.8 3.1 3.2 4.8 5.3 5.4 6.4 8.7]

6.2 Elementwise Functions

NumPy also includes functions that apply an operation to each element individually, returning a new array of the same shape.

Function	Description
`np.exp()`	Raises e to the power of each element in the array.
`np.log()`	Computes the natural logarithm of each element.
`np.round()`	Rounds each element to a specified number of decimal places.

Example:

float_array = np.array([3.451, 1.234, 6.576, 2.475, 7.506])

print('Example of np.exp():  ', np.exp(float_array))
print('Example of np.log():  ', np.log(float_array))
print('Example of np.round():', np.round(float_array, 2))
print('Example of np.round():', np.round(float_array, 0))

Example of np.exp():   [  31.53190846    3.43494186  717.66292857   11.88170711 1818.92327889]
Example of np.log():   [1.23866404 0.21026093 1.88342666 0.9062404  2.0157027 ]
Example of np.round(): [3.45 1.23 6.58 2.48 7.51]
Example of np.round(): [3. 1. 7. 2. 8.]

7 Array Comparisons

NumPy allows elementwise comparisons between arrays or between an array and a single value. This means that each element of the array is compared individually, and the result is a Boolean array containing True or False for each comparison.

Let’s start with a numeric example:

someArray = np.array([4, 7, 6, 3, 9, 8])

print(someArray < 5)

[ True False False  True False False]

Here, each element of someArray is compared to 5. The result is an array of Boolean values indicating which elements are less than 5.

We can also perform comparisons on arrays containing strings. Below, we count how many times each category (‘A’, ‘B’, or ‘C’) appears in the array.

cat = np.array(['A', 'C', 'A', 'B', 'B', 'C', 'A', 'A',
                'C', 'B', 'C', 'C', 'A', 'B', 'A', 'A'])

print('Count of A:', np.sum(cat == 'A'))
print('Count of B:', np.sum(cat == 'B'))
print('Count of C:', np.sum(cat == 'C'))

Count of A: 7
Count of B: 4
Count of C: 5

Note

np.sum() can be used on Boolean arrays — since True is treated as 1 and False as 0, summing the array effectively counts the number of True values.

You can combine multiple logical conditions to filter or count elements meeting multiple criteria.

val = np.array([8, 1, 3, 6, 10, 6, 12, 4,
                6, 1, 4, 8, 5, 4, 12, 4])

print('Number of elements > 6: ', np.sum(val > 6))
print('Number of elements <= 6:', np.sum(val <= 6))
print('Number of even elements:', np.sum(val % 2 == 0))
print('Number of odd elements: ', np.sum(val % 2 != 0))

Number of elements > 6:  5
Number of elements <= 6: 11
Number of even elements: 12
Number of odd elements:  4

Logical Operators for Arrays

NumPy supports bitwise logical operators that perform elementwise Boolean logic:

Operator	Meaning	Example	Description
`&`	Logical AND	`(A > 0) & (A < 10)`	True only if both conditions are True
`\|`	Logical OR	`(A < 0) \| (A > 10)`	True if either condition is True
`~`	Logical NOT	`~(A > 5)`	Negates the condition

Note

When using these operators, always enclose each condition in parentheses, because &, |, and ~ have higher precedence than comparison operators like < or ==.

Even numbers greater than 5

print(np.sum((val % 2 == 0) & (val > 5)))

Even numbers divisible by 3 and greater than 7

print(np.sum((val % 2 == 0) & (val % 3 == 0) & (val > 7)))

Elements where the category is 'A' and the corresponding value is greater than 5
```
print(np.sum((cat == 'A') & (val > 5)))
```
```
3
```

8 Boolean Masking

Boolean masking is a powerful technique for filtering or subsetting NumPy arrays. It allows you to extract elements from an array that satisfy specific conditions, using a Boolean array (of True/False values) as a mask.

In simple terms, Boolean masking selects elements where the corresponding value in a Boolean array is True.

8.1 Basic Example

boolArray = np.array([True, True, False, True, False])
myArray = np.array([1, 2, 3, 4, 5])

subArray = myArray[boolArray]
print(subArray)

[1 2 4]

Here, only the elements of myArray corresponding to True in boolArray are selected — resulting in [1, 2, 4].

8.2 Conditional Masking

You can also create Boolean masks directly by applying logical conditions to numeric or text arrays. Let’s create two arrays — one for categories (cat) and another for numerical values (val):

cat = np.array(['A', 'C', 'A', 'B', 'B', 'C', 'A', 'A',
                'C', 'B', 'C', 'C', 'A', 'B', 'A', 'A'])

val = np.array([8, 1, 3, 6, 10, 6, 12, 4,
                6, 1, 4, 8, 5, 4, 12, 4])

We can now use Boolean conditions to extract subsets of val.

print(val[val > 6])   # Elements greater than 6
print(val[val <= 6])  # Elements less than or equal to 6

[ 8 10 12  8 12]
[1 3 6 6 4 6 1 4 5 4 4]

Similarly, we can filter even and odd elements:

print(val[val % 2 == 0])  # Even numbers
print(val[val % 2 != 0])  # Odd numbers

[ 8  6 10  6 12  4  6  4  8  4 12  4]
[1 3 1 5]

And we can use categorical masks to select val values corresponding to each group in cat:

print(val[cat == 'A'])
print(val[cat == 'B'])
print(val[cat == 'C'])

[ 8  3 12  4  5 12  4]
[ 6 10  1  4]
[1 6 6 4 8]

Tip: When you apply a condition like val > 6, NumPy automatically creates a Boolean array where each element of val is tested against the condition. You can then use that Boolean array to filter val.

8.3 Fancy Indexing

Beyond Boolean masks, NumPy also supports fancy indexing, which allows you to select elements by providing an array (or list) of indices. This is different from slicing, because it can select non-contiguous elements and in any desired order.

my_array = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90])
print(my_array[[6, 3, 8]])

[70 40 90]

This extracts elements at indices 6, 3, and 8, resulting in [70, 40, 90].

Difference between Boolean Masking and Fancy Indexing:

Feature	Boolean Masking	Fancy Indexing
Input	Boolean array (`True`/`False`)	List or array of integer indices
Purpose	Filter elements based on condition(s)	Select elements by position
Example	`val[val > 6]`	`val[[2, 5, 7]]`

Additional Resources

The following resources contain additional information about NumPy.