Numpy Basics
Array Declaration
Create a rank-1 vector
x = np.array([1, 2, 3])
Check its shape
x.shape
Declare a zero vector/matrix, which all elements are 0.
x = np.zeros((2, 2))
# [0, 0]
# [0, 0]
Declare a one vector/matrix, which all elements are 1.
x = np.ones((2, 2))
# [1, 1]
# [1, 1]
Create an identity matrix
I = np.eyes(3)
# [1 0 0]
# [0 1 0]
# [0 0 1]
Create a random matrix, values are ranging from 0 to 1
r = np.random.rand(3, 3)
Create a random matrix that is normally distributed
r = np.random.randn(3, 3)
Array Indexing
Declare an array and then slice the first two rows and columns 1 and 2
x = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[1, 2, 3, 4],
[5, 6, 7, 8]])
x_slice = x[:2, 1:3]
Slice of an array is a view into the same underlying data structure, thus modifying it will also modify the original.
x[0, 1] # => 2
x_slice[0, 0] = 10
x[0, 1] # => 10
You can also mix integer indexing with slice indexing. However, doing so will yield an array of lower rank than the original array.
x = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[1, 2, 3, 4],
[5, 6, 7, 8]])
# Rank 1 view of the second row of x
row_rank_1 = x[1, :]
# Rank 2 view of the second row of x
row_rank_2 = x[1:2, :]
Every numpy array is a grid of elements of the same type. Numpy provides a large set of numeric data types that you can use to construct arrays.
x = np.array([1, 2])
x.dtype # => dtype('int64')
x = np.array([1.0, 2.0])
x.dtype # => dtype('float64')
# We can force a particular datatype
x = np.array([1.2, 2.3], dtype=np.int64)
x.dtype # => dtype('int64')
Array Math
Declaring your numpy array, as float64
x = np.array([[1, 2], [3, 4]], dtype=np.float64)
y = np.array([[5, 6], [7, 8]], dtype=np.float64)
Element-wise sum
x + y
Element-wise difference
x - y
Element-wise product
x * y
Element-wise division
x / y
Element-wise square root
np.sqrt(x)
Inner dot product of two vectors
v = np.array([9, 10])
w = np.array([11, 12])
np.dot(v, w)
Matrix product of two 2D vectors, which are basically matrices
v = np.array([[1, 2], [3, 4]])
w = np.array([[1, 0], [0, 1]])
np.dot(v, w)
Sum an array/vector along all axis
np.sum(w)
Sum an array/vector along an axis
np.sum(w, axis=0)
np.sum(w, axis=1)
Perform transpose of a matrix
w.T
Array Broadcasting
Suppose that we want to add a constant vector to each row of a matrix
# We begin by doing it with the inefficient way...
x = np.array([[1, 1, 1], [1, 1, 1], [1, 1, 1]])
v = np.array([1, 1, 1])
# Create an empty matrix that has x.shape, and then iterate throgh every row and perform addition
out = np.empty_like(x)
for i in range(3):
y[i, :] = x[i, :] + v
# We can also consider stacking v together and perform matrix element-wise addition
vv = np.tile(v, (3, 1))
y = x + vv
However, there is an even better way in numpy! We can perform the stacking method without actually creating multiple copies of v
.
y = x + v
import numpy as np
x = np.array([[1, 1, 1], [1, 1, 1], [1, 1, 1]])
v = np.array([1, 1, 1])
# This is known as array broadcasting, which is essentially what we did with stacking.
x + v
array([[2, 2, 2],
[2, 2, 2],
[2, 2, 2]])
Image Operations
We need to shift gear a little bit and introduce scipy
. Scientifc Python library provides some basic functions to work with images. For example, it has functions to read images from disk into numpy arrays, to write numpy arrays to disk as images, and to resize images.
from scipy.misc import imread, imsave, imresize
img = imread('assets/cat.jpg')
img_tinted = img * [1.0, 0.5, 0.9] # Through array broadcasting
img_tinted = imresize(img_tinted, (300, 300))
imsave('assets/cat_tinted.jpg', img_tinted)
Plots
Plots are essential in machine learning, it helps us with understanding our data and monitoring the training progress of a model. Thus, matplotlib
comes in handy! The most important function in matplotlib
is plot
which allows you to plot 2D data, but of course there are other types of plot functions.
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
x = np.arange(0, 3 * np.pi, 0.1)
y = np.sin(x)
plt.ylim(-1, 1)
plt.plot(x, y)
plt.show()

With just a little bit of extra work, we can easily plot multiple lines at once. We can also add title, legend, and axis labels.
x = np.arange(0, 3 * np.pi, 0.1)
y_sin = np.sin(x)
y_cos = np.cos(x)
plt.plot(x, y_sin)
plt.plot(x, y_cos)
plt.ylim(-1, 1)
plt.xlabel('x-axis label')
plt.ylabel('y-axis label')
plt.title('Sine and Cosine')
plt.legend(['Sine', 'Cosine'])
plt.show()

You can plot different things in the same figure using the subplot
function.
x = np.arange(0, 3 * np.pi, 0.1)
y_sin = np.sin(x)
y_cos = np.cos(x)
# Set up a subplot grid which has 2 rows and 1 column.
fig, axs = plt.subplots(2, 1, figsize=(8, 8))
axs[0].plot(x, y_sin)
axs[0].set_title('Sine')
axs[1].plot(x, y_cos)
axs[1].set_title('Cosine')
plt.show()

We can also display images in numpy
. A slight gotcha with imshow is that it only accepts uint8
data type. We need to explicitly cast the image to uint8 before displaying it.
from matplotlib.image import imread
img = imread('assets/cat.jpg')
img_tinted = img * [1, 0.9, 0.9]
plt.subplot(1, 2, 1)
plt.imshow(img)
plt.subplot(1, 2, 2)
plt.imshow(np.uint8(img_tinted))
plt.show()

Display multiple images on a grid.
fig = plt.figure(figsize=(10, 10))
# 25 images, display them on a 5 by 5 grid
for i in range(1, 26):
img = np.random.randint(10, size=(10,10))
fig.add_subplot(5, 5, i)
plt.imshow(img)
plt.show()

Last updated