How to randomly select elements of an array with numpy in python ?

Daidalos November 26, 2019


Examples of how to randomly select elements of an array with numpy in python:

Randomly select elements of a 1D array using choice()

Lets create a simple 1D array with 10 elements:

>>> import numpy as np
>>> data = np.arange(10)
>>> data
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

\begin{equation}
A = \left( \begin{array}{ccc}
0 & 1& 2& 3& 4& 5& 6& 7& 8& 9
\end{array}\right)
\end{equation}

To select randomly n elements, a solution is to use choice(). Example of how to select randomly 4 elements from the array data:

>>> np.random.choice(data,4)
array([9, 6, 2, 9])

returns for example

\begin{equation}
A = \left( \begin{array}{ccc}
9 & 6 & 2 & 9
\end{array}\right)
\end{equation}

Another example, with n = 5

>>> for i in range(10):
...     np.random.choice(data,5)
... 
array([3, 4, 0, 8, 4])
array([3, 0, 0, 3, 6])
array([5, 1, 2, 0, 9])
array([5, 8, 6, 0, 1])
array([4, 0, 9, 4, 2])
array([9, 6, 3, 9, 9])
array([9, 5, 1, 2, 7])
array([9, 7, 6, 4, 5])
array([6, 8, 5, 5, 9])
array([8, 9, 5, 5, 6])

Random sampling without replacement

To do random sampling without remplacement, just add the option "replace = False":

>>> for i in range(10):
...     np.random.choice(data,5,replace=False)
... 
array([9, 7, 4, 0, 6])
array([0, 9, 2, 4, 6])
array([2, 6, 5, 0, 9])
array([0, 3, 5, 7, 9])
array([0, 5, 9, 6, 7])
array([5, 0, 9, 6, 3])
array([7, 2, 6, 9, 1])
array([7, 6, 5, 8, 4])
array([6, 8, 5, 7, 4])
array([0, 1, 2, 3, 5])

One can see than an element cannot be selected more than one time.

Weighted random sampling

To do weighted random sampling, it is possible to define for each element the probability to be selected:

>>> p = [0.05, 0.05, 0.1, 0.125, 0.175, 0.175, 0.125, 0.1, 0.05, 0.05]

Note: the sum must be equal to 1:

>>> sum(p)
1.0

Here for example the elements 0,1,8 or 9 will have a lower probability to be selected:

>>> for idx,p in enumerate(p):
...     print(p,data[idx])
... 
0.05 0
0.05 1
0.1 2
0.125 3
0.175 4
0.175 5
0.125 6
0.1 7
0.05 8
0.05 9

Lets check:

>>> for i in range(10):
...     np.random.choice(data,5,replace=False,p=p)
... 
array([7, 5, 0, 2, 3])
array([9, 2, 3, 5, 7])
array([2, 5, 3, 7, 4])
array([7, 2, 9, 4, 5])
array([1, 4, 6, 3, 2])
array([4, 5, 3, 7, 1])
array([2, 7, 4, 6, 3])
array([6, 5, 0, 1, 8])
array([4, 0, 5, 9, 6])
array([8, 9, 3, 4, 6])

Random sampling for a 2D array

Lets consider the following 2D array:

>>> data =  np.arange(80).reshape((8, 10))
>>> data
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79]])

\begin{equation}
data = \left( \begin{array}{ccc}
0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
10 & 11 & 12 & 13 & 14 & 15 & 16 & 17 & 18 & 19 \\
20 & 21 & 22 & 23 & 24 & 25 & 26 & 27 & 28 & 29 \\
30 & 31 & 32 & 33 & 34 & 35 & 36 & 37 & 38 & 39 \\
40 & 41 & 42 & 43 & 44 & 45 & 46 & 47 & 48 & 49 \\
50 & 51 & 52 & 53 & 54 & 55 & 56 & 57 & 58 & 59 \\
60 & 61 & 62 & 63 & 64 & 65 & 66 & 67 & 68 & 69 \\
70 & 71 & 72 & 73 & 74 & 75 & 76 & 77 & 78 & 79
\end{array}\right)
\end{equation}

The function choice() takes only 1D array as an input, however a solution is to use ravel() to transform the 2D array to a 1D array, example:

>>> np.random.choice( data.ravel(),10,replace=False)
array([64, 35, 53, 14, 48, 29, 74, 21, 62, 41])

References

Link Source
Random sampling in numpy sample() function geeksforgeeks
numpy.random.choice stackoverflow
A weighted version of random.choice stackoverflow
Create sample numpy array with randomly placed NaNs stackoverflow
Normalizing a list of numbers in Python stackoverflow

Licence


Google Ads


Activity


Google Ads