How to edit a pandas dataframe column values where a condition is verified in python ?

How to edit a pandas dataframe column values where a condition is verified in python ?

Daidalos May 24, 2020


Examples of how to edit a pandas dataframe column values where a condition is verified in python:

1 -- Create a simple dataframe with pandas

Lets' start by creating a simple dataframe with 5 columns and 20 rows:

>>> import pandas as pd
>>> import numpy as np
>>> data = np.arange(1,101)
>>> data = data.reshape(20,5)
>>> df = pd.DataFrame(data=data,columns=['a','b','c','d','e'])

returns

>>> df
     a   b   c   d    e
0    1   2   3   4    5
1    6   7   8   9   10
2   11  12  13  14   15
3   16  17  18  19   20
4   21  22  23  24   25
5   26  27  28  29   30
6   31  32  33  34   35
7   36  37  38  39   40
8   41  42  43  44   45
9   46  47  48  49   50
10  51  52  53  54   55
11  56  57  58  59   60
12  61  62  63  64   65
13  66  67  68  69   70
14  71  72  73  74   75
15  76  77  78  79   80
16  81  82  83  84   85
17  86  87  88  89   90
18  91  92  93  94   95
19  96  97  98  99  100

2 -- Select a column

To select a given column, a solution is to do:

>>> df['b']
0      2
1      7
2     12
3     17
4     22
5     27
6     32
7     37
8     42
9     47
10    52
11    57
12    62
13    67
14    72
15    77
16    82
17    87
18    92
19    97
Name: b, dtype: int64

or

>>> df.loc[:,'b']
0      2
1      7
2     12
3     17
4     22
5     27
6     32
7     37
8     42
9     47
10    52
11    57
12    62
13    67
14    72
15    77
16    82
17    87
18    92
19    97
Name: b, dtype: int64

Note: if the column does not have a name, there is the function iloc():

 >>> df.iloc[:,1]
0      2
1      7
2     12
3     17
4     22
5     27
6     32
7     37
8     42
9     47
10    52
11    57
12    62
13    67
14    72
15    77
16    82
17    87
18    92
19    97

To edit (for example here by multiplying by 10) all elements of the column b, a solution. is to do:

>>> df['b'] = df['b'] * 10
>>> df
     a    b   c   d    e
0    1   20   3   4    5
1    6   70   8   9   10
2   11  120  13  14   15
3   16  170  18  19   20
4   21  220  23  24   25
5   26  270  28  29   30
6   31  320  33  34   35
7   36  370  38  39   40
8   41  420  43  44   45
9   46  470  48  49   50
10  51  520  53  54   55
11  56  570  58  59   60
12  61  620  63  64   65
13  66  670  68  69   70
14  71  720  73  74   75
15  76  770  78  79   80
16  81  820  83  84   85
17  86  870  88  89   90
18  91  920  93  94   95
19  96  970  98  99  100

3 -- Select only elements of the column where a condition is verified

For example to edit only the values that are greater than 500:

>>> df['b'][ df['b'] > 500 ] = df['b'][ df['b'] > 500 ] / 10
>>> df
     a    b   c   d    e
0    1   20   3   4    5
1    6   70   8   9   10
2   11  120  13  14   15
3   16  170  18  19   20
4   21  220  23  24   25
5   26  270  28  29   30
6   31  320  33  34   35
7   36  370  38  39   40
8   41  420  43  44   45
9   46  470  48  49   50
10  51   52  53  54   55
11  56   57  58  59   60
12  61   62  63  64   65
13  66   67  68  69   70
14  71   72  73  74   75
15  76   77  78  79   80
16  81   82  83  84   85
17  86   87  88  89   90
18  91   92  93  94   95
19  96   97  98  99  100

Note: if the name of the column is not specify all columns will be modified, illustration:

>>> df[ df['b'] > 500 ] = df[ df['b'] > 500 ] / 10
>>> df
       a      b     c     d     e
0    1.0   20.0   3.0   4.0   5.0
1    6.0   70.0   8.0   9.0  10.0
2   11.0  120.0  13.0  14.0  15.0
3   16.0  170.0  18.0  19.0  20.0
4   21.0  220.0  23.0  24.0  25.0
5   26.0  270.0  28.0  29.0  30.0
6   31.0  320.0  33.0  34.0  35.0
7   36.0  370.0  38.0  39.0  40.0
8   41.0  420.0  43.0  44.0  45.0
9   46.0  470.0  48.0  49.0  50.0
10   5.1   52.0   5.3   5.4   5.5
11   5.6   57.0   5.8   5.9   6.0
12   6.1   62.0   6.3   6.4   6.5
13   6.6   67.0   6.8   6.9   7.0
14   7.1   72.0   7.3   7.4   7.5
15   7.6   77.0   7.8   7.9   8.0
16   8.1   82.0   8.3   8.4   8.5
17   8.6   87.0   8.8   8.9   9.0
18   9.1   92.0   9.3   9.4   9.5
19   9.6   97.0   9.8   9.9  10.0

4 -- Select only elements of the column where multiple conditions are verified

Example lets select the elements greater than 12 and lower than 40 in the column b:

>>> df['b'][ (df['b'] > 12) & (df['b'] < 40) ]
3    17
4    22
5    27
6    32
7    37

Edit those elements by adding 5

>>> df['b'][ (df['b'] > 12) & (df['b'] < 40) ] = df['b'][ (df['b'] > 12) & (df['b'] < 40) ] + 5
>>> df
     a   b   c   d    e
0    1   2   3   4    5
1    6   7   8   9   10
2   11  12  13  14   15
3   16  22  18  19   20
4   21  27  23  24   25
5   26  32  28  29   30
6   31  37  33  34   35
7   36  42  38  39   40
8   41  42  43  44   45
9   46  47  48  49   50
10  51  52  53  54   55
11  56  57  58  59   60
12  61  62  63  64   65
13  66  67  68  69   70
14  71  72  73  74   75
15  76  77  78  79   80
16  81  82  83  84   85
17  86  87  88  89   90
18  91  92  93  94   95
19  96  97  98  99  100

5 -- References