# Pandas where() – Select Rows based on Values in a Dataframe Column

**Pandas where()**: In this new tutorial we will see how to use the where() function on a column of a dataframe of the ** pandas** module.

## Introduction

A** pandas dataframe** is a two-dimensional tabular data structure that can be modified in size with labeled axes that are commonly referred to as **row **and **column **labels, with different **arithmetic operations **aligned with the row and column labels.

The Pandas library, available on python, allows to import data and to make quick analysis on loaded data.

The** where()** **function **allows you to replace the values for which your condition is *False*. The main purpose of this function is to replace values that do not satisfy one or more criteria. By default, The rows not satisfying the condition are filled with NaN value.

This is a function we don’t use often, but in some cases, it can be very useful instead of using the ** .apply() **function.

In this tutorial we will see the following topics:

- Pandas Where with
single condition*DataFrame* - Where() with
multiple condition*DataFrame* - Pandas Where with
condition*Series* - Pandas Where with
condition*Callable*

To illustrate these different points, we will use the following dataframe:

```
import pandas as pd
pokedex = [(10, 40, 40, 20),
(20, 50, 10, 30),
(20, 50, 70, 50),
(30, 20, 30, 70),
(40, 10, 50, 70)
]
df = pd.DataFrame(pokedex,
columns=['Power1', 'Power2', 'Power3', 'Power4'],
index=['Bulbasaur', 'Charmander', 'Squirtle', 'Blastoise','Caterpie'])
print(df)
```

**Output **:

```
Power1 Power2 Power3 Power4
Bulbasaur 10 40 40 20
Charmander 20 50 10 30
Squirtle 20 50 70 50
Blastoise 30 20 30 70
Caterpie 40 10 50 70
```

## Pandas where()

### Pandas where() Syntax

The syntax of the where() function is as follows:

```
DataFrame.where(cond,
other= NaN,
inplace= False,
axis= None,
level= None,
errors= 'raise',
try_cast= False)
```

This function can take several parameters. Here is a detailed description of each parameter:

Name | Description | Type | Default Value | Required |
---|---|---|---|---|

| The cond parameter is the place where the condition to be checked will be satisfied. | boolean, `Series` or `DataFrame` , an array-like structure or a callable | – | Yes |

| All values that do not satisfy the previously given condition will be treated accordingly by this other parameter. | scalar, `Series` /`DataFrame` , or a callable | NaN | No |

| Whether to perform the operation in place on the data. | Boolean | False | No |

axis | Alignment axis if needed. | Integer | None | No |

level | Alignment level if needed. | Integer | None | No |

errors | String | raise | No | |

try_cast | Try to cast the result back to the input type (if possible) | Boolean | False | No |

### Pandas where() with Single Condition

To start off, we will filter in our dataframe all values that are greater than 60. To know all the values greater than 60 in our dataframe we can use the following condition:

```
print(df<60)
Power1 Power2 Power3 Power4
Bulbasaur True True True True
Charmander True True True True
Squirtle True True False True
Blastoise True True True False
Caterpie True True True False
```

In this example, we see that 3 values are equal to False. To replace all the values that are greater than 60 by the value ‘High’, we can use the following row:

```
df = df.where(df<60, "High")
print(df)
Power1 Power2 Power3 Power4
Bulbasaur 10 40 40 20
Charmander 20 50 10 30
Squirtle 20 50 High 50
Blastoise 30 20 30 High
Caterpie 40 10 50 High
```

### Pandas where() with Multiple Condition

It is also possible to work with several conditions.

```
# Multiple Condition
filter1 = df['Power1']<40
filter2 = df['Power2']<50
df = df.where(filter1 & filter2, "OK")
print(df)
```

Output :

```
Power1 Power2 Power3 Power4
Bulbasaur 10 40 40 20
Charmander OK OK OK OK
Squirtle OK OK OK OK
Blastoise 30 20 30 70
Caterpie OK OK OK OK
```

### Pandas where() with *Series* condition

*Series*

If you want to fill an entire row based on a Pandas Series, it is possible to pass the Series in the condition.

```
filter1 = df['Power3']<50
df = df.where(filter1, "Strong")
print(df)
```

Output :

```
Power1 Power2 Power3 Power4
Bulbasaur 10 40 40 20
Charmander 20 50 10 30
Squirtle Strong Strong Strong Strong
Blastoise 30 20 30 70
Caterpie Strong Strong Strong Strong
```

### Pandas where() with *Callable* condition

*Callable*

You can also specify a callable condition for your where cond parameter. This means that you can pass a function to evaluate your value. To test this, we will use a lambda function (this also works with a normal function):

```
# Callable condition
df = df.where(lambda x: x<50, "High")
print(df)
```

Output:

```
Power1 Power2 Power3 Power4
Bulbasaur 10 40 40 20
Charmander 20 High 10 30
Squirtle 20 High High High
Blastoise 30 20 30 High
Caterpie 40 10 High High
```

## Conclusion

In this tutorial, we learned how to use the **where**() function present in the Pandas package. This package is very useful when you want to clean or process data in a dataframe.

If you have any questions about the use of this function, don’t hesitate to tell me in the comments, I will be happy to answer them.

Comments

Leave a comment