**Pandas sum()**: We will see in this tutorial how to use the sum() function for a column or row in a Pandas dataframe.

**Introduction**

A** pandas dataframe** is a two-dimensional tabular data structure that can be modified in size with labeled axes that are commonly referred to as **row **and **column **labels, with different **arithmetic operations **aligned with the row and column labels.

The Pandas library, available on python, allows to import data and to make quick analysis on loaded data.

In this tutorial, we will see how to use the sum() function present in the **pandas library**. This pandas function allows to return the sum of the values according to the axis requested in parameter. We will see the following points:

- Use the
**sum**() function to sum the values on the index axis (the rows) - Use the
**sum**() function to sum the values on the columns axis - Sum the values with a
*multi-level index* - Sum the values on a
**Series**type

To illustrate these different points, we will use the following pandas dataframe:

```
import pandas as pd
data = (['January', 'Monday', 10000, 30000],
['January', 'Friday', 5000, 20000],
['February', 'Monday', 1000000, 2000000],
['February', 'Friday', 2000000, 5000000],
['February', 'Sunday', 5000000, 10000000],
['March', 'Tuesday', 4000000, 8000000])
df = pd.DataFrame(data, columns=['Month', 'Day_Week', 'Income_A', 'Income_B'])
df = df.set_index(['Month', 'Day_Week'])
```

```
Income_A Income_B
Month Day_Week
January Monday 10000 30000
Friday 5000 20000
February Monday 1000000 2000000
Friday 2000000 5000000
Sunday 5000000 10000000
March Tuesday 4000000 8000000
```

This dataframe contains the different incomes generated per month and per day.

## Pandas Dataframe sum() function

### Pandas sum() Syntax

The sum() function is used to sum the values on a given axis. Its syntax is the following:

```
# Sum() function
DataFrame.sum(axis = None, skipna = None, level = None, numeric_only = None, min_count = 0, ** kwargs)
```

The function can take 6 parameters:

Name | Description | Type | Default Value | Required |
---|---|---|---|---|

axis | The axis to apply the function ( 0=index,1=columns) | {index (0), columns (1)} | – | Yes |

skipna | Exclude NA / NULL values | {index (0), columns (1)} | True | No |

level | If the axis is a MultiIndex (hierarchical), count along a particular level, reducing to a series. | int or level name | None | No |

numeric_only | Include only float, int, boolean columns. If none, will try to use everything, then use only numeric data. Not implemented for the series. | Boolean | True | No |

min_count | The required number of valid values to perform the operation. If fewer than min_count non-NA values are present the result will be NA. | int | 0 | No |

** kwargs | Additional arguments to be passed to the function. | – | – | No |

### Sum each Column in Pandas DataFrame

In order to sum each column of the DataFrame, you can use the ** axis **parameter in this way:

```
# Sum each column
df.sum(axis=0)
```

You can apply this code to our previously created dataframe:

```
import pandas as pd
data = (['amiradata', 'Monday', 10000, 30000],
['amiradata', 'Friday', 5000, 20000],
['google', 'Monday', 1000000, 2000000],
['google', 'Friday', 2000000, 5000000],
['google', 'Sunday', 5000000, 10000000],
['linkedin', 'Tuesday', 4000000, 8000000])
df = pd.DataFrame(data, columns=['Website', 'Day', 'Nb_Users', 'Nb_Pageviews'])
df = df.set_index(['Website', 'Day'])
print(df.sum(axis=0))
```

```
Result :
Income_A 12015000
Income_B 25050000
dtype: int64
```

We obtain the sum of the income A and the sum of the income B on the last quarter.

### Sum each Row in Pandas DataFrame

In order to sum each row of the DataFrame, you can use the axis=1 as follows:

```
# Sum each row
df.sum(axis=1)
```

You can apply this code to our previously created dataframe:

```
import pandas as pd
data = (['January', 'Monday', 10000, 30000],
['January', 'Friday', 5000, 20000],
['February', 'Monday', 1000000, 2000000],
['February', 'Friday', 2000000, 5000000],
['February', 'Sunday', 5000000, 10000000],
['March', 'Tuesday', 4000000, 8000000])
df = pd.DataFrame(data, columns=['Month', 'Day_Week', 'Income_A', 'Income_B'])
df = df.set_index(['Month', 'Day_Week'])
print(df.sum(axis=1))
```

```
Result:
Month Day_Week
January Monday 40000
Friday 25000
February Monday 3000000
Friday 7000000
Sunday 15000000
March Tuesday 12000000
dtype: int64
```

In our example, this allows us to sum the income A and B for each row.

### Multi Level Index Sum

If your dataframe has a multi-level index, you can tell pandas which index you want to sum across.

Our example dataframe contains 2 levels. To sum according to the first level, you can use this:

```
import pandas as pd
data = (['January', 'Monday', 10000, 30000],
['January', 'Friday', 5000, 20000],
['February', 'Monday', 1000000, 2000000],
['February', 'Friday', 2000000, 5000000],
['February', 'Sunday', 5000000, 10000000],
['March', 'Tuesday', 4000000, 8000000])
df = pd.DataFrame(data, columns=['Month', 'Day_Week', 'Income_A', 'Income_B'])
df = df.set_index(['Month', 'Day_Week'])
print(df.sum(level=0))
```

```
Result:
Income_A Income_B
Month
January 15000 50000
February 8000000 17000000
March 4000000 8000000
```

To sum from the second level, you can do this:

```
# Multi Level Index Sum
df.sum(level=1)
```

```
Result:
Income_A Income_B
Day_Week
Monday 1010000 2030000
Friday 2005000 5020000
Sunday 5000000 10000000
Tuesday 4000000 8000000
```

### Summing a Series

You can also use the pandas sum() function on a series :

```
#Summing a Series
df['Income_A'].sum()
```

```
Result :
12015000
```

## Conclusion

In this tutorial, we have how to simply use the sum() function of the pandas library. This function is very useful to quickly analyze the data and make quick calculations on the columns or rows of our dataframe.

If you have any questions about its use, don’t hesitate to ask me in comments, I’ll be happy to answer them.

See you soon for new tutorials.

## 1 comment