Pandas count(): In this article, we will see how to count the number of observations on a given axis in a Pandas Dataframe.
A pandas dataframe is a two-dimensional tabular data structure that can be modified in size with labeled axes that are commonly referred to as row and column labels, with different arithmetic operations aligned with the row and column labels.
The Pandas library, available on python, allows to import data and to make quick analysis on loaded data.
When working with data that you are not familiar with, it can be interesting to count the number of values in each column of your Pandas dataframe.
The count() function in the Pandas library allows you to count the number of values for each column or row.
Count() is also included within Pandas Describe.
In this tutorial we will see how :
- Count the number of values in each column.
- Count the number of values in each row.
- Count the number of null values
- Count the distinct values in our dataframe
First step, we will create a pandas dataframe to illustrate the different points:
import pandas as pd import numpy as np NaN = np.nan school = [("Sam", 40, 40, 20), ("John", 50, NaN, NaN), ("Clark", 50, 70, 50), ("Hayley", 20, NaN, 70), ("Michelle", 10, 50, 70) ] df = pd.DataFrame(school, columns=['Student', 'Math', 'Physics', 'Chemistry']) print(df)
Student Math Physics Chemistry 0 Sam 40 40.0 20.0 1 John 50 NaN NaN 2 Clark 50 70.0 50.0 3 Hayley 20 NaN 70.0 4 Michelle 10 50.0 70.0
Pandas count() Syntax
The syntax of the count() function is as follows:
Syntax: DataFrame.count(axis=0, level=None, numeric_only=False)
We can see that the function can take 3 parameters :
|axis||Counts are generated for each column if axis=0 or axis=’index’ and counts are generated for each row if axis=1 or axis=”columns”||int or str||0||No|
|level||If the axis is a MultiIndex, count along a particular level, collapsing into a DataFrame. A str specifies the level name.||int or str||None||No|
|numeric_only||Include only float, int, boolean data||boolean||False||No|
Pandas Count Values in Each Column
To count the number of values in each column, by default the function counts the values on the axis=0. Therefore it is not necessary to use a parameter in the function :
# axis = 0 print(df.count())
Student 5 Math 5 Physics 3 Chemistry 4 dtype: int64
The example above shows us that the function does not count null values.
Pandas Count Values in Each Row
To count the number of values for each row of our dataframe, we need to specify the parameter axis= 1 or axis=’columns’ :
# row axis print(df.count(axis = 1)) # Or print(df.count(axis = 'columns'))
0 4 1 2 2 4 3 3 4 4 dtype: int64
Count Null Values
We have seen that the count() function ignores null values. If you want to count the number of null values you can use the isnull() function combined with the sum() function:
# Count null values print(df.isnull().sum())
Student 0 Math 0 Physics 2 Chemistry 1 dtype: int64
Pandas Count Uniques Values
The nunique() function counts the unique values in each row or column of our dataframe :
# Distinct Values print(df.nunique())
Student 5 Math 4 Physics 3 Chemistry 3 dtype: int64
In this article we have how to count the number of values in our Pandas dataframe. I hope this function will have no more secrets for you :). It is a very useful function to explore and understand this data.
If you have any questions on this topic please let me know in the comments and I will be happy to answer them.
See you soon for new tutorials!