PySpark Rename Column on PySpark Dataframe (Single or Multiple Column)

PySpark Rename Column : In this turorial we will see how to rename one or more columns in a pyspark dataframe and the different ways to do it.
Introduction
In many occasions, it may be necessary to rename a Pyspark dataframe column. For example, when reading a file and the headers do not correspond to what you want or to export a file in a desired format.
You can see this tutorial if you want to know how to read a csv file in pyspark :
In pyspark, there are several ways to rename these columns:
- By using the function withColumnRenamed() which allows you to rename one or more columns.
- By using the selectExpr() function
- Using the select() and alias() function
- Using the toDF() function
We will see in this tutorial how to use these different functions with several examples based on this pyspark dataframe :

Here is the code to create the pyspark dataframe :
from pyspark.sql import SparkSession
from pyspark.sql import functions as f
from pyspark.sql.types import StructType, StructField, StringType,IntegerType
spark = SparkSession.builder.appName('pyspark - example toPandas()').getOrCreate()
sc = spark.sparkContext
pokedex = [
("Bulbasaur",("Grass","Poison"),1),
("Ivysaur",("Grass","Poison"),2),
("Venusaur",("Grass","Poison"),3),
("Charmeleon",("Fire","Fire"),5),
("Charizard",("Fire","Flying"),6),
("Wartortle",("Water","Water"),8),
("Blastoise",("Water","Water"),9)
]
schema = StructType([
StructField('Name', StringType(), True),
StructField('Type', StructType([
StructField('Primary', StringType(), True),
StructField('Secondary', StringType(), True)
])),
StructField('Index', StringType(), True)
])
df = spark.createDataFrame(data=pokedex, schema = schema)
df.printSchema()
df.show(truncate=False)
PySpark withColumnRenamed
PySpark withColumnRenamed – To rename a single column name
One of the simplest approaches to renaming a column is to use the withColumnRenamed function. The function takes two parameters which are :
existingCol: The name of the column you want to change.
newCol: The new column name.
Using our example dataframe, we will change the name of the “Name” column to “Pokemon_Name” :
# Rename single column using withColumnRenamed
df1 = df.withColumnRenamed("Name","Pokemon_Name")
df1.printSchema()
This gives us :
root
|-- Pokemon_Name: string (nullable = true)
|-- Type: struct (nullable = true)
| |-- Primary: string (nullable = true)
| |-- Secondary: string (nullable = true)
|-- Index: string (nullable = true)
PySpark withColumnRenamed – To rename multiple column name
We can also combine several withColumnRenamed to rename several columns at once:
# Rename mutiple column using withColumnRenamed
df1 = df.withColumnRenamed("Name","Pokemon_Name").withColumnRenamed("Index","Number_id")
df1.printSchema()
PySpark withColumnRenamed – To rename nested columns
It is also possible to rename a column containing a nested array. This has the advantage of creating multiple columns for each element of our array (this can be interesting in some situations).
Here’s how to do it:
# Rename nested column using withColumnRenamed
df1 = df.withColumn("Primary_Type",f.col("Type.Primary")) \
.withColumn("Secondary_Type",f.col("Type.Secondary")) \
.drop("Type")
df1.printSchema()
df1.show()
root
|-- Name: string (nullable = true)
|-- Index: string (nullable = true)
|-- Primary_Type: string (nullable = true)
|-- Secondary_Type: string (nullable = true)
+----------+-----+------------+--------------+
| Name|Index|Primary_Type|Secondary_Type|
+----------+-----+------------+--------------+
| Bulbasaur| 1| Grass| Poison|
| Ivysaur| 2| Grass| Poison|
| Venusaur| 3| Grass| Poison|
|Charmeleon| 5| Fire| Fire|
| Charizard| 6| Fire| Flying|
| Wartortle| 8| Water| Water|
| Blastoise| 9| Water| Water|
+----------+-----+------------+--------------+
Pyspark Rename Column Using selectExpr() function
Using the selectExpr() function in Pyspark, we can also rename one or more columns of our Pyspark Dataframe. We will use this function to rename the “Name” and “Index” columns respectively by “Pokemon_Name” and “Number_id” :
# Rename single or multiple colomun using selectExpr()
df1 = df.selectExpr("Name as Pokemon_Name", "Index as Number_id","Type")
df1.printSchema()
df1.show()
We use the “AS” keyword to assign a new value to our columns.
Pyspark Rename Column Using alias() function
The alias() function gives the possibility to rename one or more columns (in combination with the select function).
# Rename column using alias() function
df1 = df.select(f.col("Name").alias("Pokemon_Name"), f.col("Index").alias("Number_id"),"Type")
df1.printSchema()
root
|-- Pokemon_Name: string (nullable = true)
|-- Number_id: string (nullable = true)
|-- Type: struct (nullable = true)
| |-- Primary: string (nullable = true)
| |-- Secondary: string (nullable = true)
Pyspark Rename Column Using toDF() function
The toDF() function allows to convert highly typed data of a dataframe with renamed column names. We can therefore use this function to rename the columns of our Pyspark dataframe :
# Rename column using toDF() function
df1 = df.toDF("Pokemon_Name","Type","Number_id")
df1.printSchema()
df1.show()
root
|-- Pokemon_Name: string (nullable = true)
|-- Type: struct (nullable = true)
| |-- Primary: string (nullable = true)
| |-- Secondary: string (nullable = true)
|-- Number_id: string (nullable = true)
+------------+---------------+---------+
|Pokemon_Name| Type|Number_id|
+------------+---------------+---------+
| Bulbasaur|[Grass, Poison]| 1|
| Ivysaur|[Grass, Poison]| 2|
| Venusaur|[Grass, Poison]| 3|
| Charmeleon| [Fire, Fire]| 5|
| Charizard| [Fire, Flying]| 6|
| Wartortle| [Water, Water]| 8|
| Blastoise| [Water, Water]| 9|
Conclusion
In this article we learned the different ways to rename columns in a Pyspark Dataframe ( single or multiple columns). I hope that it helped you in using these functions, feel free to send me comments I would be happy to read them 🙂
If you wish to deepen your knowledge in Pyspark, there are excellent books on the subject, here is a list of what I consider interesting to study (As an Amazon Partner, I make a profit on qualifying purchases) :
Comments
Leave a comment