pyspark.sql.Column.when#

Column.when(condition, value)[source]#

Evaluates a list of conditions and returns one of multiple possible result expressions. If Column.otherwise() is not invoked, None is returned for unmatched conditions.

New in version 1.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
conditionColumn

a boolean Column expression.

value

a literal value, or a Column expression.

Returns
Column

Column representing whether each element of Column is in conditions.

Examples

Example 1: Using when() with conditions and values to create a new Column

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([(2, "Alice"), (5, "Bob")], ["age", "name"])
>>> result = df.select(df.name, sf.when(df.age > 4, 1).when(df.age < 3, -1).otherwise(0))
>>> result.show()
+-----+------------------------------------------------------------+
| name|CASE WHEN (age > 4) THEN 1 WHEN (age < 3) THEN -1 ELSE 0 END|
+-----+------------------------------------------------------------+
|Alice|                                                          -1|
|  Bob|                                                           1|
+-----+------------------------------------------------------------+

Example 2: Chaining multiple when() conditions

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([(1, "Alice"), (4, "Bob"), (6, "Charlie")], ["age", "name"])
>>> result = df.select(
...     df.name,
...     sf.when(df.age < 3, "Young").when(df.age < 5, "Middle-aged").otherwise("Old")
... )
>>> result.show()
+-------+---------------------------------------------------------------------------+
|   name|CASE WHEN (age < 3) THEN Young WHEN (age < 5) THEN Middle-aged ELSE Old END|
+-------+---------------------------------------------------------------------------+
|  Alice|                                                                      Young|
|    Bob|                                                                Middle-aged|
|Charlie|                                                                        Old|
+-------+---------------------------------------------------------------------------+

Example 3: Using literal values as conditions

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([(2, "Alice"), (5, "Bob")], ["age", "name"])
>>> result = df.select(
...     df.name, sf.when(sf.lit(True), 1).otherwise(
...         sf.raise_error("unreachable")).alias("when"))
>>> result.show()
+-----+----+
| name|when|
+-----+----+
|Alice|   1|
|  Bob|   1|
+-----+----+