pyspark.sql.functions.time_diff#

pyspark.sql.functions.time_diff(unit, start, end)[source]#

Returns the difference between two times, measured in specified units.

New in version 4.1.0.

Parameters
unitColumn or column name

The unit to truncate the time to. Supported units are: “HOUR”, “MINUTE”, “SECOND”, “MILLISECOND”, and “MICROSECOND”. The unit is case-insensitive.

startColumn or column name

A starting time.

endColumn or column name

An ending time.

Returns
Column

The difference between two times, in the specified units.

Examples

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame(
...     [("HOUR", "13:08:15", "21:30:28")], ['unit', 'start', 'end']).withColumn("start",
...     sf.col("start").cast("time")).withColumn("end", sf.col("end").cast("time"))
>>> df.select('*', sf.time_diff('unit', 'start', 'end')).show()
+----+--------+--------+---------------------------+
|unit|   start|     end|time_diff(unit, start, end)|
+----+--------+--------+---------------------------+
|HOUR|13:08:15|21:30:28|                          8|
+----+--------+--------+---------------------------+