pythondatetimepysparkpython-datetime

Current Timestamp in Azure Databricks Notebook in EST


I need the current timestamp in EST but the current_timestamp() is returning PST.

Tried the following code but it's not working and showing 6 hours before EST time:

# Import the current_timestamp function
from pyspark.sql.functions import from_utc_timestamp
    
# Set the timezone to EST
spark.conf.set("spark.sql.session.timeZone", "EST")
    
# Get the current timestamp in EST
current_timestamp_est = spark.sql("SELECT from_utc_timestamp(current_timestamp(), 'EST') as current_timestamp_est")
    
# Show the current timestamp in EST
current_timestamp_est.show(truncate = False)

Also tried the below code. But the time is coming as 2024-03-11 23:16:04.589275-04:00. Why it's coming with -4:00? Is there a way to get rid of the -4:00?

from datetime import datetime
from pytz import timezone

est = timezone('US/Eastern')
now_est = datetime.now(est)
#now_est1 = now_est[:26]
print(now_est)

Any other way to get the current timestamp in EST in a variable?


Solution

  • To get rid of the -04:00 you can just use .strftime() in your second approach:

    from datetime import datetime
    from pytz import timezone
    
    est = timezone('US/Eastern')
    now_est = datetime.now(est).strftime('%Y-%m-%d %H:%M:%S')
    print(now_est)
    
    Output:
    2024-04-04 09:42:58
    

    Now, for the first approach instead of setting EST as the timezone - it's preferred to point out a location around the globe like America/New_York to get rid of the offset value -04:00 as follows:

    spark.conf.set("spark.sql.session.timeZone", "America/New_York")
    current_timestamp_est = spark.sql("SELECT current_timestamp() as current_timestamp_est")
    
    current_timestamp_est.show(truncate=False)
    
    Output:
    +--------------------------+
    |current_timestamp_est     |
    +--------------------------+
    |2024-04-04 09:42:12.082136|
    +--------------------------+