I have a dataframe containing a variety of phone numbers that I want to extract the time zone for. I am apply to loop over the series in the dataframe as follows
external_calls_cleaned_df['time_zone'] = external_calls_cleaned_df.apply(lambda x: timezone.time_zones_for_number(phonenumbers.parse(str(x.external_number), None)), axis=1)
And this works just fine as long as the phone number in x.external_number doesn't contain a single invalid phone number; however, if one single invalid phone number is found in the entire series, it fails.
What I would like it to do is return 'Null' or None whenever it gets an invalid number- anything, actually - I can filter those out after the fact, but I don't want the process to stop at that point.
I have tried to wrap the timezone function in a new function and then execute it with try
def get_timezone(df):
try:
x = timezone.time_zones_for_number(phonenumbers.parse(str(df.external_number), None))
except:
None
return x
and then using
external_calls_cleaned_df['time_zone'] = external_calls_cleaned_df.apply(lambda x:get_timezone(x), axis=1)
The process completes then, but it fills the 'time_zone' field with None for every value.
To accomplish this I am using the phonenumbers package which is a port from the libphonenumber java package from google.
I can't share the phone numbers in my database for obvious reasons, so I don't know how to turn this into a reproducible example, or I would provide it.
Can anyone help me?
Thanks, Brad
Try refactoring your code in order to use map with the target column "external_number" instead of apply with the whole dataframe, like this:
def get_timezone(x):
try:
return timezone.time_zones_for_number(phonenumbers.parse(str(x), None))
except:
return None
external_calls_cleaned_df["time_zone"] = external_calls_cleaned_df[
"external_number"
].map(get_timezone)