I am using pandas (some experience before, mostly as csv parser and "numpy extension") to read my measurement datas and want to create a useful data structure with it. Unfortunately I am stuck with the following lines where the condition is always false. There seems to be a (type?) problem when checking the Record Dates against plate.date. df...dt.date seems to be a "bound method Timestamp.date of Timestamp" and I do not find understandable information about that.
Background: In plate, I store which wells contain which sample. This changes from time to time. So I have an attribute plate.date which is a list of dates and my function shall then give back the sample type, but just for dates given in plate.date. So I can create some objects plate and classify my data according to it.
df has (among others): "Record Date" (actually a timepoint datetime64[ns]), "Well Name" and shall get "sample_type", "index", "initial_measurement" .
if hasattr(plate, "date"):
condition = df["Record Date"].dt.date.isin(plate.date)
else:
condition = df["Well Name"] != None # True for available data
df.loc[condition, ["sample_type", "index", "initial_measurement"]] = list((df.loc[condition, "Well Name"].astype(str).apply(get_sample_info)))
# Change the data types of the new columns
df = df.astype({"sample_type": str, "index": pd.Int64Dtype(), "initial_measurement": bool})
# Helper function to get the sample type, index, and measurement for a given well.
def get_sample_info(well):
for sample_type, well_list in plate.__dict__.items():
if well in well_list and sample_type.replace("_second", "") in plate.well_ranges:
initial_measurement = True if "_second" not in sample_type else False
sample_type = sample_type.replace("_second", "")
index = well_list.index(well) + 1
return sample_type, int(index), initial_measurement
return None, np.nan, None
class Plate:
def __init__(self, ..., date=None):
...
if date is not None:
if isinstance(date, str):
# fine
self.date = list(parse(date).date)
elif isinstance(date, list) or isinstance(date, tuple):
# fine, but check type of items
if all((isinstance(item, str) or isinstance(item, datetime)) for item in date):
self.date = [parse(item).date for item in date]
else:
raise TypeError("The data type of the elements in the date list/tuple must be datetime or strings.")
elif isinstance(date, datetime):
# fine
self.date = date.date
else:
raise TypeError("The data type of parameter date must be datetime.date, string (containing date) or list/tuple (of dates/strings).")
I tried to print the results and data types with code like:
datetime.datetime.fromtimestamp(relevant_facs_measurements['Record Date'].iloc[0].date)
TypeError: 'method' object cannot be interpreted as an integer
parse("2023-12-01").date.isin(plate.date)
AttributeError: 'builtin_function_or_method' object has no attribute 'isin'
df['Record Date'].iloc[0].date == parse("2023-12-01")
False
type(df['Record Date'].iloc[0].date))
<bound method Timestamp.date of Timestamp('2023-12-01 17:16:00')>
type(parse("2023-12-01"))
datetime.datetime
df['Record Date'].dt.date.isin((parse("2023-12-01"),parse("2023-12-06")))
In another part of my code this works flawlessy:
relevant_df=df[df["Record Date"].dt.date.isin(dates_to_keep)]
dates_to_keep = [datetime.date(2023,12,1), datetime.date(2023,12,6)]
Solution was given by jlandercy:
Use date() in class definition.