I have a large database and I am looking to read only the last week for my python code.
However, somebody made a typo in the database so there is a date in the future that is throwing everything off.
recvd_dttm 6/5/2015 18:28:50 PM 6/5/2015 14:25:43 PM 9/10/2015 21:45:12 PM 6/5/2015 14:30:43 PM 6/5/2015 14:32:33 PM 6/5/2015 14:33:45 PM
Code so far:
import datetime as datetime #Create a dataframe with the data we are interested in df1 =pd.read_csv('MYDATA.csv') #This section selects the last week of data # convert strings to datetimes df1['recvd_dttm'] = pd.to_datetime(df1['recvd_dttm']) # get first and last datetime for final week of data range_max = df1['recvd_dttm'].max() range_min = range_max - datetime.timedelta(days=7) # take slice with final week of data df2 = df1[(df1['recvd_dttm'] >= range_min) & (df1['recvd_dttm'] <= range_max)]
I want to ignore all dates in the future. I have tried doing a try: except: IndexError approach, but this didn't work, as the IndexError flag was only thrown later in the code.
I have tried an if loop
if df1['recvd_dttm'].max() > datetime.datetime.now():
but these values aren't comparable, and I don't know how to select the penultimate value for the date, as max()-1 doesn't work, obviously. Does anyone have any ideas? Thanks in advance!