I'm trying to figure out the most robust caching scheme for my particular use case. I have a class that matches temporal data. The data is on a per day granularity (there's no intraday data).
class DataManager:
def write(for_date, data):
# write data to db
# I want to cache this because loading from db is expensive (my profiling results confirm this empirically)
def read(for_date):
# read data from db
# this method is used sparingly but is exposed as a public API
def truncate(cutoff_date, inclusive=True):
# remove everything after cutoff_date. if inclusive_true=True, also remove everything on cutoff_date.
I want to use something like a lru cache for read, but I can't simply add the lru cache decorator, because something like this sequence of operations is possible:
write(X, in_data)
# returns data we wrote for X
read(X)
truncate(X, inclusive=True)
# return nothing since the data was removed by truncate. if we naively cache read, this would return the data we wrote, but that data would no longer exist
read(X)
The only idea I could come up with is for any function that's removing data, we remove all the removed dates from the lru cache.
What other approaches can I consider here?