I trained a recommender using the user personalization recipe in Amazon Personalize. A list of recommended products is called using get_recommendations() in a Flask app that serves the front end. In addition, user interactions, encoded as Segment events, are passed back to the Personalize event tracker using an AWS Lambda following the guidance here. I want to use the put_event() API to write impression data back to the event tracker in order to avoid users seeing the same items over and over again even when they don't interact with them. My problem is that I can't figure out how to avoid a situation in which get_recommendations() (called in a flask app) can get the information it needs about user interactions with items (handled in the AWS Lambda since these interactions are encoded as segment events).
I can call put_event() from the flask app right after calling get_recommendations(), and use the output of the latter to specify which products were seen, but it seems that the intended use of put_events is to provide one itemId that the user interacted with, and then a list of itemIds that the user saw/clicked/etc. and did NOT interact with. However the information about which item the user interacted with is captured in the Lambda, not the flask app. Similarly, the AWS Lambda knows what action the user took, but does not know the full list of recommended items that were shown, and does not have access to the recommendationId.
I feel like this scenario isn't too exotic and I suspect I'm overlooking something fairly obvious. I'm wondering if other people have run into a similar situation and how they manage the flow of information through such a system.
Impression data for Personalize is used to inform exploration.
Impression data is not used to exclude items that were seen by a user in subsequent calls to GetRecommendations. Exploration will vary the cold items that are recommended to a user from call to call but for users with long interaction histories, more relevant warm items will likely be more prevalent in recommendations.
If you want to forcibly exclude items that a user has recently seen, you could use a Personalize filter to exclude those items by their item ID. To do this, you'd have to add a column to the items dataset (e.g.,
Items.ITEM_ID_FOR_FILTERING) and then use a dynamic filter to pass in the item IDs to exclude.There is a maximum length of 1000 characters for the filter value
ITEM_IDSso you'll only be able to send as many item IDs that will fit in 1000 characters (including commas).You should also exclude the
Items.ITEM_ID_FOR_FILTERINGcolumn from training since it's only needed for filtering and doesn't have any value for training.Excluding all items that a user has seen is not that common since you will be excluding the most relevant items for the user. Eventually, the recommendations will be dominated by the least relevant items to the user.