"Cell \u001b[0;32mIn[13], line 18\u001b[0m\n\u001b[1;32m 16\u001b[0m sim_options_jaccard \u001b[38;5;241m=\u001b[39m {\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mname\u001b[39m\u001b[38;5;124m'\u001b[39m: \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mjaccard\u001b[39m\u001b[38;5;124m'\u001b[39m}\n\u001b[1;32m 17\u001b[0m user_based_jaccard \u001b[38;5;241m=\u001b[39m KNNBasic(sim_options\u001b[38;5;241m=\u001b[39msim_options_jaccard)\n\u001b[0;32m---> 18\u001b[0m \u001b[43muser_based_jaccard\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfit\u001b[49m\u001b[43m(\u001b[49m\u001b[43mtrainset\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 20\u001b[0m \u001b[38;5;66;03m# Make predictions with each model on the test set\u001b[39;00m\n\u001b[1;32m 21\u001b[0m predictions_msd \u001b[38;5;241m=\u001b[39m user_based_msd\u001b[38;5;241m.\u001b[39mtest(testset)\n",
"print(\"RMSE with MSD similarity:\", rmse_msd)\n",
"print(\"RMSE with Jacard similarity:\", rmse_jacard)\n"
"print(\"RMSE with Jaccard similarity:\", rmse_jaccard)\n"
]
}
],
...
...
%% Cell type:markdown id:f4a8f664 tags:
# Custom User-based Model
The present notebooks aims at creating a UserBased class that inherits from the Algobase class (surprise package) and that can be customized with various similarity metrics, peer groups and score aggregation functions.
%% Cell type:code id:00d1b249 tags:
``` python
# reloads modules automatically before entering the execution of code
%load_extautoreload
%autoreload2
# standard library imports
# -- add new imports here --
# third parties imports
importnumpyasnp
importpandasaspd
# -- add new imports here --
# local imports
fromconstantsimportConstantasC
fromloadersimportload_ratings,load_items# voir si besoin
Quelque soit les neighbours (1,2,3) la valeur du ratings ne change pas
%% Cell type:markdown id:c8890e11 tags:
1).Predictions with min_k = 1: In this case, the model makes predictions without considering any minimum number of neighbors. Each prediction is made solely based on the similarity between the target user and other users who have rated the same items. Consequently, we observe varying prediction values for different items. For instance, for user 15 and item 942, the predicted rating is 3.777, while for item 64, the predicted rating is only 0.922. This indicates that the model heavily relies on the ratings from users who may have rated only a single item in common with the target user, leading to potentially erratic predictions.
2). Predictions with min_k = 2: Here, a minimum of 2 neighbors are required to make a prediction. This introduces a bit of regularization, ensuring that predictions are made based on a slightly broader consensus. We notice that the predictions are somewhat similar to those with min_k = 1, but there are slight changes in some ratings. For example, the rating for item 5054 changes from 3.010 to 2.694. This suggests that the model is slightly more conservative in its predictions due to the requirement of at least two neighbors.
3). Predictions with min_k = 3: With a minimum of 3 neighbors, the model becomes even more conservative. It requires a stronger consensus among users before making predictions. As a result, we see more uniformity in the predicted ratings compared to the previous cases. For example, for item 6322, the prediction changes from 1.711 (min_k = 1) to 2.694 (min_k = 2) and finally to 2.694 again (min_k = 3). This indicates that the model is increasingly cautious as it demands more agreement among neighbors before making predictions
%% Cell type:code id:cc806424 tags:
``` python
defanalyse_min_support(knn_model,testset):
# Rétablir min_k à 2
knn_model.min_k=2
# Modifier min_support de 1 à 3 et observer actual_k
formin_supportinrange(1,4):
knn_model.sim_options['min_support']=min_support
predictions_min_support=knn_model.test(testset[:30])# Prendre les 30 premières prédictions pour l'affichage
print(f"\nPrédictions avec min_support = {min_support}:")
File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/surprise/prediction_algorithms/algo_base.py:248, in AlgoBase.compute_similarities(self)
247 print(f"Computing the {name} similarity matrix...")
--> 248 sim = construction_func[name](*args)
249 if getattr(self, "verbose", False):
KeyError: 'jaccard'
During handling of the above exception, another exception occurred:
20 # Make predictions with each model on the test set
21 predictions_msd = user_based_msd.test(testset)
File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/surprise/prediction_algorithms/knns.py:98, in KNNBasic.fit(self, trainset)
95 def fit(self, trainset):
97 SymmetricAlgo.fit(self, trainset)
---> 98 self.sim = self.compute_similarities()
100 return self
File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/surprise/prediction_algorithms/algo_base.py:253, in AlgoBase.compute_similarities(self)
251 return sim
252 except KeyError:
--> 253 raise NameError(
254 "Wrong sim name "
255 + name
256 + ". Allowed values "
257 + "are "
258 + ", ".join(construction_func.keys())
259 + "."
260 )
NameError: Wrong sim name jaccard. Allowed values are cosine, msd, pearson, pearson_baseline.