diff --git a/evaluator.ipynb b/evaluator.ipynb index 493a5b3792df3252e1e2df57a16e286525b14847..c6e61513a1ea1a1ff88286fa764f36f771de2175 100644 --- a/evaluator.ipynb +++ b/evaluator.ipynb @@ -313,31 +313,31 @@ " <tbody>\n", " <tr>\n", " <th>baseline_1</th>\n", - " <td>1.563822</td>\n", - " <td>1.787365</td>\n", - " <td>0.046729</td>\n", + " <td>1.517749</td>\n", + " <td>1.745787</td>\n", + " <td>0.056075</td>\n", " <td>99.405607</td>\n", " </tr>\n", " <tr>\n", " <th>baseline_2</th>\n", - " <td>1.535869</td>\n", - " <td>1.866364</td>\n", - " <td>0.018692</td>\n", + " <td>1.472806</td>\n", + " <td>1.805674</td>\n", + " <td>0.000000</td>\n", " <td>429.942991</td>\n", " </tr>\n", " <tr>\n", " <th>baseline_3</th>\n", - " <td>0.871233</td>\n", - " <td>1.081468</td>\n", - " <td>0.037383</td>\n", + " <td>0.868666</td>\n", + " <td>1.076227</td>\n", + " <td>0.093458</td>\n", " <td>99.405607</td>\n", " </tr>\n", " <tr>\n", " <th>baseline_4</th>\n", - " <td>0.729477</td>\n", - " <td>0.926489</td>\n", - " <td>0.158879</td>\n", - " <td>60.583178</td>\n", + " <td>0.713063</td>\n", + " <td>0.912046</td>\n", + " <td>0.074766</td>\n", + " <td>60.349533</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", @@ -345,10 +345,10 @@ ], "text/plain": [ " mae rmse hit_rate novelty\n", - "baseline_1 1.563822 1.787365 0.046729 99.405607\n", - "baseline_2 1.535869 1.866364 0.018692 429.942991\n", - "baseline_3 0.871233 1.081468 0.037383 99.405607\n", - "baseline_4 0.729477 0.926489 0.158879 60.583178" + "baseline_1 1.517749 1.745787 0.056075 99.405607\n", + "baseline_2 1.472806 1.805674 0.000000 429.942991\n", + "baseline_3 0.868666 1.076227 0.093458 99.405607\n", + "baseline_4 0.713063 0.912046 0.074766 60.349533" ] }, "execution_count": 4, @@ -375,6 +375,24 @@ "evaluation_report = create_evaluation_report(EvalConfig, sp_ratings, precomputed_dict, AVAILABLE_METRICS)\n", "export_evaluation_report(evaluation_report)" ] + }, + { + "cell_type": "markdown", + "id": "9fbf23fd", + "metadata": {}, + "source": [ + "Analyzing the provided data on different baselines, several observations can be made across various metrics.\n", + "\n", + "Firstly, looking at the Mean Absolute Error (MAE), baseline_4 stands out with the lowest value of 0.713063, indicating superior accuracy in predictions compared to the other baselines. Following closely behind is baseline_3 with a MAE of 0.868666, showcasing commendable precision in its predictions.\n", + "\n", + "Next, considering the Root Mean Square Error (RMSE), baseline_4 again exhibits the best performance with a value of 0.912046, suggesting minimal overall prediction errors. Baseline_3 maintains strong performance here as well, with an RMSE of 1.076227.\n", + "\n", + "Examining the Hit Rate, baseline_3 leads the pack with 9.35%, signifying a higher success rate in recommendations compared to the other baselines. Meanwhile, baseline_1 and baseline_4 show lower hit rates at 5.61% and 7.48% respectively.\n", + "\n", + "Lastly, looking at the Novelty metric, baseline_4 scores the lowest at 60.35, indicating that its recommendations are less novel or more conventional compared to the others. On the other hand, baseline_1 scores the highest in novelty at 99.41, implying that its recommendations are more diverse or less conventional.\n", + "\n", + "In summary, baseline_4 appears to excel in several metrics including MAE, RMSE, and maintaining relatively low novelty. Baseline_3 stands out with a higher hit rate, showcasing effectiveness in recommendation success. Baseline_2, despite not excelling in the other metrics, exhibits an exceptionally high novelty score, indicating a unique approach to recommendations compared to the rest." + ] } ], "metadata": {