diff --git a/README.md b/README.md index 76390b13c448810a58b60cba77d4ac2d91469dd0..b77ffff2ffd8cc8bab08f1a4816376a88b006b9a 100644 --- a/README.md +++ b/README.md @@ -63,6 +63,8 @@ The project is organized into the following key components: ### Features and Models +***models.py*** and ***content_based.ipynb*** + 1. ***Feature Extraction Methods*** The system supports the following feature extraction methods: - `genre`: Extracts genres of the movies using TF-IDF vectorization. @@ -106,6 +108,25 @@ Definition of a dictionary containing available metrics for different evaluation Loading data, generating evaluation reports for different models, and saving experimental outcomes. +### Hackathon_make_predictions.ipynb + +It defines a function make_hackathon_prediction that takes feature_method and regressor_method as input. + +Inside this function: + + - It loads the training data and converts it into the format suitable for Surprise. + + - Trains a Content-Based model (ContentBased) on the training set using the specified feature and regressor methods. + + - Makes predictions on the test set by loading the test data from a CSV file and converting it into records. + +Converts the predictions into a DataFrame and saves them as a CSV file. + +It then calls this function with specific parameters and prints the generated predictions. + + + + ### Dataset Organized under the data/test/ directory, which contains three subdirectories: - content: Contains movies.csv & tags.csv files.