Commit fa160784 authored by Olivier Leblanc's avatar Olivier Leblanc
Browse files

up 22-23

parent a1de7063
This diff is collapsed.
This diff is collapsed.
%% Cell type:code id: tags:
``` python
# !pip install sklearn
```
%% Cell type:code id: tags:
``` python
import numpy as np
import matplotlib.pyplot as plt
import soundfile as sf
from scipy import signal
import sounddevice as sd
"Machine learning tools"
from sklearn.neighbors import KNeighborsClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
from sklearn.decomposition import PCA
import pickle
"Self created functions"
from utils_ import getclass, getname, gen_allpath, plot_audio, plot_specgram, get_accuracy, show_confusion_matrix, plot_decision_boundaries
from AudioUtil_And_Dataset_student import AudioUtil, SoundDS
```
%% Cell type:markdown id: tags:
Useful functions to select, read and play the dataset sounds are provided in the ``utils_`` and ``AudioUtil_And_Dataset`` folder. <br>
As for the H1, you will have to fill some short pieces of code, as well as answer some questions. We already created cells for you to answer the questions to ensure you don't forget it ;). <br>
You will find the zones to be briefly filled with a ``### TO COMPLETE`` in the cells below.
<font size=6 color=#009999> 3. Probability vector and memory [~30min-1h] </font> <br>
%% Cell type:code id: tags:
``` python
### TO COMPLETE
path2dataset = ... # Write your path to the dataset here!
allclassnames, allpath_mat = gen_allpath(path2dataset)
"Select only some classes for the classification"
sel_class = [12,14,40,41,49]
nonsel_class = np.delete(np.arange(allpath_mat.shape[0]), sel_class)
allpath_sel = np.array([allpath_mat[idx,:] for idx in sel_class])
classnames = np.array([allclassnames[idx] for idx in sel_class])
sel_class_ids = np.arange(len(sel_class))
all_sel_class_ids = np.repeat(sel_class_ids, 40)
data_path = allpath_sel.reshape(-1)
print('The selected classes are {}\n'.format(classnames))
allpath_nonsel = np.array([allpath_mat[idx,:] for idx in nonsel_class])
classnames_nonsel = np.array([allclassnames[idx] for idx in nonsel_class])
nonsel_class_ids = np.arange(len(nonsel_class))
all_nonsel_class_ids = np.repeat(nonsel_class_ids, 40)
data_path_nonsel = allpath_nonsel.reshape(-1)
```
%% Cell type:markdown id: tags:
<font size=5 color=#009999> 3.1. Probability vector </font> <br>
A clear drawback of the models considered in ``H3a_audio.ipynb`` is that they only output the most probable class, but do not provide any confidence estimate of this prediction. It is generally better to output a vector of probabilities for all the classes at each prediction, hence allowing the models to hesitate between different classes.
Remember a vector of probability can be defined as
\begin{equation*}
\mathbb P \{ i \} \in [0,1], ~~\sum_i \mathbb P \{ i \} = 1.
\end{equation*}
There are many ways to do so:
- **Adapt the models**, e.g. for the ``KNN`` classifier, give the probability of class ``i`` as the ratio between the number of neighbours with label ``i`` and the total number of considered neighbours ``K``.
- **Use other models**: a ``CNN`` classifier is suited for outputting probability values for each class.
- **Compare with old predictions**: the probability of class ``i`` may simply be given as the ratio of its appearances in the (arbitrarily chosen) ``N`` last predictions.
The last bullet implies the use of old predictions to compute a probability estimate. This leads to the notion of memory in the predictions, that we discuss in the second part of this notebook.
%% Cell type:markdown id: tags:
Let us start by creating a dataset ``myds`` and taking the model trained in ``H3a_audio_student.ipynb``.
Don't forget to normalize your feature vectors as well as reduce their dimensionality if you trained your model with such data.
%% Cell type:code id: tags:
``` python
### TO COMPLETE - Uncomment the following line
# model_knn = pickle.load(open(..., 'rb')) # Write your path to the model here!
normalize = True
pca = None
"Creation of the dataset"
myds = SoundDS(all_sel_class_ids, Nft=512, nmel=20, duration=750, shift_pct=0.0, data_path=data_path, allpath_mat=allpath_mat, normalize=normalize, pca=pca)
```
%% Cell type:markdown id: tags:
Open-Source code makes life easy! The ``KNeighborsClassifier`` from Sklearn already contains a ``predict_proba`` method. Start getting some intuition on this probability vector by playing with the chosen feature vector. <br>
Run the following code many times by changing ``ind``. <br>
%% Cell type:code id: tags:
``` python
### TO RUN
ind = 0
myds.display(ind)
thisfv = myds[ind][0]
prediction_knn = model_knn.predict([thisfv])
print('Class predicted by the model:', classnames[prediction_knn][0])
proba_knn = model_knn.predict_proba([thisfv])
plt.figure()
plt.bar(classnames, proba_knn[0])
plt.title('Probability of each class')
plt.show()
```
%% Cell type:markdown id: tags:
**Question**:
When the classifier miss-predicts, how distributed is the probability vector? Is it good news? How can we exploit that probability distribution for the prediction?
%% Cell type:code id: tags:
``` python
### TO COMPLETE
# Answer the question above
```
%% Cell type:markdown id: tags:
Play again with ``ind`` and see how distributed is the probability vector of the KNN classifier on feature vectors coming from sounds that are not part of the training data. <br>
Don't forget to normalize your feature vectors as well as reduce their dimensionality if you trained your model with such data.
%% Cell type:code id: tags:
``` python
### TO RUN
myds_nonsel = SoundDS(all_nonsel_class_ids, Nft=512, nmel=20, duration=750, shift_pct=0.0, data_path=data_path_nonsel, allpath_mat=allpath_mat, normalize=normalize, pca=pca)
ind = 1 # 1 (default)
myds_nonsel.display(ind)
thisfv = myds_nonsel[ind][0]
prediction_knn = model_knn.predict([thisfv])
print('Class predicted by the model:', classnames[prediction_knn][0])
proba_knn = model_knn.predict_proba([thisfv])
plt.figure()
plt.bar(classnames, proba_knn[0])
plt.title('Probability of each class')
plt.show()
```
%% Cell type:markdown id: tags:
### Question:
Is the classification model confident in its predictions?
%% Cell type:code id: tags:
``` python
### TO COMPLETE
# Answer the questions above
```
%% Cell type:markdown id: tags:
<font size=6 color=#009999> 3.2. Memory </font> <br>
No matter if the predictions are one class only or probability vectors, as it is natural that consecutive feature vectors belong to the same class if the sound type changes slowlier than the duration of a feature vector, it can be helpful to link consecutive predictions and see how similar they are to either strengthen or decrease our confidence in the current guess. <br>
Here, we will compare the predictions made on consecutive feature vectors belonging to the same 5s-long sound.
Run the following code with different ``class_id``'s and different ``num``.
%% Cell type:code id: tags:
``` python
### TO RUN
class_id = 14 # 12:fire, 14:bird (default), 40:helicopter, 41:chainsaw, 49:handsaw
num = 0 # 0 (default), 1, ..., 39
bird_sound = allpath_mat[class_id,num]
aud = AudioUtil.open(bird_sound)
aud = AudioUtil.resample(aud, 11025)
AudioUtil.play(aud)
"Bar charts for each window"
n_win = 5
probs = np.zeros((n_win, len(classnames)))
for window in range(n_win):
sub_aud = (aud[0][window*11025:], aud[1])
sub_aud = AudioUtil.pad_trunc(sub_aud, 750)
sgram = AudioUtil.melspectrogram(sub_aud, Nmel=20)
ncol = int(1000*11025 /(1e3*512))
sgram = sgram[:, :ncol]
fv = sgram.reshape(-1)
### TO COMPLETE - Eventually normalize and reduce feature vector dimensionality
probs[window,:] = model_knn.predict_proba([fv])[0]
"Mean bar chart"
plt.figure()
for window in range(n_win):
plt.bar(np.arange(len(classnames))*2*n_win+window, probs[window,:], alpha=0.9, label='Window {}'.format(window))
plt.legend()
plt.gca().set_xticks(np.arange(len(classnames))*2*n_win+2)
plt.gca().set_xticklabels(classnames)
plt.show()
plt.figure()
plt.bar(np.arange(len(classnames)), np.mean(probs, axis=0))
plt.gca().set_xticks(np.arange(len(classnames)))
plt.gca().set_xticklabels(classnames)
plt.show()
```
%% Cell type:markdown id: tags:
### Question:
Are the bar plots similar between the 5 windows of the same 5s-long sound? With the default sound, how often does the right class win?
%% Cell type:code id: tags:
``` python
### TO COMPLETE
# Answer the question above
```
%% Cell type:markdown id: tags:
If it is relevant to combine $N$ consecutive feature vectors, there are many ways to output a prediction from it:
- **Naive**: select the class that has the highest probability among all the considered feature vectors.
- **Majority voting**: simply select the class that appears most often as the maximum probability of a feature vector.
- **Average the feature representation**: compute the average of all feature vectors and classify from this average.
- **Maximum Likelihood**: take a probabilistic approach and consider selecting class $i$ as
$$
\text{argmax}_i~ \log \big(\prod_{n=0}^{N-1} P(y[n]=i) \big)
= \text{argmax}_i~ \sum_{n=0}^{N-1} \log P(y[n]=i)
$$
with $y[n]$ the model prediction for the feature vector $n$.
It will be part of your work to decide how you want to exploit the time information in your predictions.
%% Cell type:markdown id: tags:
Now you have all the necessary material to test a new classification model and make some objectives analysis of its performances. <br>
Follow the instructions on Moodle [written here](https://moodle.uclouvain.be/mod/assign/view.php?id=204607) to see what is expected in your ``sixth report (R6)``. <br>
A lot of other classification models are already implemented by SKlearn, check the [SKlearn API](https://scikit-learn.org/stable/supervised_learning.html#supervised-learning). Don't hesitate to read some opinions and discussions on the forums or even articles to help you in the model choice. The most motivated of you are even allowed to give a try to more than one additional model, it is time smartly invested for the upcoming weeks of this project! Although we expect only one characterization in the ``R6``.
Also, don't hesitate to get information from the Internet to learn how people use to deal with sound classification. We mention for example this idea of transfer learning that could be interesting for you in the second semester: https://www.youtube.com/watch?v=uCGROOUO_wY&t=1s
......@@ -3,21 +3,21 @@
# Modified for documentation by Jaques Grobler
# License: BSD 3 clause
import numpy as np
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.colors import ListedColormap
from sklearn.datasets import make_circles, make_classification, make_moons
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis
from sklearn.ensemble import AdaBoostClassifier, RandomForestClassifier
from sklearn.gaussian_process import GaussianProcessClassifier
from sklearn.gaussian_process.kernels import RBF
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import make_moons, make_circles, make_classification
from sklearn.neural_network import MLPClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.gaussian_process import GaussianProcessClassifier
from sklearn.gaussian_process.kernels import RBF
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis
names = [
"Nearest Neighbors",
......@@ -96,12 +96,14 @@ for ds_cnt, ds in enumerate(datasets):
clf.fit(X_train, y_train)
score = clf.score(X_test, y_test)
# Plot the decision boundary.
# Plot the decision boundary.
n = 80
vec = np.linspace(-3,3,n)
vec = np.linspace(-3, 3, n)
Xtmp = np.meshgrid(vec, vec)
Xtmp2 = np.array(Xtmp).reshape(2,n**2).T
ax.contourf(Xtmp[0], Xtmp[1], clf.predict(Xtmp2).reshape(n,n), cmap=cm, alpha=0.8)
Xtmp2 = np.array(Xtmp).reshape(2, n**2).T
ax.contourf(
Xtmp[0], Xtmp[1], clf.predict(Xtmp2).reshape(n, n), cmap=cm, alpha=0.8
)
# Plot the training points
ax.scatter(
......@@ -133,4 +135,4 @@ for ds_cnt, ds in enumerate(datasets):
i += 1
plt.tight_layout()
plt.show()
\ No newline at end of file
plt.show()
import os
import numpy as np
import matplotlib.pyplot as plt
import numpy as np
"For confusion matrix plot"
from seaborn import heatmap
from sklearn.metrics import confusion_matrix
......@@ -15,40 +17,44 @@ Synthesis of the functions in :
- plot_specgram : Plot a spectrogram (2D matrix).
- get_accuracy : Compute the accuracy between prediction and ground truth.
- show_confusion_matrix : Plot confusion matrix.
- plot_decision_boundary : Plot decision boundary of a classifier.
"""
# ----------------------------------------------------------------------------------
def getclass ( sound, format='.ogg' ) :
def getclass(sound, format=".ogg"):
"""Get class name of a sound path directory.
Note: this function is only compatible with ESC-50 dataset path organization.
"""
L = len(format)
folders = sound.split(os.path.sep)
if (folders[-1][-L:] == format):
return folders[-2].split('-')[1]
if folders[-1][-L:] == format:
return folders[-2].split("-")[1]
else:
return folders[-1].split('-')[1]
return folders[-1].split("-")[1]
def getname ( sound ) :
def getname(sound):
"""
Get name of a sound from its path directory.
"""
return os.path.sep . join(sound.split(os.path.sep)[-2:])
return os.path.sep.join(sound.split(os.path.sep)[-2:])
def gen_allpath( folder=r'Dataset_ESC-50'):
def gen_allpath(folder=r"Dataset_ESC-50"):
"""
Create a matrix with path names of height H=50classes and W=40sounds per class
and an array with all class names.
"""
classpath = [ f.path for f in os.scandir(folder) if f.is_dir() ]
classpath = [f.path for f in os.scandir(folder) if f.is_dir()]
classpath = sorted(classpath)
allpath = [None]*len(classpath)
classnames = [None]*len(classpath)
allpath = [None] * len(classpath)
classnames = [None] * len(classpath)
for ind, val in enumerate(classpath):
classnames[ind] = getclass(val).strip()
sublist = [None]*len([ f.path for f in os.scandir(val)])
sublist = [None] * len([f.path for f in os.scandir(val)])
for i, f in enumerate(os.scandir(val)):
sublist[i] = f.path
allpath[ind] = sublist
......@@ -56,73 +62,89 @@ def gen_allpath( folder=r'Dataset_ESC-50'):
allpath = np.array(allpath)
return classnames, allpath
def plot_audio(audio,audio_down,fs,fs_down):
def plot_audio(audio, audio_down, fs, fs_down):
"""
Plot the temporal and spectral representations of the original audio signal and its downsampled version
"""
M = fs//fs_down # Downsampling factor
M = fs // fs_down # Downsampling factor
L = len(audio)
L_down = len(audio_down)
frequencies = np.arange(-L//2,L//2,dtype=np.float64)*fs/L
frequencies_down = np.arange(-L_down//2,L_down//2,dtype=np.float64)*fs_down/L_down
frequencies = np.arange(-L // 2, L // 2, dtype=np.float64) * fs / L
frequencies_down = (
np.arange(-L_down // 2, L_down // 2, dtype=np.float64) * fs_down / L_down
)
spectrum = np.fft.fftshift(np.fft.fft(audio))
spectrum_down = np.fft.fftshift(np.fft.fft(audio_down))
fig = plt.figure(figsize=(12, 4))
ax1 = fig.add_axes([0.0, 0.0, 0.42, 0.9])
ax2 = fig.add_axes([0.54, 0.0, 0.42, 0.9])
ax1.plot(np.arange(L)/fs, audio, 'b', label='Original')
ax1.plot(np.arange(L_down)/fs_down, audio_down, 'r', label='Downsampled')
ax1.plot(np.arange(L) / fs, audio, "b", label="Original")
ax1.plot(np.arange(L_down) / fs_down, audio_down, "r", label="Downsampled")
ax1.legend()
ax1.set_xlabel("Time [s]")
ax1.set_ylabel("Amplitude [-]")
ax1.set_title('Temporal signal')
ax1.set_title("Temporal signal")
ax2.plot(frequencies, np.abs(spectrum), 'b', label='Original')
ax2.plot(frequencies_down, np.abs(spectrum_down)*M, 'r', label='Downsampled', alpha=0.5) # energy scaling by M
ax2.plot(frequencies, np.abs(spectrum), "b", label="Original")
ax2.plot(
frequencies_down, np.abs(spectrum_down) * M, "r", label="Downsampled", alpha=0.5
) # energy scaling by M
ax2.legend()
ax2.set_xlabel("Frequency [Hz]")
ax2.set_ylabel("Amplitude [-]")
ax2.set_title('Modulus of FFT')
ax2.set_title("Modulus of FFT")
plt.show()
plt.figure(figsize=(12, 4))
plt.plot(np.arange(L)/fs, audio, 'b', label='Original')
plt.plot(np.arange(L_down)/fs_down, audio_down, 'r', label='Downsampled')
plt.legend()
plt.xlabel("Time [s]")
plt.ylabel("Amplitude [-]")
plt.title('Zoom on Temporal signal')
plt.xlim([0,0.0025])
def plot_specgram(specgram, ax, is_mel=False, title=None, xlabel='Time [s]', ylabel='Frequency [Hz]', cmap='jet', cb=True, tf=None, invert=True):
def plot_specgram(
specgram,
ax,
is_mel=False,
title=None,
xlabel="Time [s]",
ylabel="Frequency [Hz]",
cmap="jet",
cb=True,
tf=None,
):
"""Plot a spectrogram (2D matrix) in a chosen axis of a figure.
Inputs:
- specgram = spectrogram (2D array)
- ax = current axis in figure
- title
- title
- xlabel
- ylabel
- cmap
- cb = show colorbar if True
- tf = final time in xaxis of specgram
"""
if (tf is None):
if tf is None:
tf = specgram.shape[1]
if (is_mel):
ylabel='Frequency [Mel]'
im = ax.imshow(specgram, cmap=cmap, aspect='auto', extent=[0,tf,specgram.shape[0],0])
if is_mel:
ylabel = "Frequency [Mel]"
im = ax.imshow(
specgram,
cmap=cmap,
aspect="auto",
extent=[0, tf, specgram.shape[0], 0],
origin="lower",
)
else:
im = ax.imshow(specgram, cmap=cmap, aspect='auto', extent=[0,tf,int(specgram.size/tf),0])
if (invert):
ax.invert_yaxis()
im = ax.imshow(
specgram,
cmap=cmap,
aspect="auto",
extent=[0, tf, int(specgram.size / tf), 0],
origin="lower",
)
fig = plt.gcf()
if (cb):
if cb:
cbar = fig.colorbar(im, ax=ax)
# cbar.set_label('log scale', rotation=270)
ax.set_xlabel(xlabel)
......@@ -130,22 +152,56 @@ def plot_specgram(specgram, ax, is_mel=False, title=None, xlabel='Time [s]', yla
ax.set_title(title)
return None
def get_accuracy (prediction, target):
"""
def get_accuracy(prediction, target):
"""
Compute the accuracy between prediction and ground truth.
"""
return np.sum( prediction==target)/len(prediction)
return np.sum(prediction == target) / len(prediction)
def show_confusion_matrix (y_predict, y_true2, classnames):
def show_confusion_matrix(y_predict, y_true2, classnames, title=""):
"""
From target and prediction arrays, plot confusion matrix.
The arrays must contain ints.
"""
confmat = confusion_matrix(y_true2, y_predict, labels=np.arange(np.max(y_true2)+1))
heatmap(confmat.T, square=True, annot=True, fmt='d', cbar=False, xticklabels=classnames, yticklabels=classnames)
plt.xlabel('True label')
plt.ylabel('Predicted label')
confmat = confusion_matrix(
y_true2, y_predict, labels=np.arange(np.max(y_true2) + 1)
)
heatmap(
confmat.T,
square=True,
annot=True,
fmt="d",
cbar=False,
xticklabels=classnames,
yticklabels=classnames,
)
plt.xlabel("True label")
plt.ylabel("Predicted label")
plt.title(title)
plt.show()
return None
\ No newline at end of file
return None
def plot_decision_boundaries(X,y,model, ax=None, legend='',title='', s=20, N=40, cm='brg', edgc='k'):
"""
Plot decision boundaries of a classifier in 2D, and display true labels.
"""
if ax is None:
fig = plt.figure(figsize=(4,4))
ax = fig.add_axes([0.0,0.0,0.9,1.0])
ax.set_aspect('equal', adjustable='box')
# Plot the decision boundary.
n = 80
vec = np.linspace(np.min(X),np.max(X),n)
Xtmp = np.meshgrid(vec, vec)
Xtmp2 = np.array(Xtmp).reshape(2,n**2).T
ax.contourf(Xtmp[0], Xtmp[1], model.predict(Xtmp2).reshape(n,n), cmap=cm, alpha=0.5)
scatterd = ax.scatter(X[:,0],X[:,1], c=y, cmap=cm, edgecolors=edgc, s=s)
ax.set_title(title)
ax.set_xlabel('$x_1$')
ax.set_ylabel('$x_2$')
handles, labels = scatterd.legend_elements(prop="colors")
ax.legend(handles, legend)
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment