Time-frequency analysis#

In this section we will discuss the representation of signals in time and frequency domain and how it can be useful for analysis.

Fourier Series Expansion#

Let’s revisit the sine wave example. It can be represented by its amplitude, frequency, and phase.

A periodic signal can be decomposed into a sum of sine waves with different amplitudes, frequencies, and phases. For example, below is a decomposition of a square wave.

\[\textrm{square wave} = sin(x) + sin(3x)/3 + sin(5x)/5 + sin(7x)/7 ...\]

Image Source: Wikipedia

The representation can be obtained by calculating the Fourier Series for the function.

Fourier Transform#

The analogue for this representation for discrete signals can be obtained through the Discrete Fourier Transform (DFT), which converts a signal from its time domain representation to its frequency domain representation. Similar to the above, we can interpret this operation as decomposing the signal into many different frequencies.

In practice, we usually use the Fast Fourier Transform (FFT) algorithm to compute the DFT. Let’s look at a couple examples below.

import numpy as np
import scipy as sp
from scipy.fft import fft, fftfreq
import matplotlib.pyplot as plt
# Time base used for all signals in this notebook
fs = 1000
end_time = 1
time = np.arange(fs)/fs  # 1 sec

Sine wave#

For a simple sine wave at 50 Hz, we expect to see just a single peak in the spectrum. However, below we actually see 2 peaks.

What is happening? It turns out that the FFT is symmetric and the maximum positive frequency is half the sampling frequency. This is also the frequency range within which we can accurately reconstruct the signal, based on the Nyquist-Shannon Sampling Theorem.

# Sine wave
freq = 50
cw = np.sin(2*np.pi*freq*time)
fig, (ax1, ax2) = plt.subplots(2)
ax1.plot(time, cw)
ax1.set_xlabel("Time (sec)", fontsize=10)
ax1.set_ylabel("Amplitude", fontsize=10)
ax1.set_title("Time domain")
fig.tight_layout()

markerline, _, _ = ax2.stem(fftfreq(fs, 1/fs), np.abs(fft(cw)), basefmt='b-')
markerline.set_markersize(5)
ax2.set_xlabel("Frequency (Hz)", fontsize=10)
ax2.set_ylabel("|FFT|", fontsize=10)
ax2.set_title("Frequency domain")
ax2.grid()
ax2.set_xticks(np.arange(-400, 451, 100))
fig.tight_layout()
../_images/32eaf13147253a9e32be3f9fec1c3f4dbca055969b948a2d8e18d64f3af6e00a.png

Chirp#

For a chirp signal with frequency sweeping from 50-150 Hz, below we can see that the FFT gives the spectrum showing the frequency content corresponding to this range. Here, only the right half of the symmetric FFT spectrum (the positive frequencies) is plotted.

# Chirp
chirp = sp.signal.chirp(time, f0=50, t1=1, f1=150)
# Chirp FFT
fig, (ax1, ax2) = plt.subplots(2)
ax1.plot(time, chirp)
ax1.set_title("Time domain")
ax1.set_xlabel("Time (sec)", fontsize=10)
ax1.set_ylabel("Amplitude", fontsize=10)
fig.tight_layout()

ax2.plot(fftfreq(fs, 1/fs)[:np.round(fs//2)], np.abs(fft(chirp))[:(fs//2)])
ax2.set_xlabel("Frequency (Hz)", fontsize=10)
ax2.set_ylabel("|FFT|", fontsize=10)
ax2.set_title("Frequency domain")
ax2.grid()
fig.tight_layout()
../_images/9f53d7d9fbbf0131ec09d15fe6310a5b41d6e2058d3259e48fc0915c0fc1308c.png

Note

The oscillatory pattern you see in the spectrum here is due to the sharp change of frequency content, known as the Gibbs phenomenon.

Short-Time Fourier Transform and the spectrogram#

The frequency content of a chirp changes over time. This is hard to see in the pure frequency domain representation shown above. To capture the temporally varying frequency content of a signal, we can split the signal into short segments, compute the FFT for each segment, and plot the magnitude spectum against both time and frequency. This operation is referred to as Short-Term Fourier Transform (STFT) and the result can be visualized as the spectrogram. Note that these segments can be overlapping, which can help capture the transient nature of the signal.

Below we take a look at the spectrogram of a few basic signal types.

Sine wave#

For a simple sine wave, since the frequency is contant, in the spectrogram we can see a horizontal line at 100 Hz.

plt.figure(figsize=(5,3))
s, s_freq, s_t, im = plt.specgram(cw, NFFT=256, Fs=fs, xextent=[time[0],time[-1]])
plt.title("Spectrogram of a sine wave")
plt.xlabel("Time (sec)", fontsize=10)
plt.ylabel("Frequency (Hz)")
plt.show()
../_images/a385a048b300eb5daaad6efa99635532d6d19c97fcefc9abcde9bdaf5cd4ecc1.png

Chirp#

For a chirp, rather than only seeing the spectrum of the entire signal, with the spectrogram we can see the gradual rise of frequency across time.

plt.figure(figsize=(5,3))
s, s_freq, s_t, im = plt.specgram(chirp, NFFT=256, Fs=fs)
plt.title("Spectrogram of a chirp")
plt.xlabel("Time (sec)", fontsize=10)
plt.ylabel("Frequency (Hz)", fontsize=10)
plt.show()
../_images/958eeb1adaa285d3a14f67b355bb9c6e68c81f61031e217a2ec4d4b61edbd674.png

Caution

Looking at the linear increase of frequency over time, we can that the line looks smooth but slightly wiggly. This is a result of the automatic interpolation of the plotting function, which can be turned off to correctly interpret the discrete values.

plt.figure(figsize=(5,3))
s, s_freq, s_t, im = plt.specgram(chirp, NFFT=256, Fs=fs, interpolation='none', mode='magnitude')
plt.title("Spectrogram of a chirp, without interpolation")
plt.xlabel("Time (sec)", fontsize=10)
plt.ylabel("Frequency (Hz)", fontsize=10)
out = plt.xticks(s_t)
../_images/2714b3f7508257710eb971e9b813ba6ca0af15de44210a1dc79214941eeebb00.png

The plt.specgram function has a lot of parameters which are hidden under the hood in the above function call that uses the default values for these parameters. You can see them in the screenshot below from the matplotlib.pyplot.specgram documentation page.

../_images/specgram_matplotlib.png

The key parameters that affect the resolution (in both time and frequency) of the output image are the number of points used to compute the STFT (argument NFFT), and whether these intervals overlap (noverlap). These together determine the number of time and frequency bins of the resulting spectrogram.

For example, in the above plot, the xticks are located at the center of the segment from which the FFT is computed. The spacing between two consecutive xticks is 128/1000 which corresponds to NFFT-noverlap)/N.

Caution

It is crucial to read the function documentation, because the exact implementation and arguments can vary across libraries that have the same method. For example, in addition to matplotlib, scipy also has the sciy.signal.spectrogram function. As you can see below, the arguments are similar but different!

../_images/spectrogram_scipy.png

We can use the widget below to gain a better understanding of how a spectrogram is created.

Impulse#

What about the spectrogram for an impulse? It is very interesting and drives home the message of why we should always read the documentation!

import warnings

impulse = sp.signal.unit_impulse(len(time), round(len(time)/2))
plt.figure(figsize=(5,3))

with warnings.catch_warnings():
    warnings.simplefilter("ignore")  # hide divided by zero warning
    s, s_freq, s_t, im = plt.specgram(
        impulse, NFFT=128, noverlap=0, Fs=fs, interpolation='none'
    )
plt.title("Spectrogram of an Impulse")
plt.xlabel("Time")
plt.ylabel("Frequency (Hz)")
plt.xticks(s_t)

# Only print out every third of the x ticks to avoid overlapping
locs, labels = plt.xticks()
new_labels = [f"{loc:.4f}" if i % 2 == 0 else "" for i, loc in enumerate(locs)]
plt.xticks(locs, new_labels)

plt.show()
../_images/a4d235d6ef34a9c787b6c2397e0ad43aa8350b3eec53af7a8543e0764ca6d3f8.png

Caution

What is going on? By default plt.specgram plots on the log scale (in dB). When the value is zero the resulting plot contains NaNs, which are the empty space in the above plot.

To avoid this problem, we can plot on the linear scale by setting scale='linear'. We should also check the type of spectrum used. By default the function uses mode="psd", a statistical estimate of the spectrum. We can plot the magnitude of the FFT by using mode='magnitude".

impulse = sp.signal.unit_impulse(len(time), round(len(time)/2))
plt.figure(figsize=(5,3))
s, s_freq, s_t, im = plt.specgram(
    impulse, NFFT=128, noverlap=0, Fs=fs, interpolation='none',
    scale='linear', mode='magnitude'
) 
plt.title("Spectrogram of an Impulse")
plt.xlabel("Time")
plt.ylabel("Frequency (Hz)")
plt.xticks(s_t)

# Only print out every third of the x ticks to avoid overlapping
locs, labels = plt.xticks()
new_labels = [f"{loc:.4f}" if i % 2 == 0 else "" for i, loc in enumerate(locs)]
plt.xticks(locs, new_labels)

plt.show()
../_images/d4c8eecca573bbfb21dd4fc1deeb7b7bd398f9aab12fc1dee44272c8bdce841a.png

We see that the spectrogram is positive only at the timestamp of the impulse, and have a frequency content that goes across all frequencies.

An call from an orca#

Now that we have a better understanding of the spectrogram, let’s look at one real example: a call of an orca from the Watkins Marine Mammal Sound Database.

# Listening to the call
import IPython
import IPython.display
IPython.display.Audio("../resources/92301007.wav")
# reading a buzz call
from scipy.io import wavfile
sample_rate, signal = wavfile.read("../resources/watkins_orca_92301007.wav")
plt.figure(figsize=(5,3))
s, s_freq, s_t, im = plt.specgram(signal, NFFT=512, Fs=sample_rate, interpolation='none')
plt.title("Spectrogram of an orca call")
plt.xlabel("Time (sec)", fontsize=10)
plt.ylabel("Frequency (Hz)", fontsize=10)
plt.show()
../_images/c73264761cb1b46505c399f6df4b1e2c00acf10abb8134efd9cf345c980d7e71.png

We can use the same widget as before to get a better idea of the different temporally transient components of the call, by observing how the spectrum changes with time and how the NFFT parameter affects the resolution.