Issue
I'm trying to plot the frequencies that make up the first 1 second of a voice recording.
My approach was to:
- Read the
.wav
file as a numpy array containing time series data - Slice the array from [0:sample_rate-1], given that the sample rate has units of
[samples/1 second]
, which implies thatsample_rate [samples/seconds] * 1 [seconds] = sample_rate [samples]
- Perform a fast fourier transform (fft) on the time series array in order to get the frequencies that make up that time-series sample.
- Plot the the frequencies on the x-axis, and amplitude on the y-axis. The frequency domain would range from
0:(sample_rate/2)
since the Nyquist Sampling Theorem tells us that the recording captured frequencies of at least two times the maximum frequency, i.e2*max(frequency)
. I'll also slice the frequency output array in half since the output frequency data is symmetrical
Here is my implementation
import matplotlib.pyplot as plt
import numpy as np
from scipy.fftpack import fft
from scipy.io import wavfile
sample_rate, audio_time_series = wavfile.read(audio_path)
single_sample_data = audio_time_series[:sample_rate]
def fft_plot(audio, sample_rate):
N = len(audio) # Number of samples
T = 1/sample_rate # Period
y_freq = fft(audio)
domain = len(y_freq) // 2
x_freq = np.linspace(0, sample_rate//2, N//2)
plt.plot(x_freq, abs(y_freq[:domain]))
plt.xlabel("Frequency [Hz]")
plt.ylabel("Frequency Amplitude |X(t)|")
return plt.show()
fft_plot(single_sample_data, sample_rate)
This is the plot that it generated
However, this is incorrect, my spectrogram tells me I should have frequency peaks below the 5kHz range:
In fact, what this plot is actually showing, is the first second of my time series data:
Which I was able to debug by removing the absolute value function from y_freq
when I plot it, and entering the entire audio signal into my fft_plot
function:
...
sample_rate, audio_time_series = wavfile.read(audio_path)
single_sample_data = audio_time_series[:sample_rate]
def fft_plot(audio, sample_rate):
N = len(audio) # Number of samples
y_freq = fft(audio)
domain = len(y_freq) // 2
x_freq = np.linspace(0, sample_rate//2, N//2)
# Changed from abs(y_freq[:domain]) -> y_freq[:domain]
plt.plot(x_freq, y_freq[:domain])
plt.xlabel("Frequency [Hz]")
plt.ylabel("Frequency Amplitude |X(t)|")
return plt.show()
# Changed from single_sample_data -> audio_time_series
fft_plot(audio_time_series, sample_rate)
The code sample above produced, this plot:
Therefore, I think one of two things is going on:
- The fft() function is not actually performing an fft on the time series data it is being given
- The .wav file does not contain time series data to begin with
What could be the issue? Has anyone else experienced this?
Solution
I have replicated, essentially replicated, the code in the question and I don't see the problem the OP has described.
In [172]: %reset -f
...: import matplotlib.pyplot as plt
...: import numpy as np
...: from scipy.fftpack import fft
...: from scipy.io import wavfile
...:
...: sr, data = wavfile.read('sample.wav')
...: print(data.shape, sr)
...: signal = data[:sr,0]
...: Signal = fft(signal)
...: fig, (axt, axf) = plt.subplots(2, 1,
...: constrained_layout=1,
...: figsize=(11.8,3))
...: axt.plot(signal, lw=0.15) ; axt.grid(1)
...: axf.plot(np.abs(Signal[:sr//2]), lw=0.15) ; axf.grid(1)
...: plt.show()
sr, data = wavfile.read('sample.wav')
(268237, 2) 8000
Hence, I'm voting for closing the question because it is "Not reproducible or was caused by a typo".
Answered By - gboffi
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.