A sinusoidal wave that has no DC offset will have an instantanous amplitude that will spend half the time positive and and half negative. So to get the full extent of the variations in the wave, you will want to at least sample twice per period to see the min and max extent of the wave amplitude. Sampling less than that risks missing those extremum, and in that case the sampled points won't show you the true period of the wave you are sampling.
For instance, let's say you have a wave with frequency $f=100$Hz. It will have a period $T=1/f=10$ms. To sample the wave and see all the highs and lows, let's say you sample with a sampling period $T_s=1ms$, which means you will take 10 samples per period of the wave.
You increase the sampling period and still see the extremum, but if the sample period becomes more than half the wave period, you will miss an extremum. For instance in this case, the sample period is $\frac{3}{4}$ of the wave period:
To get an idea of what the aliasing frequency would be, let's first define $\delta f$ as the difference between the sampling frequency $f_s$ and the wave frequency $f$: $$\delta f \equiv f_s - f\label{df}$$ We have chosen a sampling frequency that is greater than the wave frequency, but not greater than $2f$ which would satisfy $\ref{nyquist}$. So $\delta f>0$.
The sampled points are at times $t_n = nt_s$ where $t_s$ is the time between samples, and $n$ is the $n^{th}$ sample. So $$\begin{align} y[n] &= \sin(2\pi f\cdot t_n)\nonumber \\ &= \sin(2\pi f\cdot nt_s)\nonumber \\ &= \sin(2\pi f\cdot \frac{n}{f_s})\nonumber \\ &= \sin(2\pi n\cdot \frac{f}{f_s})\nonumber \end{align}\nonumber$$ We want to find the wave that goes through the points at $nt_s$, and if we find it, then the points $y[n]$ should be defined by something like $\sin(2\pi f_a\cdot t_n)$ without being a function of $f$. So let's use $\ref{df}$, which can be rewritten as $f = f_s - \delta f$ and substitute in the above equation for $y[n]$ to get $$\begin{align} y[n] &= \sin(2\pi n\cdot \frac{f}{f_s})\nonumber \\ &= \sin(2\pi n\cdot \frac{f_s-\delta f}{f_s})\nonumber \\ &= \sin(2\pi n - 2\pi n\cdot \frac{\delta f}{f_s})\nonumber \\ &= -\sin(2\pi n\cdot \frac{\delta f}{f_s}) = -\sin(2\pi \delta f\cdot \frac{n}{f_s}) = -\sin(2\pi \delta f\cdot nt_s)\nonumber\\ &= -\sin(2\pi \delta f\cdot t_n)\nonumber \end{align}\nonumber$$ We can see that the points for when the sampling frequency violates the Nyquist condition will follow a curve defined by $\delta f\lt f_s$, so we redefine $\delta f$ to be the aliasing frequency $fa = f_s - f$. And, the alias wave will be $180\deg$ out of phase with the sampled wave. So in our case above, we have a wave with frequency $f=100$Hz, which would make the Nyquist frequency $f_{Nyquist}=2f=200$Hz. Our sampling frequency period $T_s=\frac{3}{4}T$ so $f_s=\frac{4}{3}f$, violating the Nyquist condition $f_s>2f$. The alias frequency will be $$\begin{align} f_a &= f_s - f\nonumber \\ &= f_s - \frac{3}{4}f_s \nonumber \\ &= \frac{1}{4}f_s\nonumber \end{align}\nonumber$$
Waves that are sampled with sampling frequencies satisfying the Nyquist condition $\ref{nyquist}$ can be Fourier analyzed to show the peak at the wave frequency. If $\ref{nyquist}$ is violated, and for our example the violation is restricted to the region $\half T\lt T_s\lt T$, or equivalently $f\lt f_s \lt 2f$, the sampled wave when Fourier analyzed will show a wave that has a frequency $f_a = f_s - f$. In the plots above, we had a wave with $f=100$Hz and a sampling period $T_s=\frac{3}{4}T$, or $f_s = \frac{4}{3}f=133.3$Hz. So the alias frequency will be $f_a=f_s-f=133.3-100$=33.3$Hz. To see this in the above plot, click to see the alias wave with $f_a=33.3$Hz superimposed, out of phase by $180\deg$ with the sampled wave. The aliased wave goes through all of the points as expected.
In the simulation below, set the duration, wave frequency, and sampling frequency (presets will work to show aliasing) and hit "Plot". You will see the wave with sampling (red squares), and the aliased wave drawn through the points as a dashed line.
Duration (sec) | Wave frequency (Hz) | Sampling frequency (Hz) |
---|---|---|
[Wave, Sampling, Alias]
In the plot below, you can set the sampling frequency (preset to 100Hz) and the number of Nyquist zones (separated by dashed yellow lines, and labeled). Then hit the "Plot" button and it will show you the empty plot with the Nyquist zone boundaries in yellow and the sampling frequency in purple. Click anywhere and it will select a frequency $f$, and plot the fourier series for a waveform at that frequency,. If you click in the 1st Nyquist zone, the fourier peak will coincide with where you clicked. If you click in any other Nyquist zone, the fourier peak will be in the 1st Nyquist zone, and an arrow will tell you where you clicked, illustrating aliasing.
You can see some very interesting behavior of where the fourier series peak comes out (always in the 1st Nyquist zone). This behavior is called "folding" and is explained below. Below the plot you will see the waveform plotted with the samples as markers. The range might need some zooming to see the aliasing effect (the plot library used is "plotly", which is very interactive, you can change the range easily with the mouse).
Let "n" be the Nyquist zone, $n=1, 2, 3...$. Then the boundaries of the Nyquist zone $n$ defines the frequency interval $$\big[(n-1)f_N,nf_N\big]\nonumber$$ The zone $n$ can be calculated from the waveform frequency $f$ and the Nyquist frequency $f_N=f_s/2$ via: $$n = int(1+\frac{f}{f_N})\nonumber$$ where $int$ means round down to the nearest integer ("floor" in Python).
For odd Nyquist zones (1, 3, 5...), the relationship between the alias frequency $f_a$ and the wave frequency $f$ at a fixed Nyquist frequency $f_N$ will be given by $$f_a = f - \frac{n-1}{2}f_s\nonumber$$ and for even zones (2, 4, 6...), the alias frequency $f_a$ is given by $$f_a = -f + \frac{n}{2}f_s\nonumber$$ A more elegant way to write this is to first define $f_d\equiv nf_N - f$ as the zone dependent difference between the Nyquist frequency $f_N$ and the waveform frequency $f$. Then we can write the alias frequencies for the even $f_{even}$ and odd $f_{odd}$ Nyquist zones as: $$\begin{align} f_{even} &= f_d\nonumber \\ f_{odd} &= f_N - f_d\nonumber \end{align}\nonumber$$
So for the 1st Nyquist zone, $f_d=f_N-f$ and $f_{1}=f_N-(f_N-f)=f$ as expected, no aliasaing! And for the 2nd Nyquist zone, $f_a = 2f_N-f=f_s-f$ as before.
One important thing to point out: imagine you have a 50MHz signal, and you sample it with an ADC sampling at 200M samples per second (200MSps). Your signal satisfies the Nyquist condition, so the fourier analysis will show a peak at 50MHz. However, if your signal frequency was 150MHz and you sampled with a 200MSps ADC, you will see the aliasing, resulting also in a fourier peak at 50MHz. And so you won't know if you were digitizing a 50MHz or a 150MHz (or even greater) signal. This is why you always want to filter your signal before digitizing, eliminating frequencies above the Nyquist limit. For instance, in audio, we know that sound (that humans can hear) is limited to 20kHz or less, which is why audio digitizers run at frequencies above 40kHz. In fact, most audio digitizers use 44.1kHz, a value that had to do with peculiarities of the technology limitations of the pre CD era (before $\sim$1990) where signals were all stored on magnetic tape in either PAL or NTSC format (see https://en.wikipedia.org/wiki/44%2C100_Hz). What good audio digitizers will do is to first pass the analog signal through a filter that will filter out any incoming frequencies above 22.05kHz, the Nyquist frequency, so that if they are there they won't produce any alias signals that will pollute the real audio.
There's another thing you can to actually take advantage of aliasing. Imagine that you have an incoming signal with $25\lt f\lt 50$MHz, and you want to digitize it. You will need an ADC that can run at at least $100$MHz to satisfy the Nyquist condition $\ref{nyquist}$. But what if you don't have access to an ADC that can run that fast with the dynamic range you need? If you used a $50$MHz ADC, then your incoming signal will be in the 2nd Nyquist zone, which means it will be aliased into the 1st Nyquist zone. What you would do in this case would be to pass the incoming signal through a filter that would filter the low frequencies $0-25$MHz, then pass them through the $50$MHz ADC, and fourier analyze. The spectrum will show up, but the actual frequencies will be reversed for the 1st Nyquist zone. That is, $26$MHz will show up as $f_a=f_s-f=50-26=24$MHz, and $49$MHz will show up as $1$MHz. So you would have to reverse the fourier components to recover the incoming signal. This is doable, but you will have to be careful. Better to just buy a better ADC in the first place!
Digitization
In the above section on aliasing, we sampled the incoming wave with a sampling frequency $f_s$ taking the value of the wave (the voltage) at each time $t_n = nt_s=n/f_s$ where $n$ is the nth voltage. Then we fourier analyzed the array of voltages. But in the real world, we would measure the voltage at each time $t_n$ with an analog-to-digital converter, or ADC. This device takes an analog voltage input and turns that voltage into a number. Let's assume that the ADC has a range between a minimum voltage $V_{min}$ (any input voltage less than $V_{min}$ is registered as $V_{min}$) and a maximum voltage $V_{max}$ (any input voltage that is greater than $V_{max}$ is registered as $V_{max}$. The difference between these voltages is the full-scale of the ADC, $V_{FS}=V_{max}-V_{min}$. For a bipolar ADC, $V_{min}=-V_{max}$, so the ADC can see a periodic sine wave with an amplitude $A\le V_{max}$. The number that the ADC produces will have a finite number of bits $N$ (determined by the manufacturer).
For instance if the ADC is a 3-bit ADC ($N=3$), then that means the ADC will deliver 8 possible values, between 0 and 7. Let's say that the ADC range is between $\pm 1$V, so $V_{min}=-1$V and $V_{max}=+1$V, which gives $V_{FS}=2$V (peak to peak). For an ADC with 3 bits, there are 8 different levels, and a peak-to-peak full scale range of $2$V, so each value of the ADC (each level) will mean a different voltage, with a resolution given by $\Delta=V_{FS}/2^N = 2/8=0.25$V. Any ADC always returns an integer between $0$ and $2^N-1$, so our 3-bit ADC will return an (unsigned) integer between $0$ and $7$.
The diagram below shows the boundaries for each of the 8 steps for a 3-bit ADC, and for each step the return code (in binary) and the voltage at the center of the step, which has a width in voltage of $\Delta = 0.25$V. There are $8$ codes, $0$ to $7$ ($000$ to $111$ in binary), each code pointing to one of the $8$ steps.
As you can see, an input voltage of $0.0$V will result in either a code of 3, meaning that the voltage you use will be $-0.125\pm 0.072$V, or a code of 4, or $+0.125\pm 0.072$V. Which means that an input of $0.0$V will not result in an ADC returning $0.0$V but instead either a negative or positive result depending on whether the the edge is $-0.25\lt V\le 0.0$ or $0.0\le V\lt 0.25$. In other words, whether it's the left or right bin edge that is inclusive. The disadvantage of this is that a very small sinusoid centered at zero won't map to zero most of the time, but will flip between $\pm\Delta/2$, which can create spurious tones which can have an impact in audio or baseband communications where one would use a bipolar ADC.
There is an alternative version of the coding called "mid-tread" that is constructed such that an input voltage of $0.0$V will yield a code that points to an interval where the bin center is exactly $0.0$V. This is conceptually easy to do: just shift the above diagram over by half a bin width!
A common thing for mid-tread schemes is to have a return code that is a 2's complement integer. Since 2's complement is also asymmetric about 0, this is a good match. Some mid-tread schemes use unsigned integers, and some use 2's complement. Just depends on what the manufacturer built.
The dynamic range of an ADC is measured in decibels (dB), and is defined as the ratio of the largest representable RMS signal power to the smallest representable by the log of the difference between the $$DR = 20\log_{10}\frac{V_{max}}{V_{min}}\label{drange}$$ The fact that the dynamic range $DR$ has a 20 instead of a 10 is due to the fact that the $DR$ is the log of the ratio of voltages, but the power $P\sim V^2$ so we can write equation $\ref{drange}$ as $$DR = 10\log_{10}\frac{P_{max}}{P_{min}}\label{prange}$$ This is a little bit obscure, so most people use equation $\ref{drange}$.
The number returned has a finite number $N$ of bits, and the more bits there are the more accurate the measurement value within the peak-to-peak range of the ADC. And the ADC would have a finite dynamic range and resolution. To quantify each, imagine we have an ADC that returns the voltage value measured in an integer that contains 8 bits. And imagine that the ADC can only measure voltages between 0 and 1 volt. The resolution of the ADC is what tells you the amount of voltage associated with a single bit. If the ADC has $N$ bits, then you want the ADC to return $2^N$ when the voltage it samples is $V_{max}$, so we can define the resolution $\Delta$ as $$\Delta = \frac{V_{max}}{2^N}\label{res}$$ When an ADC returns any value, that value is limited by the resolution of the ADC. For instance, if the maximum voltage is 8V, and you have a $N=3$ bit ADC, then each bit is worth 1V. If your ADC samples 7.5V, then it will return either 7 or 8, given the resolution. This uncertainty in the actual voltage measured, determined by the ADC resolution, is called the "quantization error", but it isn't really an "error", it's just an uncertainty. And given that this uncertainty is a function of the ADC resolution, we can say that the quantization uncertainty $\epsilon$ will be somewhere between $-\frac{\Delta}{2}$ and $+\frac{\Delta}{2}$. So in our case here, we would say that the voltage measured by the ADC is $7 \pm \epsilon = 7\pm \half$.
For an uncertainty in that interval, the RMS of the voltage uncertainty is just given by $$\sigma_\epsilon = \frac{\Delta}{\sqrt{12}}\label{rms}$$ using the fact that the RMS value in any interval of length $L$ is $L\sqrt{12}$. This means that the noise power due to the quantization uncertainty will be given by $$P_{noise}=\frac{\sigma_\epsilon^2}{R}=\frac{\Delta^2}{12R}\label{pnoise}$$ where $R$ is the load resistance.
To calculate the signal to noise ratio (SNR), we compare the power of a full scale sine wave, that oscillates between $\pm A=\pm V_{max}/2$, to the noise power. The RMS of that sine wave is $$V_{RMS}=\frac{A}{\sqrt{2}}=\frac{V_{max}}{2\sqrt{2}}\nonumber$$ The signal power is just $V_{RMS}^2/R$ , so we have $$P_{signal} = \frac{V_{RMS}^2}{R} = \frac{V_{max}^2}{8R}\label{power}$$ The SNR is then given by the ratio of $P_{signal}$ to $P_{noise}$: $$\begin{align} SNR &= \frac{P_{signal}}{P_{noise}}\nonumber \\ &=\frac{V_{max}^2/8R}{\Delta^2/12R}\nonumber \\ &=\frac{3V_{max}^2}{2\Delta^2}\nonumber \\ &= \frac{3}{2}\frac{V_{max}^2}{(V_{max}/2^N)^2}\nonumber \\ &= \frac{3}{2}2^{2N}\nonumber \\ \end{align}\label{SNR}$$ Now we can form the SNR in dB - SNR(dB) - as $$\begin{align} SNR(dB) &= 10\log_{10}(SNR)\nonumber \\ &=10\log_{10}(\frac{3}{2}\cdot 2^{2N})\nonumber \\ &= 10\log_{10}(1.5) + 2N\log_{10}(2)\nonumber \\ &\approx 1.76 + 6.02N\nonumber \\ \end{align}\label{snrdb}$$ For an 8 bit ADC, the $SNR(dB)$ due to the quantization uncertainty will be around $50dB$, which tells you what the full scale power will be relative to the quantization noise power.
If the sine wave has a DC offset and amplitude $A$, then the wave can be written as $$V(t) = A(1+\sin\omega t)\nonumber \\$$ and we still have $V_{max}=2A$ as before for the full scale voltage. To find the RMS we now have to do the integral $$\begin{align} V_{rms}^2 &= \frac{1}{T}\int_0^T\big(A(1+\sin\omega t)\big)^2dt\nonumber \\ &=\frac{3}{2}A^2\nonumber \\ &= \frac{3}{8}V_{max}^2\nonumber \end{align}\nonumber$$ which gives $$P_{signal} = \frac{V_{rms}^2}{R} = 3\frac{V_{max}^2}{8R}\nonumber$$ Plugging that into equation $\ref{SNR}$ for $SNR$ gives $$SNR = \frac{9}{2}2^{2N}\nonumber$$ and then into equation $\ref{snrdb}$ to get $$\begin{align} SNR(dB) &= 10\log_{10}(SNR)\nonumber\\ &= 10\log_{10}(\frac{9}{2}2^{2N})\nonumber\\ &= 10\log_{10}(4.5) + 2N\log_{10}(2)\nonumber \\ &\approx 6.53 + 6.02N\nonumber \\ \end{align}\nonumber$$ So the SNR due to quantization for an 8 bit ADC that is bipolar ($-V_{max}\le V\le V_{max}$) will be $\sim 50dB$, and the SNR for an ADC that looks at a unipolar signal ($0\le V\le V_{max}$) will be $6.53 + 6.02\times 8=54.7dB$, about $4.7dB$ better, due to the fact that the DC offset boosts the RMS signal power.
If you have an incoming wave at a fixed frequency with no noise and digitize it with an $N-bit$ ADC, the quantization uncertainty means that the Fourier transform using a DFT or FFT will have quantization noise, with the SNR given by one of the above formula depending on whether the input signal is bipolar or polar.
The ADC returns an integer $n$ that tells you that the value it saw was above some threshold.
In the simulation below, we have an incoming bipolar wave with a frequency set to 40Hz and an amplitude of 1V, digitzed by an ADC that digitizes at some number of bits (default=4) at a sampling frequency of 110Sps, satisfying the Nyquist condition. You can change all 3 parameters. The ADC has $N$ bits, which means there will be $2^N$ thresholds evenly spaced between the minimum voltage allowed ($-1$V) and the maximum ($+1$V). And we are using the mid-rise scheme, where the ADC is calibrated to return the center of the bin that the ADC value points to. The wave is drawn as a line, the sampled values of the wave are white circles, and the ADC values are the yellow circles. As you increase the number of bits, you can see the quantization uncertainty shrink!
ADC bits | $f_{sampling}$ (Hz) | $f_{wave}$ (Hz) |
---|---|---|
Show Samples | |
Show ADC |