The basic upsampling operation is upsampling by 2. Upsampling may refer to two similar operations: upsampling without interpolation or upsampling with interpolation (sometimes just called interpolation).
Mathematically, upsampling a signal $x[n]$ with DTFT $X(\omega)$ by 2 without interpolation is defined by $$X_u(\omega) = X(2 \omega) \; .$$ Notice that this is essentially a frequency-shrinking of the DTFT signal. In the time-domain, this is equivalent to adding an extra zero between every sample.
We can extend the concepts above to upsample by any integer number $M$. When upsampling a signal $x[n]$ by $M$, we add $M-1$ zeros between each sample. In frequency, this is defined by $$X_u(\omega) = X(M \omega) \; .$$ where we now shrink the frequency domain by a factor of $M$.
Interpolation is equivalent to applying a low-pass filter to remove the high frequency replica's created by shrinking in frequency. In time, this removes all of the extra zeros and smooths the signal. To remove the high frequency components, the cut-off frequency of an ideal reconstruction/interpolation filter would ideally be $\omega_c = \pi/M$, where $M$ is the upsampling factor and the gain would be $M$.