SVD-Based Image Compression and Denoising

Author

Lucas Clouser

Published

December 8, 2025

Abstract

Digital images can be viewed as large matrices whose entries record pixel intensities. Although these matrices often have full algebraic rank, their meaningful content usually lives in a much lower dimensional subspace. This project uses the singular value decomposition (SVD) to exploit that low effective rank for two tasks: compressing an image and removing additive Gaussian noise. We work with the standard camera image, treat it as a \(512\times512\) grayscale matrix, and compute its full SVD. Truncating the SVD yields optimal low-rank approximations in the sense of the Eckart–Young–Mirsky theorem and gives a simple compression scheme that stores only a small number of singular values and singular vectors. Guided by the spectrum of singular values, we examine how approximation rank controls compression ratio, relative Frobenius error, and two perceptual metrics: peak signal-to-noise ratio (PSNR) and a high-frequency signal-to-noise ratio (HF-SNR) based on Laplacian energy. For denoising, we interpret the tail of the singular value spectrum through the Marcenko–Pastur law for noise-only matrices and use its upper edge as a parameter-free threshold for discarding noise-dominated components. Across a range of noise levels, this cutoff consistently improves HF-SNR and yields visually cleaner reconstructions that preserve edges at moderate noise while gracefully degrading under heavy corruption. The experiments demonstrate how the abstract SVD machinery from numerical linear algebra translates into concrete, quantitative control over image quality.

1 Introduction

A grayscale digital image is a finite array of intensity measurements. Once we flatten the camera’s output into numbers, the image becomes a real matrix \(A \in \mathbb{R}^{m\times n}\). In principle this matrix may have rank \(\min(m,n)\), yet natural images usually display strong spatial correlations: neighbouring pixels vary together along smooth surfaces, and sharp transitions are confined to edges and textures. In linear algebra terms, the data are highly structured. The directions of variation that matter most are few, and the remaining degrees of freedom are largely redundant or dominated by noise.

The singular value decomposition is the standard tool for uncovering this structure. For any real matrix \(A\) there exist orthogonal matrices \(U \in \mathbb{R}^{m\times m}\) and \(V \in \mathbb{R}^{n\times n}\), and a diagonal matrix \(\Sigma\) with nonnegative entries, such that \[ A = U \Sigma V^{\mathsf T} = \sum_{i=1}^r \sigma_i u_i v_i^{\mathsf T}, \] where \(r = \operatorname{rank}(A)\), the singular values satisfy \(\sigma_1 \ge \sigma_2 \ge \dots \ge \sigma_r > 0\), and \(u_i, v_i\) are orthonormal singular vectors. Each term \(\sigma_i u_i v_i^{\mathsf T}\) is a rank-one “mode” encoding a particular pattern of variation in the image. The leading modes typically capture smooth shading and large-scale geometry, while later modes pick up finer textures and, eventually, noise.

The best-lower-rank approximation theorem, often attributed to Eckart and Young and presented in Ascher and Greif, says that if we truncate this expansion after \(k < r\) terms, the resulting matrix \[ A_k = \sum_{i=1}^k \sigma_i u_i v_i^{\mathsf T} \] is the unique matrix of rank at most \(k\) that minimizes the approximation error in both the spectral and Frobenius norms. This guarantees that any low-rank image we build by discarding small singular values is, in a precise sense, as close as possible to the original image for that chosen rank. The same theorem underlies the use of truncated SVD for solving ill-conditioned linear systems and for compressing large matrices.

In this project we explore how these ideas behave on a concrete image. Our goals are threefold. First, we characterize the singular value spectrum of the camera image and assess how quickly its energy concentrates in a small number of modes. Second, we quantify the trade-off between compression ratio and image quality as we vary the truncation rank, using relative Frobenius error, PSNR, and HF-SNR as complementary metrics. Third, we examine SVD-based denoising under additive Gaussian noise and use the Marcenko–Pastur law from random matrix theory to design a principled threshold for separating signal-dominated singular values from those consistent with pure noise.

The rest of the essay walks from the mathematical foundations to the specific numerical results. We begin by describing the dataset, the SVD framework, and the error metrics. We then present compression experiments, followed by denoising experiments driven by the singular spectrum and the Marcenko–Pastur edge. We conclude with an assessment of when SVD-based low-rank approximations work well in imaging and how they might be extended.

2 Methods and Computations

2.1 Setup and helper functions {.unlisted, .unnumbered}

Code
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import TwoSlopeNorm
from mpl_toolkits.axes_grid1.inset_locator import inset_axes
from matplotlib.lines import Line2D
from skimage import data, color, img_as_float
from skimage.metrics import peak_signal_noise_ratio
from skimage.filters import laplace
import pandas as pd
from scipy.linalg import svd
from matplotlib.ticker import MaxNLocator
import warnings

warnings.filterwarnings('ignore')

plt.rcParams.update({
    "figure.dpi": 120,
    "axes.titlesize": 11,
    "axes.labelsize": 10,
    "xtick.labelsize": 9,
    "ytick.labelsize": 9,
    "legend.frameon": False,
    "savefig.bbox": "tight"
})

def add_detail_inset(ax, img, zoom=80, center=[160,210], title="Detail"):
    """Attach a consistent detail inset with dark title background."""
    if center is None:
        center = (img.shape[0] // 2, img.shape[1] // 2)
    r0 = max(center[0] - zoom // 2, 0)
    c0 = max(center[1] - zoom // 2, 0)
    inset = inset_axes(ax, width="40%", height="40%", loc="lower right")
    inset.imshow(img[r0:r0+zoom, c0:c0+zoom], cmap="gray")
    inset.set_xticks([])
    inset.set_yticks([])
    inset.set_title(title, fontsize=7, color="white",
                    bbox=dict(boxstyle="round", facecolor="black", alpha=0.75))

def show_image(ax, img, title):
    ax.imshow(img, cmap="gray")
    ax.set_title(title)
    ax.axis("off")
    add_detail_inset(ax, img)

def reconstruct_rank_k(U, S, Vt, k):
    """Construct rank-k approximation of original matrix."""
    return (U[:, :k] * S[:k]) @ Vt[:k, :]

def compression_performance(m, n, k):
    """Calculate compression performance metrics."""
    original_storage = m * n
    compressed_storage = k * (m + n + 1)
    compression_ratio = original_storage / compressed_storage
    storage_percentage = (compressed_storage / original_storage) * 100
    return compression_ratio, storage_percentage

def hf_snr(clean, test):
    """High-frequency SNR via Laplacian; larger is better."""
    clean_hf = laplace(clean)
    test_hf = laplace(test)
    noise = clean_hf - test_hf
    return 10 * np.log10(np.sum(clean_hf**2) / (np.sum(noise**2) + 1e-12))

def mp_edges(m, n, sigma):
    """Marčenko-Pastur support for eigenvalues of (1/n) X^T X."""
    beta = m / n
    lambda_minus = sigma**2 * (1 - np.sqrt(beta))**2
    lambda_plus = sigma**2 * (1 + np.sqrt(beta))**2
    return beta, lambda_minus, lambda_plus

def mp_singular_edge(m, n, sigma):
    """Upper edge for singular values of Gaussian noise."""
    return sigma * (np.sqrt(m) + np.sqrt(n))

def mp_hard_threshold(U, S, Vt, m, n, sigma):
    """Hard-threshold singular values using the MP upper edge."""
    edge = mp_singular_edge(m, n, sigma)
    keep = S > edge
    if not np.any(keep):
        return np.zeros((U.shape[0], Vt.shape[1]))
    return (U[:, keep] * S[keep]) @ Vt[keep, :]

2.2 Dataset Description

We work with the standard “camera” photograph from the skimage.data library. After converting to grayscale and rescaling intensities to the interval \([0, 1]\), we obtain a \(512\times512\) matrix \(A\) with \(262{,}144\) pixels. The image has a mix of smooth regions (sky, background field) and sharp features (edges of the tripod, facial features and hair), which makes it a good stress test for low-rank approximations: coarse structures can be captured efficiently, but fine edges and textures are easier to destroy.

For the denoising experiments we generate noisy images of the form \[ A_{\text{noisy}} = A_{\text{clean}} + \eta, \] with \(\eta_{ij} \sim \mathcal{N}(0, \sigma^2)\) i.i.d. for a range of noise levels \(\sigma\). The noise model reflects a common assumption in imaging devices where sensor readout and photon statistics introduce approximately Gaussian fluctuations.

Code
# Load and preprocess image
image = data.camera()
image = img_as_float(image)
m, n = image.shape

# Visualize original image
fig, ax = plt.subplots(figsize=(6, 6))
show_image(ax, image, "Original Grayscale Image")
plt.tight_layout()
plt.show()

print(f"Image dimensions: {m} x {n} = {m*n:,} pixels")
Figure 1: Original camera image and pixel intensity distribution
Image dimensions: 512 x 512 = 262,144 pixels

2.3 Singular Value Decomposition and Mathematical Framework

2.3.1 Singular value decomposition and low-rank approximations

We compute the full SVD of the clean camera image, \[ A = U \Sigma V^{\mathsf T}, \] and record the singular values \(\{\sigma_i\}_{i=1}^{512}\). Figure 2 shows the spectrum on both logarithmic and linear scales, along with the cumulative energy fraction. The cumulative fraction is defined as \[ E(k) =s \frac{\sum_{i=1}^k \sigma_i^2}{\sum_{i=1}^r \sigma_i^2}, \] which equals one minus the squared relative Frobenius error of the best rank-\(k\) approximation. The table under that figure reports that the first singular value already carries about \(87\%\) of the Frobenius norm, the first three carry roughly \(95\%\), and the first 21 carry about \(99\%\). This rapid decay tells us that the image is extremely well approximated by a very low-rank matrix, even though its algebraic rank is 512.

Truncating the SVD after \(k\) singular values gives the low-rank approximation \(A_k\). As discussed in the text, this approximation is optimal in both the spectral norm and the Frobenius norm, meaning that no other rank-\(k\) matrix can achieve a smaller error in those norms. The same SVD also reveals that the condition number of \(A\) is on the order of \(10^7\), which emphasizes how tiny the smallest singular values are in comparison to the dominant ones and signals strong separation between signal and numerical noise.

For compression we compare the storage cost of the full image with that of a rank-\(k\) representation. Storing all pixels requires \(mn\) scalars. Storing a rank-\(k\) SVD requires the \(k\) singular values plus the first \(k\) columns of \(U\) and \(V\), for a total of \(k(m + n + 1)\) parameters. The corresponding compression ratio is \[ \text{CR}(k) = \frac{mn}{k(m + n + 1)}. \] This formula is discussed in the text and echoes the treatment in Ascher and Greif, who emphasize how the best-rank approximation theorem naturally leads to SVD-based compression schemes.

2.3.2 The Marcenko–Pastur edge for noise-only matrices

When we add Gaussian noise to an image, the SVD of the noisy matrix contains a mixture of signal-dominated modes and noise-dominated modes. Random matrix theory provides a guide for what the singular value spectrum of a pure noise matrix should look like. If \(N\) is an \(m\times n\) matrix with i.i.d. entries of mean zero and variance \(\sigma^2\), then in the high-dimensional limit the empirical distribution of eigenvalues of \(\frac{1}{n} N N^{\mathsf T}\) converges to the Marcenko–Pastur distribution supported on \[ \lambda_\pm = \sigma^2 (1 \pm \sqrt{\beta})^2, \quad \beta = \frac{m}{n}. \] Equivalently, the bulk of the singular values of \(N\) lies between \(\sigma(1 - \sqrt{\beta})\sqrt{n}\) and \(\sigma(1 + \sqrt{\beta})\sqrt{n}\). In our square \(512\times512\) case, \(\beta = 1\), so the right edge simplifies to approximately \(\sigma(\sqrt{m} + \sqrt{n})\). The key modeling step is to treat any singular value of the noisy image that lies below this “noise edge” as statistically compatible with pure noise. Singular values above the edge are unlikely to be explained by noise alone and therefore are assumed to carry signal. This suggests a simple hard-thresholding rule: keep only singular values larger than the Marcenko–Pastur upper edge and set the rest to zero.

Code
# Compute SVD
U, S, Vt = svd(image, full_matrices=False)
Code
fig, axes = plt.subplots(2, 2, figsize=(14, 6), gridspec_kw={"height_ratios": [2, 1]})

# Singular values on full log scale with a linear inset for the head
idx_full = np.arange(1, len(S) + 1)
axes[0, 0].semilogy(idx_full, S, color="black")
axes[0, 0].set_xlabel("Index i")
axes[0, 0].set_ylabel("Singular value (log)")
axes[0, 0].set_title("Singular Value Spectrum (full log scale)")

inset = inset_axes(axes[0, 0], width="45%", height="45%", loc="upper right")
inset.plot(np.arange(1, 101), S[:100], color="black")
inset.set_xlim(1, 100)
inset.set_title("Head (linear)", fontsize=8)
inset.tick_params(labelsize=7)

# Cumulative energy
cumulative_energy = np.cumsum(S**2) / np.sum(S**2)
axes[0, 1].plot(cumulative_energy, color="black", linewidth=1.4)
axes[0, 1].set_xlabel("Number of singular values k")
axes[0, 1].set_ylabel("Cumulative energy fraction")
axes[0, 1].set_title("Cumulative Energy Distribution")
axes[0, 1].set_ylim(0, 1.02)

thresholds = [0.8, 0.9, 0.95, 0.99]
for thresh in thresholds:
    k = np.argmax(cumulative_energy >= thresh) + 1
    axes[0, 1].axhline(thresh, color="gray", linestyle=":", alpha=0.6, linewidth=0.9)
    axes[0, 1].axvline(k, color="gray", linestyle=":", alpha=0.6, linewidth=0.9)

# Energy table on its own row
summary_data = []
for thresh in thresholds:
    k_val = np.argmax(cumulative_energy >= thresh) + 1
    summary_data.append({
        'Threshold': f'{thresh:.0%}',
        'Rank k': k_val,
        '% of Components': f'{k_val/len(S)*100:.1f}%',
        'Energy': f'{cumulative_energy[k_val-1]:.4f}'
    })
df_table = pd.DataFrame(summary_data)

axes[1, 0].axis("off")
axes[1, 1].axis("off")
table = axes[1, 0].table(cellText=df_table.values, colLabels=df_table.columns, loc="center", cellLoc="center")
table.scale(1.1, 1.2)
table.auto_set_font_size(False)
table.set_fontsize(9)

plt.tight_layout()
plt.show()

# Print condition number
cond_number = S[0] / S[-1]
print(f"Condition number kappa(A) = sigma1/sigman = {cond_number:.2e}")
print(f"Numerical rank (sigma_k/sigma1 > 1e-10): {np.sum(S/S[0] > 1e-10)}")
Figure 2: Singular values and energy capture
Condition number kappa(A) = sigma1/sigman = 1.18e+07
Numerical rank (sigma_k/sigma1 > 1e-10): 512

2.4 Implementation of Low-Rank Approximation

Using the helper routines defined in the setup block, we can quickly inspect compression trade-offs at a representative rank.

Code
# Test helpers with a small rank reconstruction
test_rank = 50
test_recon = reconstruct_rank_k(U, S, Vt, test_rank)
test_cr, test_storage_pct = compression_performance(m, n, test_rank)

print(f"Test reconstruction with k={test_rank}:")
print(f"  Compression ratio: {test_cr:.1f}:1")
print(f"  Storage required: {test_storage_pct:.1f}% of original")
Test reconstruction with k=50:
  Compression ratio: 5.1:1
  Storage required: 19.6% of original

2.5 Metrics and rationale

To quantify how well \(A_k\) approximates \(A\), we use three metrics.

The first is the relative Frobenius error \[ \varepsilon_F(k) = \frac{\|A - A_k\|_F}{\|A\|_F} = \sqrt{\frac{\sum_{i=k+1}^r \sigma_i^2}{\sum_{i=1}^r \sigma_i^2}}. \] This measures how much of the matrix’s energy lives in the discarded tail of the spectrum. It connects directly to the cumulative energy plot in Figure 2 and the reconstruction error curve in Figure 4.

The second is peak signal-to-noise ratio. Given an original image \(A\) and approximation \(\hat A\), we first compute the mean squared error \[ \text{MSE} = \frac{1}{mn} \sum_{i,j}(A_{ij} - \hat A_{ij})^2 \] and then define \[ \text{PSNR} = 10 \log_{10}\left(\frac{\text{MAX}^2}{\text{MSE}}\right), \] where MAX is the maximum representable intensity (here effectively \(1\) after normalization). PSNR is standard in image processing and is widely used in textbooks such as Gonzalez and Woods’ Digital Image Processing. High PSNR values correspond to small average pixel-wise errors, although the metric does not always align perfectly with human perception.

The third metric is a high-frequency SNR tailored to edges. We apply a discrete Laplacian operator \(L\) to both the clean and reconstructed images; the Laplacian emphasizes rapid intensity changes and suppresses smooth regions. The HF-SNR is defined as \[ \text{HF-SNR} = 10 \log_{10} \left( \frac{\|L A_{\text{clean}}\|_2^2}{\|L(A_{\text{clean}} - \hat A)\|_2^2} \right), \] with larger values indicating that high-frequency structure in the reference image is reconstructed more faithfully. This design is inspired by Laplacian-pyramid-based quality measures that model early stages of human visual processing. In our plots, HF-SNR responds strongly to edge blurring and noise in textured regions.

3 Experimental Results: Compression Performance

3.0.1 Structure of the singular value spectrum

The spectrum of the clean camera image is extremely steep. On the log scale, singular values decay by several orders of magnitude across the first few dozen indices. On the linear “head” plot in Figure 2, essentially all interesting variation is concentrated in a handful of modes, and the long tail is barely visible. Quantitatively, the first singular value alone accounts for about \(87\%\) of the Frobenius energy, the first three account for about \(95\%\), and the first twenty-one reach \(99\%\). The image is therefore very close to rank twenty, even though its true rank is full.

This concentration explains why low-rank approximations work so well. When we form \(A_k\) by retaining only the first \(k\) singular values and vectors, the relative Frobenius error decreases monotonically with \(k\) and drops sharply as soon as we pass the point where the cumulative energy is near one.

3.0.2 Visual quality versus rank

Code
# Selected ranks for visualization
ranks = np.logspace(1, np.log10(250), 5, dtype=int).tolist()

fig, axes = plt.subplots(2, 3, figsize=(12, 7))
axes = axes.flatten()

# Original image
show_image(axes[0], image, f'Original\nRank: {len(S)}')

# Low-rank approximations
for i, k in enumerate(ranks, 1):
    recon = reconstruct_rank_k(U, S, Vt, k)
    cr, savings = compression_performance(m, n, k)
    fro_err = np.linalg.norm(image - recon, 'fro') / np.linalg.norm(image, 'fro')
    show_image(axes[i], recon, f'k={k}\nCR: {cr:.1f}:1\nError: {fro_err:.3f}')

plt.tight_layout()
plt.show()
Figure 3: Visual comparison of low-rank approximations

Figure 3 presents a sequence of reconstructions at ranks \(k = 10, 22, 49, 111,\) and \(249\), alongside the original image. At \(k=10\), we achieve an aggressive compression ratio of roughly \(25.6{:}1\) and a relative Frobenius error of about \(0.135\). The global layout of the scene is clear: the camera, tripod, and background skyline are recognizable, and large patches of sky and field appear smooth. However, the fine structure is heavily simplified. Edges are softened, especially in the facial detail, hair, and tripod joints, and textured regions such as the grass lose their richness.

At \(k=22\) the compression ratio drops to about \(11.6{:}1\), and the error falls near \(0.097\). The face, coat, and camera pick up sharper edges, and the inset detail begins to resemble the original image more closely, although some subtle shading and texture are still missing. By \(k=49\), with a compression ratio around \(5.2{:}1\) and error roughly \(0.064\), the reconstructed image is visually very close to the original outside of high-zoom inspection. Fine features such as individual strands of hair and subtle gradients in the sky are largely recovered.

The ranks \(k=111\) and \(k=249\) correspond to compression ratios of about \(2.3{:}1\) and \(1.0{:}1\), with relative errors near \(0.036\) and \(0.011\) respectively. At this point the reconstructions become almost indistinguishable from the original across the full image, and only careful side-by-side comparisons reveal small differences. These examples illustrate a regime of diminishing returns: increasing \(k\) beyond about fifty improves the relative error and PSNR, but the gains in perceived quality are modest compared to those obtained when moving from very low rank to moderate rank.

3.0.3 Quantitative metrics versus compression ratio

Code
k_values = np.unique(np.concatenate([
    np.arange(1, 51),
    np.arange(55, 121, 5),
    np.arange(130, 301, 10),
    np.arange(310, 401, 10),
    np.arange(410, 481, 5),
    np.arange(486, 513, 2)
]))

fro_errors = []
for k in k_values:
    recon = reconstruct_rank_k(U, S, Vt, k)
    error = np.linalg.norm(image - recon, 'fro') / np.linalg.norm(image, 'fro')
    fro_errors.append(error)

plt.figure(figsize=(6.5, 4))
plt.semilogy(k_values, fro_errors, color="black", linewidth=1.4)
plt.xlabel("Rank k")
plt.ylabel("Relative Frobenius Error")
plt.title("Reconstruction Error vs Rank")
plt.grid(axis="y", linestyle=":", alpha=0.6)
plt.tight_layout()
plt.show()
Figure 4: Relative Frobenius error vs rank
Code
metrics_k = k_values  # reuse dense grid

compression_ratios = []
psnr_values = []
hf_snr_values = []

for k in metrics_k:
    recon = reconstruct_rank_k(U, S, Vt, k)
    cr, _ = compression_performance(m, n, k)
    psnr_val = peak_signal_noise_ratio(image, recon, data_range=1.0)
    hf_val = hf_snr(image, recon)
    
    compression_ratios.append(cr)
    psnr_values.append(psnr_val)
    hf_snr_values.append(hf_val)

fig, axes = plt.subplots(1, 2, figsize=(12, 4.5))
axes[0].semilogx(compression_ratios, psnr_values, color="black", linewidth=1.4)
axes[0].set_xlabel("Compression Ratio")
axes[0].set_ylabel("PSNR (dB)")
axes[0].set_title("PSNR vs Compression Ratio")

axes[1].semilogx(compression_ratios, hf_snr_values, color="crimson", linewidth=1.4)
axes[1].set_xlabel("Compression Ratio")
axes[1].set_ylabel("HF-SNR (dB)")
axes[1].set_title("HF-SNR vs Compression Ratio")

plt.tight_layout()
plt.show()
Figure 5: Perceptual metrics vs compression ratio

The error curves in Figure 4 and Figure 5 summarize these trade-offs. The relative Frobenius error decreases rapidly as \(k\) increases, consistent with the cumulative energy plot. PSNR starts at relatively low values for extremely high compression (large compression ratio) and then increases as we retain more singular values. Meanwhile, compression ratio itself decreases from around \(25{:}1\) toward \(1{:}1\). By accepting a modest increase in storage from roughly \(4\%\)\(20\%\) of the original size, we obtain reconstructions with very small matrix error and high PSNR.

HF-SNR reveals a slightly different story. At extremely aggressive compression, high-frequency content is suppressed alongside noise, so HF-SNR can be artificially high even when textures and fine edges are visibly over-smoothed. As we increase \(k\), the metric initially improves because we recover genuine high-frequency structure that existed in the clean image. Beyond a certain rank, HF-SNR begins to plateau and then eventually declines once we start reintroducing small-scale numerical artifacts associated with very small singular values. In our experiments, the sweet spot where compression, PSNR, HF-SNR, and visual quality all align lies around \(k \approx 30\)\(60\), where the compression ratio is still comfortably above \(4{:}1\).

4 Denoising via the Marcenko–Pastur Edge

4.0.1 Noise-induced lifting of the singular value tail

When we corrupt the camera image with Gaussian noise, the most striking change in the spectrum is in the tail. The leading singular values remain relatively stable as long as the noise level is moderate, since they represent large-scale structures with strong signal energy. The small singular values, however, are dramatically inflated. Figure 6 overlays the clean image spectrum, the noisy spectrum, and a pure-noise reference. As the noise standard deviation \(\sigma\) increases from \(0.1\) to \(0.5\), the noisy spectrum’s tail lifts until it nearly coincides with the pure-noise curve.

The Marcenko–Pastur theory predicts that, for a noise-only matrix, almost all singular values should lie below the edge \(\sigma(\sqrt{m} + \sqrt{n})\). We draw this edge on the spectrum and observe where the noisy image’s singular values cross it. Modes whose singular values lie below the edge are statistically indistinguishable from those of pure noise, while those above the edge are likely to carry a genuine signal component. This observation motivates a rank selection rule that is independent of the particular image content and depends only on the noise level and matrix dimensions.

Code
np.random.seed(7)
noise_levels_view = np.logspace(-2, np.log10(0.5), 40)  # np.linspace(0.01
cmap = plt.cm.copper
norm = plt.Normalize(min(noise_levels_view), max(noise_levels_view))

fig, ax = plt.subplots(figsize=(8.5, 5))
clean_line, = ax.semilogy(S, label="Clean", color="black", linewidth=1.5)

for sigma in noise_levels_view:
    color = cmap(norm(sigma))
    noisy = np.clip(image + sigma * np.random.randn(m, n), 0, 1)
    _, s_noisy, _ = svd(noisy, full_matrices=False)
    edge = mp_singular_edge(m, n, sigma)

    cutoff_candidates = np.where(s_noisy < edge)[0]
    cutoff_idx = cutoff_candidates[0] if len(cutoff_candidates) else len(s_noisy) - 1
    cutoff_val = s_noisy[cutoff_idx]

    ax.semilogy(s_noisy, color=color, linewidth=1.1)
    ax.axhline(edge, color=color, linewidth=0.5, linestyle="--", alpha=0.5)
    ax.scatter(cutoff_idx, cutoff_val, color='red', s=4, marker="o", zorder=5)

ax.set_xlabel("Index i")
ax.set_ylabel("Singular value (log)")
ax.set_title("Noise lifts the tail")

mp_handle = Line2D([0], [0], color="gray", linestyle="--", label="MP edge")
cut_handle = Line2D([0], [0], marker="o", color="red", linestyle="None", label="Intersection")
ax.legend(handles=[clean_line, mp_handle, cut_handle], loc="upper right")
sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
sm.set_array([])
plt.colorbar(sm, ax=ax, label="Noise level σ")

plt.tight_layout()
plt.show()
Figure 6: Singular value spectra under noise with pure-noise reference; dashed lines and markers show MP cutoffs

4.0.2 Hard thresholding at the MP edge

Our denoising algorithm proceeds as follows. For a noisy image at a given \(\sigma\), we compute its SVD and a reference pure-noise spectrum with the same \(\sigma\) and dimensions. We estimate the upper Marcenko–Pastur edge and keep only singular values larger than this edge, zeroing out the rest. The resulting truncated SVD defines the denoised image. This is effectively a hard-thresholding operation in the singular value domain, akin to wavelet-based hard thresholding but in a basis adapted to the image itself.

Code
np.random.seed(10)
sigma_demo = 0.12
noise = sigma_demo * np.random.randn(m, n)
noisy_img = np.clip(image + noise, 0, 1)
U_n, S_n, Vt_n = svd(noisy_img, full_matrices=False)
mp_recon = mp_hard_threshold(U_n, S_n, Vt_n, m, n, sigma_demo)

# Visualize spectrum trimming
fig, axes = plt.subplots(1, 3, figsize=(14, 4.5))

axes[0].imshow(noisy_img, cmap="gray")
axes[0].axis("off")
axes[0].set_title(f"Noisy (σ={sigma_demo})")
add_detail_inset(axes[0], noisy_img, title="Detail")

axes[1].semilogy(S_n, color="black", linewidth=1.2)
edge_val = mp_singular_edge(m, n, sigma_demo)
axes[1].axhline(edge_val, color="crimson", linestyle="--", label="MP edge")
keep_mask = S_n > edge_val
axes[1].scatter(np.where(keep_mask)[0], S_n[keep_mask], color="green", s=12, label="Kept")
axes[1].scatter(np.where(~keep_mask)[0], S_n[~keep_mask], color="gray", s=12, alpha=0.6, label="Trimmed")
axes[1].set_title("Spectrum with MP cutoff")
axes[1].set_xlabel("Index i")
axes[1].set_ylabel("Singular value (log)")
axes[1].legend()

axes[2].imshow(mp_recon, cmap="gray")
axes[2].axis("off")
axes[2].set_title("MP denoised")
add_detail_inset(axes[2], mp_recon, title="Detail")

plt.tight_layout()
plt.show()

noisy_hf = hf_snr(image, noisy_img)
mp_hf = hf_snr(image, mp_recon)
print(f"HF-SNR noisy vs clean (dB): {noisy_hf:.3f}")
print(f"HF-SNR after MP denoising (dB): {mp_hf:.3f}")
print(f"Improvement: {mp_hf - noisy_hf:+.3f} dB")
Figure 7: MP hard-thresholding removes tail noise while keeping head singular values
HF-SNR noisy vs clean (dB): -11.589
HF-SNR after MP denoising (dB): -2.811
Improvement: +8.778 dB

Figure 7 shows that this procedure discards a long tail of noise-dominated singular values while leaving the head largely intact. The HF-SNR of a representative noisy image improves from approximately \(-11.6\) dB to about \(-2.8\) dB after denoising, a gain of roughly \(8.8\) dB.

4.0.3 Visual assessment across noise levels

Code
np.random.seed(11)
noise_eval = np.geomspace(0.01, 0.5, 20)  # denser grid for plateau detection
hf_noisy, hf_mp, ranks_kept = [], [], []

for sigma in noise_eval:
    noisy = np.clip(image + sigma * np.random.randn(m, n), 0, 1)
    hf_noisy.append(hf_snr(image, noisy))
    
    U_n, S_n, Vt_n = svd(noisy, full_matrices=False)
    recon_mp = mp_hard_threshold(U_n, S_n, Vt_n, m, n, sigma)
    hf_mp.append(hf_snr(image, recon_mp))
    ranks_kept.append(np.sum(S_n > mp_singular_edge(m, n, sigma)))

gain = np.array(hf_mp) - np.array(hf_noisy)
slope = np.gradient(gain, np.log10(noise_eval))
# Plateau start provided: use sigma=0.1
plateau_sigma = 0.1

fig, axes = plt.subplots(1, 2, figsize=(11, 4.2))
axes[0].plot(noise_eval, hf_noisy, marker="o", label="Noisy", color="gray")
axes[0].plot(noise_eval, hf_mp, marker="o", label="MP denoised", color="black")
axes[0].set_xscale("log")
axes[0].set_xlabel("Noise level σ")
axes[0].set_ylabel("HF-SNR (dB)")
axes[0].set_title("Perceptual quality after MP denoising (HF-SNR)")
axes[0].legend()
axes[0].axvline(plateau_sigma, color="crimson", linestyle="--", alpha=0.7)

axes[1].plot(noise_eval, gain, marker="o", color="crimson")
axes[1].set_xscale("log")
axes[1].axhline(0, color="black", linewidth=0.9, linestyle=":")
axes[1].set_xlabel("Noise level σ")
axes[1].set_ylabel("HF-SNR gain (dB)")
axes[1].set_title("Denoising lift from MP cutoff")
axes[1].axvline(plateau_sigma, color="black", linestyle="--", alpha=0.7, label="Plateau start")
axes[1].legend()

for ax in axes:
    ax.grid(True, linestyle=":", alpha=0.5)

plt.tight_layout()
plt.show()

summary = pd.DataFrame({
    "sigma": noise_eval,
    "kept_rank": ranks_kept,
    "gain": gain
})

sigma_low = summary["sigma"].min()
sigma_high = summary["sigma"].max()
sigma_mid = plateau_sigma
triplet = [sigma_low, sigma_mid, sigma_high]

fig, axes = plt.subplots(3, 2, figsize=(8, 10))

for row, sigma in enumerate(triplet):
    np.random.seed(30 + row)
    noisy = np.clip(image + sigma * np.random.randn(m, n), 0, 1)
    U_n, S_n, Vt_n = svd(noisy, full_matrices=False)
    recon_mp = mp_hard_threshold(U_n, S_n, Vt_n, m, n, sigma)
    noisy_hf = hf_snr(image, noisy)
    recon_hf = hf_snr(image, recon_mp)
    gain_val = recon_hf - noisy_hf

    axes[row, 0].imshow(noisy, cmap="gray")
    axes[row, 0].axis("off")
    axes[row, 0].set_title(f"Noisy σ={sigma:.3f}\nHF-SNR={noisy_hf:.2f} dB")
    add_detail_inset(axes[row, 0], noisy, title="Detail")

    axes[row, 1].imshow(recon_mp, cmap="gray")
    axes[row, 1].axis("off")
    axes[row, 1].set_title(f"MP denoised\nHF-SNR={recon_hf:.2f} dB (gain {gain_val:+.2f})")
    add_detail_inset(axes[row, 1], recon_mp, title="Detail")

plt.tight_layout()
plt.show()
(a) HF-SNR lift from MP denoising across Gaussian noise levels
(b)
Figure 8

The images in Figure 8 help interpret these metrics. At a low noise level (\(\sigma = 0.01\)), the noisy image already looks very similar to the clean reference. The MP cutoff is conservative and removes a few small singular values that now carry a mix of signal and noise. As a result, the denoised image loses a bit of subtle texture, and HF-SNR actually decreases slightly (from about \(9.4\) dB to \(8.9\) dB). In this regime the best strategy is effectively to do nothing; the noise is too small to justify aggressive smoothing.

At a moderate noise level (\(\sigma = 0.10\)), the situation reverses. The noisy image contains visible grain, especially in flat regions such as the sky and background field, and the HF-SNR is around \(-10\) dB. After MP thresholding, HF-SNR jumps to roughly \(-2.6\) dB, with an improvement of more than \(7\) dB, and the visual noise is greatly reduced. Edges of the face, hair, and camera remain sharp, while homogeneous regions regain a smoother appearance.

Under heavy noise (\(\sigma = 0.50\)), the raw image is dominated by speckle, and HF-SNR drops near \(-21\) dB. The MP cutoff keeps only a very small head of the spectrum, producing a reconstruction that looks like a low-rank caricature of the scene. Fine details are gone, yet the gross structure, including the camera’s silhouette and major boundaries, re-emerges from the noise. HF-SNR improves by almost \(19\) dB to about \(-2.2\) dB, which is a large numerical gain that matches the qualitative improvement from “almost unrecognizable” to “noisy but clearly the same subject.”

5 Discussion

Working through the experiments, I saw both the promise and the limitations of SVD-based image processing, with practical trade-offs that are easy to miss in purely theoretical treatments.

First, while the energy concentration in leading singular values mathematically justifies compression, the subjective quality of low-rank approximations disappointed me. Retaining just 5-10% of singular values, while preserving “global structure” produces noticeably blurred reconstructions that sacrifice texture and detail. The camera image’s smooth regions survive well, but facial features and fine patterns degrade unpleasantly. This exposes a limitation: SVD prioritizes mean-square error reduction over perceptual fidelity.

Second, I found that the metrics themselves disagree on what constitutes “good” reconstruction. Relative Frobenius error shows rapid improvement with rank, but this mathematical convenience masks how errors cluster in perceptually critical areas like edges. PSNR’s logarithmic scale offers better interpretability but still overestimates quality when smoothing occurs, which is precisely what truncated SVD does. HF-SNR’s focus on high frequencies reveals the method’s weakness: low-rank approximations systematically sacrifice edges and textures for global structure, scoring poorly even when PSNR appears acceptable. Even then HF-SNR isn’t perfect.

Third, the Marcenko–Pastur denoising rule revealed deeper issues. While theoretically elegant, its “parameter-free” nature is misleading because it requires knowing the noise level σ, which is rarely available in practice. More concerning is the over-regularization visible even at moderate noise: the MP threshold often removes legitimate detail alongside noise, producing unnaturally smooth results. At σ=0.1, denoised images lose subtle texture while still retaining noise artifacts, a compromise that looks neither clean nor natural.

The method’s failure at low noise levels (worsening HF-SNR) is particularly damning because it suggests the rule is too aggressive, discarding meaningful signal when noise is minimal. This is not just theoretical; it means the method could degrade high-quality images in real applications.

Finally, while the connection to ill-conditioned systems is mathematically neat, it reveals SVD’s fundamental conservatism: it treats small singular values as uniformly untrustworthy. But in images, “small” does not always mean “noise”; fine textures and subtle gradients also produce modest singular values. The method’s inability to distinguish these leads to oversmoothing, a problem familiar in linear regularization that more sophisticated approaches (like wavelet thresholding) address better.

6 Conclusions

After running these experiments, I am convinced that truncated SVD provides a mathematically elegant framework for compression and denoising, but practical implementation reveals significant shortcomings.

For compression, SVD achieves impressive theoretical ratios but produces visibly degraded images even at moderate compression levels. The method’s bias toward preserving low-frequency content results in “blurry” reconstructions that sacrifice precisely the details humans find most important. While 5-10% of singular values may preserve “structure,” they fail to preserve quality.

For denoising, the MP-based approach offers theoretical neatness but limited practical value. Its requirement of known noise levels, tendency toward over-smoothing, and degradation of clean images make it inferior to modern alternatives. The visual results, while better than raw noise in the metric, lack the crispness and natural appearance of state-of-the-art methods, I would rather look at the noisy image than the denoised one.

The broader lesson is that mathematical optimality in one metric (Frobenius error) does not translate to perceptual quality. SVD’s linear, global approach fails to capture the spatially varying, nonlinear nature of both image structure and human perception. While the framework provides valuable conceptual insight into low-rank approximation and noise separation, its practical utility for image processing feels limited; it serves more as a demonstration of mathematical principles than a competitive solution.

7 References

  1. Ascher, U. M., & Greif, C. (2011). A First Course in Numerical Methods. SIAM.
  2. Eckart, C., & Young, G. (1936). “The approximation of one matrix by another of lower rank.” Psychometrika, 1(3), 211–218. DOI: 10.1007/BF02288367
  3. Gonzalez, R. C., & Woods, R. E. (2018). Digital Image Processing (4th ed.). Pearson.
  4. Marčenko, V. A., & Pastur, L. A. (1967). “Distribution of eigenvalues for some sets of random matrices.” Math. USSR-Sbornik, 1(4), 457–483.
  5. Andrews, H. C., & Patterson, C. L. (1976). “Singular value decompositions and digital image processing.” IEEE Trans. on Acoustics, Speech, and Signal Processing, 24(1), 26–53. DOI: 10.1109/TASSP.1976.1162766