Quantum-enhanced Computer Vision

Qubit

The qubit is the fundamental unit of quantum information, represented by a normalized two-dimensional complex vector \(|\psi\rangle \in \mathbb{C}^2, \| |\psi\rangle \|_2 = 1\).

Mathematically, we can define two orthonormal basis vectors called computational states:
\(|0\rangle = \begin{bmatrix}1\\0\end{bmatrix}, \quad |1\rangle = \begin{bmatrix}0\\1\end{bmatrix}\),
and express a qubit as a linear combination of the basis vectors:
\(|\psi\rangle = \alpha |0\rangle + \beta |1\rangle, \quad \alpha, \beta \in \mathbb{C}, \quad |\alpha|^2 + |\beta|^2 = 1\).

In column vector form, this corresponds to:
\(|\psi\rangle = \begin{bmatrix}\alpha\\\beta\end{bmatrix} = \begin{bmatrix}a+ib\\c+id\end{bmatrix}, \quad a,b,c,d \in \mathbb{R}, \quad a^2+b^2+c^2+d^2=1\).

Geometric Interpretation: Bloch Sphere

Up to a global phase, a qubit state can be visualized on the Bloch sphere:
\(|\psi\rangle = \cos\frac{\theta}{2}|0\rangle + e^{i\phi}\sin\frac{\theta}{2}|1\rangle\).
Angles \(\theta\) and \(\phi\) specify the qubit's position on the sphere.

Bloch sphere representation of a qubit

Qubit Measurement

When a qubit \(|\psi\rangle = \alpha |0\rangle + \beta |1\rangle \in \mathbb{C}^2\) is measured in the computational basis \(\{|0\rangle, |1\rangle\}\), the state collapses to one of the basis vectors with probabilities determined by the amplitudes:

\(|0\rangle \text{ with probability } |\alpha|^2 = |\langle 0|\psi\rangle|^2,\)
\(|1\rangle \text{ with probability } |\beta|^2 = |\langle 1|\psi\rangle|^2\).

In other words, a qubit exists in a superposition of classical states before measurement. Upon measuring, it collapses probabilistically: \(|\psi\rangle = \alpha |0\rangle + \beta |1\rangle \to |0\rangle \text{ or } |1\rangle\) with probabilities \(|\alpha|^2\) and \(|\beta|^2\), respectively. This process is also called the collapse of the wave function.

Multi-Qubit Systems

Multi-qubit gates are unitary operators acting on \(\mathbb{C}^{2^n}\) and can generate entanglement by coupling qubits. Measurement generalizes directly: the probability of observing a specific computational basis state is given by the squared magnitude of its corresponding coefficient, regardless of whether the state is separable or entangled.

Visualizing on the Bloch Sphere

Measurement can be seen as projecting the qubit state \(|\psi\rangle\) onto the chosen axis (here the \(z\)-axis) and it collapses \(|0\rangle\) or \(|1\rangle\) with probability \(|\alpha|^2\) or \(|\beta|^2\). By repeating the measurement multiple times, we can compute the measurement expectation value, which correspond to the squared magnitudes of the projections of \(|\psi\rangle\) onto the measurement axis.

Qubit measurement projections on the Bloch sphere

Multi-Qubit Systems, Entanglement

When multiple qubits \(|\psi_1\rangle, \dots, |\psi_n\rangle\) are considered jointly, their combined state is represented by the tensor (Kronecker) product of the individual states:
\(|\psi\rangle = |\psi_1\rangle \otimes |\psi_2\rangle \otimes \cdots \otimes |\psi_n\rangle \in \mathbb{C}^{2^n}\).
Such a collection of qubits forms a quantum register. A convenient shorthand notation is \(|\psi_1\psi_2\cdots\psi_n\rangle\).

Two-Qubit Example

For two qubits \( |\psi_1\rangle = \alpha|0\rangle + \beta|1\rangle \) and \( |\psi_2\rangle = \gamma|0\rangle + \delta|1\rangle \), the joint state is
\(|\psi\rangle = |\psi_1\rangle \otimes |\psi_2\rangle = \begin{bmatrix} \alpha\gamma \\ \alpha\delta \\ \beta\gamma \\ \beta\delta \end{bmatrix}\).
Product states of this form are called separable; they correspond to rank-1 tensors and can be decomposed into individual qubit states.

Entanglement

Multi-qubit systems are not restricted to separable states. A state that cannot be written as a tensor product of single-qubit states is called entangled. A canonical example is the two-qubit Bell (EPR) state:
\( |\psi\rangle = \frac{1}{\sqrt{2}} (|01\rangle + |10\rangle) = \frac{1}{\sqrt{2}} \begin{bmatrix} 0 \\ 1 \\ 1 \\ 0 \end{bmatrix}\).
No coefficients \(\alpha,\beta,\gamma,\delta\) exist such that this state can be factorized into \( |\psi_1\rangle \otimes |\psi_2\rangle \); hence it is genuinely entangled.

Quantum State Evolution

The state of an \(n\)-qubit quantum system can be actively manipulated over time using controlled external interactions. Let \(|\psi(0)\rangle\) denote the initial state of the system. Its time evolution is governed by the system Hamiltonian \(H(t) \in \mathbb{C}^{2^n \times 2^n}\), which encodes the energies and couplings induced by the experimental setup.

Schrödinger Equation

The time evolution of the quantum state \(|\psi(t)\rangle\) is described by the Schrödinger equation:

\(i\hbar \frac{d}{dt} |\psi(t)\rangle = H(t)\,|\psi(t)\rangle\).

Here, \(i\) denotes the imaginary unit, \(\hbar\) is the reduced Planck constant, and the Hamiltonian \(H(t)\) is a Hermitian operator acting on the \(2^n\)-dimensional Hilbert space.

Hamiltonian

Put simply, and in analogy to classical computing, a Hamiltonian can be viewed as an energy function—a mathematical expression describing how energy is distributed across a quantum system. A time-dependent Hamiltonian defines an evolving energy landscape that governs the system’s dynamics through the Schrödinger equation.
The specific structure of the Hamiltonian, together with how this evolution is realized or approximated in time, fundamentally determines the quantum computing paradigm. In gate-based quantum computing, evolution is discretized into sequences of unitary operations (quantum gates), whereas in adiabatic quantum computing (AQC) the system is steered continuously by slowly varying the Hamiltonian toward the ground state encoding the solution.

Gate-Based Quantum Computing (GQC)

Following the Schrödinger evolution of a quantum system, one of the primary ways to manipulate qubits is via quantum gates. These gates are unitary operators applied to one or multiple qubits, defining the discrete-time evolution of the system. Quantum circuits are sequences of such gates that perform computations analogous to classical logical circuits, but in a reversible and linear-algebraic manner.

Single-Qubit Gates

Single-qubit gates act on a single qubit |ψ⟩. Common examples include the Pauli gates:
\(X = \begin{pmatrix}0 & 1\\1 & 0\end{pmatrix},\; Y = \begin{pmatrix}0 & -i\\ i & 0\end{pmatrix},\; Z = \begin{pmatrix}1 & 0\\ 0 & -1\end{pmatrix}\).

For a qubit state \(|\psi\rangle = \alpha |0\rangle + \beta |1\rangle = \begin{pmatrix}\alpha \\ \beta \end{pmatrix}\), applying X yields:
\(X|\psi\rangle = \beta |0\rangle + \alpha |1\rangle\).
This swaps the amplitudes of the basis states.

The Hadamard gate \(H = \frac{1}{\sqrt{2}}\begin{pmatrix}1 & 1\\ 1 & -1\end{pmatrix}\) creates superposition:
\(H|0\rangle = \frac{|0\rangle + |1\rangle}{\sqrt{2}}, \quad H|1\rangle = \frac{|0\rangle - |1\rangle}{\sqrt{2}}\).

Multi-Qubit Gates

For n qubits, single-qubit gates are tensored to act on the full system. For example, the 2-qubit Hadamard gate is:
\(H^{\otimes 2} = H \otimes H = \frac{1}{2} \begin{pmatrix}1 & 1 & 1 & 1\\ 1 & -1 & 1 & -1\\ 1 & 1 & -1 & -1\\ 1 & -1 & -1 & 1\end{pmatrix}\).
Sequential application of \(H^\otimes2\) twice yields the identity operation \(I_4\).

Controlled gates generate entanglement. Example: the controlled-NOT (CNOT) gate:
\(\text{CNOT} = \begin{pmatrix} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1\\ 0 & 0 & 1 & 0 \end{pmatrix}, \quad \text{CNOT}|10\rangle = |11\rangle\).

Parameterized Gates and Variational Circuits

Parameterized gates are unitary rotations that depend on a real-valued parameter θ:

\(U(\theta) = e^{i \theta G} = \cos(\theta) I + i \sin(\theta) G\).

where G is Hermitian and generates rotations around a Bloch sphere axis. Parameterized Quantum Circuits (PQC) use such gates to prepare states \(|ψ(θ)\rangle = U(θ)|0\rangle\), optimizing \(\theta\) to minimize a cost function \(\langle \psi(\theta)|M|\psi(\theta)\rangle\).

This approach underpins variational quantum algorithms and Quantum Machine Learning.

Adiabatic Quantum Computing (AQC) & Quantum Annealing

Unlike gate-based QC, AQC evolves a quantum system continuously under a time-dependent Hamiltonian \(H(t)\). The system is prepared in the ground state of an initial Hamiltonian \(H_I\) and evolves slowly to a problem Hamiltonian \(H_P\), encoding the solution to an optimization problem.

Problem Hamiltonian

The typical problem Hamiltonian is an Ising-type Hamiltonian:
\(H_P = \sum_{i,j} J_{i,j} \sigma^z_i \sigma^z_j + \sum_i b_i \sigma^z_i\)
with \(\sigma^z_i\) the Pauli-Z operator on qubit \(i\) embedded in the \(n\)-qubit space:
\(\sigma^z_i = I \otimes \dots \otimes I \otimes \sigma^z \otimes I \otimes \dots \otimes I,\quad \sigma^z = \begin{pmatrix}1 & 0\\ 0 & -1\end{pmatrix},\; I = \begin{pmatrix}1 & 0\\0 & 1\end{pmatrix}\).

The ground state of \(H_P\) corresponds to the solution of the Ising problem:
\(\min_{s \in \{-1,+1\}^n} s^\top J s + s^\top b\)
or equivalently the QUBO formulation:
\(\min_{x \in \{0,1\}^n} x^\top Q x + x^\top c\).

Adiabatic Evolution

The system evolves under a time-dependent Hamiltonian:

\(H(t) = (1-f(t)) H_I + f(t) H_P,\quad f: [0,T] \to [0,1]\).

Starting in the ground state of \(H_I\), a sufficiently slow evolution ensures the system remains in the instantaneous ground state of \(H(t)\) until \(H_P\), according to the Quantum Adiabatic Theorem.

The state can be expanded in the eigenbasis of \(H(t)\):
\(|\psi(t)\rangle = \sum_k c_k(t) |k(t)\rangle, \quad H(t)|k(t)\rangle = \lambda_k(t)|k(t)\rangle\).

Initial Hamiltonian & Spectral Gap

A common choice for the initial Hamiltonian is:
\(H_I = - \sum_i \kappa \sigma^x_i, \quad \sigma^x = \begin{pmatrix}0 & 1\\1 & 0\end{pmatrix}\).
The minimum energy gap between the ground and first excited state during evolution—the spectral gap—governs the required adiabatic runtime \(T\).

For non-degenerate ground states, the gap remains positive, ensuring the adiabatic theorem holds.

Quantum Annealing & Simulated Analogues

Quantum annealers implement AQC physically, possibly non-adiabatically, returning the ground state with some probability. Repeated runs improve the likelihood of finding the optimal solution.

Classical analogues include simulated annealing:
\(P = \min \Big(1, \exp\Big(\frac{E(x^{(i-1)}) - E(x^{(i)}_\text{candidate})}{T_i}\Big)\Big)\).
Simulated quantum annealing extends this by sampling multiple states, simulating tunneling, superposition, and entanglement.

Method	Paradigm	Problem	Input type	Problem size	# Qubits
QA, CVPR’20	AQC	Transformation estimation, Point set alignment	Point clouds	≤ 5k points	~140
IQT, CVPR’22	AQC	Transformation estimation	Point clouds	≤ 1.5k points	~15
QuAnt, ICLR’23	AQC	Transformation estimation, Point set alignment, Mesh alignment	Point clouds, meshes	≤ 2k points, 5 mesh vertices	~15
QuCOOP, CVPR’25	AQC	Point set alignment, Mesh alignment	Point clouds, meshes	≤ 50 points, 502 mesh vertices	~36
QGM, 3DV’20	AQC	Graph matching	Graphs	≤ 4 graph nodes	~50
Q-Match, ICCV’21	AQC	Mesh alignment	Meshes	≤ 502 mesh vertices	~250
CcuantuMM, CVPR’23	AQC	Mesh alignment	Meshes	≤ 1k mesh vertices	~40
QSync, CVPR’21	AQC	Permutation synchronisation, Graph matching	Permutation matrices	≤ 3×3 permutation matrices, 8 views	~72
QSQS, ECCV’20	AQC	Object detection	Bounding boxes	≤ 45 bounding boxes	~45
QMOT, CVPR’22	AQC	Object tracking	Tracks, detections	≤ 3 tracks, 5 frames	~100
Doan et al., CVPR’22	AQC	Robust fitting	Data points	≤ 100 points	~100
DeQUMF, CVPR’23	AQC	Multi-model fitting	Data points	≤ 1k models, 250 points	~100
Pandey et al., CVPRW’25	AQC	Robust multi-model fitting	Data points	≤ 2k models, 10k points	~120
Bauckhage et al., LWDA’18	AQC	k-means clustering	Data points	≤ 16 points	~16
Arthur and Date, QIP’21	AQC	Balanced k-means clustering	Data points	≤ 21 points, k = 3	~64
Nguyen et al., ArXiv’23	AQC	k-means clustering	Feature vectors	-	-
Zaech et al., CVPR’24	AQC	Balanced k-means clustering	Data points	≤ 45 points, k = 3	~45
Choong et al., CVPR’23	AQC	Single image super-resolution	Images, dictionary	15×20 LR → 45×60 HR	~100
QMSVM, IEEE’23	AQC	Multi-class support vector machines	Feature vectors	≤ 60 vectors, 3 classes	~360
Zardini et al., ArXiv’24	AQC	Multi-class support vector machines	Feature vectors	≤ 24 vectors, 3 classes	~144
Q-Seg, IEEE’24	AQC	Unsupervised image segmentation	Image patches	≤ 32×32 images	-
QuMoSeg, ECCV’22	AQC	Motion segmentation	Landmark points	≤ 16 landmarks, 2 motions	~128
Santos et al., MDPI’18	AQC	Stereo matching	Stereo images	≤ 15×15 image patches	-
Heidari et al., IVCNZ’21	AQC	Stereo matching	Stereo images	≤ 383×434 image patches	-
Braunstein et al., 3DV’24	AQC	Stereo matching	Stereo images	-	-
Hur et al., QMI’22	GQC	Binary image classification	Fashion-MNIST	≤ length-32 feature vectors	8
QDCNN, IOP’20	GQC	Multi-class image classification	MNIST, GTSRB	≤ 32×32 images	-
sQCNN-3D, Elsevier’23	GQC	Multi-class point cloud classification	ModelNet, ShapeNet	≤ 32×32×32 voxel grids	4
HQNN-Parallel, IOP’24	GQC	Multi-class image classification	Medical MNIST, CIFAR	≤ {64, 28, 32}² images	5
ATP, CVPR’25	GQC	Multi-class image classification	Fashion-MNIST, CIFAR	≤ {64, 28, 32}² images	-
Chin et al., ACCV’20	GQC	Robust fitting	Data points	-	-
Yang et al., ECCV’24	GQC	Robust fitting	Data points	≤ 4 points	19
3D-QAE, BMVC’23	GQC	Point cloud auto-encoding	Point clouds	≤ 16 points	6
MosaiQ, ICCV’23	GQC	Image generation	Fashion-MNIST	≤ length-10 noise vectors	5
Huang et al., APS’23	GQC	Image generation	Handwritten digits	≤ length-32 latent vectors	6
Kolle et al., ArXiv’24	GQC	Denoising diffusion model	Fashion-MNIST, CIFAR	≤ 32×32 images	7
Piatkowski, ArXiv’22	GQC	Bundle adjustment	Sets of images	≤ 32×32 image patches	-
QIREN, ICML’24	GQC	Neural fields, implicit neural representations	Coordinates	≤ 64×64 images	6
QVF, ArXiv’25	GQC	Neural fields, implicit neural representations	Coordinates	Images, 3D shapes	6