Quantum-
enhanced
Computer
Vision (QeCV) is a research
field at the intersection of quantum computing and computer vision, focused on developing innovative
algorithms and techniques that leverage quantum paradigms to surpass classical methods in speed, efficiency,
and accuracy.
This webpage accompanies our survey paper, providing a visual overview of QeCV pipelines, detailed concept
explanations, and an interactive overview of the (non-exhaustive) published QeCV literature, complementing the
theoretical discussion.
Click a concept above to explore its theory.
The qubit is the fundamental unit of quantum information, represented by a normalized two-dimensional complex vector \(|\psi\rangle \in \mathbb{C}^2, \| |\psi\rangle \|_2 = 1\).
Mathematically, we can define two orthonormal basis vectors called computational states:
\(|0\rangle = \begin{bmatrix}1\\0\end{bmatrix}, \quad |1\rangle = \begin{bmatrix}0\\1\end{bmatrix}\),
and express a qubit as a linear combination of the basis vectors:
\(|\psi\rangle = \alpha |0\rangle + \beta |1\rangle, \quad \alpha, \beta \in \mathbb{C}, \quad |\alpha|^2 +
|\beta|^2 = 1\).
In column vector form, this corresponds to:
\(|\psi\rangle = \begin{bmatrix}\alpha\\\beta\end{bmatrix} = \begin{bmatrix}a+ib\\c+id\end{bmatrix}, \quad
a,b,c,d \in \mathbb{R}, \quad a^2+b^2+c^2+d^2=1\).
Up to a global phase, a qubit state can be visualized on the Bloch sphere:
\(|\psi\rangle = \cos\frac{\theta}{2}|0\rangle + e^{i\phi}\sin\frac{\theta}{2}|1\rangle\).
Angles \(\theta\) and \(\phi\) specify the qubit's position on the sphere.
When a qubit \(|\psi\rangle = \alpha |0\rangle + \beta |1\rangle \in \mathbb{C}^2\) is measured in the computational basis \(\{|0\rangle, |1\rangle\}\), the state collapses to one of the basis vectors with probabilities determined by the amplitudes:
\(|0\rangle \text{ with probability } |\alpha|^2 = |\langle 0|\psi\rangle|^2,\)
\(|1\rangle \text{ with probability } |\beta|^2 = |\langle 1|\psi\rangle|^2\).
In other words, a qubit exists in a superposition of classical states before measurement. Upon measuring, it collapses probabilistically: \(|\psi\rangle = \alpha |0\rangle + \beta |1\rangle \to |0\rangle \text{ or } |1\rangle\) with probabilities \(|\alpha|^2\) and \(|\beta|^2\), respectively. This process is also called the collapse of the wave function.
Multi-qubit gates are unitary operators acting on \(\mathbb{C}^{2^n}\) and can generate entanglement by coupling qubits. Measurement generalizes directly: the probability of observing a specific computational basis state is given by the squared magnitude of its corresponding coefficient, regardless of whether the state is separable or entangled.
Measurement can be seen as projecting the qubit state \(|\psi\rangle\) onto the chosen axis (here the \(z\)-axis) and it collapses \(|0\rangle\) or \(|1\rangle\) with probability \(|\alpha|^2\) or \(|\beta|^2\). By repeating the measurement multiple times, we can compute the measurement expectation value, which correspond to the squared magnitudes of the projections of \(|\psi\rangle\) onto the measurement axis.
When multiple qubits \(|\psi_1\rangle, \dots, |\psi_n\rangle\) are considered jointly, their combined state
is represented by the
tensor (Kronecker) product of the individual states:
\(|\psi\rangle = |\psi_1\rangle \otimes |\psi_2\rangle \otimes \cdots \otimes |\psi_n\rangle \in
\mathbb{C}^{2^n}\).
Such a collection of qubits forms a quantum register. A convenient shorthand notation is
\(|\psi_1\psi_2\cdots\psi_n\rangle\).
For two qubits \( |\psi_1\rangle = \alpha|0\rangle + \beta|1\rangle \) and \( |\psi_2\rangle =
\gamma|0\rangle + \delta|1\rangle \), the joint state is
\(|\psi\rangle = |\psi_1\rangle \otimes |\psi_2\rangle = \begin{bmatrix} \alpha\gamma \\ \alpha\delta \\
\beta\gamma \\ \beta\delta \end{bmatrix}\).
Product states of this form are called separable; they correspond to rank-1 tensors and can
be decomposed into individual qubit states.
Multi-qubit systems are not restricted to separable states. A state that cannot be written as a tensor
product of single-qubit states is called entangled. A canonical example is the two-qubit
Bell (EPR) state:
\( |\psi\rangle = \frac{1}{\sqrt{2}} (|01\rangle + |10\rangle) = \frac{1}{\sqrt{2}} \begin{bmatrix} 0 \\ 1
\\ 1 \\ 0 \end{bmatrix}\).
No coefficients \(\alpha,\beta,\gamma,\delta\) exist such that this state can be factorized into \(
|\psi_1\rangle \otimes |\psi_2\rangle \); hence it is genuinely entangled.
The state of an \(n\)-qubit quantum system can be actively manipulated over time using controlled external interactions. Let \(|\psi(0)\rangle\) denote the initial state of the system. Its time evolution is governed by the system Hamiltonian \(H(t) \in \mathbb{C}^{2^n \times 2^n}\), which encodes the energies and couplings induced by the experimental setup.
The time evolution of the quantum state \(|\psi(t)\rangle\) is described by the Schrödinger equation:
\(i\hbar \frac{d}{dt} |\psi(t)\rangle = H(t)\,|\psi(t)\rangle\).
Here, \(i\) denotes the imaginary unit, \(\hbar\) is the reduced Planck constant, and the Hamiltonian \(H(t)\) is a Hermitian operator acting on the \(2^n\)-dimensional Hilbert space.
Put simply, and in analogy to classical computing, a Hamiltonian can be viewed as an
energy functionâa mathematical expression describing how energy is distributed across a
quantum system. A time-dependent Hamiltonian defines an evolving energy landscape that
governs the systemâs dynamics through the Schrödinger equation.
The specific structure of the Hamiltonian, together with how this evolution is realized or approximated in
time, fundamentally determines the quantum computing paradigm. In gate-based quantum
computing, evolution is discretized into sequences of unitary operations (quantum gates), whereas in
adiabatic quantum computing (AQC) the system is steered continuously by slowly varying the Hamiltonian
toward the ground state encoding the solution.
Following the Schrödinger evolution of a quantum system, one of the primary ways to manipulate qubits is via quantum gates. These gates are unitary operators applied to one or multiple qubits, defining the discrete-time evolution of the system. Quantum circuits are sequences of such gates that perform computations analogous to classical logical circuits, but in a reversible and linear-algebraic manner.
Single-qubit gates act on a single qubit |Ïâ©. Common examples include the Pauli gates:
\(X = \begin{pmatrix}0 & 1\\1 & 0\end{pmatrix},\; Y = \begin{pmatrix}0 & -i\\ i & 0\end{pmatrix},\; Z =
\begin{pmatrix}1 & 0\\ 0 & -1\end{pmatrix}\).
For a qubit state \(|\psi\rangle = \alpha |0\rangle + \beta |1\rangle = \begin{pmatrix}\alpha \\ \beta
\end{pmatrix}\), applying X yields:
\(X|\psi\rangle = \beta |0\rangle + \alpha |1\rangle\).
This swaps the amplitudes of the basis states.
The Hadamard gate \(H = \frac{1}{\sqrt{2}}\begin{pmatrix}1 & 1\\ 1 & -1\end{pmatrix}\) creates
superposition:
\(H|0\rangle = \frac{|0\rangle + |1\rangle}{\sqrt{2}}, \quad H|1\rangle = \frac{|0\rangle -
|1\rangle}{\sqrt{2}}\).
For n qubits, single-qubit gates are tensored to act on the full system. For example, the 2-qubit Hadamard
gate is:
\(H^{\otimes 2} = H \otimes H = \frac{1}{2} \begin{pmatrix}1 & 1 & 1 & 1\\ 1 & -1 & 1 & -1\\ 1 & 1 & -1 &
-1\\ 1 & -1 & -1 & 1\end{pmatrix}\).
Sequential application of \(H^\otimes2\) twice yields the identity operation \(I_4\).
Controlled gates generate entanglement. Example: the controlled-NOT (CNOT) gate:
\(\text{CNOT} = \begin{pmatrix} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1\\ 0 & 0 & 1 & 0
\end{pmatrix}, \quad \text{CNOT}|10\rangle = |11\rangle\).
Parameterized gates are unitary rotations that depend on a real-valued parameter Ξ:
\(U(\theta) = e^{i \theta G} = \cos(\theta) I + i \sin(\theta) G\).
where G is Hermitian and generates rotations around a Bloch sphere axis. Parameterized Quantum Circuits (PQC) use such gates to prepare states \(|Ï(Ξ)\rangle = U(Ξ)|0\rangle\), optimizing \(\theta\) to minimize a cost function \(\langle \psi(\theta)|M|\psi(\theta)\rangle\).
This approach underpins variational quantum algorithms and Quantum Machine Learning.
Unlike gate-based QC, AQC evolves a quantum system continuously under a time-dependent Hamiltonian \(H(t)\). The system is prepared in the ground state of an initial Hamiltonian \(H_I\) and evolves slowly to a problem Hamiltonian \(H_P\), encoding the solution to an optimization problem.
The typical problem Hamiltonian is an Ising-type Hamiltonian:
\(H_P = \sum_{i,j} J_{i,j} \sigma^z_i \sigma^z_j + \sum_i b_i \sigma^z_i\)
with \(\sigma^z_i\) the Pauli-Z operator on qubit \(i\) embedded in the \(n\)-qubit space:
\(\sigma^z_i = I \otimes \dots \otimes I \otimes \sigma^z \otimes I \otimes \dots \otimes I,\quad \sigma^z
= \begin{pmatrix}1 & 0\\ 0 & -1\end{pmatrix},\; I = \begin{pmatrix}1 & 0\\0 & 1\end{pmatrix}\).
The ground state of \(H_P\) corresponds to the solution of the Ising problem:
\(\min_{s \in \{-1,+1\}^n} s^\top J s + s^\top b\)
or equivalently the QUBO formulation:
\(\min_{x \in \{0,1\}^n} x^\top Q x + x^\top c\).
The system evolves under a time-dependent Hamiltonian:
\(H(t) = (1-f(t)) H_I + f(t) H_P,\quad f: [0,T] \to [0,1]\).
Starting in the ground state of \(H_I\), a sufficiently slow evolution ensures the system remains in the instantaneous ground state of \(H(t)\) until \(H_P\), according to the Quantum Adiabatic Theorem.
The state can be expanded in the eigenbasis of \(H(t)\):
\(|\psi(t)\rangle = \sum_k c_k(t) |k(t)\rangle, \quad H(t)|k(t)\rangle = \lambda_k(t)|k(t)\rangle\).
A common choice for the initial Hamiltonian is:
\(H_I = - \sum_i \kappa \sigma^x_i, \quad \sigma^x = \begin{pmatrix}0 & 1\\1 & 0\end{pmatrix}\).
The minimum energy gap between the ground and first excited state during evolutionâthe
spectral gapâgoverns the required adiabatic runtime \(T\).
For non-degenerate ground states, the gap remains positive, ensuring the adiabatic theorem holds.
Quantum annealers implement AQC physically, possibly non-adiabatically, returning the ground state with some probability. Repeated runs improve the likelihood of finding the optimal solution.
Classical analogues include simulated annealing:
\(P = \min \Big(1, \exp\Big(\frac{E(x^{(i-1)}) - E(x^{(i)}_\text{candidate})}{T_i}\Big)\Big)\).
Simulated quantum annealing extends this by sampling multiple states, simulating tunneling, superposition,
and entanglement.
Quantum-enhanced computer vision (QeCV) approaches based on quantum annealing follow a six-step pipeline from problem formulation to solution interpretation. This pipeline can be executed in a single sweep or iteratively, updating and re-annealing the QUBO until convergence.
The six key steps are:
The flowchart below visualizes this process.
Hover over each element to see in-depth descriptions, mathematical details, and typical methods used at each
step.
Gate-based quantum computing requires classical data to be encoded as quantum states before processing with quantum circuits. Quantum-enhanced computer vision (QeCV) approaches typically follow a workflow of four steps:
The flowchart below visualizes this process.
Hover over each element to see in-depth descriptions, mathematical details, and typical methods used at each
step.
The table lists some notable quantum-enhanced computer vision (QeCV) approaches in the literature. It
categorises methods based on: the quantum hardware used (adiabatic (AQC) or gate-based (GQC)), encoding
strategy, problem type, and classical/quantum hybrid techniques.
Filter the table to focus on specific hardware types, encoding methods, or problem classes.
| Method | Paradigm | Problem | Input type | Problem size | # Qubits |
|---|---|---|---|---|---|
| QA, CVPRâ20 | AQC | Transformation estimation, Point set alignment | Point clouds | †5k points | ~140 |
| IQT, CVPRâ22 | AQC | Transformation estimation | Point clouds | †1.5k points | ~15 |
| QuAnt, ICLRâ23 | AQC | Transformation estimation, Point set alignment, Mesh alignment | Point clouds, meshes | †2k points, 5 mesh vertices | ~15 |
| QuCOOP, CVPRâ25 | AQC | Point set alignment, Mesh alignment | Point clouds, meshes | †50 points, 502 mesh vertices | ~36 |
| QGM, 3DVâ20 | AQC | Graph matching | Graphs | †4 graph nodes | ~50 |
| Q-Match, ICCVâ21 | AQC | Mesh alignment | Meshes | †502 mesh vertices | ~250 |
| CcuantuMM, CVPRâ23 | AQC | Mesh alignment | Meshes | †1k mesh vertices | ~40 |
| QSync, CVPRâ21 | AQC | Permutation synchronisation, Graph matching | Permutation matrices | †3Ă3 permutation matrices, 8 views | ~72 |
| QSQS, ECCVâ20 | AQC | Object detection | Bounding boxes | †45 bounding boxes | ~45 |
| QMOT, CVPRâ22 | AQC | Object tracking | Tracks, detections | †3 tracks, 5 frames | ~100 |
| Doan et al., CVPRâ22 | AQC | Robust fitting | Data points | †100 points | ~100 |
| DeQUMF, CVPRâ23 | AQC | Multi-model fitting | Data points | †1k models, 250 points | ~100 |
| Pandey et al., CVPRWâ25 | AQC | Robust multi-model fitting | Data points | †2k models, 10k points | ~120 |
| Bauckhage et al., LWDAâ18 | AQC | k-means clustering | Data points | †16 points | ~16 |
| Arthur and Date, QIPâ21 | AQC | Balanced k-means clustering | Data points | †21 points, k = 3 | ~64 |
| Nguyen et al., ArXivâ23 | AQC | k-means clustering | Feature vectors | - | - |
| Zaech et al., CVPRâ24 | AQC | Balanced k-means clustering | Data points | †45 points, k = 3 | ~45 |
| Choong et al., CVPRâ23 | AQC | Single image super-resolution | Images, dictionary | 15Ă20 LR â 45Ă60 HR | ~100 |
| QMSVM, IEEEâ23 | AQC | Multi-class support vector machines | Feature vectors | †60 vectors, 3 classes | ~360 |
| Zardini et al., ArXivâ24 | AQC | Multi-class support vector machines | Feature vectors | †24 vectors, 3 classes | ~144 |
| Q-Seg, IEEEâ24 | AQC | Unsupervised image segmentation | Image patches | †32Ă32 images | - |
| QuMoSeg, ECCVâ22 | AQC | Motion segmentation | Landmark points | †16 landmarks, 2 motions | ~128 |
| Santos et al., MDPIâ18 | AQC | Stereo matching | Stereo images | †15Ă15 image patches | - |
| Heidari et al., IVCNZâ21 | AQC | Stereo matching | Stereo images | †383Ă434 image patches | - |
| Braunstein et al., 3DVâ24 | AQC | Stereo matching | Stereo images | - | - |
| Hur et al., QMIâ22 | GQC | Binary image classification | Fashion-MNIST | †length-32 feature vectors | 8 |
| QDCNN, IOPâ20 | GQC | Multi-class image classification | MNIST, GTSRB | †32Ă32 images | - |
| sQCNN-3D, Elsevierâ23 | GQC | Multi-class point cloud classification | ModelNet, ShapeNet | †32Ă32Ă32 voxel grids | 4 |
| HQNN-Parallel, IOPâ24 | GQC | Multi-class image classification | Medical MNIST, CIFAR | †{64, 28, 32}ÂČ images | 5 |
| ATP, CVPRâ25 | GQC | Multi-class image classification | Fashion-MNIST, CIFAR | †{64, 28, 32}ÂČ images | - |
| Chin et al., ACCVâ20 | GQC | Robust fitting | Data points | - | - |
| Yang et al., ECCVâ24 | GQC | Robust fitting | Data points | †4 points | 19 |
| 3D-QAE, BMVCâ23 | GQC | Point cloud auto-encoding | Point clouds | †16 points | 6 |
| MosaiQ, ICCVâ23 | GQC | Image generation | Fashion-MNIST | †length-10 noise vectors | 5 |
| Huang et al., APSâ23 | GQC | Image generation | Handwritten digits | †length-32 latent vectors | 6 |
| Kolle et al., ArXivâ24 | GQC | Denoising diffusion model | Fashion-MNIST, CIFAR | †32Ă32 images | 7 |
| Piatkowski, ArXivâ22 | GQC | Bundle adjustment | Sets of images | †32Ă32 image patches | - |
| QIREN, ICMLâ24 | GQC | Neural fields, implicit neural representations | Coordinates | †64Ă64 images | 6 |
| QVF, ArXivâ25 | GQC | Neural fields, implicit neural representations | Coordinates | Images, 3D shapes | 6 |
@article{meli2025quantum,
title={Quantum-enhanced Computer Vision: Going Beyond Classical Algorithms},
author={Meli, Natacha Kuete and Wang, Shuteng and Benkner, Marcel Seelbach and Sasdelli, Michele and Chin, Tat-Jun and Birdal, Tolga and Moeller, Michael and Golyanik, Vladislav},
journal={arXiv preprint arXiv:2510.07317},
year={2025}
}
Natacha Kuete Meli
natacha.kuetemeli@uni-siegen.de
Vladislav Golyanik
golyanik@mpi-inf.mpg.de