# The living QAOA reference

updated 2021-10-02

## What is QAOA?

QAOA is short for “Quantum Approximate Optimization Algorithm”.

• Protocol for approximately solving an optimization problem on a quantum computer
• Can be implemented on near-term devices!
• Variable parameter $$p$$ (depth) – the optimal performance increases with $$p$$ and solves$$^*$$ the problem as $$p \to \infty$$ ($$^*$$QAOA recovers adiabatic evolution, will only solve it if the adiabatic algorithm will solve it)
• In order to use this algorithm optimally, you have to tune $$2p$$ parameters (this gets hard as $$p$$ grows!)
• This is a ‘local’ algorithm, which may limit its power.

For deeper mathematical descriptions, see the paper or tutorials (here, here, here, here). Some important jargon: mixing terms, cost functions, circuit depth, and Trotterization.

See a list of papers tagged ‘QAOA’ on the arXiv.wiki.

## Performance analysis at low depth

The QAOA at low depth is easiest to analyze. Here is a table of known results:

Problem QAOA depth Graph performance (satisfying fraction) best known algorithm? Papers
MAX-CUT 1 triangle-free $$1/2 + 0.30/\sqrt{D}$$ No Wang+ 2018, Hastings 2019
MAX-CUT 1 any see formula ? (KM project in progress) Wang+ 2018, Hastings 2019
MAX-CUT 2 girth above 5 $$1/2 + 0.41/\sqrt{D}$$ No Marwaha 2021
MAX-3-XOR (E3LIN2) 1 no overlapping constraints, or random signs $$1/2 + 0.33/\sqrt{D}$$ No Farhi+ 2015, Barak+ 2015, Lin+ 2016, Marwaha+ 2021
MAX-3-XOR (E3LIN2) 1 any at least $$1/2 + \Theta(1/\sqrt{D log D})$$ so far, No Farhi+ 2015, Barak+ 2015
MAX-k-XOR (EkLIN2) 1 no overlapping constraints, or random signs see formula So far, yes(!) Marwaha+ 2021
SK Model 1-12 infinite size see formula ? (No if there is no overlap gap) Farhi+ 2019; see also El Alaoui+ 2020
Max Independent Set 1 any at least $$\Theta(n/d)$$ so far, No (KM project in progress) Farhi+ 2020

There are some other results on low-depth QAOA (perhaps could be added to the table above):

• Marika Svensson’s papers
• The performance of QAOA at $$p=1$$ for a “QUBO” (quadratic cost function) is known
• Stuart Hadfield’s thesis (for example, on MaxDiCut)
• NASA QuAIL’s paper on Grover search with QAOA
• Claes+ 2021 on mixed-spin glass models
• the ‘toy’ Hamming-weight ramp and variations (there is a bound showing only states near weight $$n/2$$ matter for $$p=1$$); and maximally constrained 3-SAT with a single satisfying assignment. These allow perfect solution with $$p=1$$.

The QAOA at low depth has some known performance limitations. For example, see Farhi+ 2020a and Farhi+ 2020b, Bravyi+ 2019, and Chou+ 2021.

The graph structure may impact QAOA performance; see Herrman+ 2021 for a study on small graphs.

## Choosing optimal parameters

At high depth, the algorithm’s success is dependent on choosing good parameters. There are different strategies to set good parameters, depending on your goal. (Note: Optimally tuning your parameters is a NP-hard problem!) Here is a table of different parameter-setting strategies:

Strategy Description Drawbacks References
brute force Pre-compute the optimal parameters. Then use optimal parameters to run the algorithm. Although the parameters are found, this takes time exponential with the graph size. Farhi+ 2014
variational Run the algorithm several times on a quantum computer, then use the results to choose new parameters. Repeat until you achieve convergence. This requires using a quantum computer many times to find the right parameters. Noise in quantum computers may make it more challenging to choose improved parameters at each step. Farhi+ 2014
interpolative or Fourier methods Solve optimal beta and gamma at small p, and use that to guess optimal betas and gammas at larger p. This does not guarantee that the optimal parameters are the best possible for QAOA. Zhou+ 2018
machine learning Use a classical optimization method to compute nearly-optimal parameters. The computed parameters may not be optimal, because of barren plateaus. many; for example Khairy+ 2019

Barren plateaus pose a challenge to numerically calculating optimal parameters. Parameter landscapes have many local optima and a lot of “flat” regions (“barren plateaus”). This means that local changes in parameters (e.g. gradient methods) may not improve the QAOA, making it very difficult to identify optimal parameters. See McClean+ 2018 and Wang+ 2020.

Some studies suggest that optimal QAOA parameters can transfer from graph instance to graph instance: see Lotshaw+ 2021 for study at small graphs, Galda+ 2021, and Wurtz+ 2021. This is also true on large graphs; see a theoretical result of instance independence in Brandao+ 2018.

For certain problems with instance independence you can get analytical expressions for the optimal QAOA parameters. This happens for local problems like MAX-CUT on degree-3 graphs and for the SK model. On these problems, one can classically find the universal angles that apply to any (large) problem instance.

QAOA might (and from simulations, appears to) perform well with good parameters, not necessarily optimal. This suggests (but doesn’t prove!) that aiming to find optimal parameters for large $$p$$ is not only difficult, but also unnecessary.

There are different strategies to optimize parameters treating the QAOA optimization as a control problem; see Niu+ 2019, Dong+ 2019, Wiersema+ 2020.

## QAOA performance under noise and errors

The quality of solutions found by QAOA (without error correction or mitigation) quickly decays under noise. Noise also makes it harder to find optimal parameters.

• Franca+ 2020 shows that assuming each gate in the circuit fails with probability q, at a depth scaling like $$1/q$$ the expected energy of the output string is provably worse than that obtained by efficient classical algorithms. Interestingly, their results show that this can even happen at depths of the order $$1/(10q)$$, i.e. when only a small fraction of the qubits has been corrupted. This indicates that very small noise rates might be necessary to obtain an advantage if implementing the QAOA unitaries in the physical device requires a large depth.
• Under similar assumptions, Wang+ 2020 shows that the gradient of the parameters of the QAOA circuit is exponentially small at depths of order $$n/q$$, where $$n$$ is the number of qubits. This shows that the circuits are essentially untrainable at large depths.
• Quiroz+ 2021 shows that the performance of QAOA decays exponentially with errors in the parameters. See also Marshall+ 2020, Xue+ 2019.

The rapid decay of the quality of QAOA due to noise has also been observed experimentally in Harrigan+ 2021.

There are also strategies to deal with errors:

• Shaydulin+ 2021 shows how the symmetries in QAOA ansatz can be used to mitigate errors. Applies this approach on quantum hardware fo MaxCut.
• Botelho+ 2021 exteneds symmetry approach to Quantum Alternating Operator Ansatz

## Extensions to QAOA

Several adjustments to QAOA have been proposed. Some of these are new protocols that use QAOA as a subroutine; others add penalty terms, use thermal state based objectives, or adjust the mixing Hamiltonian. Here is a list:

Extension How it works References
ST-QAOA (Spanning Tree QAOA) Changes the mixer to a sum of $$X_i X_j$$ terms on a spanning tree Wurtz+ 2021a
CD-QAOA (Counter-diabatic QAOA) Adds terms to the mixer to mimic corrections to adiabaticity Wurtz+ 2021b
Quantum Alternating Operator Ansatz Generalization of QAOA (parameterized unitaries and mixers; incorporating constraints into the mixers) Hadfield+ 2017, Hadfield+ 2021
RQAOA (Recursive QAOA) Recursively generates a more and more constrained problem. Uses QAOA (or some other quantum routine) to add constraints. Bravyi+ 2019
GM-QAOA (Grover Mixer QAOA) and Th-QAOA (Threshold-based QAOA) Change the mixers to “flip phase” for a marked element (GM) or above a certain threshold (Th); imitates Grover search operators Bärtschi+ 2020, Golden+ 2021
Warm-start QAOA Approximately solve the problem with a different method; then use that output as the initial state to QAOA Egger+ 2020, Tate+ 2020
Multi-angle QAOA Separate the cost Hamiltonian into individual terms, and use a different angle per term at every step Herrman+ 2021

Some extensions use alternative cost functions besides the cost Hamiltonian, including CVaR (Conditional Value-at-Risk) (Barkoutsos+ 2019), feedback-based approaches (Magann+ 2021), and Gibbs sampling. These aim to estimate higher moments of the distribution to improve the “best” (vs expected) performance of QAOA.

## Open problems

• Which problems and families of instances can we show that QAOA provides advantage over classical algorithms?
• How can we analyze QAOA at higher depth? (for example, some ideas in Hadfield+ 2021)
• What happens at log depth?
• What happens at poly depth?
• How does the QAOA performance scale with $$p$$?
• How can we better choose the QAOA parameters?
• How do we best implement constrained optimization?
• When are penalty terms useful?
• What are the optimal choices of mixers? (comparing $$p$$, or circuit size)
• How can we best mitigate error from noise?

Figure from Barak+ 2021: The expected performance is what matters You really want the highest value the QAOA returns; when you run the experiment many times, you take the best one, not the average one. But we use the average because it’s easier to calculate, and bound the chance of higher performance with tail bounds. (Caveat: at high enough $$n$$ and $$p$$, there may be concentration – basically every bitstring you sample will achieve exactly the expected performance.) The QAOA may also perform well even if the overlap with optimal value is low.
QAOA is just adiabatic computation Although there are many similarities, there are settings where QAOA improves upon adiabatic computation. See Zhou+ 2018 and also Brady+ 2021. Other comparisons bound the depth $$p$$ for QAOA to approximate adiabatic computation, and compare the control strategies (Bapat+ 2018).
Using QAOA gives quantum advantage Although sampling from a $$p=1$$ QAOA distribution is not possible classically (Farhi+ 2016), simply using QAOA does not necessarily mean you are calculating things classical computers cannot. In fact, there are no examples (yet?) of provable quantum advantage using QAOA on an optimization problem.