Reverse annealing for nonnegative/binary matrix factorization

It was recently shown that quantum annealing can be used as an effective, fast subroutine in certain types of matrix factorization algorithms. The quantum annealing algorithm performed best for quick, approximate answers, but performance rapidly plateaued. In this paper, we utilize reverse annealing instead of forward annealing in the quantum annealing subroutine for nonnegative/binary matrix factorization problems. After an initial global search with forward annealing, reverse annealing performs a series of local searches that refine existing solutions. The combination of forward and reverse annealing significantly improves performance compared to forward annealing alone for all but the shortest run times.


Introduction
Due to the slowing progress of classical computing [1], new computational architectures [2,3,4] have gained much interest in recent years.One such architecture is quantum annealing [5].Recently, D-Wave's quantum annealing hardware [6,7] has introduced a new form of annealing -reverse annealing [8,9].Here, we explore the use of reverse annealing in the context of Nonnegative/Binary Matrix Factoriztion (NBMF), which has shown some promise in combination with quantum annealing [10].
The NBMF algorithm factors a matrix A into the product of a nonnegative, real-valued matrix B and a binary matrix C.This factorization is useful in machine learning contexts such as learning facial features.The algorithm employs an alternating least-squares approach, where each iteration includes the solution of a binary least squares problem and a nonnegative least squares problem.The binary least squares problem is solved with the quantum annealer.It has been shown that a quantum annealer provided noticeable speed-up compared to two classical solvers for the binary least squares problem [10].
One downside of the quantum annealer approach is that improvement in solution quality from iteration to iteration quickly plateaus.This is because the forward annealing approach that was used previously could only perform global searches when solving the binary least squares problem.This ignores the results of the solutions from previous iterations, which is likely a good starting point for the next iteration.Instead, the annealing process almost always produces a factor matrix that is very different from the factor matrix at the previous iteration.In practice, this means that the algorithm hops around solution space at random, quickly finding good solutions but never refining them beyond a certain level of accuracy.
Fortunately, the latest iteration of the D-Wave hardware, the 2000Q, allows us to explore solutions around some initial classical state.This process is known as reverse annealing.In this paper, we utilize reverse annealing to improve performance of the NBMF algorithm.Specifically, we can use reverse annealing to explore local minima near an initial state defined by the results of the previous iteration of the algorithm.This significantly reduces the iteration-over-iteration change in the algorithm, allowing promising solutions to be refined rather than discarded.
The paper is structured as follows.We review the NBMF algorithm in Section 2, and then introduce reverse annealing and our methodology for calibrating the reverse anneal process in Section 3. We compare the performance of the forward annealing version of the algorithm with the modified reverse annealing implementation in Section 4 and offer concluding observations in Section 5.

Review of NBMF Algorithm
The NBMF algorithm takes a real-valued n × m matrix A and finds B and C such that where B is a nonnegative n×k matrix and C is a binary k×m matrix.Generally, a small value of k is used, so that the factorization is low rank.The outline of the algorithm is straightforward.After randomly initiating a seed matrix C (0) , each iteration follows an alternating least squares approach: find find An important feature of the NBMF algorithm is that eq. ( 3) can be efficiently implemented on the D-Wave quantum annealer.In particular, the columns C j can be solved for independently: This is equivalent to the Quadratic Unconstrained Binary Optimization (QUBO) where It is important to note that this is generally a fully-connected QUBO, thus requiring an embedding of the complete graph on to the physical hardware.However, this complete graph has k vertices -equal to the rank of the factorization.This allows large matrices to be factored even through the number of qubits on the hardware is limited.For full details of the algorithm implementation, see [10].For consistency, we use the same data set (2,429 facial images) and rank k = 35 as previous work [10].

Reverse Annealing: Method, Calibration & Timing
The NBMF algorithm begins with a random initialization of the B and C matrices.Reverse annealing from this random starting point is ineffective and requires many iterations to achieve results comparable to forward annealing.Conducting a single iteration with forward annealing and then switching to reverse annealing produces much better results (increasing the number of forward annealing iterations beyond this does not offer noticeable improvements).This is in line with the idea that forward annealing performs a global search and reverse annealing performs a local search.Intuitively, it is advantageous to start with a global search and then transition to a local search.In the rest of this section we calibrate the reverse anneal process after the initial round of forward anneals.
For a given reversal distance r and time t r , the anneal schedule we use is where the first entry in each tuple is the elapsed time (in microseconds) and the second entry is the dimensionless anneal parameter s ∈ [0, 1] (s = 1 is a fully annealed system).The physical interpretation of eq. ( 8) is that we begin in a specific annealed state at t = 0, "warm" the system up to a certain temperature (parametrized by r), hold the system at that temperature for a time t r , and then re-anneal the system.In addition to specifying the reverse anneal schedule, we must also specify the initial state.As discussed in the introduction, the NBMF algorithm naturally provides an initial configuration based on the results of the previous iteration of the algorithm.Specifically, if we are solving for C (i) j , i.e. the ith iteration of the jth column of C, we can use C (i−1) j as the beginning point of our reverse anneal process.
We characterize the efficacy of a reverse anneal sample by seeing if it: • is the same as the initial state, • has a lower energy than the initial state (good), • has a higher energy than the initial state (bad).
The frequency with which samples fall in to each category gives us an idea of how effective the reverse anneal is at finding improved solutions.First, let us examine the effect of the reversal time, t r .We start by randomly selecting a QUBO generated during an evaluation of the NBMF algorithm, after the first round of forward anneals.In Fig. 1 we see that increasing t r does  not significantly increase the likelihood of discovering better states.The peak reversal distance decreases as t r increases, however the maximum percentage of samples with better ground states is similar.We therefore choose t r = 10µs as the reversal time for this application.
Next we study the effect of reversal distance r.Fig. 2 shows the effect of reversal distance for four randomly chosen QUBOs taken from the NBMF algorithm (again after the first iteration of forward annealing).Here we see that the likelihood of reverse anneal discovering a lower energy state varies from QUBO to QUBO, however the peak likelihood consistently occurs at or near r = 0.45.For this reason, we use a reverse anneal schedule for the NBMF algorithm with r = 0.45 and t r = 10µs.
We note that the choice of these parameters has some dependence on the matrix that is being factored.For example, the same calibration procedure evaluated on a matrix with random values (as opposed to the highly structured facial imagery data) revealed an optimal reversal distance of r = 0.2.There is additional computational overhead related to the reverse anneal process, such as configuring the hardware in to the chosen initial state before each anneal.Therefore, for the purposes of comparing forward and reverse anneal efficacy we will look at quality of solution vs. total QPU access time (as opposed to (annealing time × number of anneals), as was done in [10]).The total QPU access time is calculated via total QPU access time = (anneal + readout + delay) × number of samples + QPU programming time (9) Forward and reverse anneals share identical readout and QPU programming times (123µs and 8001µs, respectively).As previously discussed, the forward anneal takes 20µs while the reverse anneal takes 30µs.The major difference is in the 'delay' time, as this is the period when the quantum annealer is reset to the initial state between anneals.For the D-Wave 2000Q used for this study, located at Los Alamos National Laboratory, the delay time per sample in the forward anneal case is 21µs, while the delay time per reverse anneal sample is 520µs.So we see that the biggest time commitment in doing reverse anneals comes not from the longer anneal schedule but from the repeated state preparation.
When comparing the reverse anneal results against the original forward an-neal version of the algorithm, we allot each method equal QPU access time.
Given the timing values discussed above, we find that the ratio number of reverse anneals = 0.24 * (number of forward anneals) (10) results in equivalent total QPU access time.The remaining important variables are the number of anneals per QUBO and the total number of iterations for the algorithm to run.We discuss these in the following section.

Results
In this section we use reverse annealing in the NBMF algorithm to factor the dataset of 2,429 facial images studied in [10] in to a 2429 × 35 non-negative matrix B and 35 × 2429 binary matrix C. In this application, the columns of the C matrix can be interpreted as decompositions of each face in to 35 component features.First, we will examine the differences between the two algorithms for a fixed number of anneals.We will then study the efficacy of the two algorithms as a function of total QPU access time.Fig. 3 shows the results of the two algorithms with 6182 seconds of total QPU access time (equivalent to 1000 forward anneals or 240 reverse anneals per QUBO).The reverse anneal algorithm shows consistent improvement for many  more iterations, and produces a better result than forward anneal by the third iteration.
Recall that our hypothesis, outlined in the introduction, is that reverse annealing will outperform forward annealing due to more refinement of existing solutions as opposed to generation of entirely new solutions per QUBO.If we define and % change in C as the Hamming distance between C (i+1) and C (i) divided by the size of C, then we can look at the iteration-over-iteration change in the B and C matrices to see if this is indeed the case, see Fig. 4. The iterative  time, forward annealing results in superior performance.However, once the total QPU access time exceeds ≈ 210s, which corresponds to 7 reverse anneals per QUBO, reverse annealing overtakes forward annealing, eventually plateauing at approximately 12% improvement over the forward annealing algorithm.
The reason that forward annealing outperforms reverse annealing for small sample size is straightforward.For a single reverse anneal sample, the likelihood of finding a better state than the initial configuration is quite low (≤ 25%, sometimes much lower, see Fig. 2).Therefore, when the number of reverse anneals per QUBO is small, most iterations will result in no change.Forward annealing might be finding worse or different solutions for each QUBO, but the fact that they are new solutions for each iteration means that overall the algorithm can improve.As the number of reverse annealing samples increases, the chance of finding a better/different solution increases significantly, resulting in improved performance.

Conclusion
The results of this work suggest that reverse annealing improves the quality of the NBMF factorization by 12% for this application.This improvement is seen when the number of reverse anneals evaluated per QUBO is at least 7 (which is equivalent in QPU access time to 29 forward anneals).In [10], it was observed that quantum annealing had the largest performance gains relative to classical benchmarks in the short annealing timeframe, O(10) forward anneals per QUBO.Reverse annealing improves performance in the longer annealing timeframe, thus further establishing quantum annealing as a strong approach for non-negative binary matrix factorization.
In addition to characterizing the performance in terms of the quality of the factorization given a fixed time, it could be characterized in terms of how long it takes to obtain a factorization of a given quality.By this standard, reverse annealing would also perform well once the quality of the factorization is set sufficient low.Since NBFM with forward annealing tends to plateau at a worse factorization quality, the speed-up with reverse annealing would be very large once the factorization quality is set beyond this plateau.
Our results could be improved upon in several ways.First, it is possible that the optimal reverse anneal schedule could depend on how many iterations have already occured (i.e., as better solutions become harder to find).It is also our hope that future quantum annealing hardware will feature more rapid state initialization, as this accounts for over 98% of the additional time related to reverse annealing.This would improve the performance of NBMF with reverse annealing but leave the performance of NBMF with forward annealing unchanged.Lastly, the exact nature of the matrix being factorized appears to play a role in determining how effective the algorithm is, and this could be explored further.

Figure 1 :
Figure 1: Comparison of t r = 10µs vs. 100µs.For a given reversal distance, the height of the green area indicates the probability that a reverse anneal sample will have a lower energy than the initial state.

Figure 2 :
Figure2: The effect of reversal distance on reverse annealing efficacy for several randomly chosen QUBOs in the NBMF algorithm, each with t r = 10µs.A reversal distance of r = 0.45 gave the greatest chance of discovering a better sample than the initial state across all QUBOs.

Figure 3 :
Figure 3: Comparison of forward and reverse anneal versions of the NBMF algorithm with 1000 forward anneals and 240 reverse anneals per QUBO, corresponding to a total QPU access time of 6182 seconds over the full evaluation of each algorithm.

Figure 4 :
Figure 4: Iterative improvement of B and C matrices for forward and reverse anneal versions of the NBMF algorithm during the evaluation described in Fig. 3.

Figure 5 :
Figure 5: Comparison of reverse and forward annealing versions of NBMF algorithm after seven iterations for a range of annealing times.Reverse annealing results in up to a 12% increase in performance.