I found this material online, I do not understand why equation (5) is equal to equation (6)? How to deduct?

Given a dictionary D, a vector x has sparsity s if it can be written exactly as a linear combination of s columns of D. An important result that underlies all Sparse representation classification frameworks is the guarantee provided by the sparse recovery result that for a feature vector x with sparsity bounded from above by a constant depending on D, x can be recovered by

# Best How To :

This is not an example of a Lagrange multiplier, and the two equations are not equivalent. However, the paper doesn't claim this: the text states that formula (5) is "modified" to get formula (6).

Using a Lagrange multiplier would lead to a coupled system of two equations. Note how formula (6) doesn't exactly enforce the constraint `Dc=x`

, it only minimizes its residual. That's not the same thing. The solution `c`

of (6) will typically not satisfy `Dc=x`

, whereas the solution `c`

of (5) always has to satisfy it by definition.

What (6) actually does is express the constraint using a penalty term. The parameter `lambda`

expresses how much emphasis is put on minimizing the l1-norm of c versus minimizing the residual of the constraint `x - Dc`

.

So, (5) puts a hard constraint on the permitted values `c`

, whereas (6) basically says, "I've got these two things I'd both like to be somewhat small... find me a good compromise."