I have some large arrays each with i elements, call them X, Y, Z, for which I need to find some values a, b--where a and b are real numbers between 0 and 1--such that, for the following functions,

```
r = X - a*Y - b*Z
r_av = Sum(r)/i
rms = Sum((r - r_av)^2), summing over the i pixels
```

I want to minimize the rms. Basically I'm looking to minimize the scatter in r, and thus need to find the right a and b to do that. So far I have thought to do this in nested loops in one of two ways: either 1)just looping through a range of possible a,b and then selecting out the smallest rms, or 2)inserting a while statement so that the loop will terminate once rms stops decreasing with decreasing a,b for instance. Here's some pseudocode for these:

```
1) List
for a = 1
for b = 1
calculate m
b = b - .001
a = a - .001
loop 1000 times
sort m values, from smallest
print (a,b) corresponding to smallest m
2) Terminate
for a = 1
for b = 1
calculate m
while m > previous step,
b = b - .001
a = a - .001
```

Is one of these preferable? Or is there yet another, better way to go about this? Any tips would be greatly appreciated.

# Best How To :

There is already a handy formula for least squares fitting.

I came up with two different ways to solve your problem.

For the first one, consider the matrix `K`

:

```
L = len(X)
K = np.identity(L) - np.ones((L, L)) / L
```

In your case, `A`

and `B`

are defined as:

```
A = K.dot(np.array([Y, Z]).transpose())
B = K.dot(np.array([X]).transpose())
```

Apply the formula to find C that minimizes the error `A * C - B`

:

```
C = np.linalg.inv(np.transpose(A).dot(A))
C = C.dot(np.transpose(A)).dot(B)
```

Then the result is:

```
a, b = C.reshape(2)
```

Also, note that numpy already provides linalg.lstsq that does the exact same thing:

```
a, b = np.linalg.lstsq(A, B)[0].reshape(2)
```

A simpler way is to define A as:

```
A = np.array([Y, Z, [1]*len(X)]).transpose()
```

Then solve it against `X`

to get the coefficients and the mean:

```
a, b, mean = np.linalg.lstsq(A, X)[0]
```

If you need a proof of this result, have a look at this post.

Example:

```
>>> import numpy as np
>>> X = [5, 7, 9, 5]
>>> Y = [2, 0, 4, 1]
>>> Z = [7, 2, 4, 6]
>>> A = np.array([Y, Z, [1] * len(X)]).transpose()
>>> a, b, mean = np.linalg.lstsq(A, X)[0]
>>> print(a, b, mean)
0.860082304527 -0.736625514403 8.49382716049
```