I am trying to create a prolog program that allows to convert a list into a list with the same length consisting of only 1 element from the original list. This element must be chosen in such a way that a minimal number of elements from the original list needs to be altered and all solutions are provided through backtracking e.g. `[a,b]`

can become either `[a,a]`

or `[b,b]`

and `[a,b,a]`

should become `[a,a,a]`

.

As you might have noticed, this is the same problem as finding the element with the most occurrences and making a new list with the same length as the original list containing only that one element. This resulted in the following code:

```
make_all_equal(List, Cost, Result):-
sort(0, @=<, List, Sorted),
occurs_most(Sorted, X, Nr),
length(List, N),
Cost is N - Nr,
repeat(X, N, Result).
occurs_most([], _, 0).
occurs_most([E|List], X, Nr):-
count(E, List, 1, N, Left),
occurs_most(Left, Y, Nr1),
(Nr1 =:= N ->
(X = Y, Nr = Nr1
; X = E, Nr = N) % I would like some backtracking here
;
(Nr1 > N ->
X = Y, Nr = Nr1
;
X = E, Nr = N
)
).
count(_, [], Acc, Acc, []).
count(X, [X|T], Acc, N, Tail):-
Acc1 is Acc + 1,
count(X, T, Acc1, N, Tail).
count(X, [Y|T], Acc, Acc, [Y|T]):-
X \= Y.
repeat(_, 0, []):- !. % There is no existing predicate, is there?
repeat(X, N, [X|T]):-
N > 0,
N1 is N - 1,
repeat(X, N1, T).
```

This code works, but as you might have noticed, the definition of `occurs_most/3`

looks terrible with all those if-statements. I also want to be able to get all solutions through backtracking as I did.

If anybody could help me with the `occurs_most/3`

predicate or better solution strategies for this seemingly simple problem, I would be very thankful. I'm afraid I've been trying for too long already.

PS: this is not a homework, but rather something like a 100th prolog problem...

# Best How To :

# Determinstic variant

First a more efficient but deterministic approach:

```
occurs_most([],_,0).
occurs_most(List,X,Nr) :-
msort(List,[H|T]),
most_sort(T,H,1,H,1,X,Nr).
most_sort([Hb|T],Ha,Na,Hb,Nb,Hr,Nr) :-
!,
Nb1 is Nb+1,
most_sort(T,Ha,Na,Hb,Nb1,Hr,Nr).
most_sort([Hc|T],_,Na,Hb,Nb,Hr,Nr) :-
Nb > Na,
!,
most_sort(T,Hb,Nb,Hc,1,Hr,Nr).
most_sort([Hc|T],Ha,Na,_,_,Hr,Nr) :-
most_sort(T,Ha,Na,Hc,1,Hr,Nr).
most_sort([],Ha,Na,_,Nb,Ha,Na) :-
Na >= Nb,
!.
most_sort([],_,_,Hb,Nb,Hb,Nb).
```

First you use `msort/2`

to sort the list. Then you iterate over the list. Each time you keep track of the currently most occuring one. From the moment the new head `Hc`

differs from the previous one (`Hb`

), you know you will never visit a `Hb`

again (because of the transitivity of the order relation). So you keep counting the number of times in the current sequence. In case the sequence ends, you compare it with the previous sequence. In case it is larger, you accept that one.

# Nondeterminstic variant

Now we can turn the predicate into a non-determinstic one:

```
occurs_most([],_,0).
occurs_most(List,X,Nr) :-
msort(List,[H|T]),
most_sort(T,H,1,X,Nr).
most_sort([Ha|T],Ha,Na,Hr,Nr) :-
!,
Na1 is Na+1,
most_sort(T,Ha,Na1,Hr,Nr).
most_sort([Hb|T],Ha,Na,Hr,Nr) :-
most_sort(T,Hb,1,Hc,Nc),
(Nc =< Na ->
((Hr = Ha,Nr = Na);
(Nc = Na ->
(Hr = Hc,Nr = Nc)
)
);
(Hr = Hc,Nr = Nc)
).
most_sort([],Ha,Na,Ha,Na).
```

A problem with this approach is that if there are multiple strikes that are less than on the right, we will repeat our current strike a few times (we will solve this later). For example (`occurs_most([a,b,b,c,d],X,C)`

) will give twice `L=b,C=2`

simply because `c`

is propagated back as well as `d`

; and for both, we will pass `b`

.

In this version, we don't need to keep track of the currently found maximum. We only work on the current stike. From the moment we reach the end of the list, we return the length of the current strike. Furthermore if we start a new strike, we first look to the strikes on the right. Than we compare this with the current one. If the current one less than, we only let the ones on the right pass. In case these are equal, we both pass th strikes on the right and the current one. If our own strike is larger than the one(s) on the right, we let only the current strike pass.

This algorithm runs in *O(n log n)* (for sorting) and *O(n)* for finding the value that occurs most.

# Getting rid of the duplicate answers

We can get rid of the duplicated answers, by simply first construct a bag of tail strikes:

```
most_sort([Hb|T],Ha,Na,Hr,Nr) :-
findall(Hx/Nx,most_sort(T,Hb,1,Hx,Nx),Bag),
Bag = [_/Nc|_],
(
(Nc =< Na -> (Hr = Ha,Nr = Na); fail);
(Nc >= Na -> member(Hr/Nr,Bag); fail)
).
```

We know there is definitely something in the bag, because there are still elements on the right of the list, that will form a new strike. We collect these in the bag. We furthermore know that these elements all have the same count (otherwise they would not have passed other count tests). So we take the first element from the bag. Inspect the length, in case the length is less than or equal, we first answer with our own strike. In case the strikes in the bag are larger or equal, we pass all members in the bag.

# Boosting it a bit further

Because you use the `occurs_most`

frequently, on the same list, you can optimize the algorithm a bit, by sorting only once in the `make_all_equal`

method). Furthermore you can also put `length/2`

in the front, since the length of a list is fixed so you don't calculate the length *every time you find such most occurring value*. Finally you can boost `repeat/2`

as well: only construct one list with *one* variable, and then instantiate the single variable will save you a lot of work (say the list is thousands of elements long, you can do instantiation in *O(1)*).

```
make_all_equal(List, Cost, Result):-
length(List, N),
msort(List,Sorted),
repeat(X, N, Result),
occurs_most(Sorted, X, Nr),
Cost is N - Nr.
occurs_most([],_,0).
occurs_most([H|T],X,Nr) :-
most_sort(T,H,1,X,Nr).
most_sort([Ha|T],Ha,Na,Hr,Nr) :-
!,
Na1 is Na+1,
most_sort(T,Ha,Na1,Hr,Nr).
most_sort([Hb|T],Ha,Na,Hr,Nr) :-
findall(Hx/Nx,most_sort(T,Hb,1,Hx,Nx),Bag),
Bag = [_/Nc|_],
(
(Nc =< Na -> (Hr = Ha,Nr = Na); fail);
(Nc >= Na -> member(Hr/Nr,Bag); fail)
).
most_sort([],Ha,Na,Ha,Na).
repeat(_, 0, []):-
!.
repeat(X, N, [X|T]) :-
N > 0,
N1 is N - 1,
repeat(X, N1, T).
```