I have the following problem and I only have a slight idea about it:

Consider a tape storage problem. Given `n`

files of length `l1,...,ln`

and frequencies with which they are accessed `f1,...,fn`

, where sum of all frequencies is 1 and `0<fi<1`

. "Optimal" means to minimize the average retrieval time of these files. For example, if you have two files, each of length `10, 4`

and frequency `0.8, 0.2`

, if store file 1 first, the average retrieval time is `10 x 0.8 + 14 x 0.2 = 10.8`

.

Design an algorithm to find the optimal order and prove that it works.

My thoughts: Larger frequencies & larger length in front of the order, but which factor should be given a higher priority?

# Best How To :

There is no need for Dynamic Programming for this problem. It is a simple sorting problem, with a slight twist.

Find `frequency / length`

for each file. This gets you, how frequent is each unit length access for the file. Sort these in descending order since each file is "penalized" by the previous file's length.

**Proof:**

Assume that the order is fixed somehow. Then, swapping any two adjacent files will not affect any of the other `li * fi`

other than the location of the swap. Call the left one in the swap `x`

and other one `y == x+1`

. Then, the change in average retrieval time is `ly*fx - lx*fy`

. If `fx/lx < fy/ly`

, then `ly*fx < lx*fy`

which means `ly*fx - lx*fy < 0`

, which means that the average retrieval time decreased. This means that we are at an advantage to do this swap. We keep doing such swaps whenever `fx/lx < fy/ly`

and we will get to a point when this is no longer possible. This must be the optimal answer.

Swapping adjacent things whenever `left < right`

, is similar to bubble sorting in descending order, so we can just generally use any sort instead.