I'm trying to implement parallelism to this function I want it to take as many threads as possible, and write the results to a file.

The results need to be written in the file in the incrementing order so the first result needs to be written first the second second and so on.

The keyGen function is simply an MD5 of the integer m which is used as the start point for each chain. Reduction32 is a reduction function it takes the first 8 byte adds t and returns that value. When a chain reaches its endpoint it is stored in the binary file.

Is there a smart way to make this parallel? without screwing up the order the endpoints are stored in?

```
void tableGenerator32(uint32_t * text){
int mMax = 33554432, lMax = 236;
int m, t, i;
uint16_t * temp;
uint16_t * key, ep[2];
uint32_t tp;
FILE * write_ptr;
write_ptr = fopen("table32bits.bin", "wb");
for(m = 0; m < mMax ; m++){
key = keyGen(m);
for (t = 0; t < lMax; t++){
keyschedule(key);
temp = kasumi_enc(text);
tp = reduction32(t,temp);
temp[0]=tp>>16;
temp[1]=tp;
for(i=0; i < 8; i++){
key[i]=temp[i%2];
}
}
for(i=0;i<2;i++)
ep[i] = key[i];
fwrite(ep,sizeof(ep),1,write_ptr);
}
fclose(write_ptr);
}
```