I'm writing an encryption program to encrypt files (large and small) to do this, my current method is to read 1024 bytes from a file, encrypt those bytes, and write them to a temporary file, and repeat until finish. Once this process finishes, the original file is deleted and the temporary file is renamed to take the name of the original.
Here is a sample piece of code that processes n bytes (n being 1024):
private void processChunk(BinaryReader Input, BinaryWriter Output, int n)
// Read n bytes from the input fileStream
Byte Data = Input.ReadBytes(n);
// Read n bytes from the streamCipher
Byte cipherData = StreamCipher.OutputBytes(n);
for (int x = 0; x < n; x++)
// XOR a byte of the input stream with a corresponding byte of the streamCipher
Data[x] ^= cipherData[x];
// Write n bytes to the output fileStream
So I'm pretty sure I can't multi-thread the encryption algorithm because the bytes are generated as a keystream and depend on the bytes generated before, but reading and writing from files and cpu operations can be?
What's the best strategy to take here?
Best How To :
Spontaneously, I would suggest to run three threads in parallel:
- A reader thread that reads chunks of data into memory.
- An encryption thread doing all the work.
- A writer thread that writes the encrypted data to disk.
The three threads communicate via two queues, like the BlockingCollection provided by .Net 4. See Fast and Best Producer/consumer queue technique BlockingCollection vs concurrent Queue.
So thread 1 fills queue 1, thread 2 reads queue 1 and fills queue 2, thread 3 reads queue 3. If any of the threads is faster than the others, the BlockingCollection will block the reading or writing thread until the thread on the other side has caught up. For example, if the BlockingCollection is set to a max size of 10, the reading thread will block after it has read 10 data chunks ahead of the encryption thread.
One more observation: Input.ReadBytes will allocate a new byte array on the heap for every read. This array will be discarded after the current chunk is processed, so if you have large files and a fast encryption algorithm, memory allocation and garbage collection could actually noticeably impact the performance (.Net zeros memory buffers upon allocation). Instead, you could use a pool of buffers that are reserved and returned by the read and encryption threads, and use the Stream.Read method that accepts an existing buffer to write into.