Occasionally, you need a more robust solution to solve a problem.
In my last post, I wrote about the horrors of this small code snippet:
public byte[] Serialize(object o) { using (var stream = new MemoryStream()) { MySerializer.Serialize(stream, o); return stream.ToArray(); } }
One way to alleviate the memory pressure that can be caused by frequent creation and destruction of large objects is to tell the .Net garbage collector to compact the Large Object Heap (LOH) using:
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
However, this solution, while it may reduce the memory footprint of your application, does nothing to really solve the initial problem of allocating all that memory in the first place. As I mentioned in my last post, one way to accomplish that goal is to use a buffer pool.
The Buffer Manager
Instead of writing my own buffer manager, I am going to use the implementation Microsoft provides in the Windows Communication Framework (WCF) called the BufferManager. Now that WCF is open source you can go look at the implementation of BufferManager here.
The BufferManager
essentially allocates a large chunk of contiguous memory and sets it aside for later use. You can create an instance of a BufferManager
by calling the CreateBufferManager()
static method as follows:
BufferManager.CreateBufferManager(maxPoolSize, maxBufferSize);
The first parameter, maxPoolSize
represents how much memory you want the BufferManager
to allocate in total. The second parameter, maxBufferSize
represents the maximum amount of memory that can be obtained when requesting an individual buffer from the pool.
To ask for a buffer from the pool, you call the TakeBuffer(n)
method passing the size of the buffer you need, up to the size specified as the maxBufferSize
. To release the buffer back to the pool, you call the ReturnBuffer(buffer)
method and pass back the buffer you previously had taken.
Once noteworthy characteristic of the BufferManager
is that all the size parameters are converted to a size in a multiples of the powers of 2. That is, if you call TakeBuffer(100000);
you will receive a buffer with the size of 2^17 (or 131072) since 2^16 (65535) is smaller than 100,000.
WARNING: Once you return a buffer to a pool, you still have a reference to that memory, so it is advisable to set your buffer to
null
immediately afterwards to avoid accidentally using it.
BufferManager and MemoryStream Together At Last
It is certainly possible to use the BufferManager
and a MemoryStream
together and achieve some relief in creating large objects.
For example, to prevent a MemoryStream
from allocating its own buffer, you can pass one into its constructor as follows:
var bm = BufferManager.CreateBufferManager(maxPoolSize, maxBufferSize); var buffer = bm.TakeBuffer(131072); using (var ms = new MemoryStream(buffer)) { // Do work. } bm.ReturnBuffer(buffer); buffer = null;
This approach will prevent MemoryStream
from creating an internal buffer and save you both the large object and the memory allocations step.
Not Everything Comes Up Roses
One of the problems with the BufferManager
/MemoryStream
approach is the fixed buffer size. Of course, MemoryStream
can use a smaller portion of the buffer than it needs, but it will not increase capacity. In some scenarios, it may be desirable to have the buffer grow if needed, hence the almost ubiquitous use of MemoryStream
’s parameterless constructor.
Another issue with MemoryStream
is the the ToArray()
method. This method is typically used to return the data in the stream as a byte array. This is done to keep the result separate from the internal buffer; therefore it will allocate a new array and copy the used portion of the buffer to the array. Unfortunately, this could very well generate another large object. Furthermore, you will get the entire buffer back, even if you only wrote to half of it.
One way to work around this issue would be to not use the ToArray()
method. Instead, you could use read the data out of the MemoryStream
into your own array, obtained from the BufferManager
. Unfortunately, this would force you to keep track of how much data you wrote to the stream.
var bm = BufferManager.CreateBufferManager(maxPoolSize, maxBufferSize); var buffer = bm.TakeBuffer(131072); byte[] output; using (var ms = new MemoryStream(buffer)) { var bytesWritten = MySerializer.Serialize(ms, o);</pre> <pre><code>// Do more work. output = bm.TakeBuffer(bytesWritten); ms.Read(output, 0, bytesWritten); </code></pre> <pre>} bm.ReturnBuffer(buffer); buffer = null; // Use 'output' for something. bm.ReturnBuffer(output); output = null;
Since BufferManager
will give you a buffer sized to a power of 2, the buffer you receive will probably be larger than bytesWritten
. Typically, a serializer will not return the number of bytes they serialized the object into, so this may be a problem for you to solve. It really is unfortunate that MemoryStream
will not let you know since it sets the Length
to the Capacity
when a buffer is provided in construction.
Finally, there is an issue with BufferManager
itself: it does not always zero-out the buffer when you take or return one. It is certainly possible to return a buffer and then request a new one that contains data from the previous use of the buffer. Array.Clear()
should be pretty fast, but it’s another step you need to do after taking or before returning a buffer if you feel it is necessary.
It’s a Solution, Not The Only Solution
If it seems more trouble than it’s worth to use the BufferManager
with a MemoryStream
, you may be right. You can fall back to just having the garbage collector clean up after you, or we can look at other alternatives. In a future post, I’ll discuss using a custom implementation to replace the MemoryStream
.
To learn more about tuning C# code to run at its most effective, take John Robbins’ Mastering .Net Performance Tuning class, here at Wintellect.