DOT NET TRICKS: Memory Allocation

Showing posts with label Memory Allocation. Show all posts

Monday, February 27, 2012

Inter-Process Communication using MemoryMappingFile

In an operating system, a Memory Mapping file are virtual storage place which has direct byte to byte correlation between the Virtual Address Space and the corresponding physical storage. So when we access the Virtual Address space via a memory mapping file we are directly communicating with the kernel space where the file is actually loaded. The portion of calculation between the physical storage and logical storage is hence avoided.

Memory Mapping files allows application to access files in the same way as memory. Generally as address between the physical storage and virtual memory address space, we cannot access the physical address directly. But using Memory Mapping Files, the process loads a specific range of address within the process address space with which the storage of memory into the file can be done by just assigning value to a dereferenced pointers. The IO operation on a MemoryMapping file is so fast that from the programmers point of view it seems to be like accessing the memory rather than actual physical storage. To increase performance memory mapping files are not actually stored to the disk file as well, but rather it will be stored automatically in background when FlushViewOfFile is executed or paging file is written. To Read more about memory mapping files, read here.

Memory Mapping file inside .NET Framework

.NET introduces a separate namespace to handle Memory Mapping files. Previously, we needed to do this using unmanaged Api's but with the introduction of managed API into the .NET framework library, it becomes very easy to handle MemoryMapping file directly from .NET library.

As memory mapping files are loaded into memory on a separate range of address space, two process can easily share the same page file instance and thus interprocess communication can be made with fast access to memory. It is recommended to back data with an actual disk file when large data is loaded into memory, so that there is no memory leak on the system when there is large memory pressure.

Optimizing INPC Objects against memory leaks using WeakEvents

Working with WPF has always been a fun. Dealing with animation and richness in UI to its optimum level often gives you an edge to present something to your client and to ensure that your client shouts with a "WOW!". The Wow factor of applications can give you high rank in the first place, but increases expectation from the software. It is practically very hard to maintain with this expectation as time progresses. The survival of the fittest chooses one which best suits to the problem.

One of the major problems that developers face today is the memory leaks in an applications. Often the software that is built looks great but does not follow basic guidelines to ensure that the application is not memory hungry or even there are no existing memory leaks.

Note : By memory leak we mean, some portion of memory is not reclaimed by the garbage collector even though the object is not in use.

By this way, the memory usage of the application increases at a certain extent and finally crashes with the OutOfMemoryException. To detect a memory leak there exists a large number of tools of which some are free while others are used as commercial purposes. Most of these problems can be fixed by either using Disposable pattern ( IDisposable interface) or manually de - referencing each and every object that are not in use. Sometimes, this can be also done in destructor / finalizers in .net too (but remember using destructors lose GC cycle).

Internals of .NET Objects and Use of SOS

Well, now getting deeper into the facts, lets talk about how objects are created in .NET and how type system is laid out in memory for this post in my Internals Series. As this is going to be very deep dive post, I would recommend to read this only if you want to kill your time to know the internal details of .NET runtime and also you have considerable working experience with the CLR types and type system.

Recently I have been talking with somebody regarding the actual difference between the C++ type system and managed C# type system. I fact the CLR Type system is different from the former as any object (not a value type) is in memory contains a baggage of information when laid out in memory. This makes CLR objects considerable different from traditional C++ programs.

Classification of Types

In .NET there are mainly two kind of Types.

Value Types (derived from System.ValueType)
Reference Type (derived directly from System.Object)

Even though ValueTypes are internally inherited from System.Object in its core, but CLR treats them very differently. Indeed from your own perception the Value Types are actually allocated in stacks (occationally) while reference types are allocated in Heaps. This is to reduce the additional contension of GC heaps for Heap allocation, GC cycles, occasional call to OS for additional memory needs etc. The object that is allocated in managed Heap is called Managed Object and the pointer that is allocated in stack to refer to the actual object in heap is called Object Reference (which is sometimes called as Managed Pointer).

ValueType and ReferenceType : Under the Hood Part 2

Well, if you have read my previous post, you should be already clear how memory of ValueTypes and ReferenceTypes are allocated and De-allocated internally in terms of IL. Here in this post, I am going to cover some more concepts behind ValueTypes and ReferenceTypes and what exactly comprises of them.

In my previous post on the series, I have told you that any type that inherits from System.ValueTypes is stored in Stack while any type that is not inherited from System.ValueType is stored in Heap. Well, the statement is not correct totally.

ValueTypes and ReferenceTypes : Under the Hood

In .NET, Value Type and Reference Types are forms an element of confusion between both developers and the students. Many of us take this as granted that Value Types are allocated in Thread Stack ( a 1 MB local stack created per Thread) and on each method calls the local value types are allocated in the Stack such that after the call ends, the object is deallocated. On the contrary, the reference types we know are those which are always allocated in heap (which is not always true, I will discuss later) and even though they are used as locals, and will be deallocated only after an interval by a separate Thread that is running with any .NET process (called finalizer thread) which occationally starts finding the memory blocks on Heap storage and compact and store only the reachable objects called Garbage Collector. Well, in this post, I am not going to cover the details of Garbage Collection, but rather I will focus more on Value Types and Reference Types that I know personally to clear any doubt regarding them in terms of IL constructs. So after reading the post, you will know some of the basics of IL too.

Hidden Facts of C# Structures in terms of MSIL

Did you ever thought of a program without using a single structure declared in it ? Never thought so na? I did that already but found it would be impossible to do that, at least it does not make sense as the whole program is internally running on messages passed from one module to another in terms of structures.

Well, well hold on. If you are thinking that I am talking about custom structures, you are then gone too far around. Actually I am just talking about declaration of ValueTypes. Yes, if you want to do a calculation, by any means you need an integer, and System.Int32 is actually a structure. So for my own program which runs without a single ValueType declared in it turns out to be a sequence of Print statements. Does'nt makes sense huh! Really, even I thought so. Thus I found it is meaningless to have a program which just calls a few library methods (taking away the fact that library methods can also create ValueType inside it) and prints arbitrary strings on the output window.

Please Note : As struct is actually an unit of ValueType, I have used it synonymously in the post.

So what makes us use Structures so often?

The first and foremost reason why we use structures is that

They are very simple with fast accessibility.
It is also created on Thread local stack, which makes it available only to the current scope and destroyed automatically after the call is returned. It doesn't need any external invisible body to clear up the memory allocated henceforth.
Most of the primitives have already been declared as struct.
In case of struct we are dealing with the object itself, rather than with the reference of it.
Implicit and explicit operators work great with struct and most of them are included already to the API.

Download Sample - 24KB

Garbage Collection Notifications in .NET 4.0

Memory management is primary for any application. From the very beginning, we have used destructors, or deleted the allocated memory whilst using the other programming languages like C or C++. C# on the other hand being a proprietor of .NET framework provides us a new feature so that the programmer does not have to bother about the memory deallocation and the framework does it automatically. The basic usage is :

Always leave local variable.
Set class variables, events etc to null.

This leaves us with lots of queries. When does the garbage collection executes? How does it affect the current application? Can I invoke Garbage collection when my system / application is idle ?

These questions might come to any developer who has just came to .NET environment. As for me too, I was doing the application just blindly taking it account that this might be the basic usage of .NET applications, and there might be a hidden hand for me who works for me in background. Until after few days, I got an alternative to call the Garbage collection using

GC.Collect()

But according to the documentation, GC.Collect() is a request.It is not guaranteed that the Collection process will start after calling the method and the memory is reclaim So, there is still uncertainty whether actually the Garbage Collection will occur or not.

Why GC.Collect is discouraged ?

GC.Collect actually forces the Garbage collection to invoke its collection process out of its regular cycle. It potentially decreases the performance of the application which calls it. As GC.Collect runs in the thread on which it is called, it starts and quickly finishes the call. In such phase of GC collection, actually the memory is not been reclaimed, rather it just produces a thorough scan on objects that are out of scope. The memory ultimately freed whenever Full Garbage collection takes place.
For reference you can read Scotts blog entry.

Whats new in .NET 4.0 (or .NET 3.5 SP1)
Well, GC is changed a bit with the introduction of .NET 4.0, so that the programmer have better flexibility on how to use GC. One of such changes is the signature of GC.Collect()

Garbage Collection Algorithm with the use of WeakReference

We all know .NET objects deallocates memory using Garbage Collection. Garbage collection is a special process that hooks in to the object hierarchy randomly and collects all the objects that are not reachable to the application running. Let us make Garbage collection a bit clear before moving to the alternatives.

Garbage Collection Algorithm

In .NET, every object is allocated using Managed Heap. We call it managed as every object that is allocated within the .NET environment is in explicit observation of GC. When we start an application, it creates its own address space where the memory used by the application would be stored. The runtime maintains a pointer which points to the base object of the heap. Now as the objects are created, the runtime first checks whether the object can be created within the reserved space, if it can it creates the object and returns the pointer to the location, so that the application can maintain a Strong Reference to the object. I have specifically used the term Strong Reference for the object which is reachable from the application. Eventually the pointer shifts to the next base address space.

When GC strikes with the assumption that all objects are garbage, it first finds all the Strong References that are global to the application, known as Application Roots and go on object by object. As it moves from object to object, it creates a Graph of all the objects that it finds from the application Roots, such that every object in the Graph is unique. When this process is finished, the Graph will contain all the objects that are somehow reachable to the application. Now as the GC already identified the objects that are not garbage to the application, it goes on Compaction. It linearly traverses to all the objects and shifts the objects that are reachable to non reachable space which we call as Heap Compaction. As the pointers are moved during the Heap compaction, all the pointers are reevaluated again so that the application roots are pointing to the same reference again.

WeakReference as an Exception

On each GC cycle, a large number of objects are collected to release the memory pressure of the application. As I have already stated, that it finds all the objects that are somehow reachable to the Application Roots. The references that are not collected during the Garbage Collection are called StrongReference, as by the definition of StrongReference, the objects that are reachable to the GC are called StrongReference objects.

This creates a problem. GC is indeterminate. It randomly starts deallocating memory. So say if one have to work with thousand bytes of data at a time, and after it removes the references of the object it had to rely on the time when GC strikes again and removes the reference. You can use GC.Collect to request the GC to start collecting, but this is also a request.

Now say you have to use the large object once again, and you removed all the references to the object and need to create the object again. Here comes huge memory pressure. So in such situation you have :

Already removed all references of the object.
Garbage collection didnt strike and removed the address allocated.
You need the object again.

In such a case, even though the object still in the application memory area, you still need to create another object. Here comes the use of WeakReference.

Download Sample Application - 27 KB

Memory Management in .NET

In .NET memory is managed through the use of Managed Heaps. Generally in case of other languages, memory is managed through the Operating System directly. The program is allocated with some specific amount of memory for its use from the Raw memory allocated by the Operating system and then used up by the program. In case of .NET environment, the memory is managed through the CLR (Common Language Runtime) directly and hence we call .NET memory management as Managed Memory Management.

Allocation of Memory

Generally .NET is hosted using Host process, during debugging .NET creates a process using VSHost.exe which gives the programmer the basic debugging facilities of the IDE and also direct managed memory management of the CLR. After deploying your application, the CLR creates the process in the name of its executable and allocates memory directly through Managed Heaps.

When CLR is loaded, generally two managed heaps are allocated; one is for small objects and other for Large Objects. We generally call it as SOH (Small Object Heap) and LOH (Large Object Heap). Now when any process requests for memory, it transfers the request to CLR, it then assigns memory from these Managed Heaps based on their size. Generally, SOH is assigned for the memory request when size of the memory is less than 83 KBs( 85,000 bytes). If it is greater than this, it allocates memory from LOH. On more and more requests of memory .NET commits memory in smaller chunks.

Now let’s come to processes. Generally a process can invoke multiple threads, as multi-threading is supported in .NET directly. Now when a process creates a new thread, it creates its own stack, i.e. for the main thread .NET creates a new Stack which keeps track of all informations associated with that particular thread. It keeps informations regarding the current state of the thread, number of nested calls etc. But every thread is using the same Heap for memory. That means, Heaps are shared through all threads.

Upon request of memory from a thread say, .NET allocates its memory from the shared Heap and moves its pointer to the next address location. This is in contrast to all other programming languages like C++ in which memory is allocated in linked lists directly managed by the Operating system, and each time memory requests is made by a process, Operating system searches for the big enough block. Still .NET win32 application has the limitation of maximum 2GB memory allocation for a single process.

32 bit processors have 32 bits of address space for locating a single byte of data. This means each 2^32 unique address locations that each byte of data can locate to, means 4.2 billion unique addresses (4GB). This 4GB memory is evenly distributed into two parts, 2 GB for Kernel and 2 GB for application usage.

De- Allocation of Memory

De - allocation of memory is also different from normal Win32 applications..NET has a sophisticated mechanism to de-allocate memory called Garbage Collector. Garbage Collector creates a thread that runs throughout the runtime environment, which traces through the code running under .NET. .NET keeps track of all the accessible paths to the objects in the code through the Graph of objects it creates. The relationships between the Object and the process associated with that object are maintained through a Graph. When garbage collection is triggered it deems every object in the graph as garbage and traverses recursively to all the associated paths of the graph associated with the object looking for reachable objects. Every time the Garbage collector reaches an object, it marks the object as reachable. Now after finishing this task, garbage collector knows which objects are reachable and which aren’t. The unreachable objects are treated as Garbage to the garbage collector. Next, it releases all the unreachable objects and overwrites the reachable objects with the Unreachable ones during the garbage collection process. All unreachable objects are purged from the graph. Garbage collection is generally invoked when heap is getting exhausted or when application is exited or a process running under managed environment is killed.

Garbage collector generally doesn’t take an object as Garbage if it implements Finalize method. During the process of garbage collection, it first looks for the object finalization from metadata. If the object has implemented Finalize(), garbage collector doesn’t make this object as unreachable, but it is assigned to as Reachable and a reference of it is placed to the Finalization queue. Finalize is also handled by a separate thread called Finalizer thread which traces through the finalizer queue and calls the finalize of each of those objects and then marks for garbage collection. Thus, if an object is holding an expensive resource, the finalize should be used. But there is also a problem with this, if we use finalize method, the object may remain in memory for long even the object is unreachable. Also, Finalize method is called through a separate thread, so there is no way to invoke it manually when the object life cycle ends.

Because of this, .NET provides a more sophisticated implementation of memory management called Dispose, which could be invoked manually during object destruction. The only thing that we need is to write the code to release memory in the Dispose and call it manually and not in finalize as Finalize() delays the garbage collection process.

Cost of Finalize in your Program:

Now let us talk about the cost that you have to bear if you have implemented indeterministic approach of .NET and included Finalize in your class. To make it clear you must know how GC works in CLR:

Generation 0 object means the objects that we have declared after last garbage collection is invoked. 1st Generation objects means which is persisting for last 1 GC cycle. Likewise 2nd Generation objects and so on. Now GC does imposes 10 examinies for 0 to 1 generation objects before doing actual Garbage Collection. For 1 to 2 Generation objects it does 100 examinees before collecting.

Now lets think of Finalize, an object that implemented Finalize will remain 9 cycle more than it would actually collected. If it still not finalized, it would move to Geeration 2 and have to go through 100 examinees to be collected. Thus use of Finalize is generally very expensive in your program.

IDisposable implementation:

For Deterministic approach of resource deallocation, microsoft introduced IDisposable interface to clear up all the resources that may be expensive.

Let us take an example :

Protected virtual void Dispose(bool isDisposing) { if(IsDisposed) return; if(isDisposing) { // Dispose all Managed Resources } IsDisposed = true; GC.SuppressFinalize(this); }

Now let us explain,
The first line indicates an if condition statement, Here I have checked if the object is already disposed or not. This is very essential, as in code one can call dispose a multiple times, we need to always check whether the object is already disposed or not. Then we did the disposing, and then made IsDisposed to true.
Now GC.SuppressFinalize will suppress the call to finalize if it is there. This is because, if the user already disposed the object and cleared up all the expensive resources using deterministic approach of deallocation, we dont need the GC to wait to call Indeterministic Finalize method during the Garbage Collection process.

For local objects, we can call dispose directly after using the object. We can also make use of Using block or try/catch block for automatic disposal of objects.

Note: In case of USING, you must remember it works only with the objects that Implements IDisposable. If you use object that dont have implemented IDisposable interface in using block, .NET will through error.

DOT NET TRICKS

Monday, February 27, 2012

Inter-Process Communication using MemoryMappingFile

Saturday, February 4, 2012

Optimizing INPC Objects against memory leaks using WeakEvents

Monday, September 26, 2011

Internals of .NET Objects and Use of SOS

Sunday, July 17, 2011

ValueType and ReferenceType : Under the Hood Part 2

Saturday, July 16, 2011

ValueTypes and ReferenceTypes : Under the Hood

Saturday, October 16, 2010

Hidden Facts of C# Structures in terms of MSIL

Thursday, August 12, 2010

Garbage Collection Notifications in .NET 4.0

Sunday, August 8, 2010

Garbage Collection Algorithm with the use of WeakReference

Thursday, October 9, 2008

Memory Management in .NET

Author's new book

Join me to get updated

About Me