Login


MSIL Programming Part 2

By Ajay Yadav on 11/1/2014
Language: CIL
Technology: .NET
Platform: Windows
License: CPOL
Views: 21,227
Frameworks & Libraries » .NET » General » MSIL Programming Part 2

Note: This article is the second of a two-part series on MSIL programming. Also see MSIL Programming Part 1.

Introduction

The goal of this article is to explain the CIL syntax and semantics for defining object-oriented language elements such as namespace, interface, field, class, etc. CIL programming is typically done using non-specialized editors that do not provide Intellisense and other rich-environment features available in the Visual Studio IDE. Yet IL developers can still implement all the same constructs they would typically implement using higher level languages, such as C#. And having an understanding of the inner workings of CIL and .NET is beneficial, especially while doing code optimization, debugging, or reverse engineering (such as malicious code detection or subverting security mechanisms).

Field Metadata

Fields allocate storage for .NET data types, which can include numeric, character, and decimal types. You can declare fields using the .field directive. It has three parameters: name, signature, and access modifiers (flags). Fields in the .NET framework can be categorized as value types or reference types.

.field <flags> <type> <name>

The field flags determine the accessibility scope of the field inside and outside the assembly (public, private), contract (static, initonly, literal), and reserved (marshal, rtspecialname). The type indicates the type of data (strings, character, numeric types, etc.) to be stored at this location.

The fields can be either global (inside class type) or local (inside function scope). The code in Syntax 1 shows data fields defined in the class-level scope.

Syntax 1: Field Declarations in IL Code

// Integer data type
.field private int32 iVal

// Float data type
.field private float32 fVal

// String data type
.field private string sVal

// Character data type
.field private char cVal

In case of class-level data initialization, the default value can be directly assigned to the variable as follows.

// Declaration with assignment 
.field private int32 iVal = int32(50)
.field private string sVal = "Ajay"

Data fields that are defined inside the method body are consider to be local data, and are defined using the .locals directive. Here, we define a local integer type value.

.local init ([0] int32 x)

The code in Listing 1 demonstrates some common operations by specifying both local variables, and global variables (outside the method).

Listing 1: Field Declaration in IL Code

.module cilFields.exe

.class private auto ansi beforefieldinit cilFields.fldsDemo
                     extends [mscorlib]System.Object
{
  .field private int32 x
  .method public hidebysig instance void testCal() cil managed
  {
    .maxstack  2
    .locals init ([0] int32 z,  [1] int32 y)
    IL_0000:  nop
    IL_0001:  ldc.i4.s   50
    IL_0003:  stloc.1
    IL_0004:  ldarg.0
    IL_0005:  ldfld      int32 cilFields.fldsDemo::x
    IL_000a:  ldloc.1
    IL_000b:  add
    IL_000c:  stloc.0
    IL_000d:  ldstr      "Fields Demo:: Calculation is {0}"
    IL_0012:  ldloc.0
    IL_0013:  box        [mscorlib]System.Int32
    IL_0018:  call       void [mscorlib]System.Console::WriteLine(string, object)
    IL_001d:  nop
    IL_001e:  ret
  }

Properties Metadata

Properties enable strict control over access to the internal state of an object. They behave like a public field, and the notation to access a property is the same as a public field on the instance. A property is a shorthand notation used to read and write fields. The .property directive is employed to define a property using the related .get and .set as shown in Syntax 2 and Listing 2.

Syntax 2: Properties Declaration in IL Code

.property instance int32 iVal()
{
  .get instance int32 NamespaceName.Class::get_iVal()
  .set instance void Namespace.Class::set_iVal(int32)
}

Listing 2: Properties Declaration in IL Code

.class private auto ansi beforefieldinit cilProperties.cPrptDemo extends [mscorlib]System.Object
{
  .field private string 'Color__Field'
  
  .method public hidebysig specialname instance string get_Color() cil managed
  {
    .maxstack  1
    .locals init (string V_0)
    IL_0000:  ldarg.0
    IL_0001:  ldfld      string cilProperties.cPrptDemo::'Color__Field'
    IL_0006:  stloc.0
    IL_0007:  br.s       IL_0009
    IL_0009:  ldloc.0
    IL_000a:  ret
  } 

  .method public hidebysig specialname instance void set_Color(string 'value') cil managed
  {
    .maxstack  8
    IL_0000:  ldarg.0
    IL_0001:  ldarg.1
    IL_0002:  stfld      string cilProperties.cPrptDemo::'Color__Field'
    IL_0007:  ret
  } 

  .method public hidebysig instance void Display() cil managed
  {
    .maxstack  8
    IL_0000:  nop
    IL_0001:  ldstr      "Property Demo::Color is {0}"
    IL_0006:  ldarg.0
    IL_0007:  call       instance string cilProperties.cPrptDemo::get_Color()
    IL_000c:  call       void [mscorlib]System.Console::WriteLine(string,object)
    IL_0011:  nop
    IL_0012:  ret
  } 

  .property instance string Color()
  {
    .get instance string cilProperties.cPrptDemo::get_Color()
    .set instance void cilProperties.cPrptDemo::set_Color(string)
  } 
}

Namespace

A namespace is a collection of related .NET types, such as classes, interfaces, etc., contained within an assembly. A single assembly can have more than one namespace definition. The Namespace in IL coding is declared using .namespace directive, as shown in Syntax 3.

Syntax 3: Namespace Declaration

.namespace testNamespace
{
    // Classes declaration section
}

Namespaces can also be nested.

.namespace parent
{
    // Classes declaration section

    .namespace child
    {
        // Classes declaration section
    }
}

// Or nested namespace can be specified as
.namespace parent.child { }

It's important to remember that namespaces are neither considered metadata, nor referenced by IL tokens. The metadata in Figure 1 shows that none of the metadata or token relate to a namespace.

Figure 1: Metadata

Metadata

Class Metadata

The Class type in IL code is defined using the .class directive. It implicitly obtains the entry of .NET System.Object base class entry as well as class should be specified by its full name, even if it is resided in the same assembly.

Syntax 3: Class Declaration

.namespace testNamespace
{
    // Classes declaration section

    .class public myClass
    {
        //Class members
    }
}

As we know, C# code controls the visibility of fields, methods, classes, and properties through various keywords such as public, private, abstract, sealed, etc. Table 1 describes the IL keywords for controlling the availability of types inside or outside an assembly.

Table 1: Visibility Attributes IL Code

Keyword (Attribute) Description
extends Allows the child class to inherits the base class
implements Enables a class to implement the Interface functionality
sealed Define a sealed class during inheritance
abstract Define an abstract class
public, private, nested public, nested private Controls the visibility of the fields, methods, and properties inside or outside the assembly
auto, sequential, explicit These flags assist in mapping a field's data to memory (auto is the default)

Constructor Metadata

Constructors are used to initialize classes, and defined through the .ctor and .cctor directives in IL code. .ctor represents an instance-level constructor while .cctor represents a static-level constructor. Note that constructors do not return a value and so are implicitly treated as void. Syntax 4 demonstrates a default class constructor.

Syntax 4: Parameterless Constructor Declaration

.method public hidebysig specialname rtspecialname instance void .ctor() cil managed {}

In the code above, it is mandatory to include the specialname and rtspecialname attributes, which uniquely identify a constructor definition in the IL code. Syntax 5 demonstrates declaring a constructor that accepts an integer argument.

Syntax 5: Parameter Constructor Declaration

.field private int32 iValue 

.method public hidebysig specialname rtspecialname instance void .ctor(int32 i) cil managed 
{
    // Implementation code
}

Listing 3 demonstrates a class constructor that accepts a string as a parameter.

Listing 3: Constructor Declaration in IL Code

.module cilConstructor.exe

.class public auto ansi beforefieldinit cilConstructor.EntryPint
       extends [mscorlib]System.Object
{
    .method public hidebysig specialname rtspecialname 
            instance void  .ctor() cil managed
    {
        .maxstack  8
        IL_0000:  ldarg.0
        IL_0001:  call       instance void [mscorlib]System.Object::.ctor()
        IL_0006:  ret
    } 
}

Interface Metadata

An interface creates a description of properties and methods of a class type. Unlike a class, however; it is not possible to create an instance of an interface. In fact, an interface can only implement another interface. Also, an interface cannot be derived from another type like classes can. The interface shouldn't be sealed, and methods defined in an interface must be marked as virtual. Interface types are defined through .class directive in IL code as shows in Syntax 6.

Syntax 6: Interface Declaration

.namespace testNamespace
{
    // Interface declaration section

    .class public interface myInterface
    {
        // Properties and methods
    }
}

A class can exposed and implement an interface using the implements keyword. Syntax 7 demonstrates this.

Syntax 7: Interface Implementation in Class

.class public myClass implements testNamespace.myInterface
{
    // Members utilization
}

Listing 4: Interface Declaration in IL Code

.class interface public abstract auto ansi cilInterface.ITestInterface
{
  .method public hidebysig newslot abstract 
         virtual instance void  sqrt(float64 i) cil managed { } 
} 

.class public auto ansi beforefieldinit cilInterface.cInrfDemo
       extends [mscorlib]System.Object
       implements cilInterface.ITestInterface
{
  .method public hidebysig newslot virtual final 
          instance void  sqrt(float64 i) cil managed
  {
    .maxstack  2
    .locals init ([0] float64 cal)
    IL_0000:  nop
    IL_0001:  ldarg.1
    IL_0002:  call       float64 [mscorlib]System.Math::Sqrt(float64)
    IL_0007:  stloc.0
    IL_0008:  ldstr      "Interface Demo:: Sqrt is {0}"
    IL_000d:  ldloc.0
    IL_000e:  box        [mscorlib]System.Double
    IL_0013:  call       void [mscorlib]System.Console::WriteLine(string,object)
    IL_0018:  nop
    IL_0019:  ret
  } 
} 

Structure Metadata

Structures are user-defined types. They can contain any number of data fields, and members that operate on those fields. The structure type must be defined as sealed. Syntax 8 demonstrates using the .class directive to define a structure.

Syntax 8: Structure Declaration

.namespace testNamespace
{
    // Structure declaration section
    .class public sealed myStructure
    {
        // Members
    }
}

The program in Listing 5 defines a structure type that has one integer type, and a method that performs some operations on the variable.

Listing 5: Structure Declaration in IL Code

.module cilStructure.exe

.class private sequential ansi sealed beforefieldinit cilStructure.sTest
       extends [mscorlib]System.ValueType
{
  .field public int32 y
  .method public hidebysig instance void square() cil managed
  {
    .maxstack  8
    IL_0000:  nop
    IL_0001:  ldarg.0
    IL_0002:  ldc.i4.4
    IL_0003:  stfld      int32 cilStructure.sTest::y
    IL_0008:  ldstr      "Square is {0}"
    IL_000d:  ldarg.0
    IL_000e:  ldfld      int32 cilStructure.sTest::y
    IL_0013:  ldarg.0
    IL_0014:  ldfld      int32 cilStructure.sTest::y
    IL_0019:  mul
    IL_001a:  box        [mscorlib]System.Int32
    IL_001f:  call       void [mscorlib]System.Console::WriteLine(string,object)
    IL_0024:  nop
    IL_0025:  ret
  } 
} 

Enum Metadata

The Enum also belongs to the CLR. It must, therefore, be marked using the sealed keyword in the IL code, as demonstrated in Syntax 9.

Syntax 9: Enum Declarations

.namespace testNamespace
{
    // Enumerator declaration section

    .class public sealed myEnum
    {
        // Members
    }
}

Enumerator typically contains constant fields that must be defined with a value within range for the underlying type. Listing 6 illustrates this by defining three constant values: Red, Green, and Blue.

Listing 6: Enumerator Declaration in IL Code

.module cilEnum.exe

.class public auto ansi sealed cilEnum.eColor extends [mscorlib]System.Enum
{
    .field public specialname rtspecialname int32 value__
    .field public static literal valuetype cilEnum.eColor Red = int32(0x00000014)
    .field public static literal valuetype cilEnum.eColor Green = int32(0x00000032)
    .field public static literal valuetype cilEnum.eColor Blue = int32(0x00000050)
}

Generics Metadata

Generics allow us to build unique types that are converted into closed types at run time. We can build generic classes that contain any integer, string, or objects types. Generics collections are far superior to their counterpart collection classes, such as Arrays, because they offer the ultimate in type safety.

Generics are defined using single tick ( ` ) in IL code, followed by a numeric value which represent the number of generic type parameters.

Syntax 10: Generics Declarations

.namespace testNamespace
{
    // Generic declaration section

    .newobj instance void class [mscorlib] 
                 System.Collection.Generic.List`1<int32>::.ctor()
}

The IL code above would map to the following C# code, where we are defining a generic type that accepts an integer at run time.

List<int> gObj= new List<int>();

Similarly, the code in Listing 7 implements a generic class that accepts an integer type parameter at runtime and yields the addition of each added number without even bothering about type conversions at runtime.

Listing 7: Generic Declaration in IL Code

.module cilGenerics.exe

.class private auto ansi beforefieldinit cilGenerics.cGenrcDemo  extends [mscorlib]System.Object
{
  .method public hidebysig instance void Addition() cil managed
  {
    .maxstack  4
    .locals init ([0] class [mscorlib]System.Collections.Generic.List`1<int32> iCal)
    IL_0000:  nop
    IL_0001:  newobj     instance void class [mscorlib]System.Collections.Generic.List`1<int32>::.ctor()
    IL_0006:  stloc.0
    IL_0007:  ldloc.0
    IL_0008:  ldc.i4.s   10
    IL_000a:  callvirt   instance void class [mscorlib]System.Collections.Generic.List`1<int32>::Add(!0)
    IL_000f:  nop
    IL_0010:  ldloc.0
    IL_0011:  ldc.i4.s   20
    IL_0013:  callvirt   instance void class [mscorlib]System.Collections.Generic.List`1<int32>::Add(!0)
    IL_0018:  nop
    IL_0019:  ldstr      "Generic Demo::Addition is {0}"
    IL_001e:  ldloc.0
    IL_001f:  ldc.i4.0
    IL_0020:  callvirt   instance !0 class [mscorlib]System.Collections.Generic.List`1<int32>::get_Item(int32)
    IL_0025:  ldloc.0
    IL_0026:  ldc.i4.1
    IL_0027:  callvirt   instance !0 class [mscorlib]System.Collections.Generic.List`1<int32>::get_Item(int32)
    IL_002c:  add
    IL_002d:  box        [mscorlib]System.Int32
    IL_0032:  call       void [mscorlib]System.Console::WriteLine(string, object)
    IL_0037:  nop
    IL_0038:  ret
  }
}

Inheritance Metadata

Inheritance of types is a way in which the derived type guarantees support for all of the type contracts of the base class type. In addition, the derived type usually provides additional functionality or specialized behavior. In IL code, derived classes inherit the base class contracts through extends keyword as shown in Syntax 11.

Syntax 11: Inheritance Declarations

.namespace testNamespace
{
    // Inheritance declaration section

    .class public auto ansi beforefieldinit child_class_name extends base_class             
}

The code in Listing 8 demonstrates implementing inheritance. The Father class serves as a base class to Child class. Therefore, the child class can use all of the Father class functionality, as well as add new features.

Listing 8: Inheritance Declaration in IL Code

.module cilInheritance.exe

.class public auto ansi beforefieldinit cilInheritance.Father 
                                        extends [mscorlib]System.Object
{
  .method public hidebysig instance void FatherMethod() cil managed
  {
    .maxstack  8
    IL_0000:  nop
    IL_0001:  ldstr      "this property belong to Father"
    IL_0006:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_000b:  nop
    IL_000c:  ret
  } 
} 

.class public auto ansi beforefieldinit cilInheritance.Child 
                                        extends cilInheritance.Father
{
  .method public hidebysig instance void ChildMethod() cil managed
  {
    .maxstack  8
    IL_0000:  nop
    IL_0001:  ldstr      "this property belong to Child"
    IL_0006:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_000b:  nop
    IL_000c:  ret
  } 
} 

Polymorphism Metadata

The .NET framework implements polymorphism, which it overrides the base class' virtual methods. A virtual method definition can be marked by the newslot attribute in the base class only, which creates a new virtual method for the defining class and any classes derived from it. The important point to remember is that the newslot attribute would be specified in the derived class as shown in Syntax 12.

Syntax 12: Polymorphism Declarations

// Polymorphic method declaration section (base class) virtual

.method public hidebysig newslot virtual 
         Instance void method_name() cil managed     
    
// Polymorphic method declaration section (derived class) override

.method public hidebysig virtual 
         Instance void method_name() cil managed     

The sample in Listing 9 manipulates the numeric parameters to the various Calculation() methods, using polymorphism. In the base class, an addition operation is performed, whereas in the derived class, the base class functionality is hidden and a multiplication operation is performed.

Listing 9: Polymorphism Declaration in IL Code

.module cilPolymorphism.exe

.class public auto ansi beforefieldinit cilPolymorphism.cBase extends [mscorlib]System.Object
{
  .method public hidebysig newslot virtual 
          instance void  Calculation(int32 x,int32 y) cil managed
  {
    .maxstack  2
    .locals init ([0] int32 z)
    IL_0000:  nop
    IL_0001:  ldarg.1
    IL_0002:  ldarg.2
    IL_0003:  add
    IL_0004:  stloc.0
    IL_0005:  ldstr      "Addition is {0}"
    IL_000a:  ldloc.0
    IL_000b:  box        [mscorlib]System.Int32
    IL_0010:  call       void [mscorlib]System.Console::WriteLine(string,object)
    IL_0015:  nop
    IL_0016:  ret
  } 
} 

.class public auto ansi beforefieldinit cilPolymorphism.cChild extends cilPolymorphism.cBase
{
  .method public hidebysig virtual 
          instance void Calculation(int32 x,int32 y) cil managed
  {
    .maxstack  2
    .locals init ([0] int32 z)
    IL_0000:  nop
    IL_0001:  ldarg.1
    IL_0002:  ldarg.2
    IL_0003:  mul
    IL_0004:  stloc.0
    IL_0005:  ldstr      "Multiplication is {0}"
    IL_000a:  ldloc.0
    IL_000b:  box        [mscorlib]System.Int32
    IL_0010:  call       void [mscorlib]System.Console::WriteLine(string,object)
    IL_0015:  nop
    IL_0016:  ret
  }
} 

Conclusion

This article provides a comprehensive overview of IL coding and syntax. As with higher level languages like C#, we have reached thorough understanding about how to code various inherent types of the CLR through IL opcodes, and analysis of the corresponding, generated metadata. We have got deeper into the coding mechanisms of typical CLR programming constructs like constructors, structures, generics and, moreover, the object-oriented programming features such as inheritance, interfaces, encapsulation, and polymorphism using CIL.

End-User License

Use of this article and any related source code or other files is governed by the terms and conditions of The Code Project Open License.

Author Information

Ajay Yadav

Ajay Yadav is an author, Cyber Security Specialist, Subject-Matter-Expert, Software Engineer, and System Programmer with more than eight years of work experience on diverse technology domains. He earned a Master and Bachelor Degree in Computer Science, along with numerous premier professional certifications from Microsoft, EC-council, and Red-hat. For several years, he has been researching on Reverse Engineering, Secure Source Coding, Advance Software Debugging, Vulnerability Assessment, System Programming and Exploit Development. He is a regular contributor to various international programming journals as well as assists developer community with writing blogs, research articles, tutorials, training material and books on sophisticated technology. His spare time activity includes tourism, movies and meditation. He can be reached at om.ajay007@gmail.com;