Archive for the ‘Debugging’ Category

Microsoft “Roslyn” based REPL injection.

December 12th, 2011 by

Microsoft recently released their new Compiler API codename “Roslyn”. If you haven’t checked it out yet you should. Here’s the link: http://msdn.microsoft.com/en-us/roslyn/.

I wanted to get my hands a little dirty and play with the new API. I’ve been meaning to look into Managed DLL injection for a while to get code execution for process interrogation. There are times when we’re testing that we want to interrogate a process for framework level information. For whatever reasons we sometimes can’t compile the target with hooks. So it would be nice to have a way to execute code. Roslyn’s CSX files look like a great way to accomplish this so that’s what I’m trying to expose.

Currently this only works on 32 bit processes.

Let’s start by describing the architecture as there are 3 things going on. The major components are the Injector, Unmanaged Injectee and Managed Injectee. The injector is the controller in this scenario; he’s responsible for the injection into the managed process and communication between the components. Communication is handled via named pipes.

The injector uses a well-documented dll injection technique via CreateRemoteThread and LoadLibrary. This loads the unmanaged dll into the Managed process. The unmanaged DLL actually handles the Managed DLL injection. I wont go into unmanaged dll injection as it’s pretty well document technique. I assume the reader understands these concepts.

From this point I assume the unmanaged DLL has been injected into the managed process.

After the unmanaged DLL is injected I need to make sure the correct version of the CLR is loaded. To accomplish this use the CLR hosting API’s to determine the version of the CLR that is loaded by the process (Provided there is one loaded). The host process must be running .Net 4.0 to support the Roslyn API. Because the early versions of the hosting API’s are deprecated I need to check to see if the .net 4.0 mscoree is loaded “msvcr100_clr0400.dll”. I check via a GetModuleHandle. If it exists we know we are running .Net 4.0 and know the CLR is already running. Two birds down with a single stone.

hMod = GetModuleHandle(L"msvcr100_clr0400.dll");

Once we know the CLR is loaded and it’s 4.0 we can get a handle to the CLSID_CLRMetaHost via:

hr = CLRCreateInstance( CLSID_CLRMetaHost,
IID_ICLRMetaHost,
(LPVOID*)&pMetaHost );

From the meta host we can get a handle to the running RunTimeHost via:

ICLRRuntimeHost *pClrHost = NULL;
runTimeInfo->GetInterface(CLSID_CLRRuntimeHost, IID_ICLRRuntimeHost, (LPVOID *)&pClrHost);

This will return a handle to the current RuntimeHost (Or load the runtime if it isn’t running). The next call is to load my Managed DLL plus call the entry method.

pClrHost->ExecuteInDefaultAppDomain(L"InjectedManagedDll_Net_4.dll", L"InjectedManagedDll_Net_4.InjectedClass", L"Test", L"TestArg" , &ret);

This loads the Managed DLL into the process. Once the Managed DLL’s Test method is called I create a managed thread.

public static int Test(string param)
{
new Thread(new ThreadStart(ThreadFunc)).Start();
return 666;
}

This thread then generates a few more threads and sets up the NamedPipe communication pipe and reports to the server things are setup.

static void ThreadFunc()
{
try
{
PipeClient.Instance.Start("CNIPipe");
}
catch (Exception e)
{
PipeClient.Instance.LogMessageToServer(e.Message);
}
}

I then expose some simple messages back and forth between the injector and injectee and expose a simple REPL loosely based on this guy’s implementation: http://visualstudiomagazine.com/articles/2011/11/16/the-roslyn-scripting-api.aspx.

private ScriptHost()
{

HashSetassemblys = new HashSet();
assemblys.Add(Assembly.GetCallingAssembly());
assemblys.Add(Assembly.GetEntryAssembly());
assemblys.Add(Assembly.GetExecutingAssembly());

Listnamespaces = new List() { "System", "System.Collections", "System.Collections.Generic" };

ScriptEngine = new ScriptEngine(assemblys.ToList(), namespaces);

Session = Session.Create(this);
}

public object Execute(string code)
{
return ScriptEngine.Execute(code, Session);
}

This gets you a basic REPL inside another process. Next steps include making sure the communication API between the host and injectee are more well formed and able to handle both 32 and 64 bit processes. Stay tuned!

Microsoft CCI Framework for Deobfuscating .Net binaries. (Part 3)

February 18th, 2010 by

Renaming parts of the assembly.

So I promised this last week, but I’ve been busy on a new project. Below is some code that shows renaming of methods. This is a solution to renaming classes within namespaces. It iterates over each namespace renaming classes from class1 -> classN. This is more useful for human readability and tracing logic. I leave it as an exercise to the reader to figure out how to rename other parts of the assembly. But hey if you really need it an get stuck, let me know!

I’ll be posting a tool at some point that does all these different actions for you. Hopefully I’ll have a early release out by mid next month. I’m currently learning WPF well enough to build in visulalizations of the control flow graph. That way after a mutator is applied you can visually see the results.

There is a dictionary in the mutator class that uses the namespace string as a key in order to know which class # i left off at. I test on the string length < 2 because the obfuscators I’ve seen that do this trick tend to just rename everything to some obscure unicode code point of length 1. Just a easy stop condition. You can use any stop condition that suits your needs.

View Code CSHARP
public override NamespaceTypeDefinition Visit(NamespaceTypeDefinition namespaceTypeDefinition)
{
  string key = namespaceTypeDefinition.ContainingUnitNamespace.Name.Value;
  if (!classDict.ContainsKey(key))
  {
     classDict.Add(key, 0);
  }
  if (namespaceTypeDefinition.Name.Value.Length < 2)
  {
     int i = classDict[key];
     namespaceTypeDefinition.Name = this.host.NameTable.GetNameFor(String.Format("Class{0}", i));
     i++;
     classDict[key] = i;
  }
  return base.Visit(namespaceTypeDefinition);
}

Microsoft CCI Framework for Deobfuscating .Net binaries. (Part 2)

February 4th, 2010 by

So yesterday I talked a about using CCI to remove attributes from .Net binaries. Specifically the SupressIldasm attribute. I promised I’d put up some more code highlighting the framework’s benefits. So some more detail on the binary I’m working with. It has been ran through Babel -> Netz -> Babel again. My goals have been to reverse Debabel-> Unpack Netz -> Rebuild the .exe -> debabel again, although the first stage of babel could be skipped, but why not analyze it.

Babel uses a couple of simple techniques to prevent programs like reflector from analyzing protected binaries. These techniques are also found in other protections, so it’s useful to understand why the work and how they work, they are really very simple.

Today I’ll cover a simple but annoying technique being employed; inserting junk bytes. Babel inserts junk bytes into the IL stream of each method. When reflected it causes the disassembler to fail as it does not recognize the byte sequences it can’t continue.

Below is an example of a method ildasm’ed after removing the “suppressIldasm” attribute from the previous post.

View Code CSHARP
.class public auto ansi beforefieldinit netz.NetzStarter
       extends [mscorlib]System.Object
{
  .field private static initonly string Property0
  .field private static initonly string Property1
  .field private static initonly string Property2
  .field private static class [System]System.Collections.Specialized.HybridDictionary Property3
  .field private static class [mscorlib]System.Resources.ResourceManager Property4
  .field private static class [mscorlib]System.Collections.ArrayList Property5
  .field private static bool Property6
  .method public hidebysig specialname rtspecialname
          instance void  .ctor() cil managed
  {
    // Code size       14 (0xe)
    .maxstack  8
    IL_0000:  br         IL_0007
 
    IL_0005:  unused
    IL_0006:  unused
    IL_0007:  ldarg.0
    IL_0008:  call       instance void [mscorlib]System.Object::.ctor()
    IL_000d:  ret
  } // end of method NetzStarter::.ctor

As you can see it does an absolute jump over some “unused” bytes which are really invalid bytes. This way the logic of the program is maintained while confusing the disassembler. One technique I’ve read to handle this is to use a hex editor to look for the absolute jump op code and nop out those bytes. However this is unreliable as babel inserts bytes not just at the start of the method.

Microsoft CCI to the rescue again!.

So lets use CCI to handle rebuilding the binary by replacing invalid bytes with nops. This way we can now view this application in reflector and be able to navigate it. Below is the mutator class i wrote to handle NOP’ing invalid bytes. Again a very simple solution. Now the code is visible in reflector using the IL view. At least you get the “browsing” functionality and easily go to functions and view their dependencies and cross-references.

View Code CSHARP
public class InvalidCodeNOPReplace : MetadataMutator
{
	public InvalidCodeNOPReplace(IMetadataHost host)
	    : base(host)
	{
 
	}
 
	public override List<IOperation> Visit(List<IOperation> operations)
	{
	    operations = Utilities.ReplaceInvalidOpCodeAsNOP(operations);
 
	    return base.Visit(operations);
	}
}
 
public static List<IOperation> ReplaceInvalidOpCodeAsNOP(List<IOperation> ops)
{
    List<IOperation> newOps = new List<IOperation>();
    foreach (IOperation op in ops)
    {
 
	if (!IsValidOpCode(op.OperationCode))
	{
	    Operation o = new Operation();
	    o.Location = op.Location;
	    o.Offset = op.Offset;
	    o.OperationCode = OperationCode.Nop;
	    o.Value = 0x0;
	    newOps.Add(o);
	}
	else
	{
	    newOps.Add(op);
	}
    }
    return newOps;
}
 
private static void populateOpCodeDic(){
   OpCodes = new Dictionary<OperationCode,int>();
   foreach(int i in Enum.GetValues(typeof(OperationCode)))
   {
     OpCodes[(OperationCode)i] = i;
   }
}
public static bool IsValidOpCode(OperationCode opCode)
{
       if (OpCodes == null)
       {
            populateOpCodeDic();
       }
       return OpCodes.ContainsKey(opCode);
}

Unfortunately reconstructing the C# source doesn’t work at this stage due to the nops and invalid branching structure. However, I’m trying to work out a middle layer which can take a methodbody’s operations list, abstract it out, turn it in to a control flow graph, optimize it and rewrite. However i’m still stuck at the rewriting part. I hit a small snag in the logic I haven’t had time to work out just yet. Hopefully then the C# can be reconstructed.

Tomorrow I’ll post some simple methods to get readable names out of the method/properties/class names to make following logic easier.

*Edit forgot to add the IsValidOpCode method.

**Edit had to readd disappearing generic types.. Ugh!

Microsoft CCI Framework for Deobfuscating .Net binaries.

February 3rd, 2010 by

We had an issue recently crop up with an obfuscated .Net binary. I’ve been meaning to spend more time reversing .Net protected binaries so I start looking in it. Unfortunately everything I was reading on the forums and internet seemed difficult. Having recently read a little about Microsoft’s CCI framework, I thought this might be the best solution to the problem. Using a hex editor and looking for patterns seems hokey and a bit impractical.

So the first thing I decided to try was removing the SuppressIldasmAttribute attribute.  Below is some example code doing just that using CCI and rewriting the file. This produces an executable that works and doesn’t require just hex editing out the attribute leaving an executable that doesn’t run.

View Code CSHARP
static void Main(string[] args)
{
     var host = new PeReader.DefaultHost();
     var module = host.LoadUnitFrom(args[0]) as IModule;
     var attributeRemover = new AttributeRemover(host);
     module = attributeRemover.Visit(module);
     Stream peStream = File.Create(module.Location ".fixed");
     PeWriter.WritePeToStream(module, host, peStream);
     Console.Out.WriteLine("Finished");
}
 
/*
* Removes the static attribute atm SuppressIldasmAttribute.. can be modified to remove any attribute.
*/
 
public class AttributeRemover : MetadataMutator
{
 
     PlatformType pt;
 
     public AttributeRemover(IMetadataHost host)
                              : base(host)
     {
         pt = new PlatformType(host);
     }
 
     public override List<ICustomAttribute> Visit(List<ICustomAttribute> customAttributes)
     {
          for (int i = 0; i < customAttributes.Count; i++  )
          {
               if (customAttributes[i].Type.ToString() == "System.Runtime.CompilerServices.SuppressIldasmAttribute")
               {
                    customAttributes.RemoveAt(i);
                    break;
               }
          }
          return base.Visit(customAttributes);
     }
}

As you can see it requires very little code. Anyways that’s enough for this post. I also have some more code I’ll be posting that uses CCI to rename methods/class/methods from their “mangled names” and code that removes invalid OpCodes so reflector works at the IL level. I’m still working on code that goes through creates a optimized methods to remove the invalid jumps such that C# code can hopefully be reconstructed. We’ll see how that goes.

Some good .Net debugging info

February 12th, 2007 by

Visual Studio 2005/2008 debugging with sos.dll
The blog seems to have gone cold, so copying here for good luck.
http://blogs.msdn.com/vancem/archive/2006/09/05/742062.aspx

Vance Morrison's Weblog

Vance Morrison is currently an Architect on the .NET Runtime Team, specializing in performance issues with the runtime or managed code in general.

Digging deeper into managed code with Visual Studio: Using SOS

I have let my blog laps for too long.    I am back to blogging.   I realized reciently that we have simply not written down many interesting facts about how the runtime actually works.  I want to fix this.   Coming up in future blogs I am going to be doing a bit of a 'architectural overview' which describe the differences between managed and unmanaged code, but before I do that I realized that I have not even finished a blog entry I started in March.

In my blog How to use Visual Studio to investigate code generation questions in managed code, I talk about the how to configure Visual Studio so that you can actually look at optimized code in the debugger (which sadly is not as trivial as you would like), and showed how to look at the disassembly of managed code.    Unfortunately manage code is hard to read without a guide, and so in this blog I will show you some very useful tips for reading managed assembly code.

In this blog entry I will show you the instructions ACTUALLY need to get executed to do something as simple as assigning a string to field of a class. Note that I am assuming a familiarity with X86 assembly code. If you are the type who never wants to read assembly code, you should stop reading now, because most of this blog is a step-by-step explanation of it.

I have attached the file InspectingManageCode.zip, which contains a (trivial), project that I used for this example.  You are STRONLY encouraged to open it (you can browse it the main file is Program.cs).  Copy the files (simply drag the 'InspectingManagedCode directory inside the ZIP to a directory of your choosing), launch the InspecingManagedCode.sln file and run the example.   While the project is already set to build and run optimized code, you will still need to turn off ‘just my code’ and turn on JIT optimization as described in my previous blog to follow along.

The code in the attached example is pretty trivial.
class Program
{
    string myString;
    private Program()
    {
        myString = “foo”;
    }
    static void Main(string[] args)
    {
        Program p = new Program();
   }
}

If you were to follow the instructions in the previous blog to see what code was generated for the body of ‘Main’ you would find the following code.

00000000  push       esi
00000001  mov        ecx,9181F4h
00000006  call       FFCB1264
0000000b  mov        esi,eax
0000000d  mov         eax,dword ptr ds:[0227307Ch]
00000013  lea         edx,[esi+4]
00000016  call        79222B78
0000001b  pop         esi
0000001c  ret

At first glance this code has little similarity to the source code: the original source has a call the constructor ‘Program’ and the assembly code has two calls to strange hex addresses.  There are also references to magical numbers like 9181F4H and 0227307CH.   In this case the disassembly has not proven to be very valuable.    What can we do?   

Sadly if we try to peer into these CALL instructions we cannot, the debugger comes back with the very unhelpful message ‘There is no code at the specified location’.   Actually Visual Studio is LIEING to you. There really is code there, but it simply will not show you. I will show you techniques to get around this.

The key to unlocking mysteries of managed code, is a debug helper called SOS.DLL (it is a dll that is shipped with the runtime). The DLL is what is called a ‘debugger extension’. Basically it implements functionality that is useful in a debugger implementing
functions that are useful for debugging code associated with it (in this case the
runtime).   Other bloggers have
also commented on the use of this DLL (do a web search of SOS.DLL for more).

In Visual Studio, you load SOS.DLL by
opening the immediate window (Ctrl-D I) and typing

.load SOS.dll

If you do this you may get the message

SOS not available while Managed only debugging. 
To load SOS, enable unmanaged debugging in your project properties.

This message is actually reasonably
helpful.  By stopping
the debugger (Shift F5) going to Solution Explorer (Right hand pane), right clicking
on the InspectingManagedCode project file, and selecting Properties, you will get
the properties pane for the project. 
If you select the ‘Debug’ tab on the left side you will find 3 check boxes
at the bottom, one of which is labeled ‘Enable unmanaged code debugging’  If you check this, you put the debugger
into a mode where it can debug both mananged and unmanaged code, (which means you
can then use SOS.DLL).   
I have already done this on the InspectingManagedCode project, but you will
have to repeat this any time you need to use SOS.
  (Sadly the instructions for setting the debugger mode are different
for C++).    Note that running
the debugger to debug both managed and unmanaged code will slow the debugger down
a bit (it loads the symbols for all the unmanaged DLLS), so you probably only want
do this on projects like this one where you want to use SOS.DLL. 

Now you should be able to set a breakpoint
in Main(), run the program (F5), and go to the immediate window (CTRL-D I) and type

.load SOS.dll

And get the message

extension C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\sos.dll loaded.

If you are curious the SOS.DLL has reasonably
good help, if you type the command

!Help

It will give you a list of commands,
and you can get help on individual commands by specifying the name eg.

!Help u

It will give you help on the ‘u’ (unassembled)
command.   
All SOS commands need to be prefixed by a ! character so that the Visual Studio
Debugger knows that it is an SOS command and not an immediate value to be interpreted
(the normal meaning of text typed in the immediate window).

The unassemble SOS command is the command
we are interested in.  
It will disassemble a managed routine, but do a much better job than Visual Studio
presently does.  
Unfortunately, we need the address of the routine we want disassemble, and Visual
Studio goes to some length to hide this information. 
If you look at the disassembly for the code (CTRL-ALT-D), you will see that
the address of the routine is never given, only the offset from the beginning of
the method. 

The way around this is to use the ‘Registers
window’ (Ctrl-D R). 
I happen to like to put this window just above the immediate window and shrink it
so that only the two lines that actually show values are showing. 
 One of the registers is ‘EIP’ which stands for Extended Instruction
Pointer’.  It is
the address of the current instruction pointer.
  In my particular invokaction EIP has the value of 00DE0071, so
I can do the command

!u 00DE0071

Which will disassemble the ENTIRE routine
that the address 00DE0071 lives in. 
I like to right click in the immediate window and select ‘Clear All’ before
I do this so the only thing in that window is the disassembly.   On my machine I get the result

Normal JIT generated code

Program.Main(System.String[])

Begin 00de0070, size 1d

00DE0070 56      
        push
      
esi

>>> 00DE0071 B904309100
      mov
        ecx,913004h

00DE0076 E8A11FB2FF    
  call    
   0090201C (JitHelp: CORINFO_HELP_NEWSFAST)

00DE007B 8BF0      
      mov
        esi,eax

00DE007D 8B053C302B02    
mov      
  eax,dword ptr ds:[022B303Ch]

00DE0083 8D5604      
    lea  
      edx,[esi+4]

00DE0086 E8A5380979    
  call    
   79E73930

00DE008B 5E      
        pop
        esi

00DE008C C3      
        ret

It is not unlike the version the Visual
Studio produced, but there are differences

1.     
You will note that the ‘call instruction
is annoted with ‘JitHelp: CORINFO_HELP_NEWFAST’, which makes it at least a bit clearer
that this helper is used to create a New object (and is the fast version, we have
many variations).

2.     
It printed the whole routine that 00DE0071
lives in and prints a >>> 
on the instruction corresponding to the 00DE0071 address. 

3.     
While it did not print the name for
the ‘call 79E73930’, notice that the HEX value is different than the value in the
Visual Studio Disassembly (79222B78). 
The value in the 
VS disassembly is simply WRONG (it is bug no one bothered to fix). 

So let’s take a look at the first two
instructions.

00DE0071 B904309100    
  mov    
    ecx,913004h

00DE0076 E8A11FB2FF    
  call    
   0090201C (JitHelp: CORINFO_HELP_NEWSFAST)

I mentioned that this helper call creates
a new object from the GC heap. To do so it needs to know that type of the object
to be created. This is what the magic number 913004 does.  Internally in the runtime types are
described by a structure called a MethodTable, and 913004 is the address of the
MethodTable to create. 
We can find out what type 913004 corresponds to by using the !DumpMT (dump Method
Table) SOS command. 

!DumpMT 913004h

Produces the output

EClass: 00911254

Module: 00912c14

Name: Program

mdToken: 02000002 
(C:\Documents and Settings\vancem\My Documents\Visual Studio 2005\Projects\InspectingManagedCode\bin\Release\InspectingManagedCode.exe)

BaseSize: 0xc

ComponentSize: 0×0

Number of IFaces in IFaceMap: 0

Slots in VTable: 6

The only output of this that is interesting
at this point is the ‘Name’ field, which as you can see, indicates that 913004 cooresponds
to the ‘Program’ type.  
Thus these first two instructions create a program object.   This program object comes back from
the helper with all its fields zeroed, so the next instructions in the program are
the body of the constructor (the Program() constructor has been inlined into the
body of Main(). 

The next instructions

00DE007B 8BF0      
      mov
        esi,eax

00DE007D 8B053C302B02    
mov      
  eax,dword ptr ds:[022B303Ch]

00DE0083 8D5604      
    lea  
      edx,[esi+4]

00DE0086 E8A5380979    
  call    
   79E73930

Basically implement the statement ‘myString
= “foo”’ The helper returns a pointer into the
uninitialized object in the EAX register. 
The mov saves this into the ESI register. 
EAX is then loaded with what is at the address 022B303Ch.  This happens to be the string “foo”
(more on how it go there in a later blog).
  You can confirm this by going to the disassembly code, setting
a breakpoing right after the eax,dword ptr ds:[022B303Ch] instruction and looking
at the value of the EAX register in the ‘registers’ window. 
 In my example it happens to be the value 012B1D44.   You can then use the command

!DumpObj 012B1D44

Which will dump the managed object at
this address.  This
will print .

DumpObj 012B1D44

Name: System.String

MethodTable: 790fa3e0

EEClass: 790fa340

Size: 24(0×18) bytes

 (C:\WINDOWS\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)

String: foo

Fields:

     
MT    Field   Offset                
Type VT    
Attr    Value
Name

790fed1c  4000096       
4        
System.Int32  0
instance       
4 m_arrayLength

790fed1c  4000097       
8        
System.Int32  0
instance       
3 m_stringLength

790fbefc  4000098       
c         
System.Char 
0 instance      
66 m_firstChar

790fa3e0  4000099       10       
System.String  0   shared   static Empty

    >>
Domain:Value  0014c550:790d6584
<<

79124670  400009a       14       
System.Char[]  0   shared   static WhitespaceChars

    >>
Domain:Value  0014c550:012b186c
<< Basically

 Again, most of the output is uninteresting at this point, except
the Name field (which says its a string), and the ‘String’ field (which shows the
string value is ‘foo’). 
So we have confirmed that this instruction loads up the address of the ‘foo’ string
into the EAX register. 
 What is left is

00DE0083 8D5604      
    lea  
      edx,[esi+4]

00DE0086 E8A5380979    
  call    
   79E73930

The first instruction ‘LEA’ may not
be familiar to you. 
It is Load Effective Address (LEA). 
Basically it works just like a MOV instruction, but instead of moving what
was AT the memory specified, it loads the ADDRESS of the memory.   Another way of looking at this is to
imagine a MOV instruction with the [] dropped (which represent memory fetching).  Thus

00DE0083 8D5604      
    lea  
      edx,[esi+4]

Can be thought of as

00DE0083 8D5604      
    mov  
      edx, esi+4

That is it adds 4 to ESI and places
it in EDX.   Now
remember ESI points at our newly created ‘Program’ object.   We could find out all the fields of
this object by dumping it, 
In my debugger ESI has the value of 012B1D5C so I can do

!DumpObj 012B1D5C

And get

Name: Program

MethodTable: 00913004

EEClass: 00911254

Size: 12(0xc) bytes

 (C:\Documents and
Settings\vancem\My Documents\Visual Studio 2005\Projects\InspectingManagedCode\bin\Release\InspectingManagedCode.exe)

Fields:

     
MT    Field   Offset                
Type VT    
Attr    Value
Name

790fa3e0  4000001      
 4    
   System.String 
0 instance 00000000 myString

Which tells us that ESI points at a
‘Program’ object and that the total size of the object is 12 (more on that in a
later blog), and that at offset 4 there is a field calls ‘myString’ of type System.String
that currently has the value of 0 (null).
 

So now we can make a pretty good guess
that the LEA instruction is setting EDX to the address of the ‘myString’ field of
the program object. 
EAX has been set to the ‘Foo’ String, and next comes the mysterious

00DE0086 E8A5380979    
  call    
   79E73930

Ideally SOS would have annotated this
helper.   It is
what we call a ‘WriteBarrier’.  
More on exactly what a write barrier is later,
 but for now the important thing to know is that ALL updates to
OBJECT REFERENCES that live in the GC heap need to be done by calling a write barrier
helper.   
Since the Program object lives in the heap, and we are updating a object reference
pointer inside it we need to use the write barrier.
 

The runtime actually has many write
barriers.  All the
write barriers have an unusual calling convention. 
They all take the address to be updated in the EDX register.   Then depending on the write barrier,
they take the value to update in some other register (this particular write barrier
is the most commonly used, and takes its argument in the EAX register).    Logically all the write barrier
does is do (*EDX = EAX) 
(that is update what EDX points at to be the value in EAX).

That is about it for this example  The only instructions
 we did not cover
are the PUSH ESI, and POP ESI at the beginning and end of the routine.  As anyone who deals with assembly code
this is simply saving and restoring ESI since we used it in the routine itself. 

To recap here are the instructions that
actually got executed in the ‘Main’ program and what they do. 

push    
   esi      
                       
         // save ESI
mov         ecx,913004h
                      
// ECX = MethodTable(Program)
call        0090201C
                       
     // EAX = New Object (Program)
mov         esi,eax
                       
        // ESI = this (new object)
mov         eax,dword ptr ds:[022B303Ch]      // EAX = “foo”
lea         edx,[esi+4]
                       
    // EDX = &this.myString
call        79E73930
                       
     // this.myString = EAX (“foo”)
pop         esi
                       
                // restore ESI
ret          
                       
                    
// return.

 

We just understood very deaply EXACTLY
what happens when a particular piece of managed code executes. 
 Hopefully that wasn’t so bad.
    Next time we will dig a bit into this WriteBarrier
is and exactly what it does (how expensive is it?).
  We will also dig into exactly what went on inside the ‘New’ helper.    In later blogs I will go into
how exactly other run time features get converted to native code. 

 

I hope you are enjoying this peek under
the hood of the .NET Runtime. 

 

Published Tuesday, September 05, 2006 7:55 PM by vancem

Filed under:

Attachment(s): InspectingManagedCode.zip

Comments

 

barrkel said:

Great info! Thanks.

BTW, when using windbg + sos to debug, what breakpoint (native: bp / bu) is best
to set in order to use managed breakpoints (thus both !name2ee and !bpmd probably
needed)? With a breakpoint on loading of mscorwks or calling of various CLR functions,
when is the CLR booted up enough so that !name2ee etc. can work?

September 6, 2006 4:14 AM

 

vancem said:

The subject of using SOS in windbg will be the subject of a future blog, however,
I can quickly answer your question.    

The !bpmd (Breakpoint MethodDescriptor), is a command that will set a breakpoint
on a managed method by name.   For example in the example the command

   !bpmd  InspectingManagedCode.exe Program.Main

Will set a breakpoint in the ‘Main’ routine of the example program in the ZIP file.
  Note that UNLIKE the !name2ee SOS command (which looks up a method, or class
by name), the method being referenced in the !BPMD command does NOT need to be loaded
to work (it sets a ‘deferred’ breakpoint).

 

However to use ANY SOS command, you need to load SOS, and it turns out that SOS
needs the .NET runtime dlls ‘mscorwks.dll’ to be loaded before it can load.  
There are a variety of techniques you can use.    The one I use is

   bu mscorwks!EEStartup

This sets a breakpoint at the ‘EEStartup’ method in the .NET runtime DLL ‘mscorwks.
  When this breakpoint hits you can do the command

   .loadby sos mscorwks

Which tells windbg to load the sos.dll by searching the in the directory where mscorwks
lives.  Once loaded you can execute a ! bpmd  command.

Finally if you need !name2ee to work and the module is not yet loaded, you should
set a breakpoint (using !bpmd  command), in the module of interest, run to
that breakpoint (now it is loaded), and then do the !name2ee command.  

September 6, 2006 12:46 PM