# Sunday, March 05, 2006

Security lesson no.5: DLL Injection

Yesterday we saw a brilliant :) solution to a problem, using some techniques I always loved. Unfortunately in these days of troubles techniques like DLL injection and IAT patching are used mostly by malware than by useful and great software. So it is important for the software developer that cares about security to know how they work and what can be done to prevent them.

DLL Injection is the topic of this lesson, but we will also see what it is possible to do once our malicious DLL is inserted into another process address space: window subclassing, Virtual Memory walking (in search of private data like passwords, for example) and IAT overwrite.

Have fun! (and behave responsibly, as usual...)

DLLInjection.ppt (319.5 KB)

ex6-DLLinjection.zip (139.39 KB)
ex7-VMWalk.zip (338.91 KB)
ex8-IATOverwrite.zip (224.36 KB)
# Saturday, March 04, 2006

Intercepting Windows APIs

As I described in a previous entry, one of the few games I really enjoyed playing was Enemy Territory. It is a free, multi-player FPS based on the Quake 3 engine. It is class based: you chose a class and that dictates the ability of your soldier (and what he can do). I played with my fellow university mates: some of them created a clan (they even did one or two official tournaments) and they wanted to train (I was not particularly good.. I received the "easy frag" attribute!). Besides, it was good to relax an hour after lunch, before attending other lessons.

However, we had an hard time playing it... The admin won't let us use the computer lab for non didactic purposed. It is silly, if you ask me, especially it was not explicitly forbidden by college rules: for example, students and professors alike are allowed to use empty classrooms to play card games. So why can't we use an empty lab to play a free game? Since the labs were not under CCTV surveillance, we took the risk and played nonetheless (we were young.. :) ). But one day, an email from the admin warned me to not use that particular game anymore. How did they know? Simple: someone was checking all the files on the public directories (were the game was installed), which user owned them (using ACLs) and what kind of files they where.

Me and a friend of mine started to think about the problem. Initially we thought about manipulating the ACL to change the ownership (maybe to the Administrator.. it would be ironic!) of the game files, but it was impractical and it required privileges above those allowed to students, and we didn't want to do anything illegal (like an escalation of privileges). Our solution was simple: hide your programs, not only your data.

Once upon a time, programs consisted of a single .exe (or .com) file. Nowadays instead, an average application has thousands of files and DLLs in its installation directory. Think at Office, or at a game like Quake3. We wanted to execute a complete program out of a sigle data packed file, possibly compressed or encrypted. I'll discuss our ideas and the techniques we used, namely DLL injecting and API intercept and forwarding. We began to discuss seriously on the topic. Our first idea was to provide a DLL that was a proxy / interceptor for the msvcrt.dll, the C runtime of MS C++ compiler. This DLL contains the implementation of the C file handling function, such as fopen, fread, fseek. We can make a DLL with the same name, put it in the app directory (which come firts in the loader search path), export all the function of the original msvcrt.dll implementing file handling function and passing other function to the original DLL. Phew, a lot of work...msvcrt.dll exports 780 functions! We can already sense the calluses on our fingers! Furthermore, the C runtime can be statically linked to the exe, or the program could directly call Win32 API functions.

But wait, even fopen, fread, fseek and friends call Win32 API functions! So, plan B: intercept kernel32 functions! Despite her name, kernel32 is not a kernel module: is a simple user mode DLL that provides a nice API for the real kernel calls. So it can be intercepted... Calling the application we want to execute ot of the compound file victim, all we have to do is:

  1. Place some code in the victim process address space.
  2. Execute this code in order to:
    1. locate the IAT (Import Address Table) of the exe
    2. patch pointers in the IAT to point to OUR functions
  3. For now on, all calls to the patched functions will cause a jump not to the original kernel32 code, but to our functions.
The advantages of this appoach? It's more economic (we have to write only the functions we need), it works with (almost) every app (even with non C apps) and it's funny to code!

DLL injection

We need to place code and execute it in the address space of another process. This at first can seem impossible: every Win32 process has its Virtual Address Space and pointers range over this space, so it's impossible to access another process space [1][2]

The virtual address space: the lower 2GB are the user-mode space, and they are private for each process (see [1][2] for details)

Well, not really: how debuggers can work? With the help of the OS of course! We'll ask for help to the OS too. Our goal is to load a DLL in the victim address space: when a DLL is loaded, function \emph{DllMain} in the DLL is called, with dwReason equal to PROCESS_ATTACH. There are several methods to load a DLL in a process [3]:

  1. Windows HOOKS (the most ancient one). A Hook is a callback function called by windows every time a particular event occours: the most interesting one, when a top window is created or destroyed. We can then see if the application is interesting, and what to do with it. The nice thing is that the DLL that contains hook code is loaded into the other application address space.
  2. The registry. Somewhere in the registry, you can specify a key in which you place DLLs that have to be loaded in every process address space (\HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\AppInit_DLLs) . This is how mouse or video card DLLs end up in your address space. Drawback: you must have Admin rights to write to the registry, and you DLL is loaded in a lot of non-interesting processes. What a waste.
  3. Two magic Win32 functions: CreateRemoteThread and WriteProcessMemory [4].
Ritcher in [4] explains the magic very well. To summarize:
  1. Obtain the HPROCESS of the victim (via CreateProcess or via its pid).
  2. Reserve some space in the Virtual Address space of the victim with VirtualAllocEx.
  3. Use WriteProcessMemory to write the name of the DLL to load in the memory just reserved.
  4. Use CreateRemoteThread to load the library.

The virtual address space: the lower 2GB of the user-mode space, with the kernel32.dll loaded at the same address.

At a first time, we believed we needed to write shell code to execute LoadLibrary, and this is bad for 3 reasons:
(a) is difficult to write,
(b) with the new XP-SP2 NX (non execute) page protect flag we could have troubles.
Fortunately, we realized a fact: DLL are loaded in every process address space, so are private to each process. However, when you create a DLL you specify a "preferred load address'" at link time, and the OS loader will load the DLL at that address if it's free. This is due the fact that otherwise the loader must relocate the DLL, and this is a time consuming operation. This is particularly true for system DLLs, which are loaded always at the same address in every process. So, if we do a GetProcAddress against LoadLibrary in our process, we obtain the same address as in the victim process.

Scheme of the steps that lead zdll.dll to be loaded in victim's address space

We can pass to CreateRemoteThread the address of LoadLibrary as startup routine, and the name we wrote in victim address space as parameters as in figure.

IAT patching

Now we have our own code running in a thread in victim's address space. What can we do now? Everything. In particular, we can have access to PE data directories in our "host", the victim. The executables in Win32 (DLLs, exe, and even device drivers) follow a format called PE (Portable Executable). Every PE is divided in sections: export, import, resources, debug data, delayload, bound modules...[5][6].

The section we are interested in is the import section, with its IMAGE_IMPORT_DESCRIPTOR structure.

The import section, with its two parallel arrays of function pointers

The import section after the loader has done its work. The IAT now points to function entries in kernel32.dll

There's one IMAGE_IMPORT_DESCRIPTOR for each imported PE (executable or, a most common case, DLL). Each IMAGE_IMPORT_DESCRIPTOR points to two essentially identical arrays. The first one is the Import Address Table (IAT). The second one is called the Import Name Table (INT) and is used by the loader as a backup copy in case the IAT is overwritten in the binding process. Binding is an operation done to PE files before the link step, but this goes beyond the scope of this article. Matt Pietrek in [5] covers all the details. The IMAGE_THUNK_DATA structures in the IAT has two roles:

  • In the executable file, they contain either the ordinal of the imported API or an RVA (Relative Virtual Address, an offset from the base address at which the PE is loaded) to an IMAGE_IMPORT_BY_NAME structure. The functions we need to patch in DLLs are those with a name, so we look at those entries that contain an RVA. The IMAGE_IMPORT_BY_NAME structure is just a WORD, followed by a string naming the imported API.
  • When the loader starts the executable, it overwrites each IAT entry with the actual virtual address of the imported function 


The import section after zdll's DllMain has done its work. The IAT now points to function entries in zdll.dll

So we need to replace the addresses placed in the IAT by the loader with the addresses of our functions. Here the INT becomes important: how do we know which entry in the IAT we need to overwrite for, as an example, CreateFileA? We need to iterate through the entries of the IAT and INT together. The INT provides the name for the n-th entry, the IAT its VA. We simply overwrite the entry in the IAT with our own.


   while (1) // Loop forever (or until we break out)
      if ( pINT->u1.AddressOfData == 0 )

        ULONGLONG ordinal = -1;

        if ( IMAGE_SNAP_BY_ORDINAL32(pINT->u1.Ordinal) )
           ordinal = IMAGE_ORDINAL32(pINT->u1.Ordinal);           
        if ( ordinal != -1 )
           // We don't consider un-named functions
         ordinalName = (PIMAGE_IMPORT_BY_NAME)getPtrFromRVA((DWORD)(pINT->u1.AddressOfData));         
         const char* funcName = (const char*)ordinalName->Name;
         PDWORD oldFuncPointer (PDWORD)&(pIAT->u1.Function);
         if (funcName == "CreateFileA")
            pIAT->u1.Function = MyCreateFile;
      pINT++;         // Advance to next thunk
          pIAT++;         // Advance to next thunk

Compound file

So, at this point the only thing to do was to provide our own implementation of functions like CreateFile, WriteFile, SetFilePointer, FindFirstFile... and patch the IAT for kernel32 with them. But how can we  implement a file system in a single file? After some searching, I suggested that maybe Structured Storage, the way Microsoft calls its compound files, could be used: Word and Powerpoint uses them, for example.
It was only a suggestion, but the day after my mate came with an almost complete implementation based con Structured Storage functions and COM interfaces. Amazing! The last things to do were an application for building a compound file, and some cryptography to hide the content of the file. After all, this was the original goal :)

The final product worked. It was great! A piece of software complex as a video game was able to run with our own file APIs. We never used it (it was a bit too slow on startup, and we found a much simpler solution: network our notebooks), but it was fun, and I used the intercepting library we created for more interesting stuff!

[1] Jeffrey Richter. Load your 32-bit dll into another process’s address space using injlib. Microsof System Journal, May 1994.

[2] Jeffrey Richter. Advanced Windows Programming, 3rd edition. Microsoft Press, 1997.

[3] Mark Russinovich. Inside memory management, part 1. Windows and .NET Magazine, August 1998.

[4] Mark Russinovich. Inside memory management, part 2. Windows and .NET Magazine, September 1998.

[5] Matt Pietrek. Inside windows: An in-depth look into the win32 portable executable file format. MSDN Magazine, Feb. 2002.

[6] Matt Pietrek. Inside windows: An in-depth look into the win32 portable executable file format, part 2. MSDN Magazine, March 2002.

[7] Microsoft corp. Platform SDK: Structured storage. MSDN Library, April 2004.

# Thursday, March 02, 2006

Security lesson no.4: Integer overflow

To finish the cycle of lessons on overflow-based attacks, I couldn't miss a mention to integer arithmetic overflow. Integer arithmetic overflow is unharmful on its own, but can be combined with another type of attack, typically a buffer overflow. Consider the following code from a previous lesson:

int ConcatString(char *buf1, char *buf2,
    size_t len1, size_t len2)
   char buf[256];
   if((len1 + len2) > 256)
      return -1;
   memcpy(buf, buf1, len1);
   memcpy(buf + len1, buf2, len2);
   return 0;

it seems to avoid the buffer overflow problem with a simple check. However, this function is unsecure. Why? Discover it in my slides!

IntOverflow.ppt (128.5 KB)
# Wednesday, March 01, 2006

Security lesson no.3: Pointer Subterfuge

The last buffer overflow technique I treated in my lessons was Pointer Subterfuge. With this technique you try to clobber a function pointer, and make it point to a memory location containing your own code.

My students instantly objected: there is almost no function pointer in our code!
No? What about C++ objects? COM components? Kernel functions exposed as APIs?
A common way to intercept kernel-mode APIs is to patch the kernel’s system service table, a table made of function pointers!

Are you interested? Go ahead and read the powerpoint slides and the sample source!

PointerSubterfuge.ppt (255.5 KB)
ex9-vptrSmash.zip (38.35 KB)
# Monday, February 27, 2006

Security lesson no.2: Heap smashing

In a previous post I talked about my Software Attacks lessons for the Computer Security course at the University of Trento, where I was assistant professor.

Now is time for another lesson: it is again on buffer overflow, but using a more complex attack called Heap Smashing.
Have fun with my powerpoint slides and my sample code.


In short, use at your own risk.. :). The code was written and compiled using Microsoft VC++ 6.0 under Windows 2000. As I illustrated in the slides, the enhacement ini Windows XP SP2 should make this kind of technuque uneffective.

HeapOverflow.ppt (264 KB)
ex5-HeapSmash.zip (36.3 KB)
# Sunday, February 05, 2006

Synthesized proxies

Last time we saw a first solution to the problem of adding synchronization contracts to an arbitrary .NET class. The solution based on Context Attributes and Transparent Proxy had some limitations (performance, no access to class members), so I designed a tool based completely on SRE (System.Reflection.Emit) and System.Reflection. It can be thought of as a post-compiler that analyzes an assembly, custom attributes placed on classes and function, and write a small assembly that contains a proxy to the original one.

The proxy is written using metadata information of the real target object. We use facilities provided by System.Reflection in order to read the interface methods, the custom attributes and the fields and properties of the target object, and System.Reflection.Emit to emit code for the guarded methods and forwarders for the other public functions. The methods in the proxy validate the contract and update the state for subsequent validations, all in a automatic way.

The process to create a guarded component is the following:

  • write the class as usual, then add the synchronization contract writing guards for each method using the Guard attribute and Past LTL logic;
  • add a reference to the Attribute.dll assembly -the assembly containing the class implementing the Guard attribute- and compile your component. Note that the attribute can be used with every .NET language that supports custom attributes, like C#, Visual Basic, MC++, Delphi, and so on;
  • the .NET compiler will store the Guard attribute and its fields and values with the metadata associated to the method.
public class Test
bool g;
bool f = true;
[Guard("H (g or f)")] //using constructor
public string m(int i, out int j)
j = i;
return (i + 2).ToString();
[Guard(Formula = "H (g or f)")] //using public field
public string m(int i, out int j)
j = i;
return (i + 2).ToString();

The component is then passed through the post-compiler, that will post-process the assembly performing the following steps:

  • it walks the metadata of the given assembly, finding all the classes;
  • for each class it walks its methods, searching for the Guard attribute. If the attribute is found, the class is marked as forwarded;
  • it generates a new empty assembly;
  • for each forwarded class, it generates a class with the same public interface as the original one:
    • public, non-guarded methods are wrapped by the IL code necessary to perform the synchronization (acquiring the object-wide lock before forwarding the call to the original method and releasing it just after the method return)
    • public and private guarded methods are wrapped by the IL code necessary to perform the synchronization and conditional access, like shown here:
      .method public instance void  m() cil managed
        .maxstack  5
        IL_0000:  ldarg.0
        IL_0001:  call       void [mscorlib]Monitor::Enter(object)
          IL_0006:  br.s       IL_0008
          IL_0008:  ldarg.0
          IL_0009:  call       bool [mscorlib]Monitor::Wait(object)
          IL_000e:  pop
          IL_000f:  ldfld      bool Test::pre1
          IL_0010:  brfalse    IL_0008
          IL_0012:  ldc.i4.0   //"false"
          IL_0013:  ldfld      bool Test::a
          IL_0015:  ceq
          IL_0018:  stfld      bool Test::pre2
          IL_001e:  ldc.i4.0   //"false"
          IL_0021:  stfld      bool Test::pre1
          IL_0026:  ldarg.0
          IL_0027:  ldfld      class [TestLibrary]Test::GuardedTestLibrary.dll
          IL_0031:  call       instance void [TestLibrary]Test::m()
          IL_0036:  leave      IL_0048
        }  // end .try
          IL_003b:  ldarg.0
          IL_003c:  call       void [mscorlib]System.Threading.Monitor::PulseAll(object)
          IL_0041:  ldarg.0
          IL_0042:  call       void [mscorlib]System.Threading.Monitor::Exit(object)
          IL_0047:  endfinally
        }  // end handler
        IL_0048:  ret
      } // end of method Test::m
  • for each forwarded class, it generates constructors with the same signature that calls the ones of the original class.

A schema for the usage of the generated proxy

The generated assembly and its classes are saved in a separate dll, with the same name prefixed by "Guarded". This assembly should be referenced by the client applications in place of the original one, as shown in the Figure.

# Saturday, February 04, 2006

On anonymous delegates

Recently, doing some searches on anonymous methods/delagates, I came into an interesting puzzle posted by Brad Adams
The puzzle (and its explaination) are very interesting to understand what happens under the hood when an anonymous delegate is created. However, it took me some effort to understand what the compiler did using only the post and its comments, so I put Brad's code into Visual Studio, compiled an assembly, and inspected it with Reflector. The first case, the one with the delegator binding the loop variable i (num1, in the reverse engineered code) is translated in the following way:

private static void Main(string[] args)
      for (int num1 = 0; num1 < 10; num1++)
            Program.<>c__DisplayClass1 class1 = new Program.<>c__DisplayClass1();
            class1.j = num1;
            ThreadPool.QueueUserWorkItem(new WaitCallback(class1.<Main>b__0), null);

the anonymous delegate shares i between caller and callee. It is a closure, in a wide sense, in that it captures the state when it is created. However, since the C# compiler capture i "by reference" and in this case the closure is created before the for loop, the behaviour is different from the one expected in real closures (in fact, anonymous delagates are not closures)

The second example however (delegate bound to a variable local to the for loop) is different: the lifetime of j (the for block) forces the compiler to create and initalize a state object for the delegate at each iteration inside the loop:

private static void Main(string[] args)
WaitCallback callback1 = null;
Program.<>c__DisplayClass2 class1 = new Program.<>c__DisplayClass2();
class1.i = 0;
while (class1.i < 10)
if (callback1 == null)
callback1 = new WaitCallback(class1.<Main>b__0);
ThreadPool.QueueUserWorkItem(callback1, null);

public void <Main>b__0(object)

internal class Program
// Methods
public Program();
private static void Main(string[] args);

// Nested Types
private sealed class <>c__DisplayClass2
// Methods
public <>c__DisplayClass2();
public void <Main>b__0(object)

// Fields
public int i;

In a comment of Brad post, a member of the C# team pointed out that

"Anonymous methods do not capture read-only locals at a point in time, instead they actually capture the local by putting it on the heap, to share it between the outer method and all anonymous methods."

Surely this statement makes clear what is happening: however, this example prove one points I always believed: the only way to be sure of what a compiler do is to look at the assembler code it generates.

# Friday, February 03, 2006

Internet Explorer 7

Finally Microsoft released Internet Explorer 7 Beta 2 for non-Vista users! I downloaded and installed it immediately. I miss IE, it was surely faster and more comfortable to use than Firefox, but during the last year it surely become technologically obsolete.

So, what I like about Internet Explorer 7?

  • New toolbars, no menubar: it's great. It's neat, simple to use, leaves a lot of space for the pages. Who needs a complete File, Edit, View, Window, Help menu in a web browser? Finally someone understood that! A+!
  • Tabbed browsing: I do NOT use it in Firefox, because I always get lost in tabs. I think its a nice feature, and IE surely makes it simple to use and seamlessly integrated. QuickTabs are great, although Firefox have a similar extension (foXpose).
    (BTW, I don't really get tabs: why? They are even more confusing for inexperienced users, because the system already has another tab-bar, the application bar)
  • Search box (finally!). It is possible to add engines, but the number is still scarce: no Webster, no Wikipedia... MSDN is here, fortunately.
  • Search also from the address bar: Mozilla had it, why did they remove it from Firefox I will never get. And it uses the default search engine, too.
  • (Unchanged form IE6, but impossible to have with Firefox) Opening a new Tab/ Window: you can choose to open it showing exactly the same page you are watching now. In Firefox I am forced to open a blank page, then copy/paste the address between address bars.

And what is still behind Firefox? Or, what I hate of IE7

  • First, the most horrible think: ALIASED FONTS.They look terrible on my monitor, why if I disabled them in the control panel IE must impose its will and use aliased fonts? They are orrible to see in my opinion, make me wonder if there is fog in the room..
  • Download boxes: one box for download is surely better than Firefox download manager (awful, IMO) but fortunately firefox have a great extension, Download statusbar

  • Extension: if I don't like something in Firefox, there is an extension to modify it. I love extensions, and if IE7 will be programmable using .NET (maybe JScript.NET) I will be the first to use it and to write a lot of extensions.
  • One of the features I love in Firefox and Visual Studio: incremental search. Guys, you made such a nice work with the tool-bars and the notifications (the yellow ribbon), why ruin the user experience prompting a search box? (or a download dialog, btw)
  • On the same line: download files in the right place from the beginning. Why put them in the cache and then move them? I incidentally clicked "cancel" during this move operation (I was typing) and I had to download the file again.

Well, at the end of the list.. I still prefer Firefox. The only hope of IE is to be programmable, and then it will be mine.

More on yield

Yesterday I discovered and introduced the yield keyword. Today, I wanted to see some examples and to discover how it is implemented. Let's start with a classic example, the Fibonacci function, both in Python and in C#.
def fib():
   a, b = 0, 1
   while 1:
       yield b
       a, b = b, a + b
static IEnumerable<int> Fib(int max)
int a = 0;
int b = 1;
yield return 1;

for (int i = 0; i < max - 1; i++)
int c = a + b;
yield return c;

a = b;
b = c;

The two method are essentially equivalent. How are they implemented? The first thought that came into my mind when I saw the Fibonacci example was: the compiler traslate it to static variables. But  clearly this is not the case: if two clients call the same function, they will share its internal state, which is not desiderable.
The second guess was: well, as in anonymous delegates the current state is captured by an anonymous class generated by the compiler. The compiler generates it inseting one field for every local variable in the function, and then instantiate an object of that class for every call site. This explains the Fibonacci function, but what about this one?

static IEnumerable<string> gen2()

yield return "A";

string s = "A" + "B";
yield return s;

s = 1.ToString();
yield return s;

for (int i = 0; i < 5; ++i)
yield return i.ToString();
Clearly, execution of the function body must stop after a yield statement, like after a return statement, but it must also be resumed in the same place, not at the begininng of the function. So, we have to save also the point at which execution stopped, and resume it (with a jump right at the beginning of the function) right at the same point. Using the excellent Reflector tool by Lutz Roeder you can see in pseudo-code (reverse engineered C#) how this class is generated and how the function is wrapped inside MoveNext, with a big switch right at the beginning of the function that allows resuming execution at different points based on a state variable:
private sealed class <gen>d__7 : IEnumerable<string>, IEnumerable, 
IEnumerator<string>,IEnumerator, IDisposable { // Methods [DebuggerHidden] public <gen>d__7(int <>1__state);
      private bool MoveNext()
         switch (this.<>1__state)
            case 0:
                  this.<>1__state = -1;
                  this.<>2__current = "A";
                  this.<>1__state = 1;
                  return true;
            case 1:
                  this.<>1__state = -1;
                  this.<s>5__8 = "AB";
                  this.<>2__current = this.<s>5__8;
                  this.<>1__state = 2;
                  return true;
            case 2:
                  this.<>1__state = -1;
                  int num2 = 1;
                  this.<s>5__8 = num2.ToString();
                  this.<>2__current = this.<s>5__8;
                  this.<>1__state = 3;
                  return true;
            case 3:
                  this.<>1__state = -1;
                  this.<i>5__9 = 0;
                  while (this.<i>5__9 < 5)
                        this.<>2__current = this.<i>5__9.ToString();
                        this.<>1__state = 4;
                        return true;
                        this.<>1__state = -1;
            case 4:
                  goto Label_00DE;
         return false;
[DebuggerHidden] IEnumerator<string> IEnumerable<string>.GetEnumerator(); [DebuggerHidden] IEnumerator IEnumerable.GetEnumerator(); [DebuggerHidden] void IEnumerator.Reset(); void IDisposable.Dispose(); // Properties string IEnumerator<string>.Current { [DebuggerHidden] get; } object IEnumerator.Current { [DebuggerHidden] get; } // Fields private int <>1__state; private string <>2__current; public int <i>5__9; public string <s>5__8; }
All this reasoning happened yesterday with a guy that work on dynamic languages (I don't want to say who right now... I want only to say "cross your fingers for me!" I may have great news for you in the future). This guy correctly pointed out an issue with this approach: what if the yield statement is inside a try block? In CIL, you can't jump inside a try block: you can only jump at the first instruction of the block. I was puzzled at first: what can we do? The simplest solution was to not permit the mixing of try/catch and yield. But surely this is limitating (Python allow yield to be used almost everywhere, the only exception being yield not allowed in try-finally). The guy then gave me an hint: obviously, once you are inside the block, you can jump whenever you like inside that very block. So, the execution can be resumed using a sort of multiple-dispatch. It was all very clear at once: you have to make a first jump inside the right try block, then at the correct location inside the block. In a function like this one
def f():
       yield 1
           yield 2
           yield 3  # never get here
       except ZeroDivisionError:
           yield 4
           yield 5
           yield 6
       yield 7     # the "raise" above stops this
       yield 8
   yield 9
       x = 12
       yield 10
   yield 11

The compiler will translate this code in something like:

 L_0001: ldfld int32 YieldTest.Program/<gen>d__7::<>1__state
 L_0006: stloc.1 
 L_0007: ldloc.1 
 L_0008: switch (L_0031, L_0023, L_0025, L_0027, L_002c)
 L_0021: br.s L_0033
 L_0023: br.s L_0059
 L_0025: br.s L_0085
 L_0027: br L_00b2
 L_0033: switch (L_0034, L_0037)
 L_0034: ldc.i4 2 
 L_0035: stfld int32 YieldTest.Program/<gen>d__7::<>2__current
 L_0036: br L_0105

The first switch (at position L_0008) selects the try block, the second one (position L_0033) selects the yield statement inside that block. I tried to translate this Python function in C#, to see how the IL code generated by the C# compiler looks like, and with my great surprise the following code

static IEnumerable<int> f()
yield return 1;
yield return 2;
int k = 1 / 0;
yield return 3; //never get here
catch (System.DivideByZeroException)
yield return 4;
yield return 5;
yield return 6;
yield return 7; //the "raise" above stops this
yield return 8;
yield return 9;
int x = 12;
yield return 10;
yield return 11;

did not compiled at all! The compiler spits out the following errors:

error CS1626: Cannot yield a value in the body of a try block with a catch clause
error CS1631: Cannot yield a value in the body of a catch clause
error CS1625: Cannot yield in the body of a finally clause

Apparently, the C# team glissed over the problem by not allowing the same behaviour as in Python.

# Thursday, February 02, 2006

The yield statement

C# 2.0 introduced a new keyword, yield. I didn't paid many attention to this new keyword, assuming that anonimous delegates and generics were more interesting and that yield was only a way to wrap an "iterator pattern". I was wrong (but I have the excuse that the name given to this feature in C# is 'enumerators', and I think now that this name is a bit reductive and misleading). Yield exposes a feature known by the Python community as Generators.

They are a bit like continuations, because they return (or better, yield) a value and then when they are called again they resume execution right after the last yield.

In C#, yield must be used inside an iterator block. An iterator block is more or less a function whore return type must be IEnumerable, IEnumerator, IEnumerable, or IEnumerator (see MSDN).

using System;
using System.Collections.Generic;
using System.Text;

namespace YieldTest
class Program
static void Main(string[] args)
foreach(string s in gen())


static IEnumerable<string> gen()
Console.WriteLine("In the generator function");

Console.WriteLine("Give A");
yield return "A";

Console.WriteLine("Give B");
yield return "B";

It is easy to see why C# enumerators are very useful when implementing iteration over a sequence: as Raymond Chen points out in one of his blog posts there are two "models" for the enumerator-consumer pattern:

"The COM model for enumeration (enumeration objects) is biased towards making life easy for the consumer and hard for the producer. The enumeration object (producer) needs to be structured as a state machine, which can be quite onerous for complicated enumerators, for example, tree walking or composite enumeration."

"On the other hand, the callback model for producer (used by most Win32 functions) is biased towards making life easy for the enumerator and hard for the consumer. This time, it is the consumer that needs to be structured as a state machine, which is more work if the consumer is doing something complicated with each callback. (And even if not, you have to create a context structure to pass state from the caller, through the enumerator, to the callback.) "

In the first model, the caller calls Next() repeatedly and the enumerator has to find the next item and return it. Since the enumerator returns, it has to record state informations with a stack data structure, mimicking the call stack.

In the callback model, on the other and, the producer performs the operations it needs on the data structure (walks a tree, for example) and calls back the consumer through the callback as it finds items. This makes the producer implementation straightforward (in the case of the tree, a simple recursive function will do) but life is made harder for the consumer: it needs to maintain state across each callback.

It would be great to have the simpler approach on both sides, with the caller seeing a simple enumerator that returns items in order and the enumerator seeing a callback that it can throw item into. Raymond solution is a great piece of software based on fibers (user-scheduled threads), but as he points out fibers are hard to use, and it is very easy to make subtle error difficult to debug.

C# solution is in the yield statement and in generators.