# Thursday, 24 November 2005

Introducing synchronization contracts

In the last year, during my thesis period and later, I have heard a lot of interest from the software research community around contracts. Maybe I'm biased, since my thesis was on contracts, but Don Box claimed more than once that Indigo (now WCF) is based on contracts, and Singularity, the new prototipe of managed OS from Microsoft Research, is based as contracts as well.
In the same period, we also had an increasing interest in concurrency; the most evident sign is the interest raised by Herb Sutter's article "The free lunch is over".

The idea expressed in my thesis is: why do not use contracts to specify concurrency constraints? We can use them to check if a desired temporal behaviour is met, or if an undesiderable sequence of events happens. But we can do even more: we can enforce a desired temporal behaviuor, generating synchronization code from the contract isself.

If you are interested in this technique, let me explain it in some more details: in a contractual aware language, the correctness of a program can be expressed and enforced through contracts. Contracts are assertions on a single method (pre-conditions and post-conditions) or over a whole class (class or object invariants). In their simplest form, those assetions are expressed in boolean logic, but there are various "levels" for contracts:
  • basic or syntactic contracts (for example: XML schemas, IDL);
  • assertion-based contracts (for example: Eiffel);
  • behavioral contracts (assertions are not only on methods or classes, but on the control flow as well);
  • synchronization contracts (like the previous ones, but able to work in a concurrent environment, i.e. to check temporal behaviour);
  • QoS contracts (like the previous ones, but with the possibility to specify also numerical constraints (throughput, mean response time, etc.)).
Therefore a contract aware language is often a dialect of an original programming languages with few additional keywords for expressing assertions.

A compiler for a language supporting concurrent contracts, therefore, will automatically generate checks to see at run-time if a undesiderable sequence of events happens. To express assertions from which the check will be generated we use temporal logics, an extension of boolean logic that adds operators for specifying temporal behaviour.

How it works? We mutuate techinques from formal cheching techniques. The temporal logics we use (LTL -linear temporal logic- and PLTL -LTL with Past-) are mutuated from this field as well.
The life of an object can be seen as an execution trace, i.e. a sequence of traversed states. From its "birth" (construction) to its "death" (finalization and destruction) an object has an internal state, and this state is altered by external events through callings to its method, using the interface it provides.

We can check if a particular assertion holds by examining the sequence of state the object traversed. Moreover, since the satisfaction of a Past LTL formula can be defined in a recursive manner, for those formulas we can do it only looking at the previous execution step.
For LTL formulae, instead, we make use of well-known algorithms to translate every formula in a finite state automaton.

This is for the checking side, but we wished to push it a little further...

# Tuesday, 22 November 2005

MSDN search on Firefox

I use Mozilla Firefox as my primary browser, mainly for its great extensibility through extensions/plugin/whatever. They are a big add-on for every developer! Anybody knows that customization is not a primary goal for the mainstream market (most of end-users prefer standardization); however if you are a developer you try to tailor to your needs everything, starting from the development IDE through the web browser.
Having heard from Antimail of the new MSDN search engine, I could not resist to add it to the search engines in Firefox.. =) Here is the code and png images for it.

BTW, I hope Microsoft will add a managed model to IE 7. Making the browser features exposed to .NET languages will be great. I'm not talking about the HTML rendering component: this one is already an ActiveX object re-usable in every application. I think about the whole browser: context menus, searchbar, progress, address and menubars.. It should be a pleasure to write Browsers plugin with .NET: for now, it is possible only in the excellent new Visual Studio 2005!

msdnplugin.src (.89 KB)
msdnplugin.png (.88 KB)

# Saturday, 19 November 2005

My first lesson was...

It is over an year from my first lesson as an assistant professor at the University of Trento, held while I was writing my thesis. The course was on Security and Privacy, and my lessons where on Software Attacks and how to prevent them. I do not claim to be a Security expert, I'm not, but my love for compilers, inner OS working and my willing to know "how it works inside" provided me with some knowledge to show the others what happens when a cracker gets control of your computer.
I always promised me to publish my lessons on the web. I also have some examples, but I don't think I'm going to publish them. They are all innocuos, since they only affect themselves, but maybe it's not so responsible to leave them "free" in the open air.. =). My first lesson was on general vulnerabilities, an introduction to the lexicon and to the first end simples type of software attack: the Buffer Overflow. If you want to read the whole Lesson you can find it here (in PowerPoint format).

If you, like me, are lazy, here is a little appetizing of what you will found, to see if it's worth your time. I hope so!
When you compile a C program, the instruction are translated into ASM code (and then to machine code, but here is almost a 1-1 mapping). As we saw some days ago, on an x86 machine most of the data (parameters, local space) is held on the stack. As an example, consider the following C code:

void f(int a, int b)
{
char buffer[20];
int i;
int j;
j = a;
i = a + b;
}

int main()
{
f(2, 3);
return 0;
}

When we compile it, it is translated to the following ASM instructions:

; 3    : {

   push  ebp
   mov   ebp, esp
   sub   esp, 28              
   
; 4    :    char buffer[20];
; 5    :    int i;
; 6    :    int j;   
; 7    :    j = a;
; 8    :    i = a + b;

   mov   eax, DWORD PTR [ebp + 8h] ; a
   mov   DWORD PTR [ebp - 8], eax ; j EBP - 8
   add   eax, DWORD PTR [ebp + 0Ch] ; b
   mov   DWORD PTR [ebp - 4], eax ; i EBP - 4
 
; 9    : }

   mov   esp, ebp
   pop   ebp
   ret   0

; 12   : {

   push  ebp
   mov   ebp, esp
   sub   esp, 0

; 13   :    f(2, 3);

   push  3
   push  2
   call  _f

; 14   :    return 0;

   xor   eax, eax

; 15   : }

   mov   esp, ebp
   pop   ebp
   ret   0

From the assembly code, you can see how parameters and locals are translated to stack space. The return address (the point from which the instruction is called, and to which we should return) is also saved on the stack. Can you see it? If we fill the buffer with over 20 character we will spill over, "invade" the space of the other locals, of the return address, and of the other parameters.

void func()
{
char buffer[20];
char* current = buffer;
int i;

for (i = 0; i < 256; ++i)
{
*current = 'A';
++current;
}
}

int main()
{
func();
return 0;
}

What will happen? A memory access error ("This program will be terminated" on Win32, "Segmentation fault" on Linux), surely. But, if you compile and execute the code, at which address the fault will happen? (Suggestion: the ASCII code for 'A', repeated four time, is a memory address in the 32-bit virtual memory space of your process).

Now with another example program we try to spill the buffer in a clever way: knowing that at memory address 0x00401090 there is an "interesting" piece of code, we can try to "return" to it, instead of returning to main.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

//00401090
static char message[] = "AAAAAAAAAAAAAAAAAAAAAAAA\x90\x10\x40\x00";


void handleRequest()
{
char buffer[20];
strcpy(buffer, message);
printf("Request: %s\n", buffer);
}

void doNastyThings()
{
printf("He he!!\n");
}

int main()
{
while(1)
{
handleRequest();
}
return 0;
}

Compile it with VC++ 6.0 (if you have another compiler, obviously the address of the doNastyThings function will change: fix it). Surprised to see "He he!!" on the console? Have I tickled your brains? So, to see how it works.. Let's read my lesson!

BufferOverflow.ppt (162 KB)
# Friday, 18 November 2005

More on SPARC


Yesterday we saw briefly the architecture of SUN SPARC processors, and how the presence of a big register array and its organization into register windows affected the calling convention and the assembly programming model.
The idea seemed clever, and C programs should be affected very positively by this design (we know that function calls are a great performance issue on x86). However, its is with a reason that the SPARC-Solaris platform was called Slowlaris.. =)
The register window was designed without consideration of a real word scenario: the presence of all those register is a terrible slowdown on multitasking operating system; when a context switch happens, all the register of the thread being switched off must be dumped into memory. It's not like on x86, where only general purpose registers shuld be pushed in memory; the other ones are "hidden", and used cleverly by the processor.
For SPARC processors, the number of registers is (for a 8 windows processor) 136, i.e. about 1Kb. With the increasing of CPU frequencies, memory bandwidth is a major bottleneck, and with a slice time of 50 ms (a quite high slice time) the memory wasted is about 20Kb every second.
It may seem not too much, but as an example I have read that IPC would be 5 times slower on a Sparc processor than on
a comparable 8-register processor, if all  Sparc registers are saved and restored on context switch.
Algorithms for lazy context switch have been proposed, but those are applicable only through a special designed kernel or through harware modifications.
Hence the "friendly" name Slowlaris for the SPARC-Solaris pair.

# Thursday, 17 November 2005

Porting Bugs


Today I worked with a developer from another corporation on porting some C programs they wrote for a SPARC (with Solaris) to our machines, that are all based on Intel x86. It when pretty much straightforward, we were ready to face the endianess problem (for network communication and binary files reading). But we found a very interesting bug, that didn't manifest on SPARC but fired immediately on Intel. The souce was more or less:
int Function1(int paramA, int paramB)
{
if (paramA == 0)
{
//bad value
return 0;
}

if (paramB != 1 || paramB != 2)
{
//bad value
return 0;
}

//do stuff
//call another function
Function2();
}

int main()
{
//...
if (Function1(a, b) == 0)
printf("Error: Function1 returned 0\n");
//...
}

Can you already spot the bug? We did, after a little "tracing" (printf) (I know, is the worst type of debugging.. but we were compiling over ssh, in a shell, and it proved to be a quick solution).
If you have already guessed what's next, you can stop reading and go to see the solution. If not, you can enjoy a trip in the calling conventions of two very different platforms: a CISC and a rich-register RISC.

I know, defining a Pentium IV as a CISC is not quite correct. Pentium processors have now a superscalar architecture, unordered execution, and internelly they use micro-ops that are very RISC-like. And they have a tons of registers (anybody knows how many, exactly?). But the "programming interface", i.e. the instruction set and register set exposes by the CPU to the world is CISC. It has only four general purpose registers, and a rich instruction set.

On the other side, SPARC architecture, instruction set and register set are very different. Even fuction calling conventions on SPARC are deeply influenced by its early hardware design. Since the instruction set was reduced (in the first version there wasn't even an instruction to divide two numbers), engineers tought to use the spared die space to fill it with a lot of register. And in order to use them efficiently, they organized them in several "register windows".
On x86, every function has its own stack space (the stack frame). On SPARC every function has also its own "register space", a register window. A register window is a set of about 24 register, 8 %i (input) registers, 8 %o (output) and 8 %l (local). The registers are mapped in a clever way: out registers in the caller become in registers in the callee, so that the first 8 parameters are passed through the 8 out-in registres. The first one has a special additional semantic:

%o0  (r08)  [3]  outgoing parameter 0 / return value from callee   
%i0  (r24)  [3]  incoming parameter 0 / return value to caller

This register, unlike the other input registers, is assumed by caller to be preserved across a procedure call
This mapping is done with the help of the SAVE and RESTORE functions; the prolog/epilog structure on SPARC so it is like:

function:
   save  %sp, -K, %sp

   ; perform function
   ; put return value in register %i0 (if the function returns a value)

   ret        
   restore

   
It is important to note that in the SAVE instruction the source operands (first two parameters) are read from the old register window, and the destination operand (the rightmost parameter) is written to the new window. "%sp" is used as both source and destination, but the result is written into the stack pointer of the new window, which is a different register (the stack pointer of the old window is renamed and becomes the frame pointer).
   
The call sequence is:

call <function>
mov 10, %o0

    
The delay slot is often filled with an instruction to set some parameters, in this example it loads the first parameter.

What is a delay slot?
When the control flow is altered (by a unconditional jump instruction, like call, jmpl or branch) the order of execution is inverted: the instruction after the unconditional jump instruction is being fetched simultaneously to the unconditional jump instruction.
This is done to implement a simple pipeline solution: when performing a jump instruction, there is already another instruction in the pipeline: the instruction in the delay slot, will be executed before the processor actually has a chance to jump to the new location. This is the simplest scheme for the chip designer to implement, since the general pipelining mechanism can be used without making any exceptions for transfer instructions; and it is the fastest way of arranging things, since no instructions are discarded.
The SPARC assembly language programmer must be aware of delay slots constantly while coding any change in flow of control, since the order of instruction execution is reversed from the order in which the instructions appear.


Returing to our problem, when function a calls function b %o registers in function a become %i in function b.
If b do not touch %i0 (it reaches the end of the function without a return statement) %i0 / return value remains equal to the incoming parameter.
In the original code, parameterA is always != 0 (otherwise, the function returns with value 0), so the fuction returns to the caller with return value != 0 (uqual to the first parameter).

What happens when porting to Intel? Intel calling conventions are very different. Remeber, x86 has only 4 general purpose registers accessible through assembly language, so it has to pass parameters using the stack:

  • Arguments are pushed on the stack in reverse order.
  • The caller pops arguments after return.
  • Primitive data types, except floating point values, are returned in EAX or EAX:EDX depending on the size.
  • float and double are returned in fp0, i.e. the first floating point register.
  • Simple data structures with 8 bytes or less in size are returned in EAX:EDX.
  • Complex Class objects are returned in memory.
  • When a return is made in memory the caller passes a pointer to the memory location as the first parameter. The callee populates the memory, and returns the pointer. The caller pops the hidden pointer together with the rest of the arguments.

If no C return statement is found, when the control flow reaches the end of the function the ret
instruction is executed and the control returns to the caller. The return value, i.e. the one in the EAX register, is the last value that register assumed... possibly the return value of the last function called.
In our case, the last function before the end (Function2) returned 0, and so our Function1.

What is the lesson? Compiling and testing on one platform (even with a cross complier, like the GCC we used) isn't enough. What surprised me was the absolute absence of compiler errors or warnings. Maybe I am too used to tools that give you warnings if you do something wrong =) but that's the way I like them. When a compiler seems to be too strict or to give you too much warning, remember that an error detected by the complier is a bug less in your product.

# Friday, 11 November 2005

Visual Studio 2005 installed!

Yesterday I was at the Microsoft 2005 Technical conference, part of the 2005 ready to launch tour. We all got Visual Studio 2005 standard NFR; a very glad present. It's true: Visual Studio 2005 rocks! Thank you to all the team for the great work, and to Cyrus for the beutiful IDE!

# Thursday, 10 November 2005

Enemy Territory fun!

I'm not a passionate gamer: I like very much games, but expecially from a technical/developer point of view: often they are the expression of the state-of-the-art of many fields, condensated in one product. Think about algorithms, graphics, artificial intelligence, physical simulation and  numerical computation. I love games, I like very much experimenting with graphics, but I am a sporadic gamer (with the exception of RPGs like Star Wars KOTOR I and II, NeverWinter Nights).

Back at my third year at the university, however, some of my fellow studens formed an Enemy Territory clan. Enemy Territory is a multiplayer only, team based FPS on the second world war. It is free, since it should have been developed as the multiplayer part of a single player game, but the project for the latter failed.
You can chose your team, axis or allies, and a class: soldier, medic, engineer, covert ops, field ops. Every class has unique strengths and abilities: it's very fun to be a medic and "revive" your friends, or to be a covert op and steal the uniform of a member of the other team, or to be a sniper....

The absence of single-player capabilities, and of an artificial intelligence at all (no bots) it's stimulating: every opposer is a human, and so there is no predefined tactis or "stupid" behaviours (well, only humanly stupid behaviour).

There are a stock of predefined sentences to communicate quickly with other member of the team over the internet, when you can reach the other "by voice", like "I need a medic", "They are coming", etc. Among them, "We need an engineer" (mainly to defuse dynamite) and "Enemy in disguise" (when you see a covert ops with your uniform..). The sentences came as both a written banner at the bottom of the screen and an audio file. In the shipped version the written sentences are equal to the audio files, but it is possible to change the written part.

Someone on the internet made a thing that made me laugh for an hour or so: they changed "We need an engineer" with "We need a ninja here" and "Enemy in disguise" with "Enemy in the skies". Same pronunciation! And I saw someone from my team  looking up watching the skies..
And when things get really bad, t's so cool to go around and call a ninja to help you =)

# Wednesday, 09 November 2005

More on syntax highlight

As you may have noticed, I have "solved" my problem of posting code. Now, if you look back at last Sunday's post, I have cool area limited by a dashed board for my code, with syntax coloring for C++ and correct indetation. And it works on Firefox too.
My effort was very limited, since I did not coded the entire solution by myself as originally wanted. I'm not entirely satisfied, so maybe in the future I could do it, or simply replace the syntax highlight engine. DasBlog uses the syntax highlight engine from Thomas Johansen. His highlight engine (here)  uses a simple api that takes the source code, a language definition file (written in XML), and spits out html. The code must be enclosed in <pre> tags, since it do not codify spaces as &nbsp. This was the first bug in dasBlog, and I got rid of it easily. The second problem was: why the popup window that could let me enter the code does not show?
The problem was in the showModal javascript function. It is a very pretty function for showing modal windows in a browser, very good for creating rich-content dialogs. Unfortunately, Firefox do not implement it. I could go for two solutions: using an alternate function (or better, a set o functionality like those implemented by SubImage) or not using a popup dialog at all. I went for the second one.
So, I was able to have the syntax highlighter I wanted in three steps:
  • redesign the Add Entry form;

  • Add some html to the code returned by Thomas Johansen's highlighter (for the border and the indentation);
  • Add an XML file for the C++ syntax.
  
It works pretty well; however, the highlighter has some problems: I wasn't able to highlight curly braces, for example, and it isn't able to recognize tokens not separated by a whitespace:

// Separated tokens
if (a != b)

// Not working
if (a!=b)



For now it works pretty well for me, but in the future I could code my own syntax highlight engine (for example follwing this article) or I could repalce it with a more professional one, for example CodeHighlighter

Here is a screenshot of my modified form; if you like it, send me an email and I'll post here the source code.
# Tuesday, 08 November 2005

Managed OS and contracts

Today I had some time to flush my blog queue, and I found two interesting posts: the last episode of Mark Russinovich "war" against Sony and its buggy rootkit used as a DRM system (surely something agaist trused computing.. I'd like to know if anyone knows Microsoft's official position) and Adi Oltean's Sunday post on the Singularity OS at http://research.microsoft.com/os/singularity/

I think the first one is surely worth discussion, but many (too many, since it reached the mainstream media) words have already been spent.

The second one came unexpected, and with a little sense of envy.. for not being there. It's since my conversion to the CLR that I'm dreaming and speculating on how to build an OS in "managed" code (or using as much managed code as possible). And now they have done it!
In the technical report there are many interesting ideas: from contracts (not so different from the one's we'll be using in Indingo, it seems) to the use of MSIL to produce verifialble code, the global optimizing compiler (better than a JIT) or the global garbage collector. And versioning, trusted components, and all the benefits the CLR brought to us... OS wide!
As my MSC thesis was on concurrency contracts for the CLR, I think a very interesting part of the project is the Sing# programming language. I had experience of the Spec# programming language (a good project from another MSR team). I actually had the luck to speak with Schulte and Tillmann from the Spec# team at SAPS last year. Their work inspired mine on the CxC language. Now I really want to go and see how Sing# works with concurrency, and if it is possible to fit it with my idea of concurrency contracts. I will do further reading on the subject in the following days, time permitting as always. For those interested, you can find the paper here, and my thesis here.

# Sunday, 06 November 2005

C#, C++ and Rotor (2)


To see how the C# compiler works, what is better to go and look at the sources? However, if you have ever written a compiler, you can understand that figure out who calls who is not so simple. I decided to simplify my investigation a little with the help of.. a debugger =)
I set two breakpoints in two functions that I believed to be correlated, but that seemed to not came in touch nowhere in the code. It was strange, because the two are parse function and are in the same class.

I placed the first breakpoint in ParseNamespaceBody. As you can see in parser.cpp, this function will in turn call ParseNamaspace, ParseClass,
which in turn will call ParseMethod, ParseAccessor, ParseConstructor...
The second breakpoint was placed instead in ParseBlock. This is the function that parses a block of code, i.e. all the statements enclosed by two curly braces { }.
The interesting thing is that the two breakpoints are not hit subsequentely, but in different moments with different stack traces.

This is the stack trace for the first breakpoint:
   
0006efe8 531fa3ad cscomp!CParser::ParseNamespaceBody+0x74 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\parser.cpp @ 1627]
0006effc 53216891 cscomp!CParser::ParseSourceModule+0x8d [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\parser.cpp @ 1602]
0006f18c 5320c226 cscomp!CSourceModule::ParseTopLevel+0x691 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\srcmod.cpp @ 3331]
0006f19c 53175940 cscomp!CSourceData::ParseTopLevel+0x16 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\srcdata.cpp @ 171]
0006f234 53175bd7 cscomp!COMPILER::ParseOneFile+0x130 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 1197]
0006f284 53174599 cscomp!COMPILER::DeclareOneFile+0x27 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 1228]
0006f324 5317767a cscomp!COMPILER::DeclareTypes+0x89 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 821]
0006f4d0 53173987 cscomp!COMPILER::CompileAll+0x13a [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 1676]
0006f65c 5317d0dd cscomp!COMPILER::Compile+0xd7 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 704]
0006fe48 0040db29 cscomp!CController::Compile+0xdd [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\controller.cpp @ 225]
0006ff50 0040e7ac csc!main+0x349 [d:\cxc\sscli\clr\src\csharp\csharp\scc\scc.cpp @ 2270]
0006ff64 79c8138d csc!run_main+0x1c [d:\cxc\sscli\pal\win32\crtstartup.c @ 49]
0006ff88 0040e70b rotor_pal!PAL_LocalFrame+0x3d [d:\cxc\sscli\pal\win32\exception.c @ 600]
0006ffc0 7c816d4f csc!mainCRTStartup+0x7b [d:\cxc\sscli\pal\win32\crtstartup.c @ 66]
0006fff0 00000000 kernel32!BaseProcessStart+0x23

   
And this is the trace for the second one:
   
0006eed4 5321849c cscomp!CSourceModule::ParseInteriorNode+0xbc [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\srcmod.cpp @ 4576]
0006ef7c 531e163b cscomp!CSourceModule::GetInteriorNode+0x32c [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\srcmod.cpp @ 3863]
0006ef98 531e14fb cscomp!CInteriorTree::Initialize+0x4b [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\inttree.cpp @ 80]
0006efc8 53217fe4 cscomp!CInteriorTree::CreateInstance+0x6b [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\inttree.cpp @ 61]
0006f068 5320c28a cscomp!CSourceModule::GetInteriorParseTree+0xd4 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\srcmod.cpp @ 3728]
0006f07c 5316d1e8 cscomp!CSourceData::GetInteriorParseTree+0x1a [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\srcdata.cpp @ 187]
0006f1f0 5316c44a cscomp!CLSDREC::compileMethod+0x6c8 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\clsdrec.cpp @ 6832]
0006f220 5316b728 cscomp!CLSDREC::CompileMember+0x9a [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\clsdrec.cpp @ 6569]
0006f268 5316c7b2 cscomp!CLSDREC::EnumMembersInEmitOrder+0x228 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\clsdrec.cpp @ 6327]
0006f2f0 5316c084 cscomp!CLSDREC::compileAggregate+0x252 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\clsdrec.cpp @ 6657]
0006f324 53177f1b cscomp!CLSDREC::compileNamespace+0xc4 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\clsdrec.cpp @ 6523]
0006f4d0 53173987 cscomp!COMPILER::CompileAll+0x9db [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 1922]
0006f65c 5317d0dd cscomp!COMPILER::Compile+0xd7 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 704]
0006fe48 0040db29 cscomp!CController::Compile+0xdd [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\controller.cpp @ 225]
0006ff50 0040e7ac csc!main+0x349 [d:\cxc\sscli\clr\src\csharp\csharp\scc\scc.cpp @ 2270]
0006ff64 79c8138d csc!run_main+0x1c [d:\cxc\sscli\pal\win32\crtstartup.c @ 49]
0006ff88 0040e70b rotor_pal!PAL_LocalFrame+0x3d [d:\cxc\sscli\pal\win32\exception.c @ 600]
0006ffc0 7c816d4f csc!mainCRTStartup+0x7b [d:\cxc\sscli\pal\win32\crtstartup.c @ 66]
0006fff0 00000000 kernel32!BaseProcessStart+0x23


As you can see, the lowest common ancestor is COMPILER::CompileAll. Then the traces diverge: the first one calls DeclareTypes, the second one compileNamespace. I started to guess what was happening, but I wanted to see it clearly.

So I examined the two functions and the functions called by them. The ParseMethod, ParseAccessor, ParseConstructor functions parse the declaration of classes, functions and properties. But then, when they reach the open curly brace, they stop. Or, better, they call a SkipBlock function that searches the body of a class/method/etc for other declaration (inner classes, for example) but skips the statements. They record the start of the method (the position af the first curly) and return.

Digging into ParseBlock, I saw that it will in turn call the function responsible to generate the intermediate code, a parse tree made of STATEMENTNODEs

   STATEMENTNODE   **ppNext = &pBlock->pStatements;
while (CurToken() != TID_ENDFILE)
{
if ((iClose != -1 && Mark() >= iClose) ||
(iClose == -1 &&
(CurToken() == TID_CLOSECURLY ||
CurToken() == TID_FINALLY ||
CurToken() == TID_CATCH)))
break;
if (CurToken() == TID_CLOSECURLY) {
Eat (TID_OPENCURLY);
NextToken();
continue;
}
*ppNext = ParseStatement (pBlock);
ppNext = &(*ppNext)->pNext;
ASSERT (*ppNext == NULL);
}

    
The ParseStatement function will drive the parsing calling every time the right function among
CallStatementParser
ParseLabeledStatement
ParseDeclarationStatement
ParseExpressionStatement

    
So this second phase will finally parse the statements. The pertinent code is in the ParseInteriorNode function:

m_pParser->Rewind (pMethod->iCond);
pMethod->pBody = (BLOCKNODE *)m_pParser->ParseBlock (pMethod);
 

Now, we can finally see what happens and why a C# compiler works without forward declarations or without the separation of declaration and implementation: it is a two-phase parser. The first phase collects informations about symbols (classes and function names) and builds the symbol table.
The second phase uses the symbol table (already complete) for name resolution, than builds the complete parse tree.