# Saturday, 06 August 2016

Old school code writing (sort of)

As I mentioned in my previous post, online resources on Hosting are pretty scarce. 

Also, writing an Host for the CLR requires some in-depth knowledge of topics you do not usually see in your day-to-day programming, like for example IO completion ports. Same for AppDomains: there is plenty of documentation and resources compared to Hosting, but still some more advanced feature, and the underlying mechanisms (how does it work? How does a thread interact and knows of AppDomains?) are not something you can find in a forum. 

Luckily, I have been coding for enough time to have a programming library at home. Also, I have always been the kind of guy that wants to know not only how to use stuff, but how they really work, so I had plenty of books on how the CLR (and Windows) work at a low level. All the books I am going to list were already in my library!

The first one, a mandatory read, THE book on hosting the CLR:



Then, a couple of books from Richter:

  

The first one is very famous. I have the third edition (in Italian! :) ) which used to be titled "Advanced Windows". It is THE reference for the Win32 API.
If you go anywhere near CreateProcess and CreateThread, you need to have and read this book.

The second one has a title which is a bit misleading. It is acutally a "part 2" for the first one, focused on highly threaded, concurrent applications. It is the best explanation I have ever read of APCs and IO Completion Ports.

  

A couple of very good books on the CLR to understand Type Loading and AppDomains.
A "softer" read before digging into...

  

...the Internals. You need to know what a TEB is and how it works when you are chasing Threads as they cross AppDomains.
And you need all the insider knowledge you may get, if you need to debug cross-thread, managed-unmanaged transitions. And bugs spanning over asynchronous calls. 

My edition of the first book is actually called "Inside Windows NT". It is the second edition of the same book, which described the internals of NT3.1 (which was, despite the name, the first Windows running on the NT kernel), and was originally authored by Helen Custer. Helen worked closely with Dave Cutler's original NT team. My edition covers NT4, but it is still valid today. Actually, it is kind of fun to see how things evolved over the years: you can really see the evolution, how things changed with the transition from 32 to 64 bits (which my edition already covers, NT4 used to run on 64 bit Alphas), and how they changed it for security reasons. But the foundations and concepts are there: evolution, not revolution.

  

And finally two books that really helped me while writing replacements for the ITaks API. The first one to tell me how it should work, the second one telling me how to look inside the SSLCI for the relevant parts (how and when the Hosting code is called).

Of course, I did not read all these books before setting to work! But I have read them over the years, and having them in my bookshelf provided a quick and valuable reference during the development of my host for Pumpkin.
This is one of the (few) times when I'm grateful to have learned to program "before google", in the late '90/early '00. Reading a book was the only way to learn. It was slow, but it really fixed the concepts in my mind. 

Or maybe I was just younger :)


# Thursday, 04 August 2016

IL rewriting + AppDomain sandboxing + Hosting

So, in the end, what went into Pumpkin?

Control was performed at compilation time or execution time? And if it is execution, using which technique?

In general, compilation has a big pro (you can notify immediately the snippet creator that he did something wrong, and even preventing the code block from becoming an executable snippet) and a big con (you control only the code that is written. What if the user code calls down some (legitimate) path in the BCL that results in a undesired behaviour?)

AppDomain sandboxing has some big pros (simple, designed with security in mind) and a big con (no "direct" way to control some resource usage, like thread time or CPU time).
Hosting has a big advantage (fine control of everything, also of "third" assemblies like the BCL) which is also the big disadvantage (you HAVE to do anything by yourself).

So each of them can handle the same issue with different efficacy. Consider the issue of controlling thread creation:
  • at compilation, you "catch" constructs that create a new thread (new Thread, Task.Factory.StartNew, ThreadPool.QueueUserWorkItem, ...)
    • you have to find all of them, and live with the code that creates a thread indirectly.
    • but you can do wonderful things, like intercepting calls to thread and sync primitives and substitute them - run them on your own scheduler!
  • at runtime, you:
    • (AppDomain) check periodically. Count new threads from the last check.
    • (hosting) you are notified of thread creation, so you monitor it.
    • (debugger) you are notified as well, and you can even suspend the user code immediately before/after.

Another example:
  • at compilation, you control which namespaces can be used (indirectly controlling the assembly)
  • at runtime you can control which assemblies are really loaded (you are either notified OR asked to load them - and you can prevent the loading)

What I ended up doing is to use a mix of techniques. 

In particular, I implemented some compiler checks.
Then, run the compiled IL on a separate AppDomain with a restricted PermissionSet (sandboxing).
Then, run all the managed code in an hosted CLR.

I am not crazy...
 

Guess who is using the same technique? (well, not compiler checks/rewriting, but AppDomain sandboxing + Hosting?)
A piece of software that has the same problem, i.e. running unknown, third party pieces of code from different parties in a reliable, efficient way: IIS.
There is very little information on the subject; it is not one of those things for which you have extensive documentation already available. Sure, MSDN has documented it (MSDN has documentation for everything, thankfully), but there is no tutorial, or Q&As on the subject on StackOverflow. But the pieces of information you find in blogs and articles suggests that this technology is used in two Microsoft products: SQL Server, for which the Hosting API was created, and IIS.

Also, this is a POC, so one of the goals is to let me explore different ways of doing the same thing, and assess robustness and speed of execution. Testing different technologies is part of the game :)

Building barriers: compilation time VS execution time

In order to obtain what we want, i.e. fine grained resource control for our "snippets", we can act at two levels:

  • compilation time
  • execution time

Furthermore, we can control execution in three ways:

  1. AppDomain sandboxing: "classical" way, tested, good for security
  2. Hosting the CLR: greater control on resource allocation
  3. Execute in a debugger: even greater control on the executed program. Can be slower, can be complex

Let's examine all the alternatives.

Control at compilation time

Here, as I mentioned, the perfect choice would be to use the new (and open-source) C# compiler.

It divides well compilation phases, has a nice API, and can be used to recognize "unsafe" or undesired code, like unsafe blocks, pointers, creation of unwanted classes or call to undesired methods.

Basically, the idea is to parse the program text into a SyntaxTree, extract the node matching some criteria (e.g. DeclarationModifiers.Unsafe, calls to File.Read, ...), and raise an error. Also, it a possibility is to write a CSharpSyntaxRewriter that encapsulates (for diagnostic) or completely replace some classes or methods.

Unfortunately, Roslyn is not an option: StackOverflow requirements prevents the usage of this new compiler. Why? Well, users may want to show a bug, or ask for a particular behaviour they are seeing in version 1 of C# (no generics), or version 2 (No extension methods, no anonymous delegates, etc.). So, for the sake of fidelity, it is required that the snippet can be compiled with an older version of the compiler (and no, the /langversion switch is not really the same thing).

An alternative is to act at a lower level: IL bytecode. 
It is possible to compile the program, and then inspect the bytecode and even modify it. You can detect all the kind of unsafe code you do not want to execute (unsafe, pointers, ...), detect the usage of Types you do not want to load (e.g. through a whitelist), insert "probes" into the code to help you catch runaway code.

I'm definitely NOT thinking about "solving" the halting problem with some fancy new static analysis technique... :) Don't worry!

I'm talking about intercepting calls to "problematic" methods and wrap them. So for example:

static void ThreadMethod() {
   while (1) {
      new Thread(ThreadMethod).Start();
   }
}
This is a sort of fork bomb

(funny aside: I really coded a fork bomb once, 15 years ago. It was on an old Digital Alpha machine running Digital UNIX we had at the university. The problem was that the machine was used as a terminal server powering all the dumb terminals in the class, so bringing it down meant the whole class halted... whoops!)

After passing it through the IL analyser/transpiler, the method is rewritted (compiled) to:


static void ThreadMethod() {
   while (1) {
      new Wrapped_Thread(ThreadMethod).Start();
   }
}

And in Wrapped_Thread.Start() you can add "probes", perform every check you need, and allow or disallow certain behaviours or patterns. For example, something like: 

if (Monitor[currentSnippet].ThreadCount > MAX_THREADS)
  throw new TooManyThreadException();

if (OtherConditionThatWeWantToEnforce)
  ...

innerThread.Start();


You intercept all the code that deals with threads and wrap it: thread creation, synchronization object creation (and wait), setting thread priority ... and replace them with wrappers that do checks before actually calling the original code.

You can even insert "probes" at predefined points: inside loops (when you parse a while, or a for, or (at IL level), before you jump), before functions calls (to have the ability to check execution status before recursion). These "probes" may be used to perform checks, to yield the thread quantum more often (Thread.Sleep(0)), and/or to check execution time, so you are sure snippets will not take the CPU all by themselves. 

An initial version of Pumpkin used this very approach. I used the great Cecil project from Mono/Xamarin. IL rewriting is not trivial, but at least Cecil makes it less cumbersome. This sub-project is also on GitHub as ManagedPumpkin.

And obviously, whatever solution we may chose, we do not let the user change thread priorities: we may even run all the snippets in a thread with *lower* priority, so the "snippet" manager/supervisor classes are always guaranteed to run.

Control at execution time

Let's start with the basics: AppDomain sandboxing is the bare minimum. We want to run the snippets in a separate AppDomain, with a custom PermissionSet. Possibly starting with an almost empty one. 

Why? Because AppDomains are a unit of isolation in the .NET CLI used to control the scope of execution and resource ownership. It is already there, with the explicit mission of isolating "questionable" assemblies into "partially trusted" AppDomains. You can select from a set of well-known permissions or customize them as appropriate. Sometimes you will hear this approach referred to as sandboxing.

There are plenty of examples on how to do that, it should be simple to implement (for example, the PTRunner project).

AppDomain sandboxing helps with the security aspect, but can do little about resource control. For that, we should look into some form of CLR hosting.

Hosting the CLR

"Hosting" the CLR means running it inside an executable, which is notified of several events and acts as a proxy between the managed code and the unmanaged runtime for some aspects of the execution. It can actually be done in two ways:

1. "Proper" hosting of the CLR, like ASP.NET and SQL Server do

Looking a what you can control through the hosting interface  you see that, for example, you can control and replace all the native implementations of "task-related" (thread) functions.
It MAY seem overkill. But it gives you complete control. For example, there was a time (a beta of CLR v2 IIRC) in which it was possible to run the CLR on fibers, instead of threads. This was dropped, but gives you an idea of the level of control that can be obtained.

2. Hosting through the CLR Profiling API (link1, link2)

You can monitor (and DO!) a lot of things with it: I used it in the past to do on-the-fly IL rewriting (you are notified when a method is JIT-ed and you can modify the IL stream before JIT) (my past project used it for a similar thing, monitor thread synchronization... I should have talked about it on this blog years ago!)

In particular, you can intercept all kind of events relative to memory usage, CPU usage, thread creation, assembly loading, ... (it is a profiler, after all!).
An hypothetical snippet manager running alongside the profiler (which you control, as it is part of your own executable) can then use a set of policies to say "enough!" and terminate the offending snippet's threads.

Debugging

Another project I did in the past involved using the managed debugging API to run code step-by-step.

This gives you plenty of control, even if you do not do step-by-step execution: you can make the debugger code "break into" the debugger at thread creation, exit, ... And you can issue a "break" any time, effectively gaining complete control on the debugged process (after all, you are a debugger: it is your raison d'etre to inspect running code). It can be done at regular intervals, preventing resource depletion by the snippet.

# Saturday, 30 July 2016

Pumpink: a .NET "container"

More or less 20 months ago (gosh, time flies!) I started a side-project for a very famous company (StackExchange).
StackExchange had just launched, a few months before, a new feature on the main site of their network (StackOverflow).
This feature is called "Code Snippets", and it allows you to embed some sample HTML + JS code in a Question, or an Answer, and let the visitors of the page run it.
Of course, being JS it would run inside the browser, and with few focused precautions it can be made safe for both servers and clients (you do not want to leave an attack vector open on your servers, but you also don't want your visitors to be attacked/exploited as well!)

More details on how they implemented and safeguarded it can be found on their meta.stackoverflow.com site.

The feature got a lot of attention, and of course there where requests to extend it to other languages.
I was one of those that wanted more languages added, but I understood that JS was a very particular case.
Snippets in any other language would have meant an entirely different approach and an entirely different scale of complexity.

In October 2014 I visited NYC; before my visit I got in touch with David Fullerton, the "big boss" of SO development team. We were in touch since my previous "adventure", a few years before, when I interviewed for a position on their Q&A team. We discussed briefly about my past interview, and then he asked me a very interesting question: what would I add to StackOverflow? C# snippets immediately come to my mind.

We discussed briefly about it, drafted up some ideas, added requirements in the process (discarding most of the ideas) and finally David asked if I would like to try it out, as an Open Source experiment sponsored buy StackExchange.

... well, of course! Fun and challenging software, exchanging ideas with some of the most brilliant devs in the .NET ecosystem, and I get paid too! :)

So "Pumpink" was born. If you are curious, you can find it on my GitHub. It already contains a rather in-depth analysis about the structure of the project, and how it works.

Or, if you want to know "why?" instead of simply "how?", you can wait for the upcoming blog posts, in which I will detail some of the choices, problems, headaches that shaped the problem.

# Monday, 15 December 2014

Strangest bug of the day

System.TypeLoadException:
Could not load type 'System.Diagnostic.Debug'
from assembly 'MyAssembly'

Uhm... Hello?
Why are you looking for 'System.Diagnostic.Debug' inside my assembly? Go look inside System!

To be fair, I was fiddling with the CLR Hosting APIs, in particular IHostAssemblyStore.
Using this interface, you can override the CLR assemby probing logic, and load you own assemblies from wherever you want.
This is particularly helpful if you are using your own AppDomainManager, and you want to load it inside your sandboxed domain, which you carefully configured to disallow loading of any external assembly.
If you try to do that, binding fails, obviously. And since the AppDomainManager is created and loaded by the unmanaged portion of the CLR, you don't get a chance to use AssemblyResolve (tried, didn't work).

But that does not really justify this behavior: the Fusion log viewer was reporting no errors, and the exception was wrong (or, at least, not one of the usual exceptions you should look for when debugging binging failures).

There is a tiny parameter, called pAssemblyId. The docs are quite clear: it is used internally to see if that assembly was already loaded.
If it is, do not try to load it again. Sounds like a good optimization BUT:

The docs don't say it, but if you pass back "0", something very peculiar happens.
In my case, I had the Jitter looking for types inside my assembly. Others had different issues.

That was definitely the strangest bug of the day :)

# Sunday, 12 January 2014

Integration of Razor and AngularJS (2)

I found the technique I showed in my previous post particularly useful when you need to "migrate" some fields from server-side to client-side. By "migrate" I mean: make fields which were initially only needed on the server side, for page rendering, available on the client-side, in order to access and manipulate them through JavaScript too (for user interaction through Angular, in this case).

Suppose you have a page to display some person details:

@model PersonVM
...
<h1>@Model.Name</h1>
<span>@Model.Description</span>

All is well, until you decide that you want to provide a way to edit the description. You do so by adding a input field (for example), plus an "edit" button:

<button ng-click="'editing = true'" ... >Edit</button>
<input ng-model="description" ng-show="editing" ... >
<span>{{description}}</span>

This modification requires you to move the person's description from the server-side ViewModel to the client-side $scope. That means changing the controller code as well, since Razor does not need @Model.Description anymore, but you need to pass it (though JSON) to $scope.description:

$http.get("/Person/DetailsData" + $scope.personId).
            success(function (data, status) {
                $scope.description = data. description;

This is not bad, and the code still stays readable and compact using the "pattern" I described a couple of posts ago. But I'd rather not touch the controller at all.
Using the couple of directives I wrote, it is as simple as:

<button ng-click="'editing = true'" ... >Edit</button>
<input ng-model="description" ng-show="editing" value='@Model.Description' ... >
<span>{{description}}</span>

or

<button ng-click="'editing = true'" ... >Edit</button>
<input ng-model="description" ng-show="editing" ... >
<span ng-model="description" ng-content>@Model.Description</span>

No need for an additional $http request, nor for another controller action.
The "value" attibute on the input, or the element text of the span, will be written out by Razor as before; the new directives will use the content of the value attribute (or the element text) to initialize $scope.description. And without the need to do another round-trip to the server!
You don't even need to change the controller and split model fields between
ViewResult Person::Details 

and

JsonResult Person::DetailsData.

# Wednesday, 08 January 2014

Mixing AngularJS and ASP.NET MVC

MiniPoint is a web application created using a mix of AngularJS and MVC.

In the past I have used JavaScript libraries (in particular, jQuery) in conjunction with other web frameworks (ASP.NET pages, mainly).
Before beginning to work on MiniPoint, more or less 5 months ago, I needed to create a very simple example application to show how an external developer could use our OAuth provider for authentication.

A former collegue pointed me to AngularJS, and I was very impressed by it.

Let me put it straight: I like jQuery, I think it's fantastic for two reasons: it just works everyswhere, taking upon itself the burden of cross-browser scripting, and it lets you work with your existing web pages and improve them, significantly and progressively.
But for complex web applications, AngularJS is just.. so much cleaner!

The philosophy is very different (you should read this excellent answer if you have not read it yet); as highlighted in it, you don't design your page, and then change it with DOM manipulations (that's the jQuery way). You use the page to tell what you want to accomplish. It's much closer to what you would do in XAML, for example, supporting very well the MVVM concept.
This difference between jQuery and AngularJS actually reminds me of WinForms (or old Java/Android) programming VS. WPF: the approach of the former is to build the UI (often with a designer) and then change it through code. The latter leverages the power of data-binding to declare in the view what you want to accomplish, what is supposed to happen.
The view directly presents the intent.

The first AngularJS application I created was a pure AngularJS application: all the views where "static" (served by the web server, not generated) and all the code, all the beaviour (routes, controllers, ...) was in AngularJS and in a bunch of REST services I build with ServiceStack. AngularJS and ServiceStack seem made for each other: the approach is was very clean and it works really well if you have a rich SPA (Single Page Application). It is different from what I was used to do, and I needed some time to wrap my head around it (I kept wishing I could control my views, my content, on the server).

So, for the next, bigger problem I said "well, let's have the best of both world: server-side generated (Razor) views with angular controllers, to have more control on the content; MVC controllers to server the Views and their content".

Seems easy, but it has a couple of issues. jQuery is great for pages produced by something else (ASP.NET), where you design a page, and then you make it dynamic using jQuery. This is because jQuery was designed for augmentation and has grown incredibly in that direction. AngularJS excels in building JavaScript applications, entirely in angular.

It is possible to integrate AngularJS with MVC, of course, but you have different choices of how to pass, or better transform, data from a MVC Controller, the (View)Model it generates, and the AngularJS controller and its "viewmodel" (the $scope). Choosing the right one is not always easy.

In plain MVC (producing a page with no client-side dynamic) you have a request coming in and (through routing) triggering and action (a method) on a Contoller. The Controller then builds a Model (the ViewBag, or a typed (View)Model), selects a View and passes the both the (View)Model and the View to a ViewEngine. The ViewEngine (Razor) uses the View as a template, and fills it with the data found in the (View)Model. The resulting html page is sent back to the client.

Why am I talking about (View)Model, instead of just plain Model? Because this is what I usually end up creating for any but the simplest Views. The data model, which hold the so-called "business objects", the one you are going to persist on your database, is often different from what you are going to render on a page. What you want to show on a page is often the combination of two (or more) objects from the data model. There is a whole pattern built on this concept: MVVM. The patter is widespread in frameworks with two-way binding mechanisms (Knockout.js, WPF, Silverlight); many (me included) find it beneficial even in frameworks like MVC; after all, the ViewBag is exactly that: a bag of data, drawn from your business objects, needed by the ViewEngine to correctly render a View.

However, instead of passing data through an opaque, untyped ViewBag, it is a good practice to build a class containing exactly the fields/data needed by the View (or, better, by Razor to build the View).

If you add AngularJS to the picture, you have to pass data not only to Razor, but to AngularJS as well. How can you do it? There are a couple of options:

Razor inside the script

<script>
   ...
   var id = @(Model.Id);
   ...
</script>

This approach works, of course: any data you need to share between the server and the client can be written directly in the script, by embedding it in the (cs)html page and generating the script with the page. I do not like this approach for a number of reasons, above all caching and clear separation of code (JavaScript) and view (html).

The url

I wanted to keep my controller code in a separate .js file, so I discarded this option. The next place where I looked at was the url; after all, it has been used for ages to pass information from the client to the server. For a start, I needed to pass a single ID, a tiny little number, to angular. I already had this tiny number on the URL, as ASP.NET routing passes it to a Controller action (as a parameter) to identify a resource. As an example, suppose we have a list of persons, and we want the details relative to a single person with ID 3. ASP.NET routing expects a URL like:

http://localhost/Person/Details/3

This maps "automatically" (or better, by convention) to:

ActionResult PersonController::Details(int id)

If I want to get the same number in my angular controller, I could just get the very same url using the location service, and parse it like MVC routing does:

var id = $location.path().substring($location.path().lastIndexOf("/"));

But it's kind of ugly, and I find it "hack-ish" to do it in this way, so I kept looking.

Razor inside the page (ng-init)

A better alternative is to use Razor to write "something" on the page, and then read it from the client script. You can use hidden fields, custom data- attributes and so on; fortunately angular already provide an attribute for this purpose, ng-init.

<div xmlns:ng="http://angularjs.org" id="ng-app" class="ng-app:MiniModule" ng-app="MiniModule">
    <div class="row-fluid" ng-controller="PersonDetailsController" ng-init="personId=@Model.Id">

Angular injects it in the scope during initialization, so you can refer to it as $scope.personId

Ajax

Finally, one of the most common way to trasfer data from the server to a script it through ajax calls. AngularJS has a great service ($http), very simple and powerful:

$http.get("/Person/GroupData/").
        success(function (data, status) {
            $scope.data = data;
        }).error(function (data, status) {
            $scope.alerts.push(data);
        });

On the server side, there is a

JsonResult PersonController::GroupData()

method which returns a JsonResult, encapsulating a Json object.


Mixing up

It is not convenient to use ng-init for large objects, or for many objects. On the other hand, you need a practical way to pass around an ID, to use ajax on resources that requires it (like http://localhost/Person/Details/3).

The more sensible approach, which I ended up using, seems to use ng-init to pass around the id, and ajax to actually retrieve the data. In the current implementation of MiniPoint it seems to work quite well.
In general, when I have a resource (like Person) and I want to show and edit information and details linked to it, I have:
  • an object model (Person)
  • a ViewModel (PersonVM) which is populated in controller actions and passed to the View:

ActionResult PersonController::Details(int id) {
   ...
   return View(new PersonVM { ... });
}

@model PersonVM
...

<div xmlns:ng="http://angularjs.org" id="ng-app" class="ng-app:MiniModule" ng-app="MiniModule">
    <div class="row-fluid" ng-controller="PersonDetailsController" ng-init="personId=@Model.Id">
...

  • a Person data transfer object (PersonDTO) which is populated requested by the angular controller, populated by a "Data" controller action and then returned as JSON to the controller:

    // Defer ajax call to let ng-init assign the correct values
    $scope.$evalAsync(function () {
        $http.get("/Person/DetailsData/" + $scope.personId).
            success(function (data, status) {
                $scope.data = data;
                // ...
            }).error(function (data, status) {
                // error handling
            });
    });


JsonResult PersonController::DetailsData(int id) {
   ...
   return Json(new PersonDTO { ... });
}

# Wednesday, 18 September 2013

Where have I been?

I really don't know if anybody besides my friends and coworkers read this blog but.. after some entries in the first half of this year, everything got silent.
The reason is simple: I was working, and in my free time.. learning working again.

The project I mentioned (the mini sharepoint-like replica) got momentum and a deadline. I am really enjoying working on it; I am using and putting together a lot of great techniques and frameworks. Some of them were known to me (WF4 for workflows management, GPPG/Gplex for parser generation, ASP.NET MVC for web applications...) but some were really new (AngularJS.. wow I do like this framework!, Razor.. which is a joy to work with). It is tiresome to work, then get home (using my commuting time to read and learn) and then work again. Fortunately, my wife is super-supportive!

That, and my "regular" day-to-day work were we are also putting together some new and exciting applications. The most interesting, a "framework" for authentication and authorization from third-parties that uses a mix of OAuth, NFC and "custom glue" to authorize access to third-party applications without having to enter username-password (we use our smart cards as credentials).
Think about an open, public environment (a booth at an expo, or an one of the outside events for which our regions is famous, like the Christmas Market)



You have a booth where you are offering some great service subscription. You do want people to stop by and subscribe to your services, but you do not want to bother them for too long, so filling out forms is out of question; you do not even want them to sign-in on a keyboard (try to make people enter a password at -5C.. it's not easy).
I coded a simple prototype, an Android app that uses NFC to read the AltoAdige Pass that every traveler have (or should have :) ) as a set of credentials for authorization against our OAuth provider. The third-party app request an access toke, users are redirected to our domain and "log-in" using the card, by weaving it in front of an Android tablet. The process is secure (it involves the card processor and keys, mutual authentication, etc.) but easy and fast. They see the permissions and confirm (again weaving the card when the tablet asks them).

For now it is only a prototype, but.. I found it interesting when pieces of (cool) technology fall together to produce something that is easy to use and actually makes people's life e little bit easier.
With so many balls to juggle, time for blogging is... naught. But I will be back with more news on my projects, as soon as I have some time.

# Monday, 24 June 2013

A "Mini" workflow engine, part 4.1: from delegates to patterns

Today I will write just a short post, to wrap-up my considerations about how to represent bookmarks.
We have seen that the key to represent episodic execution of reactive programs is to save "continuations", marking points in the program text where to suspend (and resume) execution, together with program state.

We have seen two ways of represent bookmarks in C#: using delegates/Func<> objects, and using iterators.

Before going further, I would like to introduce a third way, a variation of the "Func<> object" solution.
Instead of persisting delegates (which I find kind of interesting and clever, but limiting) we turn to an old,plain pattern solution.

A pattern which recently gained a lot of attention is CQRS (Command Query Responsibility Segregation). It is a pattern that helps to simplify the design of distributed systems, and helps scaling out. It is particularly helpful in cloud environments (my team at MSR in Trento used it for our cloud project).

Having recently used it, I remembered how CQRS (like many other architectural level patterns) is built upon other patterns; in particular, I started thinking about the Command pattern (the "first half" of CQRS).
It looks like the Command pattern could fit quite well with our model; for example, we can turn the ReadLine activity:

[Serializable]
public class ReadLine : Activity
{
    public OutArgument<string> Text = new OutArgument<string>();

    protected override ActivityExecutionStatus Execute(WorkflowContext context)
    {
        context.CreateBookmark(this.Name, this.ContinueAt);
        return ActivityExecutionStatus.Executing;
    }

    void ContinueAt(WorkflowContext context, object value)
    {
        this.Text.Value = (string)value;
        CloseActivity(context);
    }
}
Into the following Command:

/* The Receiver class */
public class ReadLine : Activity
{
    public OutArgument<string> Text = new OutArgument<string>();

    protected override ActivityExecutionStatus Execute(WorkflowContext context)
    {
        context.CreateBookmark(this.Name, new CompleteReadLine(this));
        return ActivityExecutionStatus.Executing;
    }

    void ContinueAt(WorkflowContext context, object value)
    {
        this.Text.Value = (string)value;
        CloseActivity(context);
    }
}

[Serializable]
public interface ICommand
{
    void Execute(WorkflowContext context);
} /* The Command for starting reading a line - ConcreteCommand #1 */
[Serializable]
public StartReadLine: ICommand {     private readonly ReadLine readLine;     public StartReadLine(ReadLine readLine) {         this.readLine = readLine;     }          public void Execute(WorkflowContext context) {      readLine.Status = readLine.Execute(context);     } }      /* The Command for finishing reading a line - ConcreteCommand #2 */
[Serializable]
public class CompleteReadLine : ICommand {     private readonly ReadLine readLine;     public CompleteReadLine(ReadLine readLine) {         this.readLine = readLine;     }          public Object Payload { get; set; }     public void Execute(WorkflowContext context) {         readLine.ContinueAt(context, payload);
readLine.Status = ActivityExecutionStatus.Closed;
    } }
The idea is to substitute the "ContinueAt" delegates with Command objects, which will be created and enqueued to represent continuations.
The execution queue will then call the Execute method, instead of invoking the delegate, and will inspect the activity status accordingly (not shown here).
Why using a pattern, instead of making both Execute and ContinueAt abstract and schedule the Activity itself using the execution queue? To maintain the same flexibility; in this way, we can have ContinueAt1 and ContinueAt2, wrapped by different Command objects, and each of them could be scheduled based on the execution of Execute.

Using objects will help during serialization: while Func<> objects and delegates are tightly coupled with the framework internals and therefore, as we have seen, require BinarySerialization, plain objects may be serialized using methods, more flexible and/or with higher performances (like XmlSerialization or protobuf-net).

Yes, it is less immediate, more cumbersome, less elegant. But it is also a more flexible, more "portable" solution.
Next time: are we done with execution? Not quite.. we are still missing repeated execution of activities! (Yes, loops and gotos!)

# Monday, 10 June 2013

A "Mini" workflow engine. Part 4: "out-of-order" composite activities, internal bookmarks, and execution.

Last time, we have seen how a sequence of activities works with internal bookmarks to schedule the execution of the next step, introducing a queue to hold these bookmarks.
We left with a question: what if execution is not sequential?
In fact, one of the advantages of an extensible workflow framework is the ability to define custom statements; we are not limited to Sequence, IfThenElse, While, and so on but we can invent our own, like Parallel:

public class Parallel : CompositeActivity
{
    protected internal override ActivityExecutionStatus Execute(WorkflowInstanceContext context) {
        foreach (var activity in children)        
           context.RunProgramStatement(activity, this.ContinueAt);       
        return ActivityExecutionStatus.Executing;
    }

    private void ContinueAt(WorkflowInstanceContext context, object arg) {
        foreach (var activity in children)        
            //Check every activity is closes...
            if (activity.ExecutionStatus != ActivityExecutionStatus.Closed)
                return;
        
         CloseActivity(context);
     }
}

And even have different behaviours here (like speculative execution: try several alternative, choose the one that complete first).

Clearly, a queue does not work with such a statement. We can again use delegates to solve this issue, using a pattern that is used in several other scenarios (e.g. forms management).
When a composite activity schedules a child activity for execution, it subscribes to the activity, to get notified of its completion (when the activity will finish and become "closed"). Upon completion, it can then resume execution (the "internal bookmark" mechanism).

child.Close += this.ContinueAt;
Basically, we are moving the bookmark and (above all) its continuation to the activity itself. It will be the role of CloseActivity to find the delegate and fire it. This is exactly how WF works...

context.CloseActivity() {
  ...
  if (currentExecutingActivity.Close != null)
     currentExecutingActivity.Close(this, null);

}
... and it is the reason why WF needs to keep explicit track of the current executing activity.

Personally, I find the "currentExecutingActivity" part a little unnerving. I would rather say explicitly which activity is closing:

context.CloseActivity(this);
Moreover, I would enforce a call to CloseActivity with the correct argument (this)

public class Activity {
   private CloseActivity() {
      context.CloseActivity(this); //internal
Much better!

Now we almost everything sorted out; the only piece missing is about recursion. Remember, when we call the continuation directly, we end up with a recursive behaviour. This may (or may not) be a problem; but what can we do to make the two pieces really independent?

We can introduce a simple execution queue:

public class ExecutionQueue
{
    private readonly BlockingCollection<Action> workItemQueue = new BlockingCollection<Action>();

    public void Post(Action workItem) {
           ...
    }
}
Now, each time we have a continuation (from an internal or external bookmark), we post it to the execution queue. This makes the code for RunActivity much simpler:

internal void RunProgramStatement(Activity activity, Action<WorkflowContext, object> continueAt)
{
    logger.Debug("Context::RunProgramStatement");

    // Add the "bookmark"
    activity.Closed = continueAt;

    // Execute the a activity
    Debug.Assert(activity.ExecutionStatus == ActivityExecutionStatus.Initialized);
    ExecuteActivity(activity);
} 
As you may have noticed, we are giving away a little optimization; a direct call when the activity just completes should be much faster.
Also, why would we need a queue? We could do just fine using Tasks:

Task.Factory.
    StartNew(() => activity.Execute(this)).
    ContinueWith((result) =>
    {
        // The activity already completed?
        if (result.Result == ActivityExecutionStatus.Closed)
        continueAt(this, null);
        else
        {
        // if it didn't complete, an external bookmark was created
        // and will be resumed. When this will happen, be ready!
        activity.OnClose = continueAt;
        }
    });


internal void ResumeBookmark(string bookmarkName, object payload) {            
    ...
    Task.Factory.
        StartNew(() => bookmark.ContinueAt(this, payload));
}

internal void CloseActivity(Activity activity) {
    if (activity.OnClose != null)
    Task.Factory.
        StartNew(() => activity.OnClose(this, null));
}
Which makes things nicer (and potentially parallel too). Potentially, because in a sequential composite activity like Sequence, a new Task will be fired only upon completion of another. Things for the Parallel activity, however, will be different. We will need to be careful with shared state and input and output arguments though, as we will see in a future post.

Are there other alternatives to a (single threaded) execution queue or to Tasks?

Well, every time I see this kind of pause-and-resume-from-here kind of execution, I see an IEnumerable. For example, I was really fond of the Microsoft CCR and how it used IEnumerable to make concurrent, data-flow style asynchronous execution easier to read:

    int totalSum = 0;
    var portInt = new Port<int>();

    // using CCR iterators we can write traditional loops
    // and still yield to asynchronous I/O !
    for (int i = 0; i < 10; i++)
    {
        // post current iteration value to a port
        portInt.Post(i);

        // yield until the receive is satisifed. No thread is blocked
        yield return portInt.Receive();               
        // retrieve value using simple assignment
        totalSum += portInt;
    }
    Console.WriteLine("Total:" + totalSum);

(Example code from MSDN)
I used the CCR and I have to say that it is initially awkward, but then it is a style that grows on you, it is a very efficient, and it was way of representing asynchronous with a "sequential" look, well before async/await.

The code for ReadLine would become much clearer:

public class ReadLine : Activity
{
    public OutArgument<string> Text = new OutArgument<string>();       

    protected override IEnumerator<Bookmark> Execute(WorkflowInstanceContext context, object value)
    {
        // We need to wait? We just yield control to our "executor"
        yield return new Bookmark(this);

        this.Text.Value = (string)value;

        // Finish!
        yield break;
        // This is like returning ActivityExecutionStatus.Close,
        // or calling context.CloseActivity();            
    }
}
Code for composite activities like sequence could be even cleaner:

protected internal override IEnumerator<Bookmark> Execute(WorkflowInstanceContext context)
{
    // Empty statement block
    if (children.Count == 0)
    {
        yield break;
    }
    else
    {
        foreach (var child in children)
        {
            foreach (var bookmark in context.RunProgramStatement(child))
                yield return bookmark;
        }                
    }
}

Or, with some LINQ goodness:

return children.SelectMany(child => context.RunProgramStatement(child));
The IEnumerable/yield pair have a very simple, clever implementation; it is basically a compiler-generated state machine. The compiler transforms your class to include scaffolding and memorize in which state the code is, and jump execution to the right code when the method is invoked.

But remember, our purpose is exactly to save and persist this information: will this state machine be serialized as well?
According to this StackOverflow question, the answer seem to be positive, with the appropriate precautions; also, it seems that the idea is nothing new (actually, it is 5 years old!)
http://dotnet.agilekiwi.com/blog/2007/05/implementing-workflow-with-persistent.html

...and it also seems that the general thought is that you cannot do reliably.

In fact, I would say you can: even if you do not consider that 5 years passed and the implementation of yield is still the same, the CLR will not break existing code. At IL level, the CLR just see some classes, one (or more) variables that holds the state, and a method with conditions based on those variables. It is not worse (or better) that serializing delegates, something we are already doing.

The iterator-based workflow code sits in a branch of my local git, and I will give it a more serious try in the future; for now, let's play on the safe side and stick to the Task-based execution.

Until now, we mimicked quite closely the design choices of WF; I personally like the approach based on continuations, as I really like the code-as-data philosophy. However the serialization of a delegate can hardly be seen as "code-as-data": the code is there, in the assembly; it's only status information (the closure) that is serialized, and the problem is that you have little or no control over these compiler generated classes.

Therefore, even without IEnumerable, serialization of our MiniWorkflow looks problematic: at the moment, it is tied to BinarySerializer. I want to break free from this dependency, so I will move towards a design without delegates.

Next time: bookmarks without delegates

# Friday, 07 June 2013

A "Mini" workflow engine. Part 3: execution of sequences and internal bookmarks

A couple of posts ago, I introduced a basic ("Mini") workflow engine.
The set of classes used an execution model that uses bookmarks and continuations, and I mentioned that it may have problems.

Let's recap some basic points:
  • Activities are like "statements", basic building blocks.
  • Each activity has an "Execute" method, which can either complete immediately (returning a "Closed" status),
// CloseActivity can be made optional, we will see how and why in a later post
context.CloseActivity();
return ActivityExecutionStatus.Closed;

or create a bookmark. A bookmark has a name and a continuation, and it tells the runtime "I am not done, I need to wait for something. When that something is ready, call me back here (the continuation)". In this case, the Activity returns an "Executing" status.
  • There are two types of bookmarks: external and internal.
    External bookmarks are the "normal" ones:
context.CreateBookmark(this.Name, this.ContinueAt);
return ActivityExecutionStatus.Executing;
They are created when the activity needs to wait for an external stimulus (like user input from the command line, the completion of a service request, ...).
Internal bookmarks are used to chain activities: when the something "I need to wait for" is another activity, the activity request the runtime to execute it and call back when done:
context.RunProgramStatement(statements[0], ContinueAt);
return ActivityExecutionStatus.Executing;

In our first implementation, external bookmarks are really created, held into memory and serialized if needed:

public void CreateBookmark(string name, Action<WorkflowInstanceContext, object> continuation)
{
    bookmarks.Add(name, new Bookmark
        {
            ContinueAt = continuation,
            Name = name,
            ActivityExecutionContext = this
        });
}
From there, the runtime is able to resume a bookmark when an event that "unblocks" the continuation finally happens
(notice that WF uses queues, instad of a single-slot bookmarks, which is quite handy in many cases - but for now, this will do just right):

internal void ResumeBookmark(string bookmarkName, object payload)
{            
    var bookmark = bookmarks[bookmarkName];
    bookmarks.Remove(bookmarkName);
    bookmark.ContinueAt(this, payload);
}
Internal bookmarks instead are created only if needed:

internal void RunProgramStatement(Activity activity, Action<WorkflowInstanceContext, object> continueAt)
{
    var result = activity.Execute(this);
    // The activity already completed?
    if (result == ActivityExecutionStatus.Closed)
        continueAt(this, null);
    else
    {
        // Save for later...
        InternalBookmark = new Bookmark
        {
            ContinueAt = continueAt,
            Name = "",
            ActivityExecutionContext = this
        };
    }
}
But who resumes these bookmarks?
CreateBookmark - ResumeBookmark are nicely paired, but RunProgramStatement seems not to have a correspondence.

When do we need to resume an internal bookmark? When the activity passed to RunProgramStatement completes. This seems like a perfect job for CloseActivity!
public void CloseActivity()
{
    logger.Debug("Context::CloseActivity");
    // Someone just completed an activity.
    // Do we need to resume something?
    if (InternalBookmark != null)
    {
        logger.Debug("Context: resuming internal bookmark");
        var continuation = InternalBookmark.ContinueAt;
        var context = InternalBookmark.ActivityExecutionContext;
        var value = InternalBookmark.Payload;
        InternalBookmark = null;
        continuation(context, value);
    }
}

However, this approach has two problems.
The first one is not really a problem, but we need to be aware of this; if the activities chained by internal bookmarks terminate immediately, the execution of our "sequence" of activities result in recursion:

public class Print : Activity
{
    protected override ActivityExecutionStatus Execute(WorkflowInstanceContext context)
    {
        Console.WriteLine("Hello");
            context.CloseActivity();
        return ActivityExecutionStatus.Closed;
    }
}

var sequence = new Sequence();
var p1 = new Print();
var p2 = new Print();
var p3 = new Print();
sequence.Statements.Add(p1);
sequence.Statements.Add(p2);
sequence.Statements.Add(p3);
Its execution will go like:

sequence.Execute
-context.RunProgramStatement
--p1.Execute
--sequence.ContinueAt
---context.RunProgramStatement
----p2.Execute
----sequence.ContinueAt
-----context.RunProgramStatement
------p3.Execute
------sequence.ContinueAt
-------context.CloseActivity (do nothing)
The second problem is more serious; lets' consider this scenario

var s1 = new Sequence();
var p1 = new Print();
var s2 = new Sequence();
var r2 = new Read();
var p3 = new Print();
s1.Statements.Add(p1);
s1.Statements.Add(s2);
s2.Statements.Add(r2);
s2.Statements.Add(p3);

And a possible execution which stops at r2, waiting for input:

s1.Execute
-context.RunProgramStatement
--p1.Execute
--s1.ContinueAt
---context.RunProgramStatement(s2, s1.ContinueAt)
----s2.Execute
-----context.RunProgramStatement(r2, s2.ContinueAt)
------r2.Execute
------(External Bookmark @r2.ContinueAt)
------(return Executing)
-----(Internal Bookmark @s2.ContinueAt)
-----(return)
----(return Executing)
---(Internal Bookmark @s1.ContinueAt) <-- Oops!

What happens when the external bookmark r2.ContinueAt is executed?
r2 will complete and call context.CloseActivity. This will, in turn, resume the internal bookmark.. which is not set to s2.ContinueAt, but to s1.ContinueAt.

We can avoid this situation by observing how RunProgramStatement/CloseActivity really look the like call/return instructions every programmer is familiar with (or, at least, every C/asm programmer is familiar with). Almost every general purpose language uses a stack to handle function calls (even if this is an implementation detail); in particular, the stack is  used to record where the execution will go next, by remembering the "return address", the point in the code where to resume execution when the function returns. Sounds familiar?

call fRunProgramStatement(a)
- pushes current IP on the stack-call a.Execute
- jump to entry of function -saves continueAt in and internal bookmark
  
... at the end of f ...... at completion of a ...
  
return CloseActivity
-pops return address from stack-takes continueAt from internal bookmark
-jump to return address-executes continueAt

Notice that we swapped the jump to execution and saving the return point; I did it to perform an interesting "optimization" (omitting the bookmark if the a.Execute terminates immediately; think of it as the equivalent to inlining the function call - and not really an optimization, as it comes with its problem as we have just seen).

Is it a safe choice? Provided that we are swapping the stack used by function calls with a queue, yes.
In standard, stack based function calls what you have to do is to pinpoint where to resume execution when the function being called returns. The function being called (the callee) can in turn call other functions, so point-to-return to (return addresses) need to be saved in a LIFO (Last In, First Out) fashion. A stack is perfect.

Here, we provide a point where to resume execution directly, and this point has no relation to the return point of execute (i.e. where we are now).
If the call to Execute does not close the activity, that means that execution is suspended. A suspended execution can, in turn, lead to the suspension of other parent activities, like in our nested sequences example. Now, when r2 is resumed, it terminatates and calls its "return" (CloseActivity). The points-to-return to, in this case, need to be processed in a FIFO (First In, First Out) fashion.

Next time: a sequence is not the only kind of possible composite activity. Does a queue satisfy additional scenarios?

# Sunday, 02 June 2013

A "Mini" workflow engine. Part 2: on the serialization of Bookmarks

Last time, we have seen that Activities are composed in a way that allows them to be paused and resumed, using continuations in the form of bookmarks:
[Serializable]
internal class Bookmark
{
    public string Name { get; set; }
    public Action<ActivityExecutionContext, object> ContinueAt { get; set; }  
    public ActivityExecutionContext ActivityExecutionContext { get; set; }
}
A Bookmark has a relatively simple structure: a Name, a continuation (in the form of an Action) and a Context. Bookmarks are created by the Context, and the continuation is always a ContinueAt method implemented in an activity. Thus, serializing a Bookmark (and, transitively, its Context) should lead to the serialization of an object graph that holds the entire workflow code, its state and its "Program Counter".

However, serializing a delegate sounds kind of scary to me: are we sure that we can safely serialize something that is, essentially, a function pointer?
That's why I started looking for an answer to a particular question: what happens when I serialize a delegate, a Func<T> or an Action<T>?

I found some good questions and answers on StackOverflow; in particular, a question about serialization of lambdas and closures (related to prevalence, a technique to provide transactional implementing checkpoints and redo journals in terms of serialization) and a couple of questions about delegate (anonymous and not) serialization (1, 2 and 3).

It looks like:
  • delegate objects (and Func<>) are not XML serializable, but they are BinarySerializable.
  • even anonymous delegates, and delegates with closures (i.e. anonymous classes) are serializable, provided that you do some work to serialize the compiler-generated classes (see for example this MSDN article, and this blog post)
  • still, there is a general agreement that the whole idea of serializing a delegate looks very risky

BinarySerializable looks like the only directly applicable approach; in our case, we are on the safe side: every function  we "point to" from delegates and bookmarks is in classes that are serialized as well, and objects and data that accessed by the delegate are serialized as well (they access only data that is defined in activities themselves, which is saved completely in the context). BinarySerilization will save everything we need to pause and resume a workflow in a different process.

I implemented a very simple persistence of workflow instances using BinarySerializer for MiniWorkflow, and it does indeed work as expected.
You can grab the code on github and see it yourself.

However, I would like to provide an alternative serialization mechanism for MiniWorkflow, less "risky". I do not like the idea of not knowing how my Bookmarks are serialized, I would like something more "visible", to understand if what I need really goes from memory to disk, and to be more in control.
Also, I am really curious to see how WF really serializes bookmarks; workflows are serialized/deserialized as XAML, but a workflow instance is something else. In fact, this is a crucial point: in the end, I will probably end up using WF4 (MiniWorkflow is -probably- just an experiment to understand the mechanics better), but I will need to implement serialization anyway. I do not want to take a direct dependency on SQL, so I will need to write a custom instance store, alternative to the standard SQL instance store.

I will explore which alternatives we have to a BinarySerializer; but before digging into this problem, I want to work out a better way to handle the flow of execution. Therefore, next time calling continuations: recursion, work item queue or... call/cc?

# Saturday, 01 June 2013

A "Mini" workflow engine. Part 1: resumable programs

WF achieves durability of programs in a rather clever way: using a "code-as-data" philosophy, it represents a workflow program as a series of classes (activities), which state of execution is recorded at predefined points in time.

It then saves the whole program (code and status) when needed, through serialization.
The principle is similar to the APM/IAsyncResult pattern: an async function says "do not wait for me to complete, do something else" and to the OS/Framework "Call me when you are done here (the callback), and I will resume what I have to do". The program becomes "thread agile": it does not hold onto a single thread, but it can let a thread go and continue later on another one.

Workflows are more than that: they are process agile. They can let their whole process terminate, and resume later on another one. This is achieved by serializing the whole program on a durable storage.
The state is recorded using continuations; each activity represents a step, a "statement" in the workflow. But an activity does not call directly the following statement: it executes, and then says to the runtime "execute the following, then get back to me", providing a delegate to call upon completion, a continuation.
The runtime sees all these delegates, and can either execute them or save them. It can use the delegate to build a bookmark.  Serializing the bookmark will save it to disk, along with all the program code and state (basically, the whole workflow object graph plus one delegate that holds a sort of "program pointer" (the bookmark), where execution can be resumed).

The same mechanism is used for a (potentially long) wait for input: the activity tells the runtime "I am waiting for this, call me back here when done" using a delegate, and the runtime can use it to passivate (persist and unload) the whole workflow program.

I found it fascinating, but I wanted to understand better how it worked, and I was dubious about one or two bits. So, I built a very limited, but functional workflow runtime around the same principles. I have to say that using WF3 (and WF4 from early CTPs too) I already had quite a good idea of how it could have been implemented, but I found Dharma Shukla and Bob Shmidt "Essential Windows Workflow Foundation" very useful to cement my ideas and fill some gaps. Still, building a clone was the better way to fill in the last gaps.

The main classes are:
  • an Activity class, the base class for all the workflow steps/statements; a workflow is a set of activities;
  • a WorkflowHandle, which represent an instance of a workflow;
  • a Bookmark, which holds a continuation, a "program pointer" inside the workflow instance;
  • a Context, which will hold all the bookmarks for the current workflow instance;
  • a WorkflowRuntime, which handles the lifecycle of a WorkflowInstance and dispatches inputs (resuming the appropriate Bookmark in the process)

A basic Activity is pretty simple:

[Serializable]
public abstract class Activity
{
   public Activity()
   {
     this.name = this.GetType().Name;
   }

   abstract protected internal ActivityExecutionStatus Execute(ActivityExecutionContext context);

   protected readonly string name;
   public string Name
   {
     get { return name; }
   }
}
It just have a name, and an "Execute" method that will be implemented by subclasses; it is how activities are composed that is interesting:

[Serializable]
public class Sequence : Activity
{
    int currentIndex;
    List<Activity> statements = new List<Activity>();
    public IList<Activity> Statements
    {
        get { return statements; }
    }

    protected internal override ActivityExecutionStatus Execute(ActivityExecutionContext context)
    {
        currentIndex = 0;
        // Empty statement block
        if (statements.Count == 0)
        {
            context.CloseActivity();
            return ActivityExecutionStatus.Closed;
        }
        else
        {
            context.RunProgramStatement(statements[0], ContinueAt);
            return ActivityExecutionStatus.Executing;
        }
    }

    public void ContinueAt(ActivityExecutionContext context, object value)
    {
        // If we've run all the statements, we're done
        if (++currentIndex == statements.Count) 
            context.CloseActivity();
        else // Else, run the next statement
            context.RunProgramStatement(statements[currentIndex], ContinueAt);
    }        
}
Now the Execute method needs to call Execute on child activities, one at time. But it does not do it in a single step, looping though all the statements list and calling their execute: this will not let the runtime handle them properly. In fact, the execute method says to the runtime "run the first statement, then call me back". This is a pattern you see both in WF3 and WF4, even if in WF3 is more explicit. It is called an "internal bookmark": you ask the runtime to set a bookmark for you, execute an activity, and the activity will resume the bookmark once done:
// This method says: the next statement to run is this. 
// When you are done, continue with that (call me back there)
internal void RunProgramStatement(Activity activity, Action<ActivityExecutionContext, object> continueAt) { // This code replaces // context.Add(new Bookmark(activity.Name, ContinueAt)); var result = activity.Execute(this); // The activity already completed? if (result == ActivityExecutionStatus.Closed) continueAt(this, null); else { // Save for later... InternalBookmark = new Bookmark { ContinueAt = continueAt, Name = "", ActivityExecutionContext = this }; } }
When an Activity completes, returning ActivityExecutionStatus::Closed or calling explicitly CloseAcctivity, the runtime looks for a previously set bookmark and, if found, resumes execution from it:
public void CloseActivity()
{
    // Someone just completed an activity.
    // Do we need to resume something?
    if (InternalBookmark != null)
    {
        var continuation = InternalBookmark.ContinueAt;
        var context = InternalBookmark.ActivityExecutionContext;
        var value = InternalBookmark.Payload;
        InternalBookmark = null;
        continuation(context, value);
    }
}
Do you already spot a problem with this way of handling the continuations and the control flow of the workflow program? Yes, it is recursive, and the recursion is broken only if an Activity explicitly sets a bookmark. In this way, the runtime simply returns to the main control point, waiting for the input and resuming the bookmark after it receives it.
For example, in this "ReadLine" activity:
[Serializable]
public class ReadLine : Activity
{
    public OutArgument<string> Text = new OutArgument<string>();       

    protected override ActivityExecutionStatus Execute(ActivityExecutionContext context)
    {
        //Waits for user input (from the command line, or from 
        //wherever it may come
        context.CreateBookmark(this.Name, this.ContinueAt);
        return ActivityExecutionStatus.Executing;
    }

    void ContinueAt(ActivityExecutionContext context, object value)
    {
        this.Text.Value = (string)value;
        context.CloseActivity();
    }
}
When the bookmark is created the runtime stores it, and then the "Execute" method returns, without any call to RunProgramStatement. The rutime knows the Activity is waiting for user input, and do not call any continuation: it stores it and wait for input.
public void CreateBookmark(string name, Action<ActivityExecutionContext, object> continuation)
{
    // var q = queuingService.GetWorkflowQueue(name);
    // q += continuation;
    bookmarks.Add(name, new Bookmark
        {
            ContinueAt = continuation,
            Name = name,
            ActivityExecutionContext = this
        });
} 
The next logical step is to use Bookmarks to save the program state to durable storage. This lead me to ask me what ended up to be a rather controversial question: is it possible, or advisable, to serialize continuations (in any .NET variant in which they appear, i.e. delegates, EventHandlers, Func<> or Action<>)?

Therefore, next time: serializing Bookmarks

# Friday, 31 May 2013

Building a Sharepoint "clone"

Well, from my previous post it should be clear that it will not be a "full" Sharepoint clone :) That would be a daunting task!
Anyway, for the kind of application I have in mind, I need to replicate the following functionalities:
  •     Everything is accessible via a web browser
  •     The basic data collection will be a list of "things" (more on that later)
  •     It will be possible to perform complex, collaborative, persisted, long-running operations to create and modify "things" and "list of things"
  •     Operations, lists and "things" need to be secured (have different accesses, with permissions assignable to users and roles)
  •     It is possible to define a "thing" (lists of them) (but only for users in the correct role)
  •     It is possible to define operations using predefined basic blocks ("steps")
  •     Steps will have automatically generated UI to gather user input, if necessary
For example: one of these "things" may be a project proposal.

Someone starts a project proposal, fills up some fields, and submit it to a manager. The manager can send it back, asking more details. Then someone is assigned to it, to explore the feasibility, for example. And many other steps can follow.

Each of these steps fills out a little part of the proposal (our "thing"). I want to discuss how I want to build "things", and which shape they will have. For now, let's focus on steps.
First of all, not that I have not used the word "workflow" up to now.
Yes, this kind of step-by-step operations fall nicely in a the definition of workflow; if you look at the company documentation, or even if you ask an employee to describe the procedure to submit a proposal, they are described in a graphical way, with boxes and arrows and diamonds: a flowchart.

WF (the .NET workflow engine) enforces this feeling by using a graphical language very similar to flowcharts as one of the ways to author a workflow. Besides the familiar authoring experience (which is, unfortunately, not without shortcomings in SP), WF adds many nice features we want in this kind of operations. Above all, we have  resilience-durability-persistence. This feature is really useful in our scenario: delays introduced by human interaction are very long, and a way to persist and resume programs is really useful to allow resilience and also scalability (programs can pause and resume, and migrate from one machine to another too).

The plan (for this blog):
- A series of posts about workflows
- Comparison with an interesting alternative: dataflows
- Building and persisting "things" (Dynamic types)
- Some nice ASP.NET MVC features used to put everything together (and make the actual users of all of this happy!)

Everything is accompanied by code.
I will try to put most of what I am discussing here on github, pushing new stuff as I blog about it.
Eventually, I hope to push all the code I wrote (and will write) for this project on github, but I will have to discuss it with my "investor" before doing that..

Next time: Part 1 of the workflow series!

# Sunday, 26 May 2013

Back to C#

Three years ago, give or take, I was asked to help in assessing the technical choices for an interesting project.
The project aimed at realizing a platform (intranet, web) to help a company in dealing with all the quality-related processes and procedures; in particular, it should help in preparing and maintaining the high quality levels required by international standards (ISO-9001 and alike).

A very good friend of mine was asked to act as a program manager/product owner, and work as a proxy between the initial set of companies interested in the product (the customers) and the ISV that was in charge of developing the software.
My friend designed and detailed the specification for every aspect needed in the software, which was composed of three main modules: storage, review and approval processes for corporate documents and handbooks; collection, review and resolution of quality-related issues and process enhancement proposals; tracking and monitoring of supplies and suppliers.
All these processes follow nicely the pattern of a workflow: clear steps, with well defined responsibilities, and long-spans between steps. In short, reactive programs that need to be durable, recorded (for auditing) and persisted.

That said, we decided (in agreement with the ISV) to use Windows Workflow Foundations as a basic engine: it seemed a very good fit for the problem at hand. In particular, the ISV decided to use Sharepoint 2010 (which was in Beta, at that time). Personally, I think it was a good choice: I already had experience with WF, and SP seemed to offer all the missing bits and a lot of bells and whistles.

However, the ISV had no prior knowledge of SP 2010, and that showed in how they implemented the solution; it worked, but it required time to implement and was not nearly as flexible as expected. When faced with something more complex, something that required some effort to customize, they fell back to write custom code. A lot of code was written, and a little too much was hard-wired: for example, modifying a workflow (adding an activity, like "send an email", or exchanging two steps) should have been something easily done through the Sharepoint Designer, but ended up being something doable only with code, because most of the activities and the code to compose them was custom. Same for the web forms, for the workflow data (stored as XML data in custom SQL server tables instead of SP lists), for excel reports (which are provided out of the box... but only for SP lists).

Nonetheless, the software did its job, and did it quite well; the design phase was done carefully, and it met the expectations. More companies wanted to use it, and of course they wanted customizations and improvements.
I know all these details because I was asked to take over from were the ISV left, and help in bug fixing, modifications and deployment of new installations. It was hard, more that it should have been; some things that should have been done directly by the customers were possible only using code.
And deployment of Sharepoint solutions on top of an existing, running instance is a real, royal pain.
In short, we arrived at a point were some of the requests were simply not possible to fulfill. Some of the shortcomings are due to SP 2010 (more fine grained permissions and access control; less heavy footprint and resource usage; more custom user interface; easy and flexible deploy options) and we could not address them easily. Two features especially were a no-go for further development: deployment in a shared hosting environment (cloud?) and (linked to it) the ability to use a storage engine that was NOT SQL server.

Earlier this year, my friend and I met, and we decided to try and build a solution that was independent from SP, that retained the good bits (workflows, lists, reports) and improved upon them (adding more fined permissions and access control, workflows that were easier to author, persistence over different storage, a more modern UI).
We decided also to try and do it in what we thought was the right way, more for "fun" than for "profit": that meant no heavy pressure on times, the ability to learn new stuff (ASP.NET MVC, websockets,..) and to re-build some parts of the stack that we saw as necessary (I have a daily job for a living in a software company, so this project was just for fun and to keep my abilities sharp).
I was happy to have an opportunity to go back to C#, to keep my training on .NET going, and stay up-to-date on the latest  .NET technologies: it has been almost two years now without using C# professionally, and I was missing it! So, I said yes.
I decided also to keep a journal here, explaining what I have done and what I will do, what is working and what I will learn.

Next time: building a Sharepoint "clone"
# Friday, 24 March 2006

Rotor 2.0!

Finally! From JasonZ through Brad Adams. I was waiting for it... I'm downloading it right now, and this week (if work premits me to do it) I'm going to digg into some new cool features, like anonimous methods and delegates (C# compiler) and Lightweight Code Generation (LCG - BCL).


# Sunday, 05 February 2006

Synthesized proxies

Last time we saw a first solution to the problem of adding synchronization contracts to an arbitrary .NET class. The solution based on Context Attributes and Transparent Proxy had some limitations (performance, no access to class members), so I designed a tool based completely on SRE (System.Reflection.Emit) and System.Reflection. It can be thought of as a post-compiler that analyzes an assembly, custom attributes placed on classes and function, and write a small assembly that contains a proxy to the original one.

The proxy is written using metadata information of the real target object. We use facilities provided by System.Reflection in order to read the interface methods, the custom attributes and the fields and properties of the target object, and System.Reflection.Emit to emit code for the guarded methods and forwarders for the other public functions. The methods in the proxy validate the contract and update the state for subsequent validations, all in a automatic way.

The process to create a guarded component is the following:

  • write the class as usual, then add the synchronization contract writing guards for each method using the Guard attribute and Past LTL logic;
  • add a reference to the Attribute.dll assembly -the assembly containing the class implementing the Guard attribute- and compile your component. Note that the attribute can be used with every .NET language that supports custom attributes, like C#, Visual Basic, MC++, Delphi, and so on;
  • the .NET compiler will store the Guard attribute and its fields and values with the metadata associated to the method.
public class Test
{
bool g;
bool f = true;
[Guard("H (g or f)")] //using constructor
public string m(int i, out int j)
{
j = i;
return (i + 2).ToString();
}
[Guard(Formula = "H (g or f)")] //using public field
public string m(int i, out int j)
{
j = i;
return (i + 2).ToString();
}
}

The component is then passed through the post-compiler, that will post-process the assembly performing the following steps:

  • it walks the metadata of the given assembly, finding all the classes;
  • for each class it walks its methods, searching for the Guard attribute. If the attribute is found, the class is marked as forwarded;
  • it generates a new empty assembly;
  • for each forwarded class, it generates a class with the same public interface as the original one:
    • public, non-guarded methods are wrapped by the IL code necessary to perform the synchronization (acquiring the object-wide lock before forwarding the call to the original method and releasing it just after the method return)
    • public and private guarded methods are wrapped by the IL code necessary to perform the synchronization and conditional access, like shown here:
      .method public instance void  m() cil managed
      {
        .maxstack  5
        IL_0000:  ldarg.0
        IL_0001:  call       void [mscorlib]Monitor::Enter(object)
        .try
        {
          //check 
          IL_0006:  br.s       IL_0008
          IL_0008:  ldarg.0
          IL_0009:  call       bool [mscorlib]Monitor::Wait(object)
          IL_000e:  pop
          IL_000f:  ldfld      bool Test::pre1
          IL_0010:  brfalse    IL_0008
          //update
          IL_0012:  ldc.i4.0   //"false"
          IL_0013:  ldfld      bool Test::a
          IL_0015:  ceq
          IL_0018:  stfld      bool Test::pre2
          IL_001e:  ldc.i4.0   //"false"
          IL_0021:  stfld      bool Test::pre1
          //forward 
          IL_0026:  ldarg.0
          IL_0027:  ldfld      class [TestLibrary]Test::GuardedTestLibrary.dll
          IL_0031:  call       instance void [TestLibrary]Test::m()
          IL_0036:  leave      IL_0048
        }  // end .try
        finally
        {
          IL_003b:  ldarg.0
          IL_003c:  call       void [mscorlib]System.Threading.Monitor::PulseAll(object)
          IL_0041:  ldarg.0
          IL_0042:  call       void [mscorlib]System.Threading.Monitor::Exit(object)
          IL_0047:  endfinally
        }  // end handler
        IL_0048:  ret
      } // end of method Test::m
      
  • for each forwarded class, it generates constructors with the same signature that calls the ones of the original class.


A schema for the usage of the generated proxy

The generated assembly and its classes are saved in a separate dll, with the same name prefixed by "Guarded". This assembly should be referenced by the client applications in place of the original one, as shown in the Figure.

# Friday, 06 January 2006

Context Attributes and Transparent Proxy

When I started to design synchronization contracts, I wanted to play a little with my ideas before trying to implement the whole mechanism directly into the compiler and runtime monitor. I started to wonder how contracts could be introduced in the .NET platform, and at which level.
The answer is similar to the one given for AOP on .NET (more on this in a future post!): you can act at compiler level (static-weaving in AOP parlor), at the execution engine level, and finally at class library level.

Focusing on the last two, how can you insert code for checking and enforcing constracts? According to the various studies on interception and AOP on the .NET platform(1) there are three ways to intercept calls to methods on an object, and do some preprocessing and postprocessing:
  • Using Context Attributes and a Transparent Proxy;
  • Synthesize a proxy that forward calls to the original object;
  • Injecting directly MSIL code with the Unmanaged .NET profiling APIs.
We will see how each of these methods work, and which is better for the current purpose. In this post, from now on, we will focus on the first method: using Context Attributes and a Transparent Proxy.

(1) NOTE: like Ted Neward points out in this its Interception != AOP post, the term AOP is used incorrectly in many articles. I share his ideas on the subject, but for the purposes of this discussion the term interception will suffice.

Context Attributes and a Transparent Proxy

To cook a solution based on this technology, we need three ingredients:
  • Custom Attributes to mark methods with a formula
  • Reflection (based on .NET metadata) to explore fileds and methods
  • .NET Interceptors to hijack execution and check the validity of a formula upon method invocation.
The BCL provides a complete set of managed classes (the Reflection API) that can be used to emit metadata and MSIL instructions in a rather simple way, from a managed application. In the managed world, the structure of libraries and classes is available through the metadata; this reflection mechanism makes possible to write applications that programatically read the structure of existing types in an easy way, and also that add code and fields to those types.

For the second ingredient, the .NET Framework provide also a mechanism called Attributes. According to the MSDN developers guide,
Attributes are keyword-like descriptive declarations to annotate programming elements such as types, fields, methods, and properties. Attributes are saved with the metadata of a Microsoft .NET Framework file and can be used to describe your code to the runtime or to affect application behavior at run time. While the .NET Framework supplies many useful attributes, you can also design and deploy your own.
Attributes designed by your own are called custom attributes. Custom attributes are essentially traditional classes that derive directly or indirectly from System.Attribute. Just like traditional classes, custom attributes contain methods that store and retrieve data; arguments for the attribute must match a constructor or a set of public fields in the class implementing the custom attribute.
Properties of the custom attribute class (like the element that they are intended for, if they are inherited and so on) are specified through a class-wide attribute: the AttributeUsageAttribute.
Here is an example, applied at our problem: an attribute to attach a formula to a method:
[AttributeUsage(AttributeTargets.Constructor |
AttributeTargets.Method | AttributeTargets.Property,
Inherited = true,
AllowMultiple = true)]
public class GuardAttribute : Attribute
{
public GuardAttribute() {
Console.WriteLine("Eval Formula: " + Formula);
}
public GuardAttribute(string ltlFormula) {
Formula = ltlFormula;
Console.WriteLine("Eval Formula: " + Formula);
}
public string Formula;
}
And here is an example of application of our new custom attribute:
public class Test
{
bool g;
bool f = true;
[Guard("H (g or f)")] //using constructor
public string m(int i, out int j)
{
j = i;
return (i + 2).ToString();
}
[Guard(Formula = "H (g or f)")] //using public field
public string m(int i, out int j)
{
j = i;
return (i + 2).ToString();
}
}

Attributes, like other metadata elements, can be accessed programmatically. Here is an example of a function that, given a type, scans its members to see if some is marked with our GuardAttribute:

public class AttributeConsumer
{
Type type;

public AttributeConsumer(Type type)   
{
this.type = type;
}
   
public void findAttributes()
{
Type attType = typeof(GuardAttribute);

foreach (MethodInfo m in type.GetMethods())
{
if (m.IsDefined(attType, true))
{
object[] atts = m.GetCustomAttributes(attType, true);
GuardAttribute att = (GuardAttribute)atts[0];
parseAttribute(att.Formula);
}
}
}

public void walkMembers(string s)
{
BindingFlags bf = BindingFlags.Static
| BindingFlags.Instance
| BindingFlags.Public
| BindingFlags.NonPublic
| BindingFlags.FlattenHierarchy ;

Console.WriteLine("Members for {0}", s);
MemberInfo[] members = type.GetMember(s, bf);
for (int i = 0; i < members.Length; ++i)
{
Console.WriteLine("{0} {1}",
members[i].MemberType,
members[i].Name);

//inject additional Metadata for formulas

//generate code for updating the formulas

//inject it
}
}

void injectMetadata() {
//...
}
}
If such an attribute is found, its formula is scanned, and appropriate fields to hold the previous status of sub-formulae (needed for recording temporal behaviour) are injected into the actual type.

But what about the code? We need to be notified when a method we are interested in (a method with an attached formula) is called. Here comes the third ingredient: .NET interceptors.

.NET interceptors are associated with the target component via metadata; the CLR uses the metadata to compose in a stack a set of objects (called message sinks) that get notified of every method call. The composition usually happens when an object instance of the target component is created. When the client calls a method on the target object, the call is intercepted by the framework, the message sinks get the chance of processing the call and performing their service; finally the object's method is called.
On the return from the call each sink in the chain is again invoked, giving it the possibility to post-process the call. This set of message sinks work together to provide a context for the component's method to execute.

Thanks to attributes, metadata of the target component can be enriched with all the information necessaries to bind the message sinks we want to the component itself. However, custom attiributes are often not sufficient: if you need access to the call stack before and after each method to read the environment (like parameters of a method call), this requires an interceptor and the context in which the object lives: .NET interceptors can act only if we provide for a context for the component.
Let's see how objects can live in a context, and how contexts and interceptors work togheter. In the .NET Framework, an application domain is a logical boundary that the common language runtime (CLR) creates within a single process. Components loaded by the .NET runtime are isolated form each other: they run independently of one another and cannot directly impact one another; they don't directly share memory and they can communicate only using .NET remoting (although this service is provided transparently by the framework). Components living in separate appdomains have separate contexts. For objects living in the same application domain, a context is provided for any class that derives from System.ContextBoundObject; when we create an instance of a subclass of ContextBoundObject the .NET runtime will automatically create a separate context for the newly created object.

figura1.png 

This diagram shows the flow of a call between the a client class and an object in a different context or application domain.
In such a situation, the .NET framework performs the following steps:
  1. A transparent proxy is created. This proxy contains an interface identical to the recipient, so that the caller is kept in the dark about the ultimate location of the callee.
  2. The transparent proxy calls the real proxy, whose job it is to marshal the parameters of the method across the application domain. Before the target object receives the call there are zero or more message sink classes that get called. The first message sink pre-processes the message, sends it along to the next message sink in the stack of message sinks between client and object, and then post-processes the message. The next message sink does the same, and so on until the last sink is reached. Then the control is passed to the stack builder sink.
  3. The last sink in the chain is the stack builder sink. This sink takes the parameters and places them onto the stack before invoking the method in the receiving object.
  4. By doing this, the recipient remains as oblivious to the mechanism used to make the call as the initiator is.
  5. After calling the object, the stack builder sink serializes the outbound parameters and the return value, and returns to the previous message sink.
So, the object implementing our pre- and post-processing logic have to participate in this chain of message sinks.

For a class implementing a sink to be hooked to our target object, we first need to update our attribute to work with context-bound objects. This is done by deriving it from ContextAttribute instead of Attribute and implementing a method for returning a context property for that attribute:
[AttributeUsage(AttributeTargets.Class, Inherited = true)]
public class InterceptAttribute : ContextAttribute
{

public InterceptAttribute() : base("Intercept")
{
}

public override void Freeze(Context newContext)
{
}

public override void
GetPropertiesForNewContext(IConstructionCallMessage ctorMsg)
{
ctorMsg.ContextProperties.Add( new InterceptProperty() );
}

public override bool IsContextOK(Context ctx,
IConstructionCallMessage ctorMsg)
{
InterceptProperty p = ctx.GetProperty("Intercept")
as InterceptProperty;
if(p == null)
return false;
return true;
}

public override bool IsNewContextOK(Context newCtx)
{
InterceptProperty p = newCtx.GetProperty("Intercept")
as InterceptProperty;
if(p == null)
return false;
return true;
}

}

[AttributeUsage(AttributeTargets.Constructor |
AttributeTargets.Method | AttributeTargets.Property,
Inherited = true,
AllowMultiple = true)]
public class GuardAttribute : Attribute
{
public GuardAttribute() {}
public GuardAttribute(string ltlFormula)
{
Formula = ltlFormula;

AttributeConsumer ac = new AttributeConsumer();

//parse formula...
LTLcomp parser = new LTLcomp(ac);
parser.openGrammar(...);
parser.parseSource(ltlFormula);
}

private string Formula;

public void Process()
{
//evalute Formula
}

public void PostProcess()
{
//update formula
}
}
At object creation time GetPropertiesForNewContext is called for each context attribute associated with the object.
This allows us to add our own \emph{context property} to the list of properties of the context bound with our target object; the property in turn allows us to add a message sink to the chain of message sinks:

The intercepting mechanism is really powerful. However, for our purposes it is not enough. It has three major diadvantages:
  • performance: the overhead of crossing the boundary of a context, or of an appdomain, isn't always acceptable (the cost of a function call is 20-fold, from some simple measures I did). If you already need to do this (your component must live in another appdomain, or in another process, or even in another machine) there is no problem and almost no overhead, since the framework need already to establish a proxy and marshal all method calls;
  • we have to modify the class(es) in the target component. They must inherith from ContextBoundObject. And since .NET doesn't support multiple inhritance, this is a rather serious issue;
  • only parameter of the method call are accessible. The state of the target object, its fields and properties, are hidden. Since to find out if an object is accessible from a client we need to inspect its state, this issue makes it very difficult to use the interception mechanism for our purposes.
Next time we'll see the first completely working solution: synthesized proxies.
# Sunday, 06 November 2005

C#, C++ and Rotor (2)


To see how the C# compiler works, what is better to go and look at the sources? However, if you have ever written a compiler, you can understand that figure out who calls who is not so simple. I decided to simplify my investigation a little with the help of.. a debugger =)
I set two breakpoints in two functions that I believed to be correlated, but that seemed to not came in touch nowhere in the code. It was strange, because the two are parse function and are in the same class.

I placed the first breakpoint in ParseNamespaceBody. As you can see in parser.cpp, this function will in turn call ParseNamaspace, ParseClass,
which in turn will call ParseMethod, ParseAccessor, ParseConstructor...
The second breakpoint was placed instead in ParseBlock. This is the function that parses a block of code, i.e. all the statements enclosed by two curly braces { }.
The interesting thing is that the two breakpoints are not hit subsequentely, but in different moments with different stack traces.

This is the stack trace for the first breakpoint:
   
0006efe8 531fa3ad cscomp!CParser::ParseNamespaceBody+0x74 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\parser.cpp @ 1627]
0006effc 53216891 cscomp!CParser::ParseSourceModule+0x8d [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\parser.cpp @ 1602]
0006f18c 5320c226 cscomp!CSourceModule::ParseTopLevel+0x691 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\srcmod.cpp @ 3331]
0006f19c 53175940 cscomp!CSourceData::ParseTopLevel+0x16 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\srcdata.cpp @ 171]
0006f234 53175bd7 cscomp!COMPILER::ParseOneFile+0x130 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 1197]
0006f284 53174599 cscomp!COMPILER::DeclareOneFile+0x27 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 1228]
0006f324 5317767a cscomp!COMPILER::DeclareTypes+0x89 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 821]
0006f4d0 53173987 cscomp!COMPILER::CompileAll+0x13a [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 1676]
0006f65c 5317d0dd cscomp!COMPILER::Compile+0xd7 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 704]
0006fe48 0040db29 cscomp!CController::Compile+0xdd [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\controller.cpp @ 225]
0006ff50 0040e7ac csc!main+0x349 [d:\cxc\sscli\clr\src\csharp\csharp\scc\scc.cpp @ 2270]
0006ff64 79c8138d csc!run_main+0x1c [d:\cxc\sscli\pal\win32\crtstartup.c @ 49]
0006ff88 0040e70b rotor_pal!PAL_LocalFrame+0x3d [d:\cxc\sscli\pal\win32\exception.c @ 600]
0006ffc0 7c816d4f csc!mainCRTStartup+0x7b [d:\cxc\sscli\pal\win32\crtstartup.c @ 66]
0006fff0 00000000 kernel32!BaseProcessStart+0x23

   
And this is the trace for the second one:
   
0006eed4 5321849c cscomp!CSourceModule::ParseInteriorNode+0xbc [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\srcmod.cpp @ 4576]
0006ef7c 531e163b cscomp!CSourceModule::GetInteriorNode+0x32c [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\srcmod.cpp @ 3863]
0006ef98 531e14fb cscomp!CInteriorTree::Initialize+0x4b [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\inttree.cpp @ 80]
0006efc8 53217fe4 cscomp!CInteriorTree::CreateInstance+0x6b [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\inttree.cpp @ 61]
0006f068 5320c28a cscomp!CSourceModule::GetInteriorParseTree+0xd4 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\srcmod.cpp @ 3728]
0006f07c 5316d1e8 cscomp!CSourceData::GetInteriorParseTree+0x1a [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\srcdata.cpp @ 187]
0006f1f0 5316c44a cscomp!CLSDREC::compileMethod+0x6c8 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\clsdrec.cpp @ 6832]
0006f220 5316b728 cscomp!CLSDREC::CompileMember+0x9a [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\clsdrec.cpp @ 6569]
0006f268 5316c7b2 cscomp!CLSDREC::EnumMembersInEmitOrder+0x228 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\clsdrec.cpp @ 6327]
0006f2f0 5316c084 cscomp!CLSDREC::compileAggregate+0x252 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\clsdrec.cpp @ 6657]
0006f324 53177f1b cscomp!CLSDREC::compileNamespace+0xc4 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\clsdrec.cpp @ 6523]
0006f4d0 53173987 cscomp!COMPILER::CompileAll+0x9db [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 1922]
0006f65c 5317d0dd cscomp!COMPILER::Compile+0xd7 [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\compiler.cpp @ 704]
0006fe48 0040db29 cscomp!CController::Compile+0xdd [d:\cxc\sscli\clr\src\csharp\csharp\sccomp\controller.cpp @ 225]
0006ff50 0040e7ac csc!main+0x349 [d:\cxc\sscli\clr\src\csharp\csharp\scc\scc.cpp @ 2270]
0006ff64 79c8138d csc!run_main+0x1c [d:\cxc\sscli\pal\win32\crtstartup.c @ 49]
0006ff88 0040e70b rotor_pal!PAL_LocalFrame+0x3d [d:\cxc\sscli\pal\win32\exception.c @ 600]
0006ffc0 7c816d4f csc!mainCRTStartup+0x7b [d:\cxc\sscli\pal\win32\crtstartup.c @ 66]
0006fff0 00000000 kernel32!BaseProcessStart+0x23


As you can see, the lowest common ancestor is COMPILER::CompileAll. Then the traces diverge: the first one calls DeclareTypes, the second one compileNamespace. I started to guess what was happening, but I wanted to see it clearly.

So I examined the two functions and the functions called by them. The ParseMethod, ParseAccessor, ParseConstructor functions parse the declaration of classes, functions and properties. But then, when they reach the open curly brace, they stop. Or, better, they call a SkipBlock function that searches the body of a class/method/etc for other declaration (inner classes, for example) but skips the statements. They record the start of the method (the position af the first curly) and return.

Digging into ParseBlock, I saw that it will in turn call the function responsible to generate the intermediate code, a parse tree made of STATEMENTNODEs

   STATEMENTNODE   **ppNext = &pBlock->pStatements;
while (CurToken() != TID_ENDFILE)
{
if ((iClose != -1 && Mark() >= iClose) ||
(iClose == -1 &&
(CurToken() == TID_CLOSECURLY ||
CurToken() == TID_FINALLY ||
CurToken() == TID_CATCH)))
break;
if (CurToken() == TID_CLOSECURLY) {
Eat (TID_OPENCURLY);
NextToken();
continue;
}
*ppNext = ParseStatement (pBlock);
ppNext = &(*ppNext)->pNext;
ASSERT (*ppNext == NULL);
}

    
The ParseStatement function will drive the parsing calling every time the right function among
CallStatementParser
ParseLabeledStatement
ParseDeclarationStatement
ParseExpressionStatement

    
So this second phase will finally parse the statements. The pertinent code is in the ParseInteriorNode function:

m_pParser->Rewind (pMethod->iCond);
pMethod->pBody = (BLOCKNODE *)m_pParser->ParseBlock (pMethod);
 

Now, we can finally see what happens and why a C# compiler works without forward declarations or without the separation of declaration and implementation: it is a two-phase parser. The first phase collects informations about symbols (classes and function names) and builds the symbol table.
The second phase uses the symbol table (already complete) for name resolution, than builds the complete parse tree.
            
           


# Saturday, 05 November 2005

C#, C++ and Rotor

As a C++ programmer, initially I was very disapponited when Microsoft took the .NET way. And I didn't really liked C#. Today, I have a different opinion (especially thanks to anonymous delegates, generics, all the other version 2 goodies and LINQ), but at the begininnig it was very hard for me to leave my loved COM+/C++ for .NET/C#.

I love a lot of C++ features, among them the possibility of separate declaration and definition, even phisically in two separate files (.h and .cpp). This is a need due to the early C++ compilers design, but it is also a way to guarantee a correct forma mentis (and to increase readability of the code).

Consider the following classes:

#pragma once

#include <iostream>

class Bar
{
public:

   void bar()
   {
      Foo* foo = new Foo();
      std::cout << "Bar" << std::endl;
   }
};

class Foo
{
public:
   void foo(Bar* owner)
   {
      bar->bar();
   }
};


There are two problems in this file: the obvious one, that it does't compile:

d:\VC\Test\Test.h(13) : error C2065: 'Foo' : undeclared identifier
d:\VC\Test\Test.h(13) : error C2065: 'foo' : undeclared identifier

the second one, that it does introduce a dependendency on iostream on anybody that is going to include it.
Now, if we separate definition

class Bar
{
public:
   void bar();
};

class Foo
{
public:
   void foo(Bar* owner);
};

and implementation

#include ".\test.h"
#include <iostream>

void Bar::bar()
{
   Foo* foo = new Foo();
   std::cout << "Bar" << std::endl;
}

void Foo::foo(Bar* bar)
{
   bar->bar();
}

the two problems are solved. Do you think my example is a stupid one? Sure, but think about the Model View Controller pattern, or about every system involving notifications, like a Window manager...

In C# there is no distinction: when you declare a type you must also give its implementation, the only exceptions being interfaces and abstract methods. So, how can a C# compiler deal with the above case? We will discover it in the next blog: time to do a little digging in Rotor!