Monday, 10 June 2013

# A "Mini" workflow engine. Part 4: "out-of-order" composite activities, internal bookmarks, and execution.

Last time, we have seen how a sequence of activities works with internal bookmarks to schedule the execution of the next step, introducing a queue to hold these bookmarks.
We left with a question: what if execution is not sequential?
In fact, one of the advantages of an extensible workflow framework is the ability to define custom statements; we are not limited to Sequence, IfThenElse, While, and so on but we can invent our own, like Parallel:

public class Parallel : CompositeActivity
{
protected internal override ActivityExecutionStatus Execute(WorkflowInstanceContext context) {
foreach (var activity in children)
context.RunProgramStatement(activity, this.ContinueAt);
return ActivityExecutionStatus.Executing;
}

private void ContinueAt(WorkflowInstanceContext context, object arg) {
foreach (var activity in children)
//Check every activity is closes...
if (activity.ExecutionStatus != ActivityExecutionStatus.Closed)
return;

CloseActivity(context);
}
}

And even have different behaviours here (like speculative execution: try several alternative, choose the one that complete first).

Clearly, a queue does not work with such a statement. We can again use delegates to solve this issue, using a pattern that is used in several other scenarios (e.g. forms management).
When a composite activity schedules a child activity for execution, it subscribes to the activity, to get notified of its completion (when the activity will finish and become "closed"). Upon completion, it can then resume execution (the "internal bookmark" mechanism).

child.Close += this.ContinueAt;
Basically, we are moving the bookmark and (above all) its continuation to the activity itself. It will be the role of CloseActivity to find the delegate and fire it. This is exactly how WF works...

context.CloseActivity() {
...
if (currentExecutingActivity.Close != null)
currentExecutingActivity.Close(this, null);

}
... and it is the reason why WF needs to keep explicit track of the current executing activity.

Personally, I find the "currentExecutingActivity" part a little unnerving. I would rather say explicitly which activity is closing:

context.CloseActivity(this);

Moreover, I would enforce a call to CloseActivity with the correct argument (this)

public class Activity {
private CloseActivity() {
context.CloseActivity(this); //internal
Much better!

Now we almost everything sorted out; the only piece missing is about recursion. Remember, when we call the continuation directly, we end up with a recursive behaviour. This may (or may not) be a problem; but what can we do to make the two pieces really independent?

We can introduce a simple execution queue:

public class ExecutionQueue
{
private readonly BlockingCollection<Action> workItemQueue = new BlockingCollection<Action>();

public void Post(Action workItem) {
...
}
}
Now, each time we have a continuation (from an internal or external bookmark), we post it to the execution queue. This makes the code for RunActivity much simpler:

internal void RunProgramStatement(Activity activity, Action<WorkflowContext, object> continueAt)
{
logger.Debug("Context::RunProgramStatement");

activity.Closed = continueAt;

// Execute the a activity
Debug.Assert(activity.ExecutionStatus == ActivityExecutionStatus.Initialized);
ExecuteActivity(activity);
} 
As you may have noticed, we are giving away a little optimization; a direct call when the activity just completes should be much faster.
Also, why would we need a queue? We could do just fine using Tasks:

StartNew(() => activity.Execute(this)).
ContinueWith((result) =>
{
if (result.Result == ActivityExecutionStatus.Closed)
continueAt(this, null);
else
{
// if it didn't complete, an external bookmark was created
// and will be resumed. When this will happen, be ready!
activity.OnClose = continueAt;
}
});

internal void ResumeBookmark(string bookmarkName, object payload) {
...
}

internal void CloseActivity(Activity activity) {
if (activity.OnClose != null)
StartNew(() => activity.OnClose(this, null));
}
Which makes things nicer (and potentially parallel too). Potentially, because in a sequential composite activity like Sequence, a new Task will be fired only upon completion of another. Things for the Parallel activity, however, will be different. We will need to be careful with shared state and input and output arguments though, as we will see in a future post.

Are there other alternatives to a (single threaded) execution queue or to Tasks?

Well, every time I see this kind of pause-and-resume-from-here kind of execution, I see an IEnumerable. For example, I was really fond of the Microsoft CCR and how it used IEnumerable to make concurrent, data-flow style asynchronous execution easier to read:

    int totalSum = 0;
var portInt = new Port<int>();

// using CCR iterators we can write traditional loops
// and still yield to asynchronous I/O !
for (int i = 0; i < 10; i++)
{
// post current iteration value to a port
portInt.Post(i);

// retrieve value using simple assignment
totalSum += portInt;
}
Console.WriteLine("Total:" + totalSum);

(Example code from MSDN)
I used the CCR and I have to say that it is initially awkward, but then it is a style that grows on you, it is a very efficient, and it was way of representing asynchronous with a "sequential" look, well before async/await.

The code for ReadLine would become much clearer:

public class ReadLine : Activity
{
public OutArgument<string> Text = new OutArgument<string>();

protected override IEnumerator<Bookmark> Execute(WorkflowInstanceContext context, object value)
{
// We need to wait? We just yield control to our "executor"
yield return new Bookmark(this);

this.Text.Value = (string)value;

// Finish!
yield break;
// This is like returning ActivityExecutionStatus.Close,
// or calling context.CloseActivity();
}
}
Code for composite activities like sequence could be even cleaner:

protected internal override IEnumerator<Bookmark> Execute(WorkflowInstanceContext context)
{
// Empty statement block
if (children.Count == 0)
{
yield break;
}
else
{
foreach (var child in children)
{
foreach (var bookmark in context.RunProgramStatement(child))
yield return bookmark;
}
}
}


Or, with some LINQ goodness:

return children.SelectMany(child => context.RunProgramStatement(child));
The IEnumerable/yield pair have a very simple, clever implementation; it is basically a compiler-generated state machine. The compiler transforms your class to include scaffolding and memorize in which state the code is, and jump execution to the right code when the method is invoked.

But remember, our purpose is exactly to save and persist this information: will this state machine be serialized as well?
According to this StackOverflow question, the answer seem to be positive, with the appropriate precautions; also, it seems that the idea is nothing new (actually, it is 5 years old!)
http://dotnet.agilekiwi.com/blog/2007/05/implementing-workflow-with-persistent.html

...and it also seems that the general thought is that you cannot do reliably.

In fact, I would say you can: even if you do not consider that 5 years passed and the implementation of yield is still the same, the CLR will not break existing code. At IL level, the CLR just see some classes, one (or more) variables that holds the state, and a method with conditions based on those variables. It is not worse (or better) that serializing delegates, something we are already doing.

The iterator-based workflow code sits in a branch of my local git, and I will give it a more serious try in the future; for now, let's play on the safe side and stick to the Task-based execution.

Until now, we mimicked quite closely the design choices of WF; I personally like the approach based on continuations, as I really like the code-as-data philosophy. However the serialization of a delegate can hardly be seen as "code-as-data": the code is there, in the assembly; it's only status information (the closure) that is serialized, and the problem is that you have little or no control over these compiler generated classes.

Therefore, even without IEnumerable, serialization of our MiniWorkflow looks problematic: at the moment, it is tied to BinarySerializer. I want to break free from this dependency, so I will move towards a design without delegates.

Next time: bookmarks without delegates

Sunday, 05 February 2006

# Synthesized proxies

Last time we saw a first solution to the problem of adding synchronization contracts to an arbitrary .NET class. The solution based on Context Attributes and Transparent Proxy had some limitations (performance, no access to class members), so I designed a tool based completely on SRE (System.Reflection.Emit) and System.Reflection. It can be thought of as a post-compiler that analyzes an assembly, custom attributes placed on classes and function, and write a small assembly that contains a proxy to the original one.

The proxy is written using metadata information of the real target object. We use facilities provided by System.Reflection in order to read the interface methods, the custom attributes and the fields and properties of the target object, and System.Reflection.Emit to emit code for the guarded methods and forwarders for the other public functions. The methods in the proxy validate the contract and update the state for subsequent validations, all in a automatic way.

The process to create a guarded component is the following:

• write the class as usual, then add the synchronization contract writing guards for each method using the Guard attribute and Past LTL logic;
• add a reference to the Attribute.dll assembly -the assembly containing the class implementing the Guard attribute- and compile your component. Note that the attribute can be used with every .NET language that supports custom attributes, like C#, Visual Basic, MC++, Delphi, and so on;
• the .NET compiler will store the Guard attribute and its fields and values with the metadata associated to the method.
public class Test{   bool g;   bool f = true;   [Guard("H (g or f)")] //using constructor   public string m(int i, out int j)   {      j = i;      return (i + 2).ToString();   }   [Guard(Formula = "H (g or f)")] //using public field   public string m(int i, out int j)   {      j = i;      return (i + 2).ToString();   }}


The component is then passed through the post-compiler, that will post-process the assembly performing the following steps:

• it walks the metadata of the given assembly, finding all the classes;
• for each class it walks its methods, searching for the Guard attribute. If the attribute is found, the class is marked as forwarded;
• it generates a new empty assembly;
• for each forwarded class, it generates a class with the same public interface as the original one:
• public, non-guarded methods are wrapped by the IL code necessary to perform the synchronization (acquiring the object-wide lock before forwarding the call to the original method and releasing it just after the method return)
• public and private guarded methods are wrapped by the IL code necessary to perform the synchronization and conditional access, like shown here:
.method public instance void  m() cil managed
{
.maxstack  5
IL_0000:  ldarg.0
IL_0001:  call       void [mscorlib]Monitor::Enter(object)
.try
{
//check
IL_0006:  br.s       IL_0008
IL_0008:  ldarg.0
IL_0009:  call       bool [mscorlib]Monitor::Wait(object)
IL_000e:  pop
IL_000f:  ldfld      bool Test::pre1
IL_0010:  brfalse    IL_0008
//update
IL_0012:  ldc.i4.0   //"false"
IL_0013:  ldfld      bool Test::a
IL_0015:  ceq
IL_0018:  stfld      bool Test::pre2
IL_001e:  ldc.i4.0   //"false"
IL_0021:  stfld      bool Test::pre1
//forward
IL_0026:  ldarg.0
IL_0027:  ldfld      class [TestLibrary]Test::GuardedTestLibrary.dll
IL_0031:  call       instance void [TestLibrary]Test::m()
IL_0036:  leave      IL_0048
}  // end .try
finally
{
IL_003b:  ldarg.0
IL_0041:  ldarg.0
IL_0047:  endfinally
}  // end handler
IL_0048:  ret
} // end of method Test::m

• for each forwarded class, it generates constructors with the same signature that calls the ones of the original class.

A schema for the usage of the generated proxy

The generated assembly and its classes are saved in a separate dll, with the same name prefixed by "Guarded". This assembly should be referenced by the client applications in place of the original one, as shown in the Figure.

Tuesday, 10 January 2006

# On openMosix

The first clustering architecture I am going to speak about is openMosix. openMosix is a Linux-only solution, for reasons that will be clear, but the concepts are applicable to every OS with virtual memory architecture. I think that a port of these ideas to the Windows OSes can be very interesting, but enormously challenging (at least for developers that cannot access the sources) and maybe not so paying for: other architectures that require a shift in the concurrent/distributed programming paradigm can bring more benefits at last.

Anyway, openMosix is unique for its (almost) complete tranparency: processes can be migrated to other nodes, and distributed computing could happen, without any intervent on the user or programmer side. openMosix turns a cluster into a big multi-processor machine.

The openMosix architecture consists of two parts:
• a Preemptive Process Migration (PPM) mechanism and
• a set of algorithms for adaptive resource sharing.
Both parts are implemented at the kernel level, thus they are completely transparent to the application level.
The PPM can migrate any process, at anytime, to any available node. Usually, migrations are based on information provided by one of the resource sharing algorithms.
Each process has an home node, the machine where it was created. Every process seems to run at its home node, and all the processes of a user's session share the execution environment of the home node. Processes that migrate to other nodes use the new nodes resources (memory, files, etc.) whenever possible, but interact with the user's environment through the home node.
Until recently, the granularity of the work distribution in openMosix was the process. Users where able to run parallel applications by starting multiple processes in one node, and then the system distributed these processes to the best available nodes at that time; then the load-balancing algorithm running on each node decided when to relocate resources due to changes on nodes load. Thus, openMosix has no central control or master/slave relationship between nodes.

This model makes openMosix not so different from MPI-Beowulf clusters. Fortunately, recent work brought openMosix granularity down to thread level, enabling "migration of shared memory", i.e. the migration of pages of the process address space to other nodes. This feature permits to migrate multi-threaded applications.

(Figures from the MigShm technical report and presentation: The MAASK team (Maya, Asmita, Anuradha, Snehal, Krushna) designed and implemented the migration of shared memory on openMosix)

For process migration, openMosix creates a new memory descriptor on the remote node. This is fine for normal processes, but could cause problems for threads. Because a thread shares almost all of its memory pages with its parent (all but the thread stack and TLS) when threads of the same parent process are migrated, they need to share a common memory descriptor. If they have different descriptors, these threads could point to false segments.
When a thread is migrated, openMosix migrates only the user mode stack of that particular thread. The heap is migrated "on demand", paying attention to the case in which the same node is already executing threads of the same process to ensure consistency.

openMosix + MigShm control flow
Other features of the process are the ridefinition of shared-memory primitives (shalloc() etc.) and linux thread primitives, a transparent Eager Release consistency policy, and the addition of an algorithm for adaptive resource sharing based on the frequency of shared memory usage and the load across the cluster, so that threads are migrated in a way that decreases the remote accesses to the shared memory.

# Processes, Threads and Memory space

This piece of software is a very interesting and good technical quest, however the question is: it is really worth the effort? Could it scale well? Making processes, and above all developers, thinking that they only have to add threads can be misleading. And multi-thread programming requires locking, explicit synchronization, and to scale well a thoughtful administration of running threads. Threads and semaphores are starting to become uncomfortable even for multi-thread programming on a single machine.
My personal opinion is that the future is going in the other direction. There will be no shared memory, and distributed, multithreaded or clustered computation will all have the same interface, with no shared memory. The problem is that memory is lagging behind.

Processes where created for having different units of execution on the same CPU. When they were introduced, we had multiple processes all runnig in the same address space (directly into the physical address space, at that time).
Then, fortunately there was the advent of Virtual Memory, and of private virtual address spaces. We had a balanced situation: every process thought to be the only one in the machine, and to have a whole address space for its own purposes. Communication with other processes was possible, mainly message based. At that time, IPC was substantially the same if processes where on the same machine or in different machines: the main methods where sockets and named pipes.
The introduction of threads put again the system out of balance: every process had many threads of execution, sharing the same address space.

According to my historic Operating System textbook, a process is a program in execution
"with it’s current values of program counter, registers and variables; conceptually every process has it’s own virtual CPU" - A.S.Tanenbaum.
This is very close to the way modern OSes treat processes, running them in a context of execution virtually independent from the others.
"allow a lot of executions in the environment of a process, in wide measure independent the one from the others" - A.S. Tanenbaum.
However this definition for threads is not so close to reality: threads are not so independent among them, because they always share a primary resource (the common addressing space of the parent process).
openMosix "solves" this problem (making threads "independent" again) migrating trasparentely the required memory pages.

# Application Domains

To complicate this design, .NET brought us Application Domains. Application Domains are designed to be "light-weight" precesses, as Chris Brumme explains. But they are "unrelated" to threads.

# Wires?

In my opinion, we need light threads, let call them wires, that live in the managed space (so they not clobber the scheduler), have their memory and their message based  primitives for communication. Use should be simpler than threads; a good starting point may be Join-calculus, or C-omega, or any other language that support asynchronous or active functions. Those fuctions should map directly to wires, and the runtime will map them to native "tasks" (processes or threads or fibers) so that users can finally stop to worry about hacks to mitigate thread performance limitations (number of threads, thread pools, completion ports) and explicit synchronization (sempahores, mutexes, race conditions).
Wires could also adapt very well to a distributed environment: since they carry with them their data, they can be "detached" from a computational node and "re-attached" to a different destination node.

Friday, 06 January 2006

# Context Attributes and Transparent Proxy

When I started to design synchronization contracts, I wanted to play a little with my ideas before trying to implement the whole mechanism directly into the compiler and runtime monitor. I started to wonder how contracts could be introduced in the .NET platform, and at which level.
The answer is similar to the one given for AOP on .NET (more on this in a future post!): you can act at compiler level (static-weaving in AOP parlor), at the execution engine level, and finally at class library level.

Focusing on the last two, how can you insert code for checking and enforcing constracts? According to the various studies on interception and AOP on the .NET platform(1) there are three ways to intercept calls to methods on an object, and do some preprocessing and postprocessing:
• Using Context Attributes and a Transparent Proxy;
• Synthesize a proxy that forward calls to the original object;
• Injecting directly MSIL code with the Unmanaged .NET profiling APIs.
We will see how each of these methods work, and which is better for the current purpose. In this post, from now on, we will focus on the first method: using Context Attributes and a Transparent Proxy.

(1) NOTE: like Ted Neward points out in this its Interception != AOP post, the term AOP is used incorrectly in many articles. I share his ideas on the subject, but for the purposes of this discussion the term interception will suffice.

Context Attributes and a Transparent Proxy

To cook a solution based on this technology, we need three ingredients:
• Custom Attributes to mark methods with a formula
• Reflection (based on .NET metadata) to explore fileds and methods
• .NET Interceptors to hijack execution and check the validity of a formula upon method invocation.
The BCL provides a complete set of managed classes (the Reflection API) that can be used to emit metadata and MSIL instructions in a rather simple way, from a managed application. In the managed world, the structure of libraries and classes is available through the metadata; this reflection mechanism makes possible to write applications that programatically read the structure of existing types in an easy way, and also that add code and fields to those types.

For the second ingredient, the .NET Framework provide also a mechanism called Attributes. According to the MSDN developers guide,
Attributes are keyword-like descriptive declarations to annotate programming elements such as types, fields, methods, and properties. Attributes are saved with the metadata of a Microsoft .NET Framework file and can be used to describe your code to the runtime or to affect application behavior at run time. While the .NET Framework supplies many useful attributes, you can also design and deploy your own.
Attributes designed by your own are called custom attributes. Custom attributes are essentially traditional classes that derive directly or indirectly from System.Attribute. Just like traditional classes, custom attributes contain methods that store and retrieve data; arguments for the attribute must match a constructor or a set of public fields in the class implementing the custom attribute.
Properties of the custom attribute class (like the element that they are intended for, if they are inherited and so on) are specified through a class-wide attribute: the AttributeUsageAttribute.
Here is an example, applied at our problem: an attribute to attach a formula to a method:
[AttributeUsage(AttributeTargets.Constructor |   AttributeTargets.Method | AttributeTargets.Property,   Inherited = true,   AllowMultiple = true)]public class GuardAttribute : Attribute{   public GuardAttribute() {      Console.WriteLine("Eval Formula: " + Formula);   }   public GuardAttribute(string ltlFormula) {      Formula = ltlFormula;      Console.WriteLine("Eval Formula: " + Formula);   }   public string Formula;}

And here is an example of application of our new custom attribute:
public class Test{   bool g;   bool f = true;   [Guard("H (g or f)")] //using constructor   public string m(int i, out int j)   {      j = i;      return (i + 2).ToString();   }   [Guard(Formula = "H (g or f)")] //using public field   public string m(int i, out int j)   {      j = i;      return (i + 2).ToString();   }}

Attributes, like other metadata elements, can be accessed programmatically. Here is an example of a function that, given a type, scans its members to see if some is marked with our GuardAttribute:

public class AttributeConsumer{   Type type;      public AttributeConsumer(Type type)      {      this.type = type;   }      public void findAttributes()   {      Type attType = typeof(GuardAttribute);            foreach (MethodInfo m in type.GetMethods())       {         if (m.IsDefined(attType, true))         {            object[] atts = m.GetCustomAttributes(attType, true);            GuardAttribute att = (GuardAttribute)atts[0];            parseAttribute(att.Formula);         }      }   }      public void walkMembers(string s)   {      BindingFlags bf = BindingFlags.Static                  | BindingFlags.Instance                  | BindingFlags.Public                  | BindingFlags.NonPublic                  | BindingFlags.FlattenHierarchy ;            Console.WriteLine("Members for {0}", s);      MemberInfo[] members = type.GetMember(s, bf);      for (int i = 0; i < members.Length; ++i)      {         Console.WriteLine("{0} {1}",             members[i].MemberType,            members[i].Name);                     //inject additional Metadata for formulas                     //generate code for updating the formulas                     //inject it            }   }      void injectMetadata() {      //...             }}

If such an attribute is found, its formula is scanned, and appropriate fields to hold the previous status of sub-formulae (needed for recording temporal behaviour) are injected into the actual type.

But what about the code? We need to be notified when a method we are interested in (a method with an attached formula) is called. Here comes the third ingredient: .NET interceptors.

.NET interceptors are associated with the target component via metadata; the CLR uses the metadata to compose in a stack a set of objects (called message sinks) that get notified of every method call. The composition usually happens when an object instance of the target component is created. When the client calls a method on the target object, the call is intercepted by the framework, the message sinks get the chance of processing the call and performing their service; finally the object's method is called.
On the return from the call each sink in the chain is again invoked, giving it the possibility to post-process the call. This set of message sinks work together to provide a context for the component's method to execute.

Thanks to attributes, metadata of the target component can be enriched with all the information necessaries to bind the message sinks we want to the component itself. However, custom attiributes are often not sufficient: if you need access to the call stack before and after each method to read the environment (like parameters of a method call), this requires an interceptor and the context in which the object lives: .NET interceptors can act only if we provide for a context for the component.
Let's see how objects can live in a context, and how contexts and interceptors work togheter. In the .NET Framework, an application domain is a logical boundary that the common language runtime (CLR) creates within a single process. Components loaded by the .NET runtime are isolated form each other: they run independently of one another and cannot directly impact one another; they don't directly share memory and they can communicate only using .NET remoting (although this service is provided transparently by the framework). Components living in separate appdomains have separate contexts. For objects living in the same application domain, a context is provided for any class that derives from System.ContextBoundObject; when we create an instance of a subclass of ContextBoundObject the .NET runtime will automatically create a separate context for the newly created object.

This diagram shows the flow of a call between the a client class and an object in a different context or application domain.
In such a situation, the .NET framework performs the following steps:
1. A transparent proxy is created. This proxy contains an interface identical to the recipient, so that the caller is kept in the dark about the ultimate location of the callee.
2. The transparent proxy calls the real proxy, whose job it is to marshal the parameters of the method across the application domain. Before the target object receives the call there are zero or more message sink classes that get called. The first message sink pre-processes the message, sends it along to the next message sink in the stack of message sinks between client and object, and then post-processes the message. The next message sink does the same, and so on until the last sink is reached. Then the control is passed to the stack builder sink.
3. The last sink in the chain is the stack builder sink. This sink takes the parameters and places them onto the stack before invoking the method in the receiving object.
4. By doing this, the recipient remains as oblivious to the mechanism used to make the call as the initiator is.
5. After calling the object, the stack builder sink serializes the outbound parameters and the return value, and returns to the previous message sink.
So, the object implementing our pre- and post-processing logic have to participate in this chain of message sinks.

For a class implementing a sink to be hooked to our target object, we first need to update our attribute to work with context-bound objects. This is done by deriving it from ContextAttribute instead of Attribute and implementing a method for returning a context property for that attribute:
[AttributeUsage(AttributeTargets.Class, Inherited = true)]public class InterceptAttribute : ContextAttribute{       public InterceptAttribute() : base("Intercept")   {   }   public override void Freeze(Context newContext)   {               }   public override void       GetPropertiesForNewContext(IConstructionCallMessage ctorMsg)   {      ctorMsg.ContextProperties.Add( new InterceptProperty() );   }   public override bool IsContextOK(Context ctx,       IConstructionCallMessage ctorMsg)   {      InterceptProperty p = ctx.GetProperty("Intercept")          as InterceptProperty;      if(p == null)         return false;      return true;   }   public override bool IsNewContextOK(Context newCtx)   {      InterceptProperty p = newCtx.GetProperty("Intercept")          as InterceptProperty;      if(p == null)         return false;      return true;   }}[AttributeUsage(AttributeTargets.Constructor |     AttributeTargets.Method | AttributeTargets.Property,     Inherited = true,    AllowMultiple = true)]public class GuardAttribute : Attribute{    public GuardAttribute() {}   public GuardAttribute(string ltlFormula)   {      Formula = ltlFormula;      AttributeConsumer ac = new AttributeConsumer();      //parse formula...      LTLcomp parser = new LTLcomp(ac);      parser.openGrammar(...);      parser.parseSource(ltlFormula);   }   private string Formula;   public void Process()   {      //evalute Formula   }   public void PostProcess()   {      //update formula   }}

At object creation time GetPropertiesForNewContext is called for each context attribute associated with the object.
This allows us to add our own \emph{context property} to the list of properties of the context bound with our target object; the property in turn allows us to add a message sink to the chain of message sinks:

The intercepting mechanism is really powerful. However, for our purposes it is not enough. It has three major diadvantages:
• performance: the overhead of crossing the boundary of a context, or of an appdomain, isn't always acceptable (the cost of a function call is 20-fold, from some simple measures I did). If you already need to do this (your component must live in another appdomain, or in another process, or even in another machine) there is no problem and almost no overhead, since the framework need already to establish a proxy and marshal all method calls;
• we have to modify the class(es) in the target component. They must inherith from ContextBoundObject. And since .NET doesn't support multiple inhritance, this is a rather serious issue;
• only parameter of the method call are accessible. The state of the target object, its fields and properties, are hidden. Since to find out if an object is accessible from a client we need to inspect its state, this issue makes it very difficult to use the interception mechanism for our purposes.
Next time we'll see the first completely working solution: synthesized proxies.
Wednesday, 04 January 2006

# What am I reading right now?

• What Every Computer Scientist Should Know About Floating-Point Arithmetic by David Goldberg. Eh, I knew it from my first Fortran program that floats (the standard 32 bit precision using REALs in that language) will give you rounding precision. And they accumulate pretty well! Very interesting reading, though.

• Practical Common Lisp: my struggling to become good at this (for me, and for now) weird language. Very good till this point.

• Language Support for Lightweight Transactions, by Tim Harris and Keir Fraser. I am  always interested in concurrency and distributed computing, so this is a must after MSDN Magazie January's End Bracket, by Joe Duffy

• Speaking of parallel computing, I learned a lot at work on current standards for grid computing. Basically, we are building a computing cluster, and two different projects come into the scene: openMosix and MPI (the second one is the protocol choosen for Windows Server 2003 CCE, too). The two use two very different approaches, each with his own drawbacks and strenghts. I want to study some more, especially on the openMosix front, and then expose here what I learned and my own ideas.

• And, last but not least, Hackers and Painters, by Paul Graham. Very interesting and stimulating reading, made even more stimulating by the fact that I agree with many of his opinions, but I totally disagree with many others. I find it difficult to understand how an open minded persons could fall trapped in the same mistakes he points out in other people. But maybe Paul wrote some of his pages only to please an audience.. He was going to sell his book, after all. I want to discuss this topic more deeply in the future; it surely deserves a post.

Sunday, 27 November 2005

# Jingle Bells time again...

Yesterday we had a snow shower, today is the first Sunday of Advent.. What should be more appropriate for a practical example than Jingle Bells? =)

Trono designed the Santa Claus problem as a new exercise in concurrent programming. He wished to have an exercise harder to solve than the ususal mutual exclusion problems, and indeed the Santa Claus problem involves three sorts of process that need to synchronize and cooperate at certain moments. The problem may be stated as follows:

Santa repeatedly sleeps until wakened by either all of his nine reindeer,
back from their holidays, or by a group of three of his ten elves.
If awakened by the reindeer, he harnesses each of them to his sleigh,
delivers toys with them and finally unharnesses them (allowing them
to go off on holiday). If awakened by a group of elves, he shows each
of the group into his study, consults with them on toy R&D and
finally shows them each out (allowing them to go back to work).
Santa should give priority to the reindeer in the case that there is
both a group of elves and a group of reindeer waiting.

Some parts of the problem, if solved with simple primitives like monitors or semaphores, are very tricky: Trono's original implementation had a bug itself, as pointed out by Ben Ari. Ben Ari solved the problem in a quite simple way using Ada95 synchronization primitives. He also showed how java monitors ar not much better at this task than the simple P-V semaphores used by Trono.
Unlike Java, Ada 95 do not use monitors with condition variables which must be explicitly signalled and waited upon. The language uses instead a construct similar to our wait preconditions: a boolean expression (called a barrier) associated with each method. If the barrier evaluates to true, the thread calling the method is allowed to execute it, it nobody is already 'holding' that object (access to objects is mutual exclusive).
If the barrier is false, threads calling the method are queued. When a method exits, other threads are signalled, so they can re-evaluate therir barrier and see if they are allowed to execute the method.

Wait preconditions acts in the same manner, with two exceptions:
1. it is possible to use temporal logic
2. it is possible to refer to meta-variables representing methods.
Later, Nick Benton solved the problem using Polyphonic C# (now Comega). Comega is based on a paradigm derived from join calculus, whose basis are chords and asynch methods. To make it short, synchronization is message-based, and execution derives from the ability of a chord to 'consume' messages.

For example:

static void reindeerBack() & static async  reinWaiting(int r){  if (r == 8)    reindeerReady(); // last reindeer  else    reinWaiting(r + 1);}

the method reindeerBack will be executed only if there is a reinWaiting 'message' ready to be consumed. Then will send the reindeerReady 'message' that will be consumed by the Santa's thread. This will wake up and do its job with the reindeer.

Polyphonic C# solution is nice, and the code very compact. However, it could be quite hard to understand if you are not familiar with process algebras, or with concurrency based on messages.

For our solution, we take an approach that combines the two. Synchronization contracts let us do so, since it is possible to use wait preconditions like Ada95 write barriers. On the other side, we can mimick (in a very restricted way, but sufficient for this example) the method/message consumption of Comega.

We use the Room class from Ben-Ari solution (we'll call it Queue, and this class will replace class nway in the Comega solution, since they solve the same problem).

class Queue{   private int queueSize;        public Queue(int size)   {      queueSize = size;   }     void Accept() //Open_Door   {      queue.clear();      enterPossible = true;   }     public void Entry(Object o)      requires_wait(enterPossible == true)   {      queue.add(o);      waiting = waiting + 1;      if (waiting < queueSize)      {         waitForLastMember();      }      else      {         waiting = waiting - 1;         enterPossible = false;         roomDoor = true;      }   }     private void waitForLastMember(Object o)      requires_wait(queue.size() == groupSize)   {      waiting = waiting - 1;      if (waiting = 0)      {         //i'm the last one         roomDoor = false;                 santa.Wake(this.queue);      }   }   public bool Full()   {      return queue.size() == queueSize;   }     public void WaitFull()      requires_wait(queue.size() == queueSize);  }

Then, to wake Santa, we do not use a special keyword like Ada95 (we do not have one, at this point), but somethig similar to Comega:

class Santa{   //...   void WakedByReindeer()   {      harness.WaitFull();         DeliverToys();      unharness.WaitFull();           ReindeerQueue.Accept();   }     void Wake() { }     void Sleep()      require_wait(pre(@Wake))   {      if (ReindeerQueue.Full())      {         //Santa's wake and Go to Deliver (when all harnessed..)         WakedByReindeer();      }      else         WakedByElves();   }}

We do not allow, with the actual implementation, to override methods with same signature but different wait preconditions, otherwise the Comega solution:

static void waittobewoken() & static async reindeerready() { }static void waittobewoken() & static async elvesready() { }

could be mapped to:

void Sleep()      require_wait(pre(@ReindeerWake));void Sleep()      require_wait(pre(@ElvesWake));

Surely, Comega solution is more elegant in this point, but our solution is not so bad, and we hadn't the need to introduce a keyword, like select in Ada95.

The key point is that synchronization contracts are flexible; they allow the programmer to use the style best tailored to the task, or best tailored to her own experience.

References:

N. Benton
Jingle Bells: Solving the Santa Claus Problem in Polyphonic C#
Microsoft Research, 2003

M. Ben-Ari
How to solve the santa claus problem.
Concurrency: Practice & Experience, 1998.

Saturday, 26 November 2005

# Synchronization contracts: an example of syntax

I'd like to introduce very briefly the syntax of synchronization contracts.
As we have seen in the previuos two posts, we use them to express
• assertions on temporal behaviour
• synchronization
In the first group we have three keywords, mutated from DbC (those familiar with the concept will be at home here): requires, ensures and invariant.

In the second group, we have the variants of preconditions and postconditions we introduced last time: requires_wait and ensures_wait. In the base form of a synchonization contract language, these keywords are all we need. Well, plus the LTL keywords, of course.
Their meaning is pretty obvious: for a complete introduction, see chapter 5 of my thesis. Here I will introduce only some of them:
• pre(cond): cond was true at the previous step
• once(cond): cond was true once
• cond1 since cond2: cond2 became true, and from that moment cond1 was true also
• cond1 until cond2: cond1 must hold until cond2 is true
These operators will be useful to express conditions on the flow of the program: the first three are PLTL operators, the last one is a classical LTL operator.

class Stream{    bool ready;    invariant(@recevive -> (!@send until @endReceive));    invariant(@receive -> finally(endReceive));    void init();    void receive()       requires(ready && once(@init));    void endReceive()    void send()    void endSend()  }

Pretty simple, huh? the @method construct is used to express the concept "the method has been executed", and is a pretty shortcut for not introducing a ton of state variables. Here synchronization contracts are used to specify behaviour, and to check it. But let's see another example:

public class HistoryBuffer{   invariant(current == max -> (!@Put until @Get));   public virtual object Get()      requires_wait(current > 0);   public virtual object Get()      requires_wait(current < max);   {      --current;      object ret = buf[current];      return ret;   }   public virtual object GGet()      requires_wait(pre(@Put))   {      return Get();   }}

Now we have chacks (the invariant) and also synchronization (the guards on Put, Get and GGet). The GGet function have the requirement to be executed only if the previous operation is a Put (i.e. it does not permit two subsequent Get operations). See how it is simple to express such a concept, much simpler than with a stadard programming language.
Next time we'll see an interesting synchronization problem and how it can be solved with synchronization contracts.

Friday, 25 November 2005

# Contracts and Inheritance anomaly.. A solution?

Despite the dozens of concurrent object languages born in the research laboratories few have seen practical use: objects and concurrency constructs have failed to work well together.

This happenes because the object oriented methodology and the synchronization of concurrent activities are orthogonal concepts. The puprose of Concurrent Object Oriented Language is to find a way to resolve automatically this orthogonality.

The famous anomaly in the combination of synchronisation of concurrent activities and inheritance is well known as the Inheritance Anomaly: when inheriting from a base class to add a new method, unrelated to the other methods in the class, concurrency constructs force the redefinition of all other methods of a class.

There is not "solution" to the inheritance anomaly: as long as synchronization constructs and application code are interwinded, the problem stands.
However, we can mitigate it separating the two as much as possible. If we could put no synchronization code at all, obviously the anomaly is solved.
Therefore, we use synchronization constract not only to check the temporal behaviour of objects, but to "orchestrate" them. To put it simple, a method is not executed till its precondition is statisfied; assertions are turned into synchronization primitives. A method will not return until its postcondition is satisfied (so we have a rendez-vous point). This approach make it possible to write monitor-like code using only precondition and poscondition, without writing code in the method body.

Of course, not all scenarios can be modeled using only synchronization contracts, but temporal logic should be powerful enough for many common scenarios. Next post, after all this speaking, we will finally see a real application in a common scnario.

Thursday, 24 November 2005

# Introducing synchronization contracts

In the last year, during my thesis period and later, I have heard a lot of interest from the software research community around contracts. Maybe I'm biased, since my thesis was on contracts, but Don Box claimed more than once that Indigo (now WCF) is based on contracts, and Singularity, the new prototipe of managed OS from Microsoft Research, is based as contracts as well.
In the same period, we also had an increasing interest in concurrency; the most evident sign is the interest raised by Herb Sutter's article "The free lunch is over".

The idea expressed in my thesis is: why do not use contracts to specify concurrency constraints? We can use them to check if a desired temporal behaviour is met, or if an undesiderable sequence of events happens. But we can do even more: we can enforce a desired temporal behaviuor, generating synchronization code from the contract isself.

If you are interested in this technique, let me explain it in some more details: in a contractual aware language, the correctness of a program can be expressed and enforced through contracts. Contracts are assertions on a single method (pre-conditions and post-conditions) or over a whole class (class or object invariants). In their simplest form, those assetions are expressed in boolean logic, but there are various "levels" for contracts:
• basic or syntactic contracts (for example: XML schemas, IDL);
• assertion-based contracts (for example: Eiffel);
• behavioral contracts (assertions are not only on methods or classes, but on the control flow as well);
• synchronization contracts (like the previous ones, but able to work in a concurrent environment, i.e. to check temporal behaviour);
• QoS contracts (like the previous ones, but with the possibility to specify also numerical constraints (throughput, mean response time, etc.)).
Therefore a contract aware language is often a dialect of an original programming languages with few additional keywords for expressing assertions.

A compiler for a language supporting concurrent contracts, therefore, will automatically generate checks to see at run-time if a undesiderable sequence of events happens. To express assertions from which the check will be generated we use temporal logics, an extension of boolean logic that adds operators for specifying temporal behaviour.

How it works? We mutuate techinques from formal cheching techniques. The temporal logics we use (LTL -linear temporal logic- and PLTL -LTL with Past-) are mutuated from this field as well.
The life of an object can be seen as an execution trace, i.e. a sequence of traversed states. From its "birth" (construction) to its "death" (finalization and destruction) an object has an internal state, and this state is altered by external events through callings to its method, using the interface it provides.

We can check if a particular assertion holds by examining the sequence of state the object traversed. Moreover, since the satisfaction of a Past LTL formula can be defined in a recursive manner, for those formulas we can do it only looking at the previous execution step.
For LTL formulae, instead, we make use of well-known algorithms to translate every formula in a finite state automaton.

This is for the checking side, but we wished to push it a little further...

Tuesday, 08 November 2005

# Managed OS and contracts

Today I had some time to flush my blog queue, and I found two interesting posts: the last episode of Mark Russinovich "war" against Sony and its buggy rootkit used as a DRM system (surely something agaist trused computing.. I'd like to know if anyone knows Microsoft's official position) and Adi Oltean's Sunday post on the Singularity OS at http://research.microsoft.com/os/singularity/

I think the first one is surely worth discussion, but many (too many, since it reached the mainstream media) words have already been spent.

The second one came unexpected, and with a little sense of envy.. for not being there. It's since my conversion to the CLR that I'm dreaming and speculating on how to build an OS in "managed" code (or using as much managed code as possible). And now they have done it!
In the technical report there are many interesting ideas: from contracts (not so different from the one's we'll be using in Indingo, it seems) to the use of MSIL to produce verifialble code, the global optimizing compiler (better than a JIT) or the global garbage collector. And versioning, trusted components, and all the benefits the CLR brought to us... OS wide!
As my MSC thesis was on concurrency contracts for the CLR, I think a very interesting part of the project is the Sing# programming language. I had experience of the Spec# programming language (a good project from another MSR team). I actually had the luck to speak with Schulte and Tillmann from the Spec# team at SAPS last year. Their work inspired mine on the CxC language. Now I really want to go and see how Sing# works with concurrency, and if it is possible to fit it with my idea of concurrency contracts. I will do further reading on the subject in the following days, time permitting as always. For those interested, you can find the paper here, and my thesis here.