# Thursday, 04 August 2016

Building barriers: compilation time VS execution time

In order to obtain what we want, i.e. fine grained resource control for our "snippets", we can act at two levels:

  • compilation time
  • execution time

Furthermore, we can control execution in three ways:

  1. AppDomain sandboxing: "classical" way, tested, good for security
  2. Hosting the CLR: greater control on resource allocation
  3. Execute in a debugger: even greater control on the executed program. Can be slower, can be complex

Let's examine all the alternatives.

Control at compilation time

Here, as I mentioned, the perfect choice would be to use the new (and open-source) C# compiler.

It divides well compilation phases, has a nice API, and can be used to recognize "unsafe" or undesired code, like unsafe blocks, pointers, creation of unwanted classes or call to undesired methods.

Basically, the idea is to parse the program text into a SyntaxTree, extract the node matching some criteria (e.g. DeclarationModifiers.Unsafe, calls to File.Read, ...), and raise an error. Also, it a possibility is to write a CSharpSyntaxRewriter that encapsulates (for diagnostic) or completely replace some classes or methods.

Unfortunately, Roslyn is not an option: StackOverflow requirements prevents the usage of this new compiler. Why? Well, users may want to show a bug, or ask for a particular behaviour they are seeing in version 1 of C# (no generics), or version 2 (No extension methods, no anonymous delegates, etc.). So, for the sake of fidelity, it is required that the snippet can be compiled with an older version of the compiler (and no, the /langversion switch is not really the same thing).

An alternative is to act at a lower level: IL bytecode. 
It is possible to compile the program, and then inspect the bytecode and even modify it. You can detect all the kind of unsafe code you do not want to execute (unsafe, pointers, ...), detect the usage of Types you do not want to load (e.g. through a whitelist), insert "probes" into the code to help you catch runaway code.

I'm definitely NOT thinking about "solving" the halting problem with some fancy new static analysis technique... :) Don't worry!

I'm talking about intercepting calls to "problematic" methods and wrap them. So for example:

static void ThreadMethod() {
   while (1) {
      new Thread(ThreadMethod).Start();
   }
}
This is a sort of fork bomb

(funny aside: I really coded a fork bomb once, 15 years ago. It was on an old Digital Alpha machine running Digital UNIX we had at the university. The problem was that the machine was used as a terminal server powering all the dumb terminals in the class, so bringing it down meant the whole class halted... whoops!)

After passing it through the IL analyser/transpiler, the method is rewritted (compiled) to:


static void ThreadMethod() {
   while (1) {
      new Wrapped_Thread(ThreadMethod).Start();
   }
}

And in Wrapped_Thread.Start() you can add "probes", perform every check you need, and allow or disallow certain behaviours or patterns. For example, something like: 

if (Monitor[currentSnippet].ThreadCount > MAX_THREADS)
  throw new TooManyThreadException();

if (OtherConditionThatWeWantToEnforce)
  ...

innerThread.Start();


You intercept all the code that deals with threads and wrap it: thread creation, synchronization object creation (and wait), setting thread priority ... and replace them with wrappers that do checks before actually calling the original code.

You can even insert "probes" at predefined points: inside loops (when you parse a while, or a for, or (at IL level), before you jump), before functions calls (to have the ability to check execution status before recursion). These "probes" may be used to perform checks, to yield the thread quantum more often (Thread.Sleep(0)), and/or to check execution time, so you are sure snippets will not take the CPU all by themselves. 

An initial version of Pumpkin used this very approach. I used the great Cecil project from Mono/Xamarin. IL rewriting is not trivial, but at least Cecil makes it less cumbersome. This sub-project is also on GitHub as ManagedPumpkin.

And obviously, whatever solution we may chose, we do not let the user change thread priorities: we may even run all the snippets in a thread with *lower* priority, so the "snippet" manager/supervisor classes are always guaranteed to run.

Control at execution time

Let's start with the basics: AppDomain sandboxing is the bare minimum. We want to run the snippets in a separate AppDomain, with a custom PermissionSet. Possibly starting with an almost empty one. 

Why? Because AppDomains are a unit of isolation in the .NET CLI used to control the scope of execution and resource ownership. It is already there, with the explicit mission of isolating "questionable" assemblies into "partially trusted" AppDomains. You can select from a set of well-known permissions or customize them as appropriate. Sometimes you will hear this approach referred to as sandboxing.

There are plenty of examples on how to do that, it should be simple to implement (for example, the PTRunner project).

AppDomain sandboxing helps with the security aspect, but can do little about resource control. For that, we should look into some form of CLR hosting.

Hosting the CLR

"Hosting" the CLR means running it inside an executable, which is notified of several events and acts as a proxy between the managed code and the unmanaged runtime for some aspects of the execution. It can actually be done in two ways:

1. "Proper" hosting of the CLR, like ASP.NET and SQL Server do

Looking a what you can control through the hosting interface  you see that, for example, you can control and replace all the native implementations of "task-related" (thread) functions.
It MAY seem overkill. But it gives you complete control. For example, there was a time (a beta of CLR v2 IIRC) in which it was possible to run the CLR on fibers, instead of threads. This was dropped, but gives you an idea of the level of control that can be obtained.

2. Hosting through the CLR Profiling API (link1, link2)

You can monitor (and DO!) a lot of things with it: I used it in the past to do on-the-fly IL rewriting (you are notified when a method is JIT-ed and you can modify the IL stream before JIT) (my past project used it for a similar thing, monitor thread synchronization... I should have talked about it on this blog years ago!)

In particular, you can intercept all kind of events relative to memory usage, CPU usage, thread creation, assembly loading, ... (it is a profiler, after all!).
An hypothetical snippet manager running alongside the profiler (which you control, as it is part of your own executable) can then use a set of policies to say "enough!" and terminate the offending snippet's threads.

Debugging

Another project I did in the past involved using the managed debugging API to run code step-by-step.

This gives you plenty of control, even if you do not do step-by-step execution: you can make the debugger code "break into" the debugger at thread creation, exit, ... And you can issue a "break" any time, effectively gaining complete control on the debugged process (after all, you are a debugger: it is your raison d'etre to inspect running code). It can be done at regular intervals, preventing resource depletion by the snippet.

OpenID
Please login with either your OpenID above, or your details below.
Name
E-mail
(will show your gravatar icon)
Home page

Comment (HTML not allowed)  

[Captcha]Enter the code shown (prevents robots):

Live Comment Preview