# Wednesday, 26 April 2006

Did you know...

...that there are (more than) 21 algorithms to compute the factorial of a number?
I discovered this fact today, while implementing the Fisher's exact test for a 2x2 contingency table.
(For those interested: Fisher's exact test for a 2x2 contingency table, like the more famous Chi-square test, is used for the analysis of categorical frequency data: its aim is to determine whether the two categorical variables are associated.
We use it to extract the most significant genes from a gene set experiment: the two categories are "clue present/clue absent", while the two data sets are the control set and the experiment set)
Fisher exact test requires the computation of 9 factorials for each iteration, and one of them is on the total. Typically, applets and code you can found on the internet can handle data up to a total of about 100-200 before giving up due to overflow problems. Even code that uses libraries for arithmetic with arbitrary precision can handle numbers up to 6000-12000 before becoming unsuable.

Here, we are facing two problems:
  • numbers (integers and doubles) provided by the machine (physichal CPU or VM) have a finite precision and extension
  • the simple and dumb algorithm (the one that performs all the multiplications) is very SLOW
The solution is to adopt a good library that handles multiplications of very large numbers in an efficient way (using Karatsuba multiplication or a FFT-based multiplication), and forget about the naive algorithm. At this page you can find implementations and benchmarks for the 21 different algorithms I mentioned. The most efficients use prime numbers (or, better, primorials, i.e. the multiplication of the n first primes), but even the simple and elegant Split Recursive performs a lot better than the naive algorithm!

Notable news

Some funny / useful things I found browsing last week:
  • From Brad Adams, .NET Framework 2.0 Poster
    It is the replication of the version 1.1 I found shipped with Visual Studio 2003; unfortunaltely the same poster wasn't in my Visual Studio 2005 box. Great!
  • selk'bag
    The SELK’BAG is a sleeping bag you wear. The entire bag fits like a glove. You can walk with it, sleep in a sit position.. I think it would be great to have one of these for my work place in summer: the ice-conditioning system makes our offices freezing cold!
  • TiwyFeeds
     "Take It With You" Feed Reader: the idea is great, the realization is poor..

    But I'd definitely love a feed reader 'a-la' gmail, where you can store all your feeds and news for an unlimited amount of time, with a searchable interface!
  • Krugle
    A search engine for developers: personally I use google + MSN desktop search (for code on my machine). But a tool tailored to developers is surely great! I'd like to have Visual Studio search and navigation facilities in a desktop serch application...

     

# Wednesday, 19 April 2006

User interfaces

There is a lot of hype around new user interfaces lately: the new Vista glass look, how it compares to the always improving Mac OSX interface, and the Linux response (Xgl).
These are all nice and eye candy evolutions of existing interfaces, though. Like I discussed in a previous post, and as Jeff Atwood frequently reports in his articles, there is a lot of room for UI improvement. Embedded and ubiquitous search is a key improvement in my opinion. But surely future interfaces will bring more interesting features, even without a radical shift of paradigm. In the last "In our time" column of IEEE Computer (March 2006), David Alan Grier talks about pervesive computing. The definition of pervasive computing dates back to 1991, and it is due to Mark Weiser. Mark wrote that "the most profound technologies are those that disappear", "weave themselves into the fabric of everyday life".
Mark war an engineer working at (guess where?) Xerox PARC. Exactly, the same lab where GUI as we know them today were born. 
He was aware of the importance of such ideas, but he also felt that they were incompatible with the idea of a pervasive computing environment. A windowing operating system, no matter how sophisticated, only makes "the computer screen a demanding focus of attention, it does not fade into the background".
I agree with him: where I work I often see people that are so focused on the screen to forget somehow they primary work. They are so focused on how, they forget what.

But is it really a defect of the window paradigm?
I don't think so. Especially after seeing "The Island". Take a look at this desktop environment:



do you see the computer? The operating system?
Maybe a closer look...






The desktop... is the desk. It is amazing to see it in action! Using some sort of mouse, the principal (The character with glasses) "throws" a window at Lincoln (the guy in white). Then Lincoln begins to draw, like on a sheet of paper on the desk. And when he finishes, he passes the window with the drawing back to the principal.
And there is no shift of paradigm. There are windows, and folders, and documents on a desk. A mouse to grab and move them. But the desktop IS the desk!

This is pervasive computing, for me. And I really want one of those desks... :)
# Wednesday, 05 April 2006

I am Windows 2000

This is ironic...

You are Windows 2000 SP3.  You're a steady and reliable friend.  People think you're all business, but with your recent therapy you've become a little more playful.
Which OS are You?


But I like Win2000 very much!
# Friday, 24 March 2006

Rotor 2.0!

Finally! From JasonZ through Brad Adams. I was waiting for it... I'm downloading it right now, and this week (if work premits me to do it) I'm going to digg into some new cool features, like anonimous methods and delegates (C# compiler) and Lightweight Code Generation (LCG - BCL).


# Monday, 06 March 2006

Security lesson no.6: .NET Security

Finally, we reached the last topic in our cycle of security lessons on software attacks: the security model of .NET. We will see how CAS work, what are evidences and strong names, etc. I'll also give an hint about the "weakest link" in this model.

NOTE: my assumptions were made for version 1.1 of the framework. Some things where updated in 2.0 (in particular, there are good news on how the new version cope with the "weakest link".. but I want to speak about this point more precisely in a future post, since the work I did on this topic allowed me to learn a lot about the .NET runtime/loader and the Windows loader as well!)

dotNETSecurity.ppt (619.5 KB)

Did you know...

...that there are (more than) 21 algorithms to compute the factorial of a number?
I discovered this fact today, while implementing the Fisher's exact test for a 2x2 contingency table.
(For those interested: Fisher's exact test for a 2x2 contingency table, like the more famous Chi-square test, is used for the analysis of categorical frequency data: its aim is to determine whether the two categorical variables are associated.
We use it to extract the most significant genes from a gene set experiment: the two categories are "clue present/clue absent", while the two data sets are the control set and the experiment set)
Fisher exact test requires the computation of 9 factorials, and one of them is on the total. Typically, applets and code you can found on the internet can handle data up to a total of about 100-200 before giving up due to overflow problems. Even code that uses libraries for arithmetic with arbitrary precision can handle numbers up to 6000-12000 before becoming unsuable.

Here, we are facing two problems:
  • numbers (integers and doubles) provided by the machine (physichal CPU or VM) have a finite precision and extension
  • the simple and dumb algorithm (the one that performs all the multiplications) is very SLOW
The solution is to adopt a good library that handles multiplications of very large numbers in an efficient way (using Karatsuba multiplication or a FFT-based multiplication), and forget about the naive algorithm. At this page you can find implementations and benchmarks for the 21 different algorithms I mentioned. The most efficients use prime numbers (or, better, primorials, i.e. the multiplication of the n first primes), but even the simple and elegant Split Recursive performs a lot better than the naive algorithm!

# Sunday, 05 March 2006

Security lesson no.5: DLL Injection

Yesterday we saw a brilliant :) solution to a problem, using some techniques I always loved. Unfortunately in these days of troubles techniques like DLL injection and IAT patching are used mostly by malware than by useful and great software. So it is important for the software developer that cares about security to know how they work and what can be done to prevent them.

DLL Injection is the topic of this lesson, but we will also see what it is possible to do once our malicious DLL is inserted into another process address space: window subclassing, Virtual Memory walking (in search of private data like passwords, for example) and IAT overwrite.

Have fun! (and behave responsibly, as usual...)

DLLInjection.ppt (319.5 KB)

ex6-DLLinjection.zip (139.39 KB)
ex7-VMWalk.zip (338.91 KB)
ex8-IATOverwrite.zip (224.36 KB)
# Saturday, 04 March 2006

Intercepting Windows APIs

As I described in a previous entry, one of the few games I really enjoyed playing was Enemy Territory. It is a free, multi-player FPS based on the Quake 3 engine. It is class based: you chose a class and that dictates the ability of your soldier (and what he can do). I played with my fellow university mates: some of them created a clan (they even did one or two official tournaments) and they wanted to train (I was not particularly good.. I received the "easy frag" attribute!). Besides, it was good to relax an hour after lunch, before attending other lessons.

However, we had an hard time playing it... The admin won't let us use the computer lab for non didactic purposed. It is silly, if you ask me, especially it was not explicitly forbidden by college rules: for example, students and professors alike are allowed to use empty classrooms to play card games. So why can't we use an empty lab to play a free game? Since the labs were not under CCTV surveillance, we took the risk and played nonetheless (we were young.. :) ). But one day, an email from the admin warned me to not use that particular game anymore. How did they know? Simple: someone was checking all the files on the public directories (were the game was installed), which user owned them (using ACLs) and what kind of files they where.

Me and a friend of mine started to think about the problem. Initially we thought about manipulating the ACL to change the ownership (maybe to the Administrator.. it would be ironic!) of the game files, but it was impractical and it required privileges above those allowed to students, and we didn't want to do anything illegal (like an escalation of privileges). Our solution was simple: hide your programs, not only your data.

Once upon a time, programs consisted of a single .exe (or .com) file. Nowadays instead, an average application has thousands of files and DLLs in its installation directory. Think at Office, or at a game like Quake3. We wanted to execute a complete program out of a sigle data packed file, possibly compressed or encrypted. I'll discuss our ideas and the techniques we used, namely DLL injecting and API intercept and forwarding. We began to discuss seriously on the topic. Our first idea was to provide a DLL that was a proxy / interceptor for the msvcrt.dll, the C runtime of MS C++ compiler. This DLL contains the implementation of the C file handling function, such as fopen, fread, fseek. We can make a DLL with the same name, put it in the app directory (which come firts in the loader search path), export all the function of the original msvcrt.dll implementing file handling function and passing other function to the original DLL. Phew, a lot of work...msvcrt.dll exports 780 functions! We can already sense the calluses on our fingers! Furthermore, the C runtime can be statically linked to the exe, or the program could directly call Win32 API functions.

But wait, even fopen, fread, fseek and friends call Win32 API functions! So, plan B: intercept kernel32 functions! Despite her name, kernel32 is not a kernel module: is a simple user mode DLL that provides a nice API for the real kernel calls. So it can be intercepted... Calling the application we want to execute ot of the compound file victim, all we have to do is:

  1. Place some code in the victim process address space.
  2. Execute this code in order to:
    1. locate the IAT (Import Address Table) of the exe
    2. patch pointers in the IAT to point to OUR functions
  3. For now on, all calls to the patched functions will cause a jump not to the original kernel32 code, but to our functions.
The advantages of this appoach? It's more economic (we have to write only the functions we need), it works with (almost) every app (even with non C apps) and it's funny to code!

DLL injection

We need to place code and execute it in the address space of another process. This at first can seem impossible: every Win32 process has its Virtual Address Space and pointers range over this space, so it's impossible to access another process space [1][2]

The virtual address space: the lower 2GB are the user-mode space, and they are private for each process (see [1][2] for details)

Well, not really: how debuggers can work? With the help of the OS of course! We'll ask for help to the OS too. Our goal is to load a DLL in the victim address space: when a DLL is loaded, function \emph{DllMain} in the DLL is called, with dwReason equal to PROCESS_ATTACH. There are several methods to load a DLL in a process [3]:

  1. Windows HOOKS (the most ancient one). A Hook is a callback function called by windows every time a particular event occours: the most interesting one, when a top window is created or destroyed. We can then see if the application is interesting, and what to do with it. The nice thing is that the DLL that contains hook code is loaded into the other application address space.
  2. The registry. Somewhere in the registry, you can specify a key in which you place DLLs that have to be loaded in every process address space (\HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\AppInit_DLLs) . This is how mouse or video card DLLs end up in your address space. Drawback: you must have Admin rights to write to the registry, and you DLL is loaded in a lot of non-interesting processes. What a waste.
  3. Two magic Win32 functions: CreateRemoteThread and WriteProcessMemory [4].
Ritcher in [4] explains the magic very well. To summarize:
  1. Obtain the HPROCESS of the victim (via CreateProcess or via its pid).
  2. Reserve some space in the Virtual Address space of the victim with VirtualAllocEx.
  3. Use WriteProcessMemory to write the name of the DLL to load in the memory just reserved.
  4. Use CreateRemoteThread to load the library.

The virtual address space: the lower 2GB of the user-mode space, with the kernel32.dll loaded at the same address.

At a first time, we believed we needed to write shell code to execute LoadLibrary, and this is bad for 3 reasons:
(a) is difficult to write,
(b) with the new XP-SP2 NX (non execute) page protect flag we could have troubles.
Fortunately, we realized a fact: DLL are loaded in every process address space, so are private to each process. However, when you create a DLL you specify a "preferred load address'" at link time, and the OS loader will load the DLL at that address if it's free. This is due the fact that otherwise the loader must relocate the DLL, and this is a time consuming operation. This is particularly true for system DLLs, which are loaded always at the same address in every process. So, if we do a GetProcAddress against LoadLibrary in our process, we obtain the same address as in the victim process.

Scheme of the steps that lead zdll.dll to be loaded in victim's address space

We can pass to CreateRemoteThread the address of LoadLibrary as startup routine, and the name we wrote in victim address space as parameters as in figure.

IAT patching

Now we have our own code running in a thread in victim's address space. What can we do now? Everything. In particular, we can have access to PE data directories in our "host", the victim. The executables in Win32 (DLLs, exe, and even device drivers) follow a format called PE (Portable Executable). Every PE is divided in sections: export, import, resources, debug data, delayload, bound modules...[5][6].

The section we are interested in is the import section, with its IMAGE_IMPORT_DESCRIPTOR structure.

The import section, with its two parallel arrays of function pointers

The import section after the loader has done its work. The IAT now points to function entries in kernel32.dll

There's one IMAGE_IMPORT_DESCRIPTOR for each imported PE (executable or, a most common case, DLL). Each IMAGE_IMPORT_DESCRIPTOR points to two essentially identical arrays. The first one is the Import Address Table (IAT). The second one is called the Import Name Table (INT) and is used by the loader as a backup copy in case the IAT is overwritten in the binding process. Binding is an operation done to PE files before the link step, but this goes beyond the scope of this article. Matt Pietrek in [5] covers all the details. The IMAGE_THUNK_DATA structures in the IAT has two roles:

  • In the executable file, they contain either the ordinal of the imported API or an RVA (Relative Virtual Address, an offset from the base address at which the PE is loaded) to an IMAGE_IMPORT_BY_NAME structure. The functions we need to patch in DLLs are those with a name, so we look at those entries that contain an RVA. The IMAGE_IMPORT_BY_NAME structure is just a WORD, followed by a string naming the imported API.
  • When the loader starts the executable, it overwrites each IAT entry with the actual virtual address of the imported function 

 

The import section after zdll's DllMain has done its work. The IAT now points to function entries in zdll.dll

So we need to replace the addresses placed in the IAT by the loader with the addresses of our functions. Here the INT becomes important: how do we know which entry in the IAT we need to overwrite for, as an example, CreateFileA? We need to iterate through the entries of the IAT and INT together. The INT provides the name for the n-th entry, the IAT its VA. We simply overwrite the entry in the IAT with our own.

void patchIAT(PIMAGE_THUNK_DATA32 pINT, PIMAGE_THUNK_DATA32 pIAT)
{
   PIMAGE_IMPORT_BY_NAME ordinalName;

   while (1) // Loop forever (or until we break out)
   {
      if ( pINT->u1.AddressOfData == 0 )
           break;

        ULONGLONG ordinal = -1;

        if ( IMAGE_SNAP_BY_ORDINAL32(pINT->u1.Ordinal) )
           ordinal = IMAGE_ORDINAL32(pINT->u1.Ordinal);           
       
        if ( ordinal != -1 )
      {
           // We don't consider un-named functions
      }
      else
      {
         ordinalName = (PIMAGE_IMPORT_BY_NAME)getPtrFromRVA((DWORD)(pINT->u1.AddressOfData));         
        
         const char* funcName = (const char*)ordinalName->Name;
         PDWORD oldFuncPointer (PDWORD)&(pIAT->u1.Function);
        
         if (funcName == "CreateFileA")
         {
            pIAT->u1.Function = MyCreateFile;
            break;
         }        
      }     
       
      pINT++;         // Advance to next thunk
          pIAT++;         // Advance to next thunk
   }   
}


Compound file

So, at this point the only thing to do was to provide our own implementation of functions like CreateFile, WriteFile, SetFilePointer, FindFirstFile... and patch the IAT for kernel32 with them. But how can we  implement a file system in a single file? After some searching, I suggested that maybe Structured Storage, the way Microsoft calls its compound files, could be used: Word and Powerpoint uses them, for example.
It was only a suggestion, but the day after my mate came with an almost complete implementation based con Structured Storage functions and COM interfaces. Amazing! The last things to do were an application for building a compound file, and some cryptography to hide the content of the file. After all, this was the original goal :)

The final product worked. It was great! A piece of software complex as a video game was able to run with our own file APIs. We never used it (it was a bit too slow on startup, and we found a much simpler solution: network our notebooks), but it was fun, and I used the intercepting library we created for more interesting stuff!



[1] Jeffrey Richter. Load your 32-bit dll into another process’s address space using injlib. Microsof System Journal, May 1994.

[2] Jeffrey Richter. Advanced Windows Programming, 3rd edition. Microsoft Press, 1997.

[3] Mark Russinovich. Inside memory management, part 1. Windows and .NET Magazine, August 1998.

[4] Mark Russinovich. Inside memory management, part 2. Windows and .NET Magazine, September 1998.

[5] Matt Pietrek. Inside windows: An in-depth look into the win32 portable executable file format. MSDN Magazine, Feb. 2002.

[6] Matt Pietrek. Inside windows: An in-depth look into the win32 portable executable file format, part 2. MSDN Magazine, March 2002.

[7] Microsoft corp. Platform SDK: Structured storage. MSDN Library, April 2004.


# Thursday, 02 March 2006

Security lesson no.4: Integer overflow

To finish the cycle of lessons on overflow-based attacks, I couldn't miss a mention to integer arithmetic overflow. Integer arithmetic overflow is unharmful on its own, but can be combined with another type of attack, typically a buffer overflow. Consider the following code from a previous lesson:

int ConcatString(char *buf1, char *buf2,
    size_t len1, size_t len2)
{
   char buf[256];
   if((len1 + len2) > 256)
      return -1;
   memcpy(buf, buf1, len1);
   memcpy(buf + len1, buf2, len2);
   return 0;
}

it seems to avoid the buffer overflow problem with a simple check. However, this function is unsecure. Why? Discover it in my slides!

IntOverflow.ppt (128.5 KB)