# Wednesday, January 04, 2006

What am I going to say?

As 2006 is arrived, is time for a little plan for this blog. Something I want to share is my experience adding contracts (synchronization contracts, to be precise) to the .NET Framework. I did various trials: at BCL, CLR and Compiler level. Every one has advantages and disadvantage. Personally, I think that the one acting at CLR level is very interesting (it uses the Profiling APIs and the Unmanaged Metadata API to dynamically inject the verification and update code!). So, at least three posts on the three methods.

Next, I'd like to spend some more words on concurrency and parallel computing.

And since I'm starting a new project on Bioinformatics at work (fold prediction), I should be able to write some posts on this argument too.

Finally, if they want me to continue fight with castor porting damn java web applications, I'll have to dump my frustration here.. =)

# Wednesday, December 14, 2005

Meaningful programming T-shirts

From Jeff Prosise

Comment my code?

Why do you think they call it code?

Perfectly true. It is the same thing expressed in an interesting book I'm reading (Hackers and Painters, by Paul Graham - it is an interesting book, with some very good points and some very weak spots.. more on this in a later post, when I'll finish it).

The shirt is obviusly intended to be ironc.. but there is truth in the statement. A programming language is the perfect way to express algorithms; if you have to go and comment it.. you are not writing good code! Comments should be placed only in the right and meaningful spots (bleargh to automatic tools - what they were think at when they promoted the use of this tool?)

On a similar note, another T-shirt (or cup) I always wanted to have is

It compiles!

Let's ship it!

(Also the subtitle of my blog). The message is funny, but it shouldn't be: I hate testing, and my dream for the future is a really automatic and clever tool that is executed when you build your program. Love static analisys, and remember: an error the compiler can catch is a bug less in your product!

# Tuesday, December 13, 2005

On the way of OS history

Last post I wrote that windows/386 (on a Compaq Prolinea 486) was my first OS. Next there were Windows 3.1 (and 3.11), and next? If you are guessing windows 95, you are wrong.. My next OS, in 1994, if i recall it correctly, was Windows NT 3.1! A friend of mine was a system administrator at my father's office, so when he bought NT 3.5 he gave to me NT 3.1! What a big leap =) I still had dual boot with DOS (for games) but I think I can say I pioneered the 32-bit era. I never left NT since then, and I always had dual boot machines (though OS were really expensive at that time! I remember I changed machine when it was OS upgrade time, to have a cheaper OEM version): Win95 / NT4.0, Win95 / Win2000.. and finally a unique OS with XP!

My dearest OS still remains NT 3.5. I never had it on one of my machines, but I used to work on it. It was beautiful, a "pure" OS, if you get what I mean. On the other side, I admired the most Windows 95. It was, and is, an engineer masterpiece of equilibrism. How they managed to make it so compatible with a pletora of legacy applications, bringing 32 bits to the userland, and being fairly stable..it's a mystery!
# Monday, December 12, 2005

I wish I was there...

...but at the original presentation I was only 5!
Surely, this amusing presentation by Charles Petzold is a bit late =) , but I've found it fashinating. Hope we'll not lost memories from the early programming days, as too often happens in many other fields!
I've never used Windows 1.0, my first version was Windows 386, a re-compiled verision of Windows 2.0 (which was meant to run on 286, I believe). I still have diskettes and manuals!

I grew as a computer appasionate first, and as a computer programmer later, in the Windows world, so no surprise if I try my best to defend it againts useless attacks on how buggy or annoying or insecure...or whatever is Windows!

Happy birthday Windows!
# Sunday, December 11, 2005

Sorry for the delay

Wow! It's been a long time since my last blog.. Too much to do at work, plus a little holiday with a trip to Florence.. =)
Here in Italy we had a pleasant long week-end (we call it a "bridge" when an holiday day comes close to the weekend), and I especially enjoyed it at my granma' in Florence.
It snowed a little the week before, it was only about 20 cm but I think we are not more used to snow: traffic was a mess, and childrens were so happy =) it used to snow a lot more when I was a child, when i was five we had a huge snow shower that brought us almost 1 meter of snow!

So, back on this blog subject: programming! At work we had a lot of work because our servers finally arrived. There were 4 beautiful AMD opteron at 2.6GHz that needed me to be inserted in an HPC cluster. An HPC cluster is not for redundancy or fail-over, nor for load distribution in a web-server fashion: is about high performance computing. So, I had to learn about ways of configuring and using the cluster, and because I found the informations both interesting and amusing the few next entries will be about clusters!

# Tuesday, November 22, 2005

MSDN search on Firefox

I use Mozilla Firefox as my primary browser, mainly for its great extensibility through extensions/plugin/whatever. They are a big add-on for every developer! Anybody knows that customization is not a primary goal for the mainstream market (most of end-users prefer standardization); however if you are a developer you try to tailor to your needs everything, starting from the development IDE through the web browser.
Having heard from Antimail of the new MSDN search engine, I could not resist to add it to the search engines in Firefox.. =) Here is the code and png images for it.

BTW, I hope Microsoft will add a managed model to IE 7. Making the browser features exposed to .NET languages will be great. I'm not talking about the HTML rendering component: this one is already an ActiveX object re-usable in every application. I think about the whole browser: context menus, searchbar, progress, address and menubars.. It should be a pleasure to write Browsers plugin with .NET: for now, it is possible only in the excellent new Visual Studio 2005!

msdnplugin.src (.89 KB)
msdnplugin.png (.88 KB)

# Friday, November 18, 2005

More on SPARC

Yesterday we saw briefly the architecture of SUN SPARC processors, and how the presence of a big register array and its organization into register windows affected the calling convention and the assembly programming model.
The idea seemed clever, and C programs should be affected very positively by this design (we know that function calls are a great performance issue on x86). However, its is with a reason that the SPARC-Solaris platform was called Slowlaris.. =)
The register window was designed without consideration of a real word scenario: the presence of all those register is a terrible slowdown on multitasking operating system; when a context switch happens, all the register of the thread being switched off must be dumped into memory. It's not like on x86, where only general purpose registers shuld be pushed in memory; the other ones are "hidden", and used cleverly by the processor.
For SPARC processors, the number of registers is (for a 8 windows processor) 136, i.e. about 1Kb. With the increasing of CPU frequencies, memory bandwidth is a major bottleneck, and with a slice time of 50 ms (a quite high slice time) the memory wasted is about 20Kb every second.
It may seem not too much, but as an example I have read that IPC would be 5 times slower on a Sparc processor than on
a comparable 8-register processor, if all  Sparc registers are saved and restored on context switch.
Algorithms for lazy context switch have been proposed, but those are applicable only through a special designed kernel or through harware modifications.
Hence the "friendly" name Slowlaris for the SPARC-Solaris pair.

# Thursday, November 17, 2005

Porting Bugs

Today I worked with a developer from another corporation on porting some C programs they wrote for a SPARC (with Solaris) to our machines, that are all based on Intel x86. It when pretty much straightforward, we were ready to face the endianess problem (for network communication and binary files reading). But we found a very interesting bug, that didn't manifest on SPARC but fired immediately on Intel. The souce was more or less:
int Function1(int paramA, int paramB)
if (paramA == 0)
//bad value
return 0;

if (paramB != 1 || paramB != 2)
//bad value
return 0;

//do stuff
//call another function

int main()
if (Function1(a, b) == 0)
printf("Error: Function1 returned 0\n");

Can you already spot the bug? We did, after a little "tracing" (printf) (I know, is the worst type of debugging.. but we were compiling over ssh, in a shell, and it proved to be a quick solution).
If you have already guessed what's next, you can stop reading and go to see the solution. If not, you can enjoy a trip in the calling conventions of two very different platforms: a CISC and a rich-register RISC.

I know, defining a Pentium IV as a CISC is not quite correct. Pentium processors have now a superscalar architecture, unordered execution, and internelly they use micro-ops that are very RISC-like. And they have a tons of registers (anybody knows how many, exactly?). But the "programming interface", i.e. the instruction set and register set exposes by the CPU to the world is CISC. It has only four general purpose registers, and a rich instruction set.

On the other side, SPARC architecture, instruction set and register set are very different. Even fuction calling conventions on SPARC are deeply influenced by its early hardware design. Since the instruction set was reduced (in the first version there wasn't even an instruction to divide two numbers), engineers tought to use the spared die space to fill it with a lot of register. And in order to use them efficiently, they organized them in several "register windows".
On x86, every function has its own stack space (the stack frame). On SPARC every function has also its own "register space", a register window. A register window is a set of about 24 register, 8 %i (input) registers, 8 %o (output) and 8 %l (local). The registers are mapped in a clever way: out registers in the caller become in registers in the callee, so that the first 8 parameters are passed through the 8 out-in registres. The first one has a special additional semantic:

%o0  (r08)  [3]  outgoing parameter 0 / return value from callee   
%i0  (r24)  [3]  incoming parameter 0 / return value to caller

This register, unlike the other input registers, is assumed by caller to be preserved across a procedure call
This mapping is done with the help of the SAVE and RESTORE functions; the prolog/epilog structure on SPARC so it is like:

   save  %sp, -K, %sp

   ; perform function
   ; put return value in register %i0 (if the function returns a value)


It is important to note that in the SAVE instruction the source operands (first two parameters) are read from the old register window, and the destination operand (the rightmost parameter) is written to the new window. "%sp" is used as both source and destination, but the result is written into the stack pointer of the new window, which is a different register (the stack pointer of the old window is renamed and becomes the frame pointer).
The call sequence is:

call <function>
mov 10, %o0

The delay slot is often filled with an instruction to set some parameters, in this example it loads the first parameter.

What is a delay slot?
When the control flow is altered (by a unconditional jump instruction, like call, jmpl or branch) the order of execution is inverted: the instruction after the unconditional jump instruction is being fetched simultaneously to the unconditional jump instruction.
This is done to implement a simple pipeline solution: when performing a jump instruction, there is already another instruction in the pipeline: the instruction in the delay slot, will be executed before the processor actually has a chance to jump to the new location. This is the simplest scheme for the chip designer to implement, since the general pipelining mechanism can be used without making any exceptions for transfer instructions; and it is the fastest way of arranging things, since no instructions are discarded.
The SPARC assembly language programmer must be aware of delay slots constantly while coding any change in flow of control, since the order of instruction execution is reversed from the order in which the instructions appear.

Returing to our problem, when function a calls function b %o registers in function a become %i in function b.
If b do not touch %i0 (it reaches the end of the function without a return statement) %i0 / return value remains equal to the incoming parameter.
In the original code, parameterA is always != 0 (otherwise, the function returns with value 0), so the fuction returns to the caller with return value != 0 (uqual to the first parameter).

What happens when porting to Intel? Intel calling conventions are very different. Remeber, x86 has only 4 general purpose registers accessible through assembly language, so it has to pass parameters using the stack:

  • Arguments are pushed on the stack in reverse order.
  • The caller pops arguments after return.
  • Primitive data types, except floating point values, are returned in EAX or EAX:EDX depending on the size.
  • float and double are returned in fp0, i.e. the first floating point register.
  • Simple data structures with 8 bytes or less in size are returned in EAX:EDX.
  • Complex Class objects are returned in memory.
  • When a return is made in memory the caller passes a pointer to the memory location as the first parameter. The callee populates the memory, and returns the pointer. The caller pops the hidden pointer together with the rest of the arguments.

If no C return statement is found, when the control flow reaches the end of the function the ret
instruction is executed and the control returns to the caller. The return value, i.e. the one in the EAX register, is the last value that register assumed... possibly the return value of the last function called.
In our case, the last function before the end (Function2) returned 0, and so our Function1.

What is the lesson? Compiling and testing on one platform (even with a cross complier, like the GCC we used) isn't enough. What surprised me was the absolute absence of compiler errors or warnings. Maybe I am too used to tools that give you warnings if you do something wrong =) but that's the way I like them. When a compiler seems to be too strict or to give you too much warning, remember that an error detected by the complier is a bug less in your product.

# Friday, November 11, 2005

Visual Studio 2005 installed!

Yesterday I was at the Microsoft 2005 Technical conference, part of the 2005 ready to launch tour. We all got Visual Studio 2005 standard NFR; a very glad present. It's true: Visual Studio 2005 rocks! Thank you to all the team for the great work, and to Cyrus for the beutiful IDE!

# Thursday, November 10, 2005

Enemy Territory fun!

I'm not a passionate gamer: I like very much games, but expecially from a technical/developer point of view: often they are the expression of the state-of-the-art of many fields, condensated in one product. Think about algorithms, graphics, artificial intelligence, physical simulation and  numerical computation. I love games, I like very much experimenting with graphics, but I am a sporadic gamer (with the exception of RPGs like Star Wars KOTOR I and II, NeverWinter Nights).

Back at my third year at the university, however, some of my fellow studens formed an Enemy Territory clan. Enemy Territory is a multiplayer only, team based FPS on the second world war. It is free, since it should have been developed as the multiplayer part of a single player game, but the project for the latter failed.
You can chose your team, axis or allies, and a class: soldier, medic, engineer, covert ops, field ops. Every class has unique strengths and abilities: it's very fun to be a medic and "revive" your friends, or to be a covert op and steal the uniform of a member of the other team, or to be a sniper....

The absence of single-player capabilities, and of an artificial intelligence at all (no bots) it's stimulating: every opposer is a human, and so there is no predefined tactis or "stupid" behaviours (well, only humanly stupid behaviour).

There are a stock of predefined sentences to communicate quickly with other member of the team over the internet, when you can reach the other "by voice", like "I need a medic", "They are coming", etc. Among them, "We need an engineer" (mainly to defuse dynamite) and "Enemy in disguise" (when you see a covert ops with your uniform..). The sentences came as both a written banner at the bottom of the screen and an audio file. In the shipped version the written sentences are equal to the audio files, but it is possible to change the written part.

Someone on the internet made a thing that made me laugh for an hour or so: they changed "We need an engineer" with "We need a ninja here" and "Enemy in disguise" with "Enemy in the skies". Same pronunciation! And I saw someone from my team  looking up watching the skies..
And when things get really bad, t's so cool to go around and call a ninja to help you =)

# Wednesday, November 09, 2005

More on syntax highlight

As you may have noticed, I have "solved" my problem of posting code. Now, if you look back at last Sunday's post, I have cool area limited by a dashed board for my code, with syntax coloring for C++ and correct indetation. And it works on Firefox too.
My effort was very limited, since I did not coded the entire solution by myself as originally wanted. I'm not entirely satisfied, so maybe in the future I could do it, or simply replace the syntax highlight engine. DasBlog uses the syntax highlight engine from Thomas Johansen. His highlight engine (here)  uses a simple api that takes the source code, a language definition file (written in XML), and spits out html. The code must be enclosed in <pre> tags, since it do not codify spaces as &nbsp. This was the first bug in dasBlog, and I got rid of it easily. The second problem was: why the popup window that could let me enter the code does not show?
The problem was in the showModal javascript function. It is a very pretty function for showing modal windows in a browser, very good for creating rich-content dialogs. Unfortunately, Firefox do not implement it. I could go for two solutions: using an alternate function (or better, a set o functionality like those implemented by SubImage) or not using a popup dialog at all. I went for the second one.
So, I was able to have the syntax highlighter I wanted in three steps:
  • redesign the Add Entry form;

  • Add some html to the code returned by Thomas Johansen's highlighter (for the border and the indentation);
  • Add an XML file for the C++ syntax.
It works pretty well; however, the highlighter has some problems: I wasn't able to highlight curly braces, for example, and it isn't able to recognize tokens not separated by a whitespace:

// Separated tokens
if (a != b)

// Not working
if (a!=b)

For now it works pretty well for me, but in the future I could code my own syntax highlight engine (for example follwing this article) or I could repalce it with a more professional one, for example CodeHighlighter

Here is a screenshot of my modified form; if you like it, send me an email and I'll post here the source code.
# Sunday, November 06, 2005

Blog engine, Code and firefox

DasBlog have a nice feature: since it uses the FreeTextBox component to allow editing of the blog, it also has a "Insert Code" button. However, this component is less then satisfactory, the main problems being:
  • Lack of C++ language support (as you can see for yesterday's post: highlight is for C#)
  • It does not work on Firefox (mozilla browsers do not have the showModalDialog funtion)
  • It does not mantain indentation (very bad!)
  • I wish to separate code and free text, maybe enclosing code in a nice box
So, I think I am going to implement something myself, that supports at least C++ (maybe /CLI)! Another pet project for my list =)
# Saturday, November 05, 2005

Blog engine

Finally I managed to setup a blog in my web space.
The first problem was: find a ready solution, or craft a simple blog engine by myself? Since, unfortunately, time is finite, I had to go for the first one.
I have already declared my love for .NET (and I will go into details on why I found ASP.NET the coolest solution for web applications, in a future post), so I had to go for an ASP.NET solution. I have found two good blog engines: CommunityServer/.Text/Subtext and dasBlog?
I think that CommunityServer is well over my need, however I like .Text/Subtext. Unfortunately, my hosting provider do not have MsSql...at least, not without paying too much (and even in that case, without direct access to the server! Which means, no stored procedures).
I have found a pretty article by Robert Love on customizing .Text for using it with Borland Interbase. And Phillip J Haack have plans in the future to port Subtext to MySql. So this became one of my (many) pet-projects. I plan to port .Text/Subtext to pure SQL, by writing a neutral DataProvider. It should be feasible, then, use it with MySql or even with a good embedded SQL engine like FireBird or SharpHSQL... For now, I am using the excellent dasBlog, that requires very little effort to deploy and setup!

Blog topics

For this blog, I have four topics in mind:
  • Speak about my work as a "bioinformatician"
  • Rotor
  • Tanscribe here my security lessons
  • Random thoughts
I plan to start today with somthing about Rotor, the Microsoft Shared Source CLI (a good article on MSDN here)
I have found Rotor one of the great things Microsoft did in the last years (the others being the msdn blogs and the shift towards security). For who don't know what I am speaking about, Rotor is the source code of .NET. It is actually mainly the production code of the clr (the execution engine, fusion, ...) of the C# compiler and of the J# compiler. There is even a limited BCL. The only different things (for copyright reasons, I suppose) are the Just In Time compiler and the Garbage Collector. And it compiles and runs on BSD and Mac OSX too!
I have found it invaluable for my work at the university, both on the security side and on the compiler side.
# Friday, November 04, 2005

Hello Blog World!

My name is Lorenzo, and I am an Italian developer. I am currently employed in a public institute for bioinformatics, in a small town not too far from where I live. At work, I focus on developing web applications for the management of laboratory data, as well as on tecniques to annotate (i.e. predict and give a function) to genes and genes products. This includes mainly algorithms on graphs and text mining.
My real interests, however, are Windows, .NET and programming languages.
Why? Well.. Windows was my first OS. I grew with him, gathering more knowledge in the years. I really like exploring the inners of the NT kernel, the Win32 API, the COM(+) programming model, and now .NET.
I am really fond of C++, its multi-paradigm nature, and I also like very much C# and ML.
I thought about writing a blog to share with others knowledge and experience, both on bioinformatics and on the other subjects. I'd love to know if anyone will read me. So please, comment! Thank you.


finally, my blog!