Coding for fun - Algorithms
http://www.dematte.org/
All the things I like about codingen-usLorenzo DemattéWed, 26 Apr 2006 08:03:38 GMTnewtelligence dasBlog 2.3.12105.0lorenzo.dematte@gmail.comlorenzo.dematte@gmail.comhttp://www.dematte.org/Trackback.aspx?guid=ddebffca-533b-457d-bda3-05fbd9f3a3bahttp://www.dematte.org/pingback.aspxhttp://www.dematte.org/PermaLink,guid,ddebffca-533b-457d-bda3-05fbd9f3a3ba.aspxLorenzo Demattéhttp://www.dematte.org/CommentView,guid,ddebffca-533b-457d-bda3-05fbd9f3a3ba.aspxhttp://www.dematte.org/SyndicationService.asmx/GetEntryCommentsRss?guid=ddebffca-533b-457d-bda3-05fbd9f3a3ba
...that there are (more than) 21 algorithms
to compute the factorial of a number?
I discovered this fact today, while implementing the Fisher's exact test for a 2x2
contingency table.
(For those interested: Fisher's exact test for a 2x2 contingency table, like the more
famous Chi-square test, is used for the analysis of categorical frequency data: its
aim is to determine whether the two categorical variables are associated.
We use it to extract the most significant genes from a gene set experiment: the two
categories are "clue present/clue absent", while the two data sets are the control
set and the experiment set)
Fisher exact test requires the computation of 9 factorials for each iteration, and
one of them is on the total. Typically, applets and code you can found on the internet
can handle data up to a total of about 100-200 before giving up due to overflow problems.
Even code that uses libraries for arithmetic with arbitrary precision can handle numbers
up to 6000-12000 before becoming unsuable.

Here, we are facing two problems:

numbers (integers and doubles) provided by the machine (physichal CPU or VM) have
a finite precision and extension

the simple and dumb algorithm (the one that performs all the multiplications) is very
SLOW

The solution is to adopt a good library that handles multiplications of very large
numbers in an efficient way (using Karatsuba multiplication or a FFT-based multiplication),
and forget about the naive algorithm. At this
page you can find implementations and benchmarks for the 21 different algorithms
I mentioned. The most efficients use prime numbers (or, better, primorials, i.e. the
multiplication of the n first primes), but even the simple and elegant Split Recursive
performs a lot better than the naive algorithm!
Did you know...http://www.dematte.org/PermaLink,guid,ddebffca-533b-457d-bda3-05fbd9f3a3ba.aspx
http://www.dematte.org/2006/04/26/DidYouKnow.aspx
Wed, 26 Apr 2006 08:03:38 GMT...that there are (more than) 21 algorithms to compute the factorial of a number?<br>
I discovered this fact today, while implementing the Fisher's exact test for a 2x2
contingency table.<br>
(For those interested: Fisher's exact test for a 2x2 contingency table, like the more
famous Chi-square test, is used for the analysis of categorical frequency data: its
aim is to determine whether the two categorical variables are associated.<br>
We use it to extract the most significant genes from a gene set experiment: the two
categories are "clue present/clue absent", while the two data sets are the control
set and the experiment set)<br>
Fisher exact test requires the computation of 9 factorials for each iteration, and
one of them is on the total. Typically, applets and code you can found on the internet
can handle data up to a total of about 100-200 before giving up due to overflow problems.
Even code that uses libraries for arithmetic with arbitrary precision can handle numbers
up to 6000-12000 before becoming unsuable.<br>
<br>
Here, we are facing two problems:<br>
<ul>
<li>
numbers (integers and doubles) provided by the machine (physichal CPU or VM) have
a finite precision and extension</li>
<li>
the simple and dumb algorithm (the one that performs all the multiplications) is very
SLOW</li>
</ul>
The solution is to adopt a good library that handles multiplications of very large
numbers in an efficient way (using Karatsuba multiplication or a FFT-based multiplication),
and forget about the naive algorithm. At <a href="http://www.luschny.de/math/factorial/FastFactorialFunctions.htm">this
page</a> you can find implementations and benchmarks for the 21 different algorithms
I mentioned. The most efficients use prime numbers (or, better, primorials, i.e. the
multiplication of the n first primes), but even the simple and elegant Split Recursive
performs a lot better than the naive algorithm!<p>
</p>
<img width="0" height="0" src="http://www.dematte.org/aggbug.ashx?id=ddebffca-533b-457d-bda3-05fbd9f3a3ba" />http://www.dematte.org/CommentView,guid,ddebffca-533b-457d-bda3-05fbd9f3a3ba.aspxAlgorithmshttp://www.dematte.org/Trackback.aspx?guid=9c10b67d-f4d1-4181-b184-5d83d8c480bchttp://www.dematte.org/pingback.aspxhttp://www.dematte.org/PermaLink,guid,9c10b67d-f4d1-4181-b184-5d83d8c480bc.aspxLorenzo Demattéhttp://www.dematte.org/CommentView,guid,9c10b67d-f4d1-4181-b184-5d83d8c480bc.aspxhttp://www.dematte.org/SyndicationService.asmx/GetEntryCommentsRss?guid=9c10b67d-f4d1-4181-b184-5d83d8c480bc
...that there are (more than) 21 algorithms
to compute the factorial of a number?
I discovered this fact today, while implementing the Fisher's exact test for a 2x2
contingency table.
(For those interested: Fisher's
exact test for a 2x2 contingency table, like the more famous Chi-square test,
is used for the analysis of categorical frequency data: its aim is to determine whether
the two categorical variables are associated.
We use it to extract the most significant genes from a gene set experiment: the two
categories are "clue present/clue absent", while the two data sets are the control
set and the experiment set)
Fisher exact test requires the computation of 9 factorials, and one of them is on
the total. Typically, applets
and code you can found on the internet can handle data up to a total of about
100-200 before giving up due to overflow problems. Even code that uses libraries for
arithmetic with arbitrary precision can handle numbers up to 6000-12000 before becoming
unsuable.

Here, we are facing two problems:

numbers (integers and doubles) provided by the machine (physichal CPU or VM) have
a finite precision and extension

the simple and dumb algorithm (the one that performs all the multiplications) is very
SLOW

The solution is to adopt a good library that handles multiplications of very large
numbers in an efficient way (using Karatsuba multiplication or a FFT-based multiplication),
and forget about the naive algorithm. At this
page you can find implementations and benchmarks for the 21 different algorithms
I mentioned. The most efficients use prime numbers (or, better, primorials, i.e. the
multiplication of the n first primes), but even the simple and elegant Split Recursive
performs a lot better than the naive algorithm!
Did you know...http://www.dematte.org/PermaLink,guid,9c10b67d-f4d1-4181-b184-5d83d8c480bc.aspx
http://www.dematte.org/2006/03/06/DidYouKnow.aspx
Mon, 06 Mar 2006 14:41:59 GMT...that there are (more than) 21 algorithms to compute the factorial of a number?<br>
I discovered this fact today, while implementing the Fisher's exact test for a 2x2
contingency table.<br>
(For those interested: <a href="http://faculty.vassar.edu/lowry/webtext.html">Fisher's
exact test</a> for a 2x2 contingency table, like the more famous Chi-square test,
is used for the analysis of categorical frequency data: its aim is to determine whether
the two categorical variables are associated.<br>
We use it to extract the most significant genes from a gene set experiment: the two
categories are "clue present/clue absent", while the two data sets are the control
set and the experiment set)<br>
Fisher exact test requires the computation of 9 factorials, and one of them is on
the total. Typically, <a href="http://faculty.vassar.edu/lowry/VassarStats.html">applets
and code you can found on the internet</a> can handle data up to a total of about
100-200 before giving up due to overflow problems. Even code that uses libraries for
arithmetic with arbitrary precision can handle numbers up to 6000-12000 before becoming
unsuable.<br>
<br>
Here, we are facing two problems:<br>
<ul>
<li>
numbers (integers and doubles) provided by the machine (physichal CPU or VM) have
a finite precision and extension</li>
<li>
the simple and dumb algorithm (the one that performs all the multiplications) is very
SLOW</li>
</ul>
The solution is to adopt a good library that handles multiplications of very large
numbers in an efficient way (using Karatsuba multiplication or a FFT-based multiplication),
and forget about the naive algorithm. At <a href="http://www.luschny.de/math/factorial/benchmark.html">this
page</a> you can find implementations and benchmarks for the 21 different algorithms
I mentioned. The most efficients use prime numbers (or, better, primorials, i.e. the
multiplication of the n first primes), but even the simple and elegant Split Recursive
performs a lot better than the naive algorithm!<p>
</p>
<img width="0" height="0" src="http://www.dematte.org/aggbug.ashx?id=9c10b67d-f4d1-4181-b184-5d83d8c480bc" />http://www.dematte.org/CommentView,guid,9c10b67d-f4d1-4181-b184-5d83d8c480bc.aspxAlgorithms