Python - Algorithms - Bioinformaticshttp://jeromekelleher.net/2014-05-26T00:00:00+01:00Generating integer partitions2014-05-26T00:00:00+01:002014-05-26T00:00:00+01:00Jerome Kellehertag:jeromekelleher.net,2014-05-26:/generating-integer-partitions.html<p class="first last">Efficient algorithms to generate all integer partitions in Python.</p>
<p>The purpose of this page is to give an informal presentation of the algorithms
I developed for my PhD thesis and subsequently turned into a research
<a class="reference external" href="http://arxiv.org/abs/0909.2331">article</a>. The basic gist of this
work is that we can generate integer partitions much more effectively if we
encode them as ascending compositions rather than the conventional descending
compositions. As it turns out, it's much easier to generate ascending
compositions. I'm not going to argue this point here, since it's something
I've done at great length elsewhere; instead, lets just take a quick overview
of the main points and look at the algorithms themselves.</p>
<div class="section" id="ascending-compositions">
<h2>Ascending Compositions</h2>
<p>An integer partition
is an expressions of a positive integer <em>n</em> as an
unordered collection of positive integers.
A composition, on the other hand, is an expresssion
of <em>n</em> as an ordered collection of positive integers.
For example,
1 + 1 + 2,
1 + 2 + 1
and
2 + 1 + 1
all represent the same partition
of 4. Then, ascending compositions are the compositions
of <em>n</em> where all the parts are in ascending (non decreasing) order. For example,
the ascending compositions of 5 are:</p>
<pre class="literal-block">
1 + 1 + 1 + 1 + 1
1 + 1 + 1 + 2
1 + 1 + 3
1 + 2 + 2
1 + 4
2 + 3
5
</pre>
<p>Generating ascending compositions is one way to get partitions: generating
all ascending compositions is equivalent to generating all partitions. For
historical reasons, partition generation algorithms have nearly all generated
descending compositions (see
<a class="reference external" href="http://en.wikipedia.org/wiki/Partition_(number_theory)">Wikipedia</a>),
<a class="reference external" href="http://mathworld.wolfram.com/Partition.html">Mathworld</a> or
Ruskey's <a class="reference external" href="http://theory.cs.uvic.ca/inf/nump/NumPartition.html">Combinatorial Object Server</a>, for example).
There are definite advantages, however, to working with ascending compositions
instead.</p>
</div>
<div class="section" id="iterative-algorithm">
<h2>Iterative Algorithm</h2>
<p>Lets take a look at one algorithm to generate all ascending compositions.
This algorithm is written as a Python
<a class="reference external" href="http://www.python.org/dev/peps/pep-0255">generator</a>, which is a very neat way
of writing combinatorial generation algorithms.</p>
<div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">rule_asc</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
<span class="n">a</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)]</span>
<span class="n">k</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">a</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">n</span>
<span class="k">while</span> <span class="n">k</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">:</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">a</span><span class="p">[</span><span class="n">k</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="mi">1</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">a</span><span class="p">[</span><span class="n">k</span><span class="p">]</span> <span class="o">-</span> <span class="mi">1</span>
<span class="n">k</span> <span class="o">-=</span> <span class="mi">1</span>
<span class="k">while</span> <span class="n">x</span> <span class="o"><=</span> <span class="n">y</span><span class="p">:</span>
<span class="n">a</span><span class="p">[</span><span class="n">k</span><span class="p">]</span> <span class="o">=</span> <span class="n">x</span>
<span class="n">y</span> <span class="o">-=</span> <span class="n">x</span>
<span class="n">k</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">a</span><span class="p">[</span><span class="n">k</span><span class="p">]</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
<span class="k">yield</span> <span class="n">a</span><span class="p">[:</span><span class="n">k</span> <span class="o">+</span> <span class="mi">1</span><span class="p">]</span>
</pre></div>
<p>Although this algorithm is very simple, it is also very efficient. It is
<em>Constant Amortised Time</em>, which means that the average computation per
partition that is output is constant.</p>
<p>We can prove this fairly easily by looking at the two while loops and the
variable k. Since the <strong>yield</strong> operator is called exactly once for every
iteration of the outer while loop, we know that it must iterate exactly <em>p(n)</em>
times (where <em>p(n)</em> is the number of partitions of <em>n</em> --- see <a class="reference external" href="http://mathworld.wolfram.com/PartitionFunctionP.html">Mathworld</a> or Sloane's sequence
<a class="reference external" href="https://oeis.org/A000041">A000041</a> for details
and properties). Therefore, we know that there must be exactly <em>p(n)</em>
decrement operations on k (since k -= 1 is only called in the outer loop).
Then, since k is initially 1 and the algorithm terminates when k is 0, we know
that there must be <em>p(n)</em> - 1 increment operations on k. Since the only
increment operations occur in the inner while loop, we know that this loop gets
executed exactly <em>p(n)</em> - 1 times, and so the total running time of the
algorithm is proportional to <em>p(n)</em>. In other words, the algorithm is
constant amortised time.</p>
</div>
<div class="section" id="most-efficient-algorithm">
<h2>Most Efficient Algorithm</h2>
<p>If it's speed you're looking for, here is the most efficient known algorithm to
generate all partitions of a positive integer.</p>
<div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">accel_asc</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
<span class="n">a</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)]</span>
<span class="n">k</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">n</span> <span class="o">-</span> <span class="mi">1</span>
<span class="k">while</span> <span class="n">k</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">:</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">a</span><span class="p">[</span><span class="n">k</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="mi">1</span>
<span class="n">k</span> <span class="o">-=</span> <span class="mi">1</span>
<span class="k">while</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">x</span> <span class="o"><=</span> <span class="n">y</span><span class="p">:</span>
<span class="n">a</span><span class="p">[</span><span class="n">k</span><span class="p">]</span> <span class="o">=</span> <span class="n">x</span>
<span class="n">y</span> <span class="o">-=</span> <span class="n">x</span>
<span class="n">k</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">l</span> <span class="o">=</span> <span class="n">k</span> <span class="o">+</span> <span class="mi">1</span>
<span class="k">while</span> <span class="n">x</span> <span class="o"><=</span> <span class="n">y</span><span class="p">:</span>
<span class="n">a</span><span class="p">[</span><span class="n">k</span><span class="p">]</span> <span class="o">=</span> <span class="n">x</span>
<span class="n">a</span><span class="p">[</span><span class="n">l</span><span class="p">]</span> <span class="o">=</span> <span class="n">y</span>
<span class="k">yield</span> <span class="n">a</span><span class="p">[:</span><span class="n">k</span> <span class="o">+</span> <span class="mi">2</span><span class="p">]</span>
<span class="n">x</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">y</span> <span class="o">-=</span> <span class="mi">1</span>
<span class="n">a</span><span class="p">[</span><span class="n">k</span><span class="p">]</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span> <span class="o">-</span> <span class="mi">1</span>
<span class="k">yield</span> <span class="n">a</span><span class="p">[:</span><span class="n">k</span> <span class="o">+</span> <span class="mi">1</span><span class="p">]</span>
</pre></div>
<p>This algorithm is a modification of the algorithm above. It gains its
extra efficiency by using some structure of the set of ascending compositions
to make many transitions more efficient. Consider, for example, the following
of partitions of 10:</p>
<pre class="literal-block">
1 + 1 + 2 + 6
1 + 1 + 3 + 5
1 + 1 + 4 + 4
</pre>
<p>These transitions can be made very efficiently, since all we need to do is to
add one to the second last part and subtract one from the last part. The algorithm
above takes advantage of this, and it is the most efficient known algorithm to
generate partitions (it has been
<a class="reference external" href="http://arxiv.org/abs/0909.2331">shown</a>
to be more efficient than Zoghbi and Stojmenovic's excellent
<a class="reference external" href="http://www.site.uottawa.ca/~ivan/F49-int-part.pdf">ZS1 algorithm</a>.)</p>
</div>