Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • R.H.Bisseling/mondriaan
  • 3789675/mondriaan
2 results
Show changes
Commits on Source (61)
Showing with 1941 additions and 753 deletions
MONDRIAAN version 4.0 (this version released August 2013):
MONDRIAAN version 4.2.1 (this version released August 2019):
------------------------
Copyright May 2002 (version 1.0): Brendan Vastenhouw and Rob H. Bisseling
Copyright July 2008 (version 2.0): Rob H. Bisseling, Wouter Meesen, Tristan van Leeuwen,
......@@ -7,7 +7,12 @@ Copyright July 2010 (version 3.0): Rob H. Bisseling, Bas Fagginger Auer,
Albert-Jan Yzelman, Brendan Vastenhouw.
Prototype Matlab interface: Ken Stanley.
Copyright August 2013 (version 4.0): Rob H. Bisseling, Bas Fagginger Auer,
Albert-Jan Yzelman, Daniel Pelt.
Daniel Pelt, Albert-Jan Yzelman.
Copyright November 2016 (version 4.1): Rob H. Bisseling, Bas Fagginger Auer,
Marco van Oort, Daniel Pelt, Albert-Jan Yzelman.
Copyright September 2017 (version 4.2): Rob H. Bisseling, Bas Fagginger Auer,
Marco van Oort, Daniel Pelt, Albert-Jan Yzelman.
Copyright August 2019 (version 4.2.1): Rob H. Bisseling, Marco van Oort.
Extensive documentation is found in the `docs' directory.
......@@ -39,7 +44,12 @@ The Lambda-times-Lambda-minus-one communication volume metric:
The medium-grain partitioning strategy:
Daniel M. Pelt and Rob H. Bisseling
"A medium-grain method for fast 2D bipartitioning of sparse matrices",
submitted for publication (2013).
Proceedings IEEE International Parallel & Distributed Processing Symposium 2014,
IEEE Press, pp. 529-539.
The exact algorithm for obtaining an optimal partitioning of small matrices for 2 processors:
Daniel M. Pelt and Rob H. Bisseling
"An exact algorithm for sparse matrix bipartitioning",
Journal of Parallel and Distributed Computing, 85 (2015) pp. 79-90.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser Public License as published by
......
......@@ -4,13 +4,24 @@
<html>
<head>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="Content-type" content="text/html;charset=UTF-8">
<link href="style.css" rel="stylesheet" type="text/css">
<link href="print.css" rel="stylesheet" type="text/css" media="print">
<title>Partitioning a hypergraph using Mondriaan</title>
</head>
<body>
<div id="mainContainer">
<div id="pageNav">
<div><a href="USERS_GUIDE.html">Mondriaan</a></div>
<div><a href="MATLAB.html">MATLAB</a></div>
<div><a href="USERS_GUIDE_OPT.html">MondriaanOpt</a></div>
</div>
<h2>Partitioning a hypergraph using Mondriaan</h2>
<hr>
<p>
......@@ -29,15 +40,15 @@ Download the latest version of
Mondriaan</a>. Uncompress with
</p>
<ul>
<li><tt>% tar xzvf mondriaan4.tar.gz</tt></li>
<li><code>% tar xzvf mondriaan4.tar.gz</code></li>
</ul>
<p>
This will create a directory <tt>Mondriaan4</tt>
This will create a directory <code>Mondriaan4</code>
which contains all the files of the Mondriaan package.
Run
</p>
<ul>
<li><tt>% make</tt></li>
<li><code>% make</code></li>
</ul>
<p>
which will build Mondriaan and the associated tools.
......@@ -53,9 +64,9 @@ for each vertex, and a row for each hyperedge or net.
</p>
<div class="image squareimage"><img src="hypergraph.gif" alt=""><div class="caption">Figure 1</div></div>
<p>
For example, we can consider the hypergraph <tt>G = (V, E)</tt> from Figure 1 with
vertices <tt>V = {1, 2, 3, 4, 5}</tt> and nets <tt>E = {{1}, {1, 2}, {2, 3, 4}, {3, 4}}</tt>
as a matrix <tt>A</tt> with 5 columns and 4 rows in Matrix Market format:
For example, we can consider the hypergraph <code>G = (V, E)</code> from Figure 1 with
vertices <code>V = {1, 2, 3, 4, 5}</code> and nets <code>E = {{1}, {1, 2}, {2, 3, 4}, {3, 4}}</code>
as a matrix <code>A</code> with 5 columns and 4 rows in Matrix Market format:
</p>
<pre>
%%MatrixMarket weightedmatrix coordinate pattern general
......@@ -75,8 +86,8 @@ as a matrix <tt>A</tt> with 5 columns and 4 rows in Matrix Market format:
1
</pre>
<p>
Here the values <tt>4 5 8 2</tt> indicate that this is a matrix with 4 rows (nets), 5 columns (vertices), 8 entries (total number of vertices in all nets), and weighted columns (the value 2 equals 10 in binary: weighted columns, unweighted rows).
Then after all nonzeroes, <tt>1 1, 2 1, ...</tt>, we find the vertex weights, which are set to 1 for all five vertices.
Here the values <code>4 5 8 2</code> indicate that this is a matrix with 4 rows (nets), 5 columns (vertices), 8 entries (total number of vertices in all nets), and weighted columns (the value 2 equals 10 in binary: weighted columns, unweighted rows).
Then after all nonzeroes, <code>1 1, 2 1, ...</code>, we find the vertex weights, which are set to 1 for all five vertices.
</p>
<p>
Providing the vertex weights is <b>essential</b>, because otherwise Mondriaan will by default weigh all the columns by the number of nonzeroes contained in them, which will lead to unbalanced hypergraph partitions.
......@@ -84,23 +95,23 @@ Providing the vertex weights is <b>essential</b>, because otherwise Mondriaan wi
<h3>Setting the proper Mondriaan options</h3>
<p>
Now that we have our hypergraph as a Matrix Market file, say <a href="hypergraph.mtx"><tt>foo.mtx</tt></a>, we can use Mondriaan to partition it.
First we go to the <tt>tools/</tt> directory.
Now that we have our hypergraph as a Matrix Market file, say <a href="hypergraph.mtx"><code>foo.mtx</code></a>, we can use Mondriaan to partition it.
First we go to the <code>tools/</code> directory.
</p>
<ul>
<li><tt>% cd tools</tt></li>
<li><code>% cd tools</code></li>
</ul>
<p>
Here the options file <tt>Mondriaan.defaults</tt> should set <tt>SplitStrategy</tt> to <tt>onedimcol</tt>, because we want Mondriaan to partition the matrix columns, which correspond to the hypergraph vertices.
Then we partition <tt>foo.mtx</tt> in two parts with a maximum imbalance of 10% by running
Here the options file <code>Mondriaan.defaults</code> should set <code>SplitStrategy</code> to <code>onedimcol</code>, because we want Mondriaan to partition the matrix columns, which correspond to the hypergraph vertices.
Then we partition <code>foo.mtx</code> in two parts with a maximum imbalance of 10% by running
</p>
<ul>
<li><tt>% ./Mondriaan foo.mtx 2 0.1</tt></li>
<li><code>% ./Mondriaan foo.mtx 2 0.1</code></li>
</ul>
<h3>Extracting the hypergraph partitioning from the matrix partitioning</h3>
<p>
After performing the matrix partitioning, the file <tt>foo.mtx-v2</tt> contains the vector distribution of the columns
After performing the matrix partitioning, the file <code>foo.mtx-v2</code> contains the vector distribution of the columns
</p>
<pre>
5 2
......@@ -111,10 +122,13 @@ After performing the matrix partitioning, the file <tt>foo.mtx-v2</tt> contains
5 1
</pre>
<p>
The first line <tt>5 2</tt> contains the number of columns (5) and the number of parts to which they have been assigned (2).
The first line <code>5 2</code> contains the number of columns (5) and the number of parts to which they have been assigned (2).
Following this line are the column indices and the parts to which the columns have been assigned.
Because the column indices correspond directly to the vertices of our hypergraph, we see that our hypergraph has been partitioned into two parts: <tt>{1, 2, 5}</tt> and <tt>{3, 4}</tt>, which was to be expected if you look at Figure 1.
Because the column indices correspond directly to the vertices of our hypergraph, we see that our hypergraph has been partitioned into two parts: <code>{1, 2, 5}</code> and <code>{3, 4}</code>, which was to be expected if you look at Figure 1.
</p>
</div>
</body>
</html>
......
......@@ -4,13 +4,24 @@
<html>
<head>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="Content-type" content="text/html;charset=UTF-8">
<link href="style.css" rel="stylesheet" type="text/css">
<link href="print.css" rel="stylesheet" type="text/css" media="print">
<title>Mondriaan and MATLAB</title>
</head>
<body>
<div id="mainContainer">
<div id="pageNav">
<div><a href="USERS_GUIDE.html">Mondriaan</a></div>
<div><a href="HYPERGRAPH.html">Hypergraphs</a></div>
<div><a href="USERS_GUIDE_OPT.html">MondriaanOpt</a></div>
</div>
<h2>Mondriaan and MATLAB</h2>
<hr>
<p>
......@@ -21,6 +32,12 @@ a look at the <a href="USERS_GUIDE.html">user's guide</a>.
</p>
<hr>
<h3>Known issues</h3>
<p>Unfortunately, the Matlab interface for Mondriaan does not work with the most recent
Matlab versions any more. We do not envision to repair this in the near future.
However, volunteers for this task are always welcome!
</p>
<h3>How to download and install Mondriaan</h3>
<p>
Download the latest version from the
......@@ -28,46 +45,46 @@ Download the latest version from the
Mondriaan software homepage</a>. Uncompress with
</p>
<ul>
<li><tt>% tar xzvf mondriaan4.tar.gz</tt></li>
<li><code>% tar xzvf mondriaan4.tar.gz</code></li>
</ul>
<p>
This will create a directory <tt>Mondriaan4</tt>
This will create a directory <code>Mondriaan4</code>
which contains all the files of the Mondriaan package.
To enable MATLAB support, open the file <tt>Mondriaan4/mondriaan.mk</tt>
To enable MATLAB support, open the file <code>Mondriaan4/mondriaan.mk</code>
with a text-editor and look for a line which looks similar to
</p>
<ul>
<li><tt>#MATLABHOMEDIR := /usr/local/matlab</tt></li>
<li><code>#MATLABHOMEDIR := /usr/local/matlab</code></li>
</ul>
<p>
Change the directory on the right-hand side to your installation
directory of MATLAB and remove the <tt>#</tt> in front of the line,
directory of MATLAB and remove the <code>#</code> in front of the line,
such that it looks similar to
</p>
<ul>
<li><tt>MATLABHOMEDIR := /your/matlab/installation/directory</tt></li>
<li><code>MATLABHOMEDIR := /your/matlab/installation/directory</code></li>
</ul>
<p>
Furthermore make sure that the variable <tt>MEXSUFFIX</tt> is set to the proper
Furthermore make sure that the variable <code>MEXSUFFIX</code> is set to the proper
extension for MATLAB binary files for your system (from the Mathworks <a href="http://www.mathworks.nl/help/matlab/ref/mexext.html">site</a>):
</p>
<table border="1">
<tr><td><b>Platform</b></td><td><b><tt>MEXSUFFIX</tt></b></td></tr>
<tr><td>Linux (32-bit)</td><td><tt>mexglx</tt></td></tr>
<tr><td>Linux (64-bit)</td><td><tt>mexa64</tt></td></tr>
<tr><td>Apple Macintosh (32-bit)</td><td><tt>mexmaci</tt></td></tr>
<tr><td>Apple Macintosh (64-bit)</td><td><tt>mexmaci64</tt></td></tr>
<tr><td>Microsoft Windows (32-bit)</td><td><tt>mexw32</tt></td></tr>
<tr><td>Microsoft Windows (64-bit)</td><td><tt>mexw64</tt></td></tr>
<tr><td><b>Platform</b></td><td><b><code>MEXSUFFIX</code></b></td></tr>
<tr><td>Linux (32-bit)</td><td><code>mexglx</code></td></tr>
<tr><td>Linux (64-bit)</td><td><code>mexa64</code></td></tr>
<tr><td>Apple Macintosh (32-bit)</td><td><code>mexmaci</code></td></tr>
<tr><td>Apple Macintosh (64-bit)</td><td><code>mexmaci64</code></td></tr>
<tr><td>Microsoft Windows (32-bit)</td><td><code>mexw32</code></td></tr>
<tr><td>Microsoft Windows (64-bit)</td><td><code>mexw64</code></td></tr>
</table>
<p>
For example: on a 32-bit Macintosh system we would have <tt>MEXSUFFIX := mexmaci</tt>.
For example: on a 32-bit Macintosh system we would have <code>MEXSUFFIX := mexmaci</code>.
</p>
<p>
Now we are ready to compile Mondriaan, run
</p>
<ul>
<li><tt>% make</tt></li>
<li><code>% make</code></li>
</ul>
<p>
which will build Mondriaan and the associated tools.
......@@ -80,35 +97,35 @@ MATLAB interface of Mondriaan.
</p>
<p>
As test matrix we can use <a href="http://www.staff.science.uu.nl/~bisse101/Matrices/tbdmatlab.mtx.gz">tbdmatlab.mtx.gz</a>
from the Mondriaan website. The archive should be extracted to the <tt>Mondriaan4/tools</tt> directory.
from the Mondriaan website. The archive should be extracted to the <code>Mondriaan4/tools</code> directory.
</p>
<p>
Start MATLAB and navigate to the <tt>Mondriaan4/tools</tt> directory in the <i>Current Directory</i>
Start MATLAB and navigate to the <code>Mondriaan4/tools</code> directory in the <i>Current Directory</i>
subwindow.
To read and view <tt>tbdmatlab.mtx</tt>, issue
To read and view <code>tbdmatlab.mtx</code>, issue
</p>
<ul>
<li><tt>A = mmread('tbdmatlab.mtx');</tt></li>
<li><tt>spy(A)</tt></li>
<li><code>A = mmread('tbdmatlab.mtx');</code></li>
<li><code>spy(A)</code></li>
</ul>
<p>
We can partition the matrix <tt>A</tt> among 30 processors with a maximum imbalance of 3% by using
the <tt>mondriaan</tt> function in MATLAB
We can partition the matrix <code>A</code> among 30 processors with a maximum imbalance of 3% by using
the <code>mondriaan</code> function in MATLAB
</p>
<ul>
<li><tt>[I, s] = mondriaan(A, 30, 0.03);</tt></li>
<li><code>[I, s] = mondriaan(A, 30, 0.03);</code></li>
</ul>
<p>
where <tt>I</tt> is the same matrix as <tt>A</tt>, only with the real values
where <code>I</code> is the same matrix as <code>A</code>, only with the real values
of all the matrix nonzeroes set to the index of the processor to which
the nonzero was assigned, and <tt>s</tt> contains partitioning information.
the nonzero was assigned, and <code>s</code> contains partitioning information.
Full output can be generated with
</p>
<ul>
<li><tt>[I, s, p, q, r, c, rh, ch, B, u, v] = mondriaan(A, 30, 0.03, 2);</tt></li>
<li><code>[I, s, p, q, r, c, rh, ch, B, u, v] = mondriaan(A, 30, 0.03, 2);</code></li>
</ul>
<p>
where the last parameter (<tt>2</tt>) is the desired permutation method (see below).
where the last parameter (<code>2</code>) is the desired permutation method (see below).
Here, p and q are permutation vectors, r and c are row-boundaries and column-boundaries
corresponding to the ordering's block structure, rh and ch store the separator hierarchy
information, the matrix B stores the reordered matrix PAQ (in MATLAB terminology:
......@@ -116,7 +133,7 @@ B=A(p,q)) and finally u and v contain the indices of the processors to which the
components are assigned (for parallel multiplication of <i>u = A*v</i>).
See the <a href="USERS_GUIDE.html">User's Guide</a> for full details on these output
vectors and matrices. For particulars on the boundary and hierarchy functions, jump to
the appriopiate section <a href="USERS_GUIDE.html#SBDoutput">here</a>.
the appropriate section <a href="USERS_GUIDE.html#SBDoutput">here</a>.
</p>
<table border="1">
<tr><td><b>Value</b><td><b>Ordering</b></td></tr>
......@@ -134,10 +151,10 @@ of the input matrix A. This is done by setting a fifth parameter, such that the
call becomes:
</p>
<ul>
<li><tt>[I, s, p, q, r, c, rh, ch, B, u, v] = mondriaan(A, 30, 0.03, 2, symm);</tt></li>
<li><code>[I, s, p, q, r, c, rh, ch, B, u, v] = mondriaan(A, 30, 0.03, 2, symm);</code></li>
</ul>
<p>where symm is 0 by default (if the parameter is not given), and indicates A is
not symmetric. If <tt>symm</tt> takes a value 1 or 2, A is assumed symmetric
not symmetric. If <code>symm</code> takes a value 1 or 2, A is assumed symmetric
and <em>only the lower triangular part of A is passed through to Mondriaan</em>. This
is exactly the same as using the regular (terminal-based) Mondriaan application on a
symmetric matrix with the options SymmetricMatrix_UseSingleEntry set to <b>yes</b> and
......@@ -148,14 +165,14 @@ taken as usual from the Mondriaan.defaults file. Recommended is to use the fineg
or symmetric finegrain strategies. Others will work, but may not minimise the
communication volume during parallel sparse matrix-vector multiplication when
considering the full matrix A.</p>
<p>Setting <tt>symm</tt> to 2 will indicate the matrix is structurally symmetric,
<p>Setting <code>symm</code> to 2 will indicate the matrix is structurally symmetric,
but as said before, still only the lower triangular part of A is passed through to
Mondriaan. This makes no difference for any of the output parameters, except for B,
which would, for <tt>symm</tt>=1, return an incorrect full matrix PAQ as the full
reordered matrix is inferred only from the lower triangular part. Setting <tt>symm</tt>
which would, for <code>symm</code>=1, return an incorrect full matrix PAQ as the full
reordered matrix is inferred only from the lower triangular part. Setting <code>symm</code>
to 2 prevents this by automatically postprocessing B by rebuilding PAQ using the
output parameters p and q.</p>
<p>Note that setting <tt>symm</tt> equal to 1 or 2 yields symmetric permutations
<p>Note that setting <code>symm</code> equal to 1 or 2 yields symmetric permutations
(B=PAP<sup>T</sup>). Also note that it is not checked whether the input matrix really
is symmetric, and as such unsymmetric matrices can also be passed through this method.
This probably does not yield any meaningful results.</p>
......@@ -164,7 +181,7 @@ This probably does not yield any meaningful results.</p>
<p>We present two small examples of using Matlab in conjunction with Mondriaan; the
first will be on speeding up the sequential sparse matrix-vector multiply, the
second will illustrate speeding up the sequential sparse LU decomposition.
Assumed is that the working directory is <tt>Mondriaan4/tools</tt>. Also available
Assumed is that the working directory is <code>Mondriaan4/tools</code>. Also available
should be:
<ul>
<li>the tbdlinux matrix (available through Rob Bisseling's
......@@ -181,14 +198,14 @@ system jitter.
[<a href="#cite1">1</a>],
[<a href="#cite2">2</a>]):</h5>
<p>
<tt>
<code>
&gt;&gt; A=mmread('tbdlinux.mtx');<br>
&gt;&gt; [I, s, p, q, r, c, rh, ch, B, u, v] = mondriaan(A,50,0.1,2);<br>
&gt;&gt; x=rand(size(A,2),1);<br>
&gt;&gt; tic, for i=1:1000 A*x; end, toc<br>
Elapsed time is 24.707203 seconds.<br>
&gt;&gt; tic, z=x(q), for i=1:1000 B*z; end, toc<br>
Elapsed time is 19.786526 seconds.</tt>
Elapsed time is 19.786526 seconds.</code>
</p>
<p>
Using Mondriaan to transform the tbdlinux matrix into SBD form thus yields a modest 20 percent
......@@ -203,7 +220,7 @@ for details.</p>
[<a href="#cite3">3</a>],
[<a href="#cite4">4</a>]):</h5>
<p>
<tt>
<code>
&gt;&gt; A=mmread('west0497.mtx');<br>
&gt;&gt; [I, s, p, q, r, c, rh, ch, B, u, v] = mondriaan(A,10,0.1,3);<br>
&gt;&gt; tic, for i=1:1000 [L,U,lu_P] = lu(A); end, toc<br>
......@@ -222,52 +239,69 @@ ans =<br>
<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;4647<br>
<br>
</tt>
</code>
</p>
<p>Here the use of Mondriaan with BBD ordering lets the stock MATLAB
LU algorithm run almost a factor 2 faster, and reduces the fill-in
with almost a factor 3. Note that this is not the UMFPACK version of
the LU algorithm, which employs its own reordering techniques
(amongst others); see <tt>help lu</tt> within MATLAB.</p>
(amongst others); see <code>help lu</code> within MATLAB.</p>
<h3>Visualisation</h3>
<p>
We can also directly visualise the partitioning process by using <tt>mondriaanplot</tt>
We can also directly visualise the partitioning process by using <code>mondriaanplot</code>
in the following fashion:
</p>
<ul>
<li><tt>mondriaanplot(A, 30, 0.03, 2);</tt></li>
<li><code>mondriaanplot(A, 30, 0.03, 2);</code></li>
</ul>
<p>
This concludes this small tutorial.
More information is available through issueing <tt>help mondriaan</tt> from within MATLAB.
More information is available through issuing <code>help mondriaan</code> from within MATLAB.
</p>
<h3>MondriaanOpt</h3>
<p>
Apart from Mondriaan itself, also MondriaanOpt is available in MATLAB through the MatlabMondriaanOpt MEX routine.
Example matlab functions are given in mondriaanOpt.m and mondriaanOptPlot.m.
The interface of mondriaanOpt is as follows:
</p>
<ul>
<li><code>[I, s] = mondriaanOpt(A, Imbalance, Volume)</code></li>
</ul>
<p>
Here, <code>A</code> is the sparse matrix to be partitioned, <code>Imbalance</code> is the maximum allowed load imbalance,
<code>Volume</code> is the initial upper bound on the volume, <code>I</code> contains the partitioning information and
<code>s</code> contains statistics about the run.
For more information, type <code>help mondriaanOpt</code> or <code>help mondriaanOptPlot</code> in MATLAB.
</p>
<h3>References</h3>
<p>
[<a name="cite1" href="http://www.staff.science.uu.nl/~bisse101/Mondriaan/yzelman09.pdf">1</a>]
[<a id="cite1" href="http://www.staff.science.uu.nl/~bisse101/Mondriaan/yzelman09.pdf">1</a>]
<em>Cache-oblivious sparse matrix-vector multiplication by using sparse matrix partitioning methods</em>,
A. N. Yzelman and Rob H. Bisseling, SIAM Journal of Scientific Computation, Vol. 31, Issue 4, pp. 3128-3154 (2009).<br>
[<a name="cite2" href="http://www.sciencedirect.com/science/article/pii/S0167819111001062">2</a>]
[<a id="cite2" href="http://www.sciencedirect.com/science/article/pii/S0167819111001062">2</a>]
<em>Two-dimensional cache-oblivious sparse matrix-vector multiplication</em>,
A. N. Yzelman and Rob H. Bisseling, Parallel Computing, Vol. 37, Issue 12, pp. 806-819 (2011).<br>
[<a name="cite3" href="http://www.cerfacs.fr/files/cerfacs_algo/conferences/PastWorkshops/CSC05/11_Catalyurek_Aykanat.pdf">3</a>]
[<a id="cite3" href="http://www.cerfacs.fr/files/cerfacs_algo/conferences/PastWorkshops/CSC05/11_Catalyurek_Aykanat.pdf">3</a>]
<em>Hypergraph-partitioning-based sparse matrix ordering</em>,
&Uuml;mit V. &Ccedil;ataly&uuml;rek and C. Aykanat, Second International Workshop on Combinatorial Scientic Computing, CERFACS, 2005.<br>
[<a name="cite4" href="http://www.sandia.gov/~egboman/papers/HUND.pdf">4</a>]
[<a id="cite4" href="http://www.sandia.gov/~egboman/papers/HUND.pdf">4</a>]
<em>Hypergraph-based Unsymmetric Nested Dissection Ordering for Sparse LU Factorization</em>,
L. Grigori, E. G. Boman, S. Donfack, and T. A. Davis, SIAM Journal of Scientific Computation,
Vol. 32, Issue 6, pp. 3426-3446 (2010).<br>
</p>
<hr>
<p>
Last updated: 21st of August, 2013.<br><br>
Last updated: August 8, 2019.<br><br>
July 27, 2010 by Bas Fagginger Auer,<br>
December 10, 2010 by A. N. Yzelman,<br>
March 27, 2012 by Bas Fagginger Auer,<br>
August 29, 2013 by Rob Bisseling and Bas Fagginger Auer.<br><br>
August 29, 2013 by Rob Bisseling and Bas Fagginger Auer,<br>
November 3, 2016 by Marco van Oort,<br>
September 7, 2017 by Marco van Oort.<br><br>
To <a href="http://www.staff.science.uu.nl/~bisse101/Mondriaan">
Home page Mondriaan package</a>.</p>
......@@ -281,6 +315,8 @@ Home page Mondriaan package</a>.</p>
</p>
<hr>
</div>
</body>
</html>
......
This diff is collapsed.
......@@ -4,24 +4,40 @@
<html>
<head>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="Content-type" content="text/html;charset=UTF-8">
<link href="style.css" rel="stylesheet" type="text/css">
<link href="print.css" rel="stylesheet" type="text/css" media="print">
<script type="text/javascript" src="script.js"></script>
<title>User's guide MondriaanOpt</title>
</head>
<body>
<div id="mainContainer">
<div id="pageNav">
<div><a href="USERS_GUIDE.html">Mondriaan</a></div>
<div><a href="MATLAB.html">MATLAB</a></div>
<div><a href="HYPERGRAPH.html">Hypergraphs</a></div>
</div>
<h2>User's guide MondriaanOpt</h2>
<div id="top">
<div><a href="#inst">Installing</a></div>
<div><a href="#outp">Output</a></div>
<div><a href="#opts">Options</a></div>
<div><a href="#matl">MATLAB</a></div>
<div id="menuPos"></div>
<div id="menu">
<div id="menuItems">
<div><a href="#inst">Installing</a></div>
<div><a href="#outp">Output</a></div>
<div><a href="#opts">Options</a></div>
<div><a href="#constraint">Imbalance constraint</a></div>
<div><a href="#matl">MATLAB</a></div>
</div>
</div>
<hr>
<p>
<p class="updateNote">
This page is continuously being improved and updated;
therefore, a more recent version may be obtained
<a href="http://www.staff.science.uu.nl/~bisse101/Mondriaan/Docs/USERS_GUIDE_OPT.html">
......@@ -30,39 +46,57 @@ This offline version is bundled with the software for your convenience.
</p>
<hr>
<h3><a name="inst">How to install MondriaanOpt</a></h3>
<p>
Whereas Mondriaan uses heuristics to obtain good partitionings for sparse matrix-vector multiplication for any number of processors,
MondriaanOpt calculates an optimal solution to this partitioning problem with 2 processors. More precisely, it
calculates a partitioning with minimum communication volume among all solutions that obey the imbalance constraint.
</p>
<p>
A database with already solved problems with use of MondriaanOpt can be found <a href="http://www.staff.science.uu.nl/~bisse101/Mondriaan/Opt/">online</a>.
</p>
<h3><a class="anchor" id="inst">How to install MondriaanOpt</a></h3>
<p>
MondriaanOpt comes packaged with the Mondriaan software. Refer to <a href="./USERS_GUIDE.html">this page</a> for
instructions on using Mondriaan. MondriaanOpt is automatically compiled when you compile Mondriaan. The executable
is then available at <tt>tools/MondriaanOpt</tt>.
is then available at <code>tools/MondriaanOpt</code>.
</p>
<h3><a class="anchor" id="run">How to run MondriaanOpt</a></h3>
<p>
Whereas Mondriaan uses heuristics to obtain good partitionings for sparse matrix-vector multiplication for any number of processors,
MondriaanOpt will calculate an actual optimal solution for this partitioning problem with 2 processors. More precisely, it will
calculate a partitioning with minimum volume among all solutions that obey the imbalance constraint.
The MondriaanOpt program has the following interface:
</p>
<ul><li><code>% ./tools/MondriaanOpt matrix [P [eps]] [options]</code></li></ul>
<p>
One, two or three parameters may be passed, after which further options may be given.
Either [eps], -e or -k must be passed, and it is advised to pass -v (see <a href="#opts">options</a>).
Take note that while MondriaanOpt may be called with the same parameters as Mondriaan, the actual problem
being solved may be <a href="#constraint">slightly different</a>.
</p>
<h3><a name="run">How to run MondriaanOpt</a></h3>
<p>
Go inside the directory <tt>Mondriaan4</tt> and type
Some equivalent examples are:
</p>
<ul>
<li><tt>% cd tools</tt></li>
<li><tt>% ./MondriaanOpt -m ../tests/arc130.mtx -e 0.03 -v 20</tt></li>
<li><code>% ./tools/MondriaanOpt tests/arc130.mtx 2 0.03 -v 17</code></li>
<li><code>% ./tools/MondriaanOpt tests/arc130.mtx -e 0.03 -v 17</code></li>
<li><code>% ./tools/MondriaanOpt tests/arc130.mtx -k 660 -v 17</code></li>
</ul>
<p>
if you want to partition the <tt>arc130.mtx</tt> matrix (Matrix Market file format)
The above examples partition the <code>arc130.mtx</code> matrix (Matrix Market file format)
for 2 processors with at most 3% load imbalance, knowing that solutions must exist with
volume at most 20. The matrix should be the full relative path; <em>in the above example
output is saved in the Mondriaan tests folder</em> (<tt>../tests/</tt>).
volume at most 17. The matrix should be the full relative path; <em>in the above example
output is saved in the Mondriaan tests folder</em> (<code>../tests/</code>).
</p>
<h3><a name="outp">Output</a></h3>
<h3><a class="anchor" id="outp">Output</a></h3>
<p>The <tt>MondriaanOpt</tt> tool yields, after a successful run on an input matrix,
<p>The <code>MondriaanOpt</code> tool yields, after a successful run on an input matrix,
various output files. All possible output files are described below. Typically,
the output filenames are that of the input matrix filename, modified with a small
descriptor and the number of parts <i>x(=2)</i>.</p>
descriptor and the number of parts <i>(=2)</i>.</p>
<h4><u>Formats with free nonzeros</u></h4>
<p><i>
......@@ -71,139 +105,219 @@ All free nonzeros will be assigned index 3.
(Free nonzeros are nonzeros that are not assigned to a processor because assigning
it to either one will not influence communication volume.)
To stress the potential presence of free nonzeros, the number of processors (2) in the
filename is followed by a suffix <tt>f</tt>
filename is followed by a suffix <code>f</code>.
</i></p>
<h4>Processor indices (<tt>-I2f</tt>)</h4>
<p> The <tt>MondriaanOpt</tt> program writes the processor indices of each
nonzero to the Matrix Market file <tt>input-2f.mtx</tt> where the value of each
<div class="indent4">
<h4>Processor indices (<code>-I2f</code>)</h4>
<p> The <code>MondriaanOpt</code> program writes the processor indices of each
nonzero to the Matrix Market file <code>input-2f.mtx</code> where the value of each
nonzero is replaced by the processor index to which the nonzero has been assigned.
</p>
</div>
<h4>Graphical output (<tt>-2f.svg</tt>)</h4>
<p>Besides textual output, also an SVG graphic is written to the file <tt>input-2f.svg</tt>,
<div class="indent4">
<h4>Graphical output (<code>-2f.svg</code>)</h4>
<p>If the option <code>-svg</code> is given, at the end of the algorithm an SVG graphic is written to the file <code>input-2f.svg</code>,
containing a visualisation of the partitioning.
</p>
</div>
<h4><u>Formats without free nonzeros</u></h4>
<p><i>
Here, the free nonzeros of a partitioning are distributed among the two processors in
such a way that load imbalance is kept at a minimum.
Note that whenever we write <code>P</code> for the number of processors below, it implicitly equals 2.
</i></p>
<h4>Distributed matrix (<tt>-Px</tt>)</h4>
<p> The <tt>MondriaanOpt</tt> program
writes the distributed matrix to a file called <tt>input-Px</tt>,
where <tt>input</tt> is the name of the input matrix, where x equals 3 when the
distribution includes free nonzeros, and x equals 2 otherwise.
<div class="indent4">
<h4>Distributed matrix (<code>-P2</code>)</h4>
<p> The <code>MondriaanOpt</code> program
writes the distributed matrix to a file called <code>input-P2</code>,
where <code>input</code> is the name of the input matrix.
We use an adapted Matrix Market format, with this structure:
<br>
<tt>%%MatrixMarket distributed-matrix coordinate real general<br>
<code>%%MatrixMarket distributed-matrix coordinate real general<br>
m n nnz P<br>
Pstart[0]</tt> ( this should be 0 )<br>
Pstart[0]</code> ( this should be 0 )<br>
...<br>
...<br>
...<br>
<tt>Pstart[P]</tt>( this should be nnz )<br>
<tt>A.i[0] A.j[0] A.value[0]</tt>
<code>Pstart[P]</code>( this should be nnz )<br>
<code>A.i[0] A.j[0] A.value[0]</code>
...<br>
...<br>
...<br>
<tt>A.i[nnz-1] A.j[nnz-1] A.value[nnz-1]</tt>
<code>A.i[nnz-1] A.j[nnz-1] A.value[nnz-1]</code>
<br>
Here, <tt>Pstart[k]</tt> points to the start of the nonzeroes
Here, <code>Pstart[k]</code> points to the start of the nonzeroes
of processor k.
</p>
</div>
<h4>Processor indices (<tt>-Ix</tt>)</h4>
<p> The <tt>MondriaanOpt</tt> program
also writes the processor indices of each nonzero to the Matrix Market file <tt>input-Ix</tt>
<div class="indent4">
<h4>Processor indices (<code>-I2</code>)</h4>
<p> The <code>MondriaanOpt</code> program
also writes the processor indices of each nonzero to the Matrix Market file <code>input-I2</code>
where the value of each nonzero is replaced by the processor index to which
the nonzero has been assigned. The order of the nonzeroes is exactly that of the distributed matrix (<tt>-Px</tt>).
the nonzero has been assigned. The order of the nonzeroes is exactly that of the distributed matrix (<code>-P2</code>).
</p>
</div>
<h4>Cartesian submatrices (<tt>-Cx</tt>)</h4>
<div class="indent4">
<h4>Cartesian submatrices (<code>-C2</code>)</h4>
<p> The program writes the row index sets I(q)
and column index sets J(q) of the Cartesian submatrix I(q) x J(q)
for the processors q=1,...,P to the file called <tt>input-Cx</tt>.
for the processors q=1,...,P to the file called <code>input-C2</code>.
This file is additional information, useful e.g. for visualisation,
and you may not need it.
</p>
</div>
<div class="indent4">
<h4>Graphical output (<code>-2.svg</code>)</h4>
<p>If the option <code>-svg</code> is given, at the end of the algorithm an SVG graphic is written to the file <code>input-2.svg</code>,
containing a visualisation of the partitioning.
</p>
</div>
<h3><a name="opts">Program options</a></h3>
<h4><u>Output to <code>stdout</code>/<code>stderr</code></u></h4>
<p>
The MondriaanOpt options can be passed in the command line.
An overview of the options is given below.
In a successful run, at the end of execution general statistics are written to <code>stdout</code>.
Also, during such a run, every <code>2^23 = 8388608</code> iterations the current depth in the tree is written to <code>stderr</code>, in the format <code>`current depth`/`maximum depth`</code>.
Last but not least, every time a new solution is found which improves on the previous solution regarding total volume, a message is written to <code>stderr</code> reporting the newly found volume and load distribution in the format <code>`load P0`, `load P1`, `load Free`</code>.
</p>
<h3><a class="anchor" id="opts">Program options</a></h3>
<p>
The MondriaanOpt program has the following interface:
</p>
<ul><li><code>% ./tools/MondriaanOpt matrix [P [eps]] [options]</code></li></ul>
<p>
One, two or three parameters may be passed, after which further options may be given.
An overview of the available parameters and options is given below.
</p>
<div class="indent4">
<h4><u>Parameters</u></h4>
<table>
<thead>
<tr>
<th>Option</th>
<th>Value</th>
<th>Parameter</th>
<th>Name</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>-m</td>
<td><code>matrix</code></td>
<td>Matrix file</td>
<td><i>Required.</i> The input matrix file in Matrix Market (.mtx) format</td>
</tr>
<tr>
<td><code>P</code></td>
<td>Number of processors</td>
<td>Present for consistency with other Mondriaan* commands. This parameter, if given, must be equal to 2.</td>
</tr>
<tr>
<td><code>eps</code></td>
<td>Load imbalance</td>
<td>The maximum allowed load imbalance</td>
</tr>
</tbody>
</table>
</div>
<div class="indent4">
<h4><u>Options</u></h4>
<p>
Apart from the matrix, at least one of [eps], -e or -k must be given, defining the maximum allowed load imbalance.
</p>
<table>
<thead>
<tr>
<th>Option</th>
<th>Value</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>-v</td>
<td>Volume</td>
<td><i>Required.</i> The starting upper bound volume</td>
<td>
<i>Recommended.</i> The starting upper bound volume.
This defaults to <code>min(m,n)+1</code>, with <code>m</code> and <code>n</code> denoting the dimensions of the matrix to be partitioned.
While this is a valid upper bound, you may wish to pass a tighter upper bound to reduce computing time.
</td>
</tr>
<tr>
<td>-e</td>
<td>Load imbalance</td>
<td><i>Required if -k is not passed.</i> The allowed load imbalance</td>
<td>The maximum allowed load imbalance</td>
</tr>
<tr>
<td>-k</td>
<td>Number of nonzeros</td>
<td><i>Required if -e is not passed.</i> The maximum allowed number of nonzeros per part</td>
<td>The maximum allowed number of nonzeros per part</td>
</tr>
<tr>
<td>-t</td>
<td>Seconds</td>
<td>Max running time in seconds</td>
</tr>
<tr>
<td>-r</td>
<td>Dumpfile</td>
<td>Resume with given dumpfile</td>
</tr>
<tr>
<td>-h</td>
<td><i>None</i></td>
<td>Show help</td>
</tr>
<tr>
<td>-svg</td>
<td><i>None</i></td>
<td>Write visualisations of the partitioning to <code>.svg</code> files</td>
</tr>
</tbody>
</table>
</div>
<h3><a class="anchor" id="constraint">Difference in imbalance constraints</a></h3>
<p>
While the command line interface of MondriaanOpt can be used just as Mondriaan, there is a subtle difference in the problem being solved in these two.
More precisely, with <code>N</code> being the total number of nonzeros, <code>p</code> being the total number of processors (which equals 2) and
<code>load</code> the number of nonzeros assigned to a processor, compare:
</p>
<ul>
<li>the imbalance constraint <code>load &lt;= (1+epsilon) (N/p)</code> which is used in Mondriaan (In the code, this amounts to <code>floor( ((1+epsilon)*N)/p )</code>.), and</li>
<li>the imbalance constraint <code>load &lt;= (1+epsilon) ceil(N/p)</code> which is used in MondriaanOpt.</li>
</ul>
<p>
As <code>p=2</code>, this difference may only be of importance whenever <code>N</code> is odd.
In [<a href="#cite1">1,p.2</a>] it is explained that this different choice was made to ensure feasibility of the problem, even if <code>epsilon=0</code>.
</p>
<h3><a name="matl">Using MondriaanOpt in MATLAB</a></h3>
<h3><a class="anchor" id="matl">Using MondriaanOpt in MATLAB</a></h3>
<p>
For more information about MATLAB usage, please see the <a href="MATLAB.html">Mondriaan MATLAB guide</a>.
MondriaanOpt is available in Matlab using the <tt>MatlabMondriaanOpt</tt> MEX routine. Example matlab files
are given as <tt>mondriaanOpt.m</tt> and <tt>mondriaanOptPlot.m</tt>.
MondriaanOpt is available in Matlab using the <code>MatlabMondriaanOpt</code> MEX routine. Example matlab files
are given as <code>mondriaanOpt.m</code> and <code>mondriaanOptPlot.m</code>.
</p>
<h3>References</h3>
<p>
[<a name="cite1" href="http://doi.org/10.1016/j.jpdc.2015.06.005">1</a>]
[<a id="cite1" href="http://doi.org/10.1016/j.jpdc.2015.06.005">1</a>]
<em>An exact algorithm for sparse matrix bipartitioning</em>,
Daniel M. Pelt and Rob H. Bisseling, <i>Journal of Parallel and Distributed Computing</i>, <b>85</b> (2015) pp. 79-90.
</p>
<hr>
<p>
Last updated: October 3, 2016.<br><br>
October 3, 2016 by Marco van Oort.<br><br>
Last updated: September 7, 2017.<br><br>
November 4, 2016 by Marco van Oort,<br>
September 7, 2017 by Marco van Oort.<br><br>
To <a href="http://www.staff.science.uu.nl/~bisse101/Mondriaan">
the Mondriaan package home page</a>.</p>
......@@ -218,6 +332,8 @@ the Mondriaan package home page</a>.</p>
<hr>
</div>
</body>
</html>
......
body {
text-align: left;
background-color: #DDD;
color: black;
}
#pageNav {
display: none;
}
#menu {
position: relative;
height: 40px;
width: 100%;
text-align: center;
}
#menu A {
color: black;
}
div#menu div {
display: inline;
position: relative;
margin-right: 17px;
}
.center {
width:100%;
text-align:center;
}
.image {
float:left;
padding:20px;
}
.squareimage img {
width:256px;
height:256px;
padding:10px;
border-style:solid;
border-width:2px;
}
.wideimage img {
width:572px;
height:207px;
padding:10px;
border-style:solid;
border-width:2px;
}
.image .caption {
font-size:small;
}
var menuPos = 0, isFixed = false;
window.addEventListener('load', function(evt) {
// Check whether we have a menuPos element
menuPosEl = document.getElementById('menuPos');
if(typeof(menuPosEl) == 'undefined') {
return;
}
// Get position of menu
menuPos = menuPosEl.getBoundingClientRect().top - document.getElementsByTagName('html')[0].getBoundingClientRect().top;
// Fix menu when we scroll under it
document.addEventListener('scroll', function(evt) {
var isDown = document.body.scrollTop > menuPos || document.documentElement.scrollTop > menuPos;
if (isDown && !isFixed) {
isFixed = true;
var menu = document.getElementById('menu');
document.getElementById('menuPos').style.height = menu.offsetHeight+"px";
menu.className = "fixed";
}
else if(!isDown && isFixed) {
isFixed = false;
document.getElementById('menuPos').style.height = "0px";
document.getElementById('menu').className = "";
}
});
});
* {
margin: 0px;
padding: 0px;
border: 0px;
}
body {
text-align: left;
background-color: #DDD;
color: black;
color: #333;
font-family: Verdana;
font-size: 10pt;
}
h2 {
margin-bottom: 0.5em;
}
h3 {
border-bottom: 1px solid #888;
margin-top: 30px;
}
h4 {
margin-top: 1em;
}
li {
margin-left: 45px;
}
#top {
p {
margin: 1em 0px;
}
div#mainContainer {
width: 80%;
margin: 10px auto;
background-color: #fff;
border-radius: 20px;
padding: 20px;
position: relative;
}
#menu {
position: relative;
height: 40px;
width: 100%;
background-color: #f8f8f8;
border: 1px solid #999;
border-width: 1px 0px;
padding: 8px 0px;
}
#menuItems {
max-width: 800px;
text-align: center;
margin: 0px auto;
height: 16px;
overflow: hidden;
}
#menuItems:hover {
height: auto;
overflow: auto;
}
#menu.fixed {
width: 80%;
position: fixed;
top: 0px;
left: 0px;
right: 0px;
margin: 0px auto;
}
#top A {
#menu A {
color: black;
}
div#top div {
div#menuItems div {
display: inline;
position: relative;
margin-right: 17px;
}
/* Make sure anchor links arrive where they should (+2em due to fixed menu) */
a.anchor:before { content: ''; display: block; position: relative; width: 0; height: 2em; margin-top: -2em }
#pageNav {
float: right;
text-align: right;
}
#pageNav a {
color: black;
}
div#pageNav div {
display: inline;
position: relative;
margin-right: 17px;
}
p.updateNote {
border: 1px solid #ccc;
font-style: italic;
padding: 5px;
margin: 10px 5px;
background-color: #f8f8f8;
}
.center {
width:100%;
text-align:center;
......@@ -34,7 +119,7 @@ div#top div {
.squareimage img {
width:256px;
height:256px;
max-width: 100%;
padding:10px;
border-style:solid;
border-width:2px;
......@@ -42,7 +127,7 @@ div#top div {
.wideimage img {
width:572px;
height:207px;
max-width: 100%;
padding:10px;
border-style:solid;
border-width:2px;
......@@ -52,3 +137,32 @@ div#top div {
font-size:small;
}
div.indent4 {
border-left: 1px solid #888;
padding-left: 10px;
margin-top: 15px;
margin-bottom: 25px;
}
ul.options li {
padding-bottom: 10px;
}
code.option_name {
font-style: italic;
}
@media (max-width:600px) {
div#mainContainer {
width: auto;
margin: 0px;
border-radius: 0px;
padding: 10px;
}
#menu.fixed {
width: auto;
margin: 0px 10px;
}
}
......@@ -8,9 +8,12 @@
# Absolute path of the directory which contains this file (included last in every Makefile in the subdirectories).
MONDRIAANHOMEDIR:= $(abspath $(dir $(lastword $(MAKEFILE_LIST))))
MONDRIAANCURRENTVERSION := 4.0
MONDRIAANCURRENTVERSION := 4.2.1
MONDRIAANMAJORVERSION := 4
# Gainbucket type must be either LIST or ARRAY
GAINBUCKET_TYPE := ARRAY
# ==== Matlab support ====
#
# To enable Matlab support, please uncomment the following line and insert the correct (global) Matlab home directory and the correct suffix to Matlab MEX files for your architecture.
......@@ -29,10 +32,12 @@ MEXSUFFIX := mexa64
# ==== Compiler flags ====
# Debug/verbose, standard (default), performance flags.
#CFLAGS := -Wall -Wextra -Wshadow -Wno-unused-parameter -ansi -pedantic -O2 -DTIME -DUNIX -DINFO -g -DMONDRIAANVERSION=\"${MONDRIAANCURRENTVERSION}\"
#CFLAGS := -Wall -O2 -DMONDRIAANVERSION=\"${MONDRIAANCURRENTVERSION}\"
CFLAGS := -Wall -O3 -ffast-math -funroll-loops -fomit-frame-pointer -std=c99 -DMONDRIAANVERSION=\"${MONDRIAANCURRENTVERSION}\"
#CFLAGS := -Wall -O3 -pg -ffast-math -funroll-loops -std=c99 -DMONDRIAANVERSION=\"${MONDRIAANCURRENTVERSION}\"
CFLAGS := -DMONDRIAANVERSION=\"${MONDRIAANCURRENTVERSION}\" -DGAINBUCKET_${GAINBUCKET_TYPE}
#CFLAGS := ${CFLAGS} -Wall -Wextra -Wshadow -Wno-unused-parameter -ansi -pedantic -O2 -DTIME -DUNIX -DINFO -g
#CFLAGS := ${CFLAGS} -Wall -O2
CFLAGS := ${CFLAGS} -Wall -O3 -ffast-math -funroll-loops -fomit-frame-pointer -std=c99
#CFLAGS := ${CFLAGS} -Wall -O3 -pg -ffast-math -funroll-loops -std=c99
# ==== Standard compilation options (it should not be necessary to change these) ====
......
......@@ -16,6 +16,9 @@ int SplitMatrixKLFM(struct sparsematrix *pT, int k, int i, int dir,
int SplitMatrixSimple(struct sparsematrix *pT, int k, int i,
long weightlo, long weighthi, const struct opts *pOptions);
int SplitMatrixZeroVolume(struct sparsematrix *pT, int k, int i,
long weightlo, long weighthi, const struct opts *pOptions);
#ifdef USE_PATOH
struct patohnz {
int P;
......@@ -407,6 +410,17 @@ int DistributeMatrixMondriaan(struct sparsematrix *pT, int P, double eps, const
/* Setup Mondriaan options. */
maxweight = ((1 + eps) * totweight) / P; /* rounded down */
#ifdef INFO
if(ceil(totweight/(double)P) > maxweight) {
/* Compute minimum epsilon, rounded up at 5th decimal place */
double eps_min = ceil(totweight/(double)P) * (P/(double)totweight) - 1;
eps_min = ceil(eps_min * 100000)/100000;
fprintf(stderr, "Info: The posed problem is infeasible, hence the resulting matrix distribution will not satisfy the balance constraint.\n");
fprintf(stderr, " For the problem to be feasible, epsilon should be at least %.5lf.\n", eps_min);
}
#endif
if (pOptions->SplitStrategy == OneDimRow)
dir = ROW;
......@@ -484,7 +498,13 @@ int DistributeMatrixMondriaan(struct sparsematrix *pT, int P, double eps, const
#ifdef INFO2
printf(" ******** Split part %d from %d parts******** \n", i, k);
#endif
if (pOptions->SplitMethod == Simple) {
int foundZVS = FALSE; /* Whether a zero volume split has been found */
if (pOptions->ZeroVolumeSearch == ZeroVolYes && (foundZVS = SplitMatrixZeroVolume(pT, k, i, weightlo, weighthi, pOptions))) {
#ifdef INFO
printf("Found zero volume split!\n");
#endif
}
else if (pOptions->SplitMethod == Simple) {
/* Simple split of the matrix only based on load balance,
not minimising communication volume.
Useful for testing and debugging */
......@@ -503,6 +523,43 @@ int DistributeMatrixMondriaan(struct sparsematrix *pT, int P, double eps, const
fprintf(stderr, "DistributeMatrixMondriaan(): Unknown SplitMethod!\n");
return FALSE;
}
/* Shift weight and procs */
for (j = k; j > i+1; j--) {
weight[j] = weight[j-1];
procs[j] = procs[j-1];
}
k++; /* new number of parts */
weight[i] = ComputeWeight(pT, pT->Pstart[i], pT->Pstart[i+1]-1, NULL, pOptions);
weight[i+1] = ComputeWeight(pT, pT->Pstart[i+1], pT->Pstart[i+2]-1, NULL, pOptions);
if (weight[i] < 0 || weight[i + 1] < 0) {
fprintf(stderr, "DistributeMatrixMondriaan(): Unable to compute weights!\n");
return FALSE;
}
procslo = procs[i]/2;
procshi = (procs[i]%2==0 ? procslo : procslo+1);
if (weight[i] <= weight[i+1]) {
procs[i] = procslo;
procs[i+1] = procshi;
} else {
procs[i] = procshi;
procs[i+1] = procslo;
}
/* Apply free nonzero search if enabled, but only for (symmetric) finegrain and mediumgrain strategies */
if(pOptions->ImproveFreeNonzeros == FreeNonzerosYes && !foundZVS && (pOptions->SplitStrategy == FineGrain ||
pOptions->SplitStrategy == SFineGrain || pOptions->SplitStrategy == MediumGrain)) {
ImproveFreeNonzeros(pT, pOptions, procs, i, i+1);
weight[i] = ComputeWeight(pT, pT->Pstart[i], pT->Pstart[i+1]-1, NULL, pOptions);
weight[i+1] = ComputeWeight(pT, pT->Pstart[i+1], pT->Pstart[i+2]-1, NULL, pOptions);
}
#ifdef INFO2
printf(" Pstart[%d] = %ld ", i, pT->Pstart[i]);
printf("Pstart[%d] = %ld ", i+1, pT->Pstart[i+1]);
......@@ -584,34 +641,6 @@ int DistributeMatrixMondriaan(struct sparsematrix *pT, int P, double eps, const
}
}
}
/* Shift weight and procs */
for (j = k; j > i+1; j--) {
weight[j] = weight[j-1];
procs[j] = procs[j-1];
}
k++; /* new number of parts */
weight[i] = ComputeWeight(pT, pT->Pstart[i], pT->Pstart[i+1]-1, NULL, pOptions);
weight[i+1] = ComputeWeight(pT, pT->Pstart[i+1], pT->Pstart[i+2]-1, NULL, pOptions);
if (weight[i] < 0 || weight[i + 1] < 0) {
fprintf(stderr, "DistributeMatrixMondriaan(): Unable to compute weights!\n");
return FALSE;
}
procslo = procs[i]/2;
procshi = (procs[i]%2==0 ? procslo : procslo+1);
if (weight[i] <= weight[i+1]) {
procs[i] = procslo;
procs[i+1] = procshi;
} else {
procs[i] = procshi;
procs[i+1] = procslo;
}
/* Check if there is a part that is too large */
done = TRUE;
......@@ -663,6 +692,62 @@ int DistributeMatrixMondriaan(struct sparsematrix *pT, int P, double eps, const
printf(" Column lambda histogram:\n");
VerifyLambdas(pT->ColLambda, pT->n, P);
#endif
if(pOptions->CheckUpperBound == CheckUpperBoundYes) {
/* Check whether we can apply the upper bound algorithm */
int isSymmetric = ((pT->MMTypeCode[3]=='S' || pT->MMTypeCode[3]=='K' || pT->MMTypeCode[3]=='H') &&
pOptions->SymmetricMatrix_UseSingleEntry == SingleEntYes);
int hasDummies = (pT->NrDummies > 0);
int isColWeighted = (pT->MMTypeCode[0] == 'W' && pT->NrColWeights > 0);
/* To support OrderPermutation, new code should be written to recompute the permutation from scratch */
int orderPermute = (pOptions->OrderPermutation != OrderNone);
/* Do not apply the upperBound check when classic Mondriaan is used, as this algorithm may cut in two directions */
int canApply = (pOptions->SplitStrategy == FineGrain || pOptions->SplitStrategy == SFineGrain || pOptions->SplitStrategy == MediumGrain);
if(!isSymmetric && !hasDummies && !isColWeighted && !orderPermute && canApply) {
/* Compute volume. It should be at most (min(m,n)+1)(P-1) */
long ComVol1, ComVol2, tmp;
CalcCom(pT, NULL, (pT->m < pT->n)?ROW:COL, &ComVol1, &tmp, &tmp, &tmp, &tmp);
CalcCom(pT, NULL, (pT->m < pT->n)?COL:ROW, &ComVol2, &tmp, &tmp, &tmp, &tmp);
long upperBound = (((pT->m < pT->n)?pT->m:pT->n)+1)*(P-1);
if(ComVol1+ComVol2 > upperBound) {
#ifdef INFO
printf("Info: Achieved volume %ld is larger than upper bound %ld. Attempting to generate upper bound solution.\n", ComVol1+ComVol2, upperBound);
#endif
if (!SplitMatrixUpperBound(pT, P, pOptions)) {
fprintf(stderr, "DistributeMatrixMondriaan(): Unable to compute upper bound solution!\n");
}
else {
/* Update variables to reflect new distribution */
k = P;
for (j = 0; j < P; j++) {
weight[j] = ComputeWeight(pT, pT->Pstart[j], pT->Pstart[j+1]-1, NULL, pOptions);
procs[j] = 1;
}
#ifdef INFO2
printf(" Number of parts = %d \n", k);
printf(" Pstart = ");
for (j = 0; j <= P; j++)
printf("%ld ", pT->Pstart[j]);
printf("\n\n");
#endif
#ifdef INFO2
/* Print all lambdas. */
printf(" Row lambda histogram:\n");
VerifyLambdas(pT->RowLambda, pT->m, P);
printf(" Column lambda histogram:\n");
VerifyLambdas(pT->ColLambda, pT->n, P);
#endif
}
}
}
}
/* Set matrix type code to distributed */
pT->MMTypeCode[0] = 'D';
......@@ -1505,3 +1590,111 @@ int SplitMatrixSimple(struct sparsematrix *pT, int k, int i,
return TRUE;
} /* end SplitMatrixSimple */
int SplitMatrixZeroVolume(struct sparsematrix *pT, int k, int i,
long weightlo, long weighthi, const struct opts *pOptions) {
/* This function splits part i of the sparse matrix T into two parts,
the first with weight <= weightlo and the second with weight <= weighthi.
The split is performed by searching for a split with zero communication
volume. If such a split is found, it is applied to pT. Otherwise, pT is
left untouched.
Input: T sparse matrix,
k current number of parts, 1 <= k < P,
i number of part to be split, 0 <= i < k,
weightlo = smallest upper bound for part weight, belongs to part 0
weighthi = largest upper bound for part weight, belongs to part 1.
Output: T sparse matrix.
The following applies if a zero volume split is found:
The nonzeros of the new part i (the first part) are in positions
pT->Pstart[i], pT->Pstart[i+1]-1.
The nonzeros of the new part i+1 (the second) are in positions
pT->Pstart[i+1], pT->Pstart[i+2]-1.
All parts > i+1 have been shifted.
*/
long lo, hi, mid = 0, nz, weight;
int j;
struct sparsematrix A;
if (!pT || !pOptions) {
fprintf(stderr, "SplitMatrixZeroVolume(): Null arguments!\n");
return FALSE;
}
lo = pT->Pstart[i];
hi = pT->Pstart[i+1]-1;
nz = hi-lo+1;
if (nz > 0) {
weight = ComputeWeight(pT, lo, hi, NULL, pOptions);
if (weight > weightlo + weighthi || weight < 0) {
/* Desired split is infeasible */
fprintf(stderr, "SplitMatrixZeroVolume(): desired split is infeasible!\n");
return FALSE;
}
/* Copy info from T to A */
A = *pT; /* A has same size as T, and same other parameters, */
A.i = &(pT->i[lo]); /* but only a subset of the nonzeros */
A.j = &(pT->j[lo]);
if (A.MMTypeCode[2] != 'P')
A.ReValue = &(pT->ReValue[lo]);
if (A.MMTypeCode[2] == 'C')
A.ImValue = &(pT->ImValue[lo]);
A.NrNzElts = nz;
#ifdef INFO
#ifdef TIME
clock_t starttime, endtime;
double cputime;
starttime = clock();
#ifdef UNIX
struct timeval starttime1, endtime1;
gettimeofday(&starttime1, NULL);
#endif
#endif
#endif
/* Run zero volume search. If a zero volume split is found, ZeroVolumeSearch()
* will apply this split directly; we then only need to update Pstart.
*/
int foundZeroVolumePartition = ZeroVolumeSearch(&A, weightlo, weighthi, &mid, pOptions);
#ifdef INFO
#ifdef TIME
endtime = clock();
cputime = ((double) (endtime - starttime)) / CLOCKS_PER_SEC;
printf(" ZeroVolumeSearch CPU-time : %f seconds\n", cputime);
#ifdef UNIX
gettimeofday(&endtime1, NULL);
printf(" ZeroVolumeSearch elapsed time: %f seconds\n",
(endtime1.tv_sec - starttime1.tv_sec) +
(endtime1.tv_usec - starttime1.tv_usec) / 1000000.0);
#endif
fflush(stdout);
#endif
#endif
if(!foundZeroVolumePartition) {
return FALSE;
}
}
else {
mid = 0; /* Pstart[i] = Pstart[i+1] */
}
/* Shift Pstart for parts > i */
for (j = k; j > i; j--)
pT->Pstart[j+1] = pT->Pstart[j];
/* Register new splitting point.
* mid lies in [0,hi-lo], translate it to [lo,hi] */
pT->Pstart[i+1] = lo + mid;
return TRUE;
} /* end SplitMatrixZeroVolume */
......@@ -7,6 +7,9 @@
#include "Sort.h"
#include "SparseMatrix.h"
#include "Remembrance.h"
#include "ZeroVolumeSearch.h"
#include "FreeNonzeros.h"
#include "SplitMatrixUpperBound.h"
/* Function declarations for DistributeMat.c */
long ComputeWeight(const struct sparsematrix *pT, long lo, long hi, long *wnz, const struct opts *pOptions);
......
......@@ -923,3 +923,75 @@ int WriteVectorCollection(long int **X, const char *name, const long i, const lo
return TRUE;
} /* end WriteVectorCollection */
long * ReadVector(const char base, long *l, int *P, FILE *fp) {
/* Base vector-reading function. Automatically adapts base of array to read.
Assumes that we are writing a distribution vector with P>0.
base is the base value of X (usually 0).
l is the length of the vector read (X).
P is the number of processors.
The return value is the array of vector values (X), or NULL if an error occurs.
*/
long t;
if (!fp) {
fprintf(stderr, "ReadVector(): Null arguments!\n");
return NULL;
}
if (ferror(fp)) {
fprintf(stderr, "ReadVector(): Unable to write to stream!\n");
return NULL;
}
int SizeRead=FALSE, count=0;
char *line, linebuffer[MAX_LINE_LENGTH];
while (SizeRead == FALSE &&
(line = fgets(linebuffer, MAX_LINE_LENGTH, fp)) != NULL) {
/* a new line has been read */
if (strlen(line) > 0) {
if (linebuffer[0] != '%') {
/* The size line */
/* Read l and P from the line */
count = sscanf(line, "%ld %d\n", l, P);
if (count < 2 || *l < 1 || *P < 0) {
fprintf(stderr, "ReadVector(): Error in vector size!\n");
return NULL;
}
else {
SizeRead = TRUE;
}
}
}
}
if(*P == 0) {
fprintf(stderr, "ReadVector(): This function has not been implemented for undistributed vectors!\n");
return NULL;
}
long *X = (long *)malloc((*l)*sizeof(long));
long tmp;
if (X == NULL) {
fprintf(stderr, "ReadVector(): Not enough memory!\n");
return NULL;
}
for (t = 0; t < *l; t++) {
if(fscanf(fp, "%ld %ld\n", &tmp, &(X[t])) != 2) {
fprintf(stderr, "ReadVector(): Read Error!\n");
return NULL;
}
X[t] -= (1-base);
}
return X;
} /* end ReadVector */
......@@ -33,5 +33,7 @@ void PrintVecStatistics(int P, long *Ns, long *Nr, long *Nv);
int WriteVector(const long int *X, const char base, const char *name, long l, int P, FILE *fp, const struct opts *pOptions);
int WriteVectorDistribution(const long int *X, const char *name, long l, int P, FILE *fp, const struct opts *pOptions);
int WriteVectorCollection(long int **X, const char *name, const long i, const long *j, FILE *fp);
long * ReadVector(const char base, long *l, int *P, FILE *fp);
#endif /* __DistributeVecLib_h__ */
#include "FreeNonzeros.h"
/**
* Swap nonzeros s and t of matrix pM
*/
void SwapNonzero(struct sparsematrix *pM, long s, long t) {
SwapLong(pM->i, s, t);
SwapLong(pM->j, s, t);
if(pM->MMTypeCode[2] != 'P')
SwapDouble(pM->ReValue, s, t);
if(pM->MMTypeCode[2] == 'C')
SwapDouble(pM->ImValue, s, t);
} /* end SwapNonzero */
/**
* Improve load balance by moving free nonzeros between two specified processors.
* This function works for general and symmetric matrices.
* It works when dummy nonzeros are present: dummies will not be moved as they do not contribute weight.
* It does not work when weights are based on column weights.
*
* Input:
* pOptions Options struct
* procs abs(procs[i]) = Number of processors each part should still be distributed over
* p1, p2 Processors to condider
*
* Input/output:
* pM The matrix
*
* Return value: FALSE if an error occurred, TRUE otherwise
*/
int ImproveFreeNonzeros(struct sparsematrix *pM, const struct opts *pOptions, const int *procs, int p1, int p2) {
long nnz1 = ComputeWeight(pM, pM->Pstart[p1], pM->Pstart[p1+1]-1, NULL, pOptions);
long nnz2 = ComputeWeight(pM, pM->Pstart[p2], pM->Pstart[p2+1]-1, NULL, pOptions);
long i, j, t, nzWeight;
int symmetric = pM->m == pM->n &&
(pM->MMTypeCode[3]=='S' || pM->MMTypeCode[3]=='K' || pM->MMTypeCode[3]=='H') &&
pOptions->SymmetricMatrix_UseSingleEntry == SingleEntYes;
int dummies = pM->m == pM->n && pM->NrDummies > 0;
if(nnz1/abs(procs[p1]) > nnz2/abs(procs[p2])) {
long tmp = p1;
p1 = p2;
p2 = tmp;
tmp = nnz1;
nnz1 = nnz2;
nnz2 = tmp;
}
/* Now p2 is relatively larger than p1 */
if((nnz1+1)/(double)abs(procs[p1]) >= (nnz2-1)/(double)abs(procs[p2])) {
return TRUE;
}
char *rowsP1 = (char *)malloc(pM->m * sizeof(char));
char *colsP1 = (char *)malloc(pM->n * sizeof(char));
if(rowsP1 == NULL || colsP1 == NULL) {
fprintf(stderr, "ImproveFreeNonzeros(): Not enough memory!");
if(rowsP1 != NULL)
free(rowsP1);
if(colsP1 != NULL)
free(colsP1);
return FALSE;
}
for(i=0; i<pM->m; ++i) {
rowsP1[i] = 0;
}
for(j=0; j<pM->n; ++j) {
colsP1[j] = 0;
}
/* In what columns/rows does P(1) have nonzeros? */
for(t=pM->Pstart[p1]; t<pM->Pstart[p1+1]; ++t) {
rowsP1[pM->i[t]] = 1;
colsP1[pM->j[t]] = 1;
if(symmetric) {
rowsP1[pM->j[t]] = 1;
colsP1[pM->i[t]] = 1;
}
}
for(t=pM->Pstart[p2]; t<pM->Pstart[p2+1]; ++t) {
if(rowsP1[pM->i[t]] == 1 && colsP1[pM->j[t]] == 1) {
/* This nonzero is free. As p2 is relatively large, move it */
if(dummies && pM->i[t] == pM->j[t] && pM->dummy[pM->i[t]]) {
continue;
}
nzWeight = (symmetric && pM->i[t] != pM->j[t])?2:1;
if((nnz1+nzWeight)/(double)abs(procs[p1]) > (nnz2-nzWeight)/(double)abs(procs[p2])) {
free(rowsP1);
free(colsP1);
return TRUE;
}
/* Move the nonzero */
if(p2 > p1) {
SwapNonzero(pM, t, pM->Pstart[p2]);
++pM->Pstart[p2];
}
else {
SwapNonzero(pM, t, pM->Pstart[p2+1]-1);
--pM->Pstart[p2+1];
}
/* Update bookkeeping */
nnz1 += nzWeight;
nnz2 -= nzWeight;
}
}
/* Finish */
free(rowsP1);
free(colsP1);
return TRUE;
} /* end ImproveFreeNonzeros */
#ifndef __FreeNonzeros_h__
#define __FreeNonzeros_h__
#include "Sort.h"
#include "SparseMatrix.h"
#include "DistributeMat.h"
struct freeIndex {
long t;
long numProcs;
};
void SwapNonzero(struct sparsematrix *pM, long s, long t);
int ImproveFreeNonzeros(struct sparsematrix *pM, const struct opts *pOptions, const int *procs, int p1, int p2);
#endif /* __FreeNonzeros_h__ */
#include "GainBucket.h"
struct bucketentry *BucketInsert(struct gainbucket *pGB, long key, long VtxNr) {
/* This function inserts vertex number VtxNr in the gainbucket GB,
in the appropriate bucket with key value. The vertex must not be
present already. A bucketentry representing VtxNr is allocated
and a pointer to this bucketentry is returned. */
struct bucket *pB, **ppB;
struct bucketentry *pE;
if (!pGB) {
fprintf(stderr, "BucketInsert(): Null argument!\n");
return NULL;
}
ppB = &(pGB->Root);
pB = *ppB;
/*## While the key is smaller than the value in the buckets,
go through the bucket list: ##*/
while (*ppB != NULL) {
pB = *ppB; /* pB points to current bucket */
if (key < pB->value)
ppB = &(pB->next); /* points to next bucket */
else
break;
}
if (pB == NULL) {
/*## Bucket list is empty. Create first bucket: ##*/
*ppB = (struct bucket *) malloc(sizeof(struct bucket));
if (*ppB == NULL) {
fprintf(stderr, "BucketInsert(): Not enough memory for first bucket!\n");
return NULL;
}
(*ppB)->value = key;
(*ppB)->entry = NULL;
(*ppB)->prev = (*ppB)->next = NULL;
pGB->NrBuckets++;
} else if (key < pB->value) {
/*## Create new bucket at the end of the current list: ##*/
/* *ppB == NULL */
*ppB = (struct bucket *) malloc(sizeof(struct bucket));
if (*ppB == NULL) {
fprintf(stderr, "BucketInsert(): Not enough memory for new bucket!\n");
return NULL;
}
(*ppB)->value = key;
(*ppB)->entry = NULL;
(*ppB)->prev = pB;
(*ppB)->next = NULL;
pGB->NrBuckets++;
} else if (key > pB->value) {
/*## Create new bucket between the previous and the current
bucket in the list: ##*/
*ppB = (struct bucket *) malloc(sizeof(struct bucket));
if (*ppB == NULL) {
fprintf(stderr, "BucketInsert(): Not enough memory for new bucket!\n");
return NULL;
}
(*ppB)->value = key;
(*ppB)->entry = NULL;
(*ppB)->prev = pB->prev;
(*ppB)->next = pB;
if (pB->prev != NULL)
(pB->prev)->next = *ppB;
else
pGB->Root = *ppB;
pB->prev = *ppB;
pGB->NrBuckets++;
}
/* otherwise key == pB->value (and hence pB == *ppB) */
/*## Insert the vertex number into a new bucketentry at
the start of the bucket: ##*/
pE = (struct bucketentry *) malloc(sizeof(struct bucketentry));
if (pE == NULL) {
fprintf(stderr, "BucketInsert(): Not enough memory for bucket entry!\n");
return NULL;
}
pE->vtxnr = VtxNr;
pE->bucket = *ppB;
pE->prev = NULL;
pE->next = (*ppB)->entry;
if (pE->next != NULL)
(pE->next)->prev = pE;
(*ppB)->entry = pE;
return pE;
} /* end BucketInsert */
struct bucketentry *BucketMove(struct gainbucket *pGB, struct bucketentry *pE,
long key) {
/* This function moves the vertex number represented by bucketentry pE
in the gainbucket GB to the appropriate bucket with key value.
The key value must differ from the current value.
A pointer to the new bucketentry is returned.
The memory space of the original bucketentry is freed.
If its bucket has become empty, the space of the bucket is also freed. */
long VtxNr;
struct bucket *pB;
if (!pGB || !pE) {
fprintf(stderr, "BucketMove(): Null arguments!\n");
return NULL;
}
VtxNr = pE->vtxnr;
pB = pE->bucket;
if (pB->value == key) {
fprintf(stderr, "BucketMove(): Destination bucket equals source!\n");
return NULL;
}
/*## Remove this entry from the bucket: ##*/
/* Adjust forward links */
if (pE->prev != NULL)
(pE->prev)->next = pE->next;
else
pB->entry = pE->next;
/* Adjust backward links */
if (pE->next != NULL)
(pE->next)->prev = pE->prev;
if (pE != NULL)
free(pE);
/*## If this bucket is now empty, remove it from the list: ##*/
if (pB->entry == NULL) {
if (pB->prev != NULL)
(pB->prev)->next = pB->next;
else
pGB->Root = pB->next;
if (pB->next != NULL)
(pB->next)->prev = pB->prev;
if (pB != NULL)
free(pB);
pGB->NrBuckets--;
}
/*## Insert the vertex in the bucket of the new key: ##*/
pE = BucketInsert(pGB, key, VtxNr);
return pE;
} /* end BucketMove */
long BucketDeleteMax(struct gainbucket *pGB) {
/* This function deletes the first vertex from the first bucket.
This vertex has the maximum gain value.
The function returns the vertex number.
The first bucket must exist and it should not be empty.
The memory space of the original bucketentry is freed.
If its bucket has become empty, the space of the bucket is also freed. */
long VtxNr;
struct bucket *pB;
struct bucketentry *pE;
if (!pGB) {
fprintf(stderr, "BucketDeleteMax(): Null arguments!\n");
return -1;
}
pB = pGB->Root ;
pE = pB->entry;
if (!pB || !pE) {
fprintf(stderr, "BucketDeleteMax(): Internal error!\n");
return -1;
}
VtxNr = pE->vtxnr;
/*## Remove this entry from the bucket: ##*/
pB->entry = pE->next;
if (pE->next != NULL)
(pE->next)->prev = NULL;
free(pE);
/*## If this bucket is now empty, remove it from the list: ##*/
if (pB->entry == NULL) {
pGB->Root = pB->next;
if (pB->next != NULL)
(pB->next)->prev = NULL;
free(pB);
pGB->NrBuckets--;
}
return(VtxNr);
} /* end BucketDeleteMax */
long GainBucketGetMaxVal(struct gainbucket *pGB) {
/* This function gives the value of the vertex with the maximum
gain value, if the gainbucket data structure is not empty.
Otherwise, it returns LONG_MIN. */
if (!pGB) return LONG_MIN;
if (pGB->NrBuckets > 0)
return((pGB->Root)->value);
else
return(LONG_MIN);
} /* end GainBucketGetMaxVal */
long GainBucketGetMaxValVertexNr(struct gainbucket *pGB) {
/* This function gives the number of the vertex with the maximum
gain value, if the gainbucket data structure is not empty.
Otherwise, it returns LONG_MIN. */
if (!pGB) return LONG_MIN;
if (pGB->NrBuckets > 0)
return(((pGB->Root)->entry)->vtxnr);
else
return(LONG_MIN);
} /* end GainBucketGetMaxValVertexNr */
int ClearGainBucket(struct gainbucket *pGB) {
/* This function deletes all vertices and buckets
and frees the corresponding memory space.
As a result, pGB->Root = NULL and pGB->NrBuckets = 0. */
struct bucket *pB;
struct bucketentry *pE;
if (!pGB) {
fprintf(stderr, "ClearGainBucket(): Null argument!\n");
return FALSE;
}
while ((pB = pGB->Root) != NULL) {
pGB->Root = pB->next;
/*## Remove all entries from this bucket: ##*/
while ((pE = pB->entry) != NULL) {
pB->entry = pE->next;
free(pE);
}
/*## Remove this bucket from the list: ##*/
free(pB);
pGB->NrBuckets--;
}
return TRUE;
} /* end ClearGainBucket */
#if defined(GAINBUCKET_LIST)
#include "GainBucketList.c"
#elif defined(GAINBUCKET_ARRAY)
#include "GainBucketArray.c"
#else
#error "A gain bucket structure should be selected in mondriaan.mk"
#endif
/* This file defines Gainbucket, a data structure
which contains numbers of data items and their values.
The data item can be a vertex and the value its gain in a move.
The numbers are integers >=0, and the values are integers
without restriction. We call the items vertices and their values gains.
The vertices are sorted in order of decreasing gain.
Vertices with the same gain value are stored together in a bucket,
implemented as a doubly linked list. The entries of a bucket,
representing vertices, are called bucketentries.
The list is terminated by NULL at both ends.
The buckets themselves are also linked in a doubly linked list.
This list is also terminated by NULL at both ends. */
#ifndef __GainBucket_h__
#define __GainBucket_h__
#include "Options.h"
struct bucketentry {
long vtxnr; /* vertex number for this bucketentry */
struct bucket *bucket; /* pointer to the bucket containing this entry */
struct bucketentry *prev; /* pointer to previous bucketentry
in the list of entries*/
struct bucketentry *next; /* pointer to next bucketentry */
};
struct bucket {
long value; /* value for the entries in this bucket */
struct bucket *prev; /* pointer to previous bucket
in the list of buckets */
struct bucket *next; /* pointer to next bucket */
struct bucketentry *entry; /* pointer to first bucketentry in this bucket */
};
struct gainbucket {
long NrBuckets; /* number of buckets in the gainbucket */
struct bucket *Root; /* pointer to the first bucket
in the list of buckets */
};
struct bucketentry *BucketInsert(struct gainbucket *pGB, long key, long VtxNr);
struct bucketentry *BucketMove(struct gainbucket *pGB, struct bucketentry *pE,
long key);
long BucketDeleteMax(struct gainbucket *pGB);
long GainBucketGetMaxVal(struct gainbucket *pGB);
long GainBucketGetMaxValVertexNr(struct gainbucket *pGB);
int ClearGainBucket(struct gainbucket *pGB);
#endif /* __GainBucket_h__ */
This file is a placeholder that redirects to either
the Linked List implementation or the Array implementation,
as may be chosen in mondriaan.mk */
#if defined(GAINBUCKET_LIST)
#include "GainBucketList.h"
#elif defined(GAINBUCKET_ARRAY)
#include "GainBucketArray.h"
#else
#error "A gain bucket structure should be selected in mondriaan.mk"
#endif
#include "GainBucket.h"
int InitGainBucket(struct gainbucket *pGB, long MaxValue) {
/* Initialize empty GainBucket structure.
Must be called before using any other gainbucket method. */
if (!pGB) {
fprintf(stderr, "InitGainBucket(): Null parameter!\n");
return FALSE;
}
pGB->MaxValue = MaxValue;
pGB->MaxPresentValue = LONG_MIN;
pGB->Root = (struct bucket*)calloc(2*MaxValue+1, sizeof(struct bucket));
if (pGB->Root == NULL) {
fprintf(stderr, "InitGainBucket(): Not enough memory!\n");
return FALSE;
}
long k;
for(k=-MaxValue; k<=MaxValue; ++k) {
pGB->Root[pGB->MaxValue+k].value = k;
pGB->Root[pGB->MaxValue+k].entry = NULL;
}
return TRUE;
} /* end InitGainBucket */
struct bucketentry *BucketInsert(struct gainbucket *pGB, long key, long VtxNr) {
/* This function inserts vertex number VtxNr in the gainbucket GB,
in the appropriate bucket with key value. The vertex must not be
present already. A bucketentry representing VtxNr is allocated
and a pointer to this bucketentry is returned. */
struct bucket *pB;
struct bucketentry *pE;
if (!pGB || !pGB->Root) {
fprintf(stderr, "BucketInsert(): Null parameter!\n");
return NULL;
}
if(key > pGB->MaxValue || key < -pGB->MaxValue) {
fprintf(stderr, "BucketInsert(): Invalid key!\n");
return NULL;
}
pB = &(pGB->Root[pGB->MaxValue+key]);
/*## Insert the vertex number into a new bucketentry at
the start of the bucket: ##*/
pE = (struct bucketentry *) malloc(sizeof(struct bucketentry));
if (pE == NULL) {
return NULL;
}
pE->vtxnr = VtxNr;
pE->bucket = pB;
pE->prev = NULL;
pE->next = pB->entry;
if (pE->next != NULL) {
(pE->next)->prev = pE;
}
else {
pGB->NrBuckets++;
if(key > pGB->MaxPresentValue)
pGB->MaxPresentValue = key;
}
pB->entry = pE;
return pE;
} /* end BucketInsert */
struct bucketentry *BucketMove(struct gainbucket *pGB, struct bucketentry *pE,
long key) {
/* This function moves the vertex number represented by bucketentry pE
in the gainbucket GB to the appropriate bucket with key value.
The key value must differ from the current value.
A pointer to the bucketentry is returned. */
struct bucket *pB;
if (!pGB || !pE || !pGB->Root) {
fprintf(stderr, "BucketMove(): Null parameter!\n");
return NULL;
}
pB = pE->bucket;
if (pB->value == key) {
return NULL;
}
if(key > pGB->MaxValue || key < -pGB->MaxValue) {
fprintf(stderr, "BucketMove(): Invalid key!\n");
return NULL;
}
/*## Remove this entry from the bucket: ##*/
/* Adjust forward links */
if (pE->prev != NULL)
(pE->prev)->next = pE->next;
else
pB->entry = pE->next;
/* Adjust backward links */
if (pE->next != NULL)
(pE->next)->prev = pE->prev;
/*## Check empty bucket: ##*/
if (pB->entry == NULL) {
pGB->NrBuckets--;
/* If this bucket was the max bucket and the new value is lower, find the new max bucket */
if(pB->value == pGB->MaxPresentValue && key < pGB->MaxPresentValue) {
pGB->MaxPresentValue = LONG_MIN;
for(pB--;pB>=pGB->Root;pB--) {
if(pB->entry != NULL) {
pGB->MaxPresentValue = pB->value;
break;
}
if(pB->value < key)
break;
}
}
}
pB = &(pGB->Root[pGB->MaxValue+key]);
/*## Reassign the bucketentry to the new bucket: ##*/
pE->bucket = pB;
pE->prev = NULL;
pE->next = pB->entry;
if (pE->next != NULL) {
(pE->next)->prev = pE;
}
else {
pGB->NrBuckets++;
if(key > pGB->MaxPresentValue)
pGB->MaxPresentValue = key;
}
pB->entry = pE;
return pE;
} /* end BucketMove */
long BucketDeleteMax(struct gainbucket *pGB) {
/* This function deletes the first vertex from the first bucket.
This vertex has the maximum gain value.
The function returns the vertex number.
The first bucket must exist and it should not be empty.
The memory space of the original bucketentry is freed. */
long VtxNr;
struct bucket *pB;
struct bucketentry *pE;
if (!pGB || !pGB->Root) {
fprintf(stderr, "BucketDeleteMax(): Null parameter!\n");
return -1;
}
pB = &(pGB->Root[pGB->MaxValue+pGB->MaxPresentValue]) ;
pE = pB->entry;
if (!pB || !pE) {
return -1;
}
VtxNr = pE->vtxnr;
/*## Remove this entry from the bucket: ##*/
pB->entry = pE->next;
if (pE->next != NULL)
(pE->next)->prev = NULL;
free(pE);
/*## If this bucket is now empty, find the new max bucket: ##*/
if (pB->entry == NULL) {
pGB->NrBuckets--;
pGB->MaxPresentValue = LONG_MIN;
for(pB--;pB>=pGB->Root;pB--) {
if(pB->entry != NULL) {
pGB->MaxPresentValue = pB->value;
break;
}
}
}
return(VtxNr);
} /* end BucketDeleteMax */
long GainBucketGetMaxVal(struct gainbucket *pGB) {
/* This function gives the value of the vertex with the maximum
gain value, if the gainbucket data structure is not empty.
Otherwise, it returns LONG_MIN. */
if (!pGB || !pGB->Root) {
return LONG_MIN;
}
if (pGB->NrBuckets > 0)
return((pGB->Root[pGB->MaxValue+pGB->MaxPresentValue]).value);
else
return(LONG_MIN);
} /* end GainBucketGetMaxVal */
long GainBucketGetMaxValVertexNr(struct gainbucket *pGB) {
/* This function gives the number of the vertex with the maximum
gain value, if the gainbucket data structure is not empty.
Otherwise, it returns LONG_MIN. */
if (!pGB || !pGB->Root) {
return LONG_MIN;
}
if (pGB->NrBuckets > 0)
return(((pGB->Root[pGB->MaxValue+pGB->MaxPresentValue]).entry)->vtxnr);
else
return(LONG_MIN);
} /* end GainBucketGetMaxValVertexNr */
int ClearGainBucket(struct gainbucket *pGB) {
/* This function deletes all bucketentries
and frees the corresponding memory space.
As a result, pGB->NrBuckets = 0. */
struct bucket *pB;
struct bucketentry *pE;
if (!pGB) {
fprintf(stderr, "ClearGainBucket(): Null parameter!\n");
return FALSE;
}
if(pGB->Root == NULL) {
return TRUE;
}
for(pB=&(pGB->Root[2*pGB->MaxValue]); pB>=pGB->Root; pB--) {
if(pB->entry == NULL) {
continue;
}
/*## Remove all entries from this bucket: ##*/
while ((pE = pB->entry) != NULL) {
pB->entry = pE->next;
free(pE);
}
pGB->NrBuckets--;
}
pGB->MaxPresentValue = LONG_MIN;
return TRUE;
} /* end ClearGainBucket */
int DeleteGainBucket(struct gainbucket *pGB) {
/* This function deletes the GainBucket structure
and frees all corresponding memory space.
As a result, pGB->Root = NULL and the structure
cannot be used any more until InitGainBucket() is
called. */
if (!pGB) {
fprintf(stderr, "DeleteGainBucket(): Null parameter!\n");
return FALSE;
}
if(!ClearGainBucket(pGB)) {
return FALSE;
}
if(pGB->Root == NULL)
return TRUE;
free(pGB->Root);
pGB->Root = NULL;
return TRUE;
} /* end DeleteGainBucket */
/* This file defines Gainbucket, a data structure
which contains numbers of data items and their values.
The data item can be a vertex and the value its gain in a move.
The numbers are integers >=0, and the values are integers
without restriction. We call the items vertices and their values gains.
The vertices are sorted in order of decreasing gain.
Vertices with the same gain value are stored together in a bucket,
implemented as a doubly linked list. The entries of a bucket,
representing vertices, are called bucketentries.
The list is terminated by NULL at both ends.
The buckets themselves are stored in an array. */
#ifndef __GainBucket_h__
#define __GainBucket_h__
#include "Options.h"
struct bucketentry {
long vtxnr; /* vertex number for this bucketentry */
struct bucket *bucket; /* pointer to the bucket containing this entry */
struct bucketentry *prev; /* pointer to previous bucketentry
in the list of entries*/
struct bucketentry *next; /* pointer to next bucketentry */
};
struct bucket {
long value; /* value for the entries in this bucket */
struct bucketentry *entry; /* pointer to first bucketentry in this bucket */
};
struct gainbucket {
long NrBuckets; /* number of buckets in the gainbucket */
long MaxValue; /* Maximum bucket value possible */
long MaxPresentValue; /* Maximum bucket value present */
struct bucket *Root; /* pointer to the first bucket
in the list of buckets */
};
int InitGainBucket(struct gainbucket *pGB, long maxKey);
struct bucketentry *BucketInsert(struct gainbucket *pGB, long key, long VtxNr);
struct bucketentry *BucketMove(struct gainbucket *pGB, struct bucketentry *pE,
long key);
long BucketDeleteMax(struct gainbucket *pGB);
long GainBucketGetMaxVal(struct gainbucket *pGB);
long GainBucketGetMaxValVertexNr(struct gainbucket *pGB);
int ClearGainBucket(struct gainbucket *pGB);
int DeleteGainBucket(struct gainbucket *pGB);
#endif /* __GainBucket_h__ */
#include "GainBucket.h"
int InitGainBucket(struct gainbucket *pGB, long MaxValue) {
/* Initialize empty GainBucket structure.
Must be called before using any other gainbucket method. */
/* Empty, this linked list implementation does not require initialization */
return TRUE;
} /* end InitGainBucket */
struct bucketentry *_BucketInsert(struct gainbucket *pGB, struct bucketentry *pE, long key, long VtxNr) {
/* Private function, to be used by functions in this file only.
This function inserts vertex number VtxNr in the gainbucket GB,
in the appropriate bucket with key value. The vertex must not be
present already. It uses the bucket entry pE provided by the
calling function. */
struct bucket *pB, **ppB;
if (!pGB || !pE) {
fprintf(stderr, "_BucketInsert(): Null argument!\n");
return NULL;
}
ppB = &(pGB->Root);
pB = *ppB;
/*## While the key is smaller than the value in the buckets,
go through the bucket list: ##*/
while (*ppB != NULL) {
pB = *ppB; /* pB points to current bucket */
if (key < pB->value)
ppB = &(pB->next); /* points to next bucket */
else
break;
}
if (pB == NULL) {
/*## Bucket list is empty. Create first bucket: ##*/
*ppB = (struct bucket *) malloc(sizeof(struct bucket));
if (*ppB == NULL) {
fprintf(stderr, "_BucketInsert(): Not enough memory for first bucket!\n");
return NULL;
}
(*ppB)->value = key;
(*ppB)->entry = NULL;
(*ppB)->prev = (*ppB)->next = NULL;
pGB->NrBuckets++;
} else if (key < pB->value) {
/*## Create new bucket at the end of the current list: ##*/
/* *ppB == NULL */
*ppB = (struct bucket *) malloc(sizeof(struct bucket));
if (*ppB == NULL) {
fprintf(stderr, "_BucketInsert(): Not enough memory for new bucket!\n");
return NULL;
}
(*ppB)->value = key;
(*ppB)->entry = NULL;
(*ppB)->prev = pB;
(*ppB)->next = NULL;
pGB->NrBuckets++;
} else if (key > pB->value) {
/*## Create new bucket between the previous and the current
bucket in the list: ##*/
*ppB = (struct bucket *) malloc(sizeof(struct bucket));
if (*ppB == NULL) {
fprintf(stderr, "_BucketInsert(): Not enough memory for new bucket!\n");
return NULL;
}
(*ppB)->value = key;
(*ppB)->entry = NULL;
(*ppB)->prev = pB->prev;
(*ppB)->next = pB;
if (pB->prev != NULL)
(pB->prev)->next = *ppB;
else
pGB->Root = *ppB;
pB->prev = *ppB;
pGB->NrBuckets++;
}
/* otherwise key == pB->value (and hence pB == *ppB) */
/*## Insert the vertex number into a new bucketentry at
the start of the bucket: ##*/
pE->vtxnr = VtxNr;
pE->bucket = *ppB;
pE->prev = NULL;
pE->next = (*ppB)->entry;
if (pE->next != NULL)
(pE->next)->prev = pE;
(*ppB)->entry = pE;
return pE;
} /* end _BucketInsert */
struct bucketentry *BucketInsert(struct gainbucket *pGB, long key, long VtxNr) {
/* This function inserts vertex number VtxNr in the gainbucket GB,
in the appropriate bucket with key value. The vertex must not be
present already. A bucketentry representing VtxNr is allocated
and a pointer to this bucketentry is returned. */
struct bucketentry *pE;
pE = (struct bucketentry *) malloc(sizeof(struct bucketentry));
if (pE == NULL) {
fprintf(stderr, "BucketInsert(): Not enough memory for bucket entry!\n");
return NULL;
}
return _BucketInsert(pGB, pE, key, VtxNr);
} /* end BucketInsert */
struct bucketentry *BucketMove(struct gainbucket *pGB, struct bucketentry *pE,
long key) {
/* This function moves the vertex number represented by bucketentry pE
in the gainbucket GB to the appropriate bucket with key value.
The key value must differ from the current value.
A pointer to the bucketentry is returned.
If the original bucket has become empty, the space of the bucket is freed. */
long VtxNr;
struct bucket *pB;
if (!pGB || !pE) {
fprintf(stderr, "BucketMove(): Null arguments!\n");
return NULL;
}
VtxNr = pE->vtxnr;
pB = pE->bucket;
if (pB->value == key) {
fprintf(stderr, "BucketMove(): Destination bucket equals source!\n");
return NULL;
}
/*## Remove this entry from the bucket: ##*/
/* Adjust forward links */
if (pE->prev != NULL)
(pE->prev)->next = pE->next;
else
pB->entry = pE->next;
/* Adjust backward links */
if (pE->next != NULL)
(pE->next)->prev = pE->prev;
/*## If this bucket is now empty, remove it from the list: ##*/
if (pB->entry == NULL) {
if (pB->prev != NULL)
(pB->prev)->next = pB->next;
else
pGB->Root = pB->next;
if (pB->next != NULL)
(pB->next)->prev = pB->prev;
if (pB != NULL)
free(pB);
pGB->NrBuckets--;
}
/*## Insert the vertex in the bucket of the new key: ##*/
return _BucketInsert(pGB, pE, key, VtxNr);
} /* end BucketMove */
long BucketDeleteMax(struct gainbucket *pGB) {
/* This function deletes the first vertex from the first bucket.
This vertex has the maximum gain value.
The function returns the vertex number.
The first bucket must exist and it should not be empty.
The memory space of the original bucketentry is freed.
If its bucket has become empty, the space of the bucket is also freed. */
long VtxNr;
struct bucket *pB;
struct bucketentry *pE;
if (!pGB) {
fprintf(stderr, "BucketDeleteMax(): Null arguments!\n");
return -1;
}
pB = pGB->Root ;
pE = pB->entry;
if (!pB || !pE) {
fprintf(stderr, "BucketDeleteMax(): Internal error!\n");
return -1;
}
VtxNr = pE->vtxnr;
/*## Remove this entry from the bucket: ##*/
pB->entry = pE->next;
if (pE->next != NULL)
(pE->next)->prev = NULL;
free(pE);
/*## If this bucket is now empty, remove it from the list: ##*/
if (pB->entry == NULL) {
pGB->Root = pB->next;
if (pB->next != NULL)
(pB->next)->prev = NULL;
free(pB);
pGB->NrBuckets--;
}
return(VtxNr);
} /* end BucketDeleteMax */
long GainBucketGetMaxVal(struct gainbucket *pGB) {
/* This function gives the value of the vertex with the maximum
gain value, if the gainbucket data structure is not empty.
Otherwise, it returns LONG_MIN. */
if (!pGB) return LONG_MIN;
if (pGB->NrBuckets > 0)
return((pGB->Root)->value);
else
return(LONG_MIN);
} /* end GainBucketGetMaxVal */
long GainBucketGetMaxValVertexNr(struct gainbucket *pGB) {
/* This function gives the number of the vertex with the maximum
gain value, if the gainbucket data structure is not empty.
Otherwise, it returns LONG_MIN. */
if (!pGB) return LONG_MIN;
if (pGB->NrBuckets > 0)
return(((pGB->Root)->entry)->vtxnr);
else
return(LONG_MIN);
} /* end GainBucketGetMaxValVertexNr */
int ClearGainBucket(struct gainbucket *pGB) {
/* This function deletes all vertices and buckets
and frees the corresponding memory space.
As a result, pGB->Root = NULL and pGB->NrBuckets = 0. */
struct bucket *pB;
struct bucketentry *pE;
if (!pGB) {
fprintf(stderr, "ClearGainBucket(): Null argument!\n");
return FALSE;
}
while ((pB = pGB->Root) != NULL) {
pGB->Root = pB->next;
/*## Remove all entries from this bucket: ##*/
while ((pE = pB->entry) != NULL) {
pB->entry = pE->next;
free(pE);
}
/*## Remove this bucket from the list: ##*/
free(pB);
pGB->NrBuckets--;
}
return TRUE;
} /* end ClearGainBucket */
int DeleteGainBucket(struct gainbucket *pGB) {
/* This function deletes the GainBucket structure
and frees all corresponding memory space.
As a result, pGB->Root = NULL and the structure
cannot be used any more until InitGainBucket() is
called. */
/* Empty, this linked list implementation does not
need to be cleared.*/
return TRUE;
} /* end DeleteGainBucket */