root/trunk/README.txt

Revision 1, 7.6 kB (checked in by root, 3 years ago)

initial import

Line 
1 Paralogic Beowulf Performance Suite V 1.3-1
2 December 6, 2002
3 Doug Eadline deadline@plogic.com
4 www.plogic.com/bps
5
6 Purpose:
7 ========
8
9 This package is a collection of performance analysis programs for use
10 with Beowulf clusters. The suite itself provides a graphical user
11 interface for running the programs as well as html file generation of output.
12
13
14 Quick Start:
15 ============
16
17
18 1) Install the rpm - "rpm -ivh <rpmfile>"
19    If the rpm fails dependencies, use the source rpm.
20    (See below for more information.)
21 2) Either use the Paralogic module facility or make sure
22    sure your MPI iand compiler paths are set correctly.
23    You will need MPICH_HOME set to your MPICH path,
24    LAM_HOME set to your LAM-MPI path, and MPIPRO_HOME set to
25    your MPI-PRO path. Also, if you wish to
26    use LAM-MPI, you will need the LAM's bin path in your PATH
27    so that LAM can start on the nodes.
28 3) Run xbps  -  xbps &
29
30
31 Important Notes:
32 ================
33
34 The bps suite is best run as a user. Some of the tests (i.e. NAS parallel)
35 will not run as root.
36
37 Not all features of the command line interface are possible with the GUI.
38
39 When using Netpipe/Netperf Benchmarks, rsh with no password must be
40 permitted between the nodes upon which the benchmark is to be run.
41 This behavior is typical of most clusters.
42
43 Under normal operation, xbps will always preserve the existing log directory.
44 This feature is to ensure previous results will not be overwritten. You can
45 copy previous log files (from log directories) into the current log directory
46 for bps-html conversion.
47
48 Also, the tests have been designed so that the bps rpm only needs to be
49 installed on the head node. For this to work, the bps log directory must me 
50 mounted on all nodes (i.e. under /home).
51  
52 When using the NAS Parallel Benchmarks it is advisable to use the MPI's
53 which Paralogic uses for their benchmarking. However, rather than limit
54 potential BPS users, these are not made a part of the required packages list.
55 The benchmark scripts have been written to rely on the two environment
56 variables (for LAM-MPI and MPICH). If you are having problems with the
57 NAS benchmarks, extract the npb.tar.gz archive in the /usr/bps/src directory
58 and try running the scripts by hand. Consult the README.plogic file for
59 more information. Also, if you wish to use the Portland Group or
60 the Intel Compilers make sure you have these properly configured.
61
62 Any suggestions for methods of improving the the tests are welcomed.
63 Please email the BPS mailing list:  bps@plogic.com
64
65
66 Install Procedure:
67 ==================
68
69 Using the rpm file: 
70 (version numbers may vary)
71
72   rpm -i bps-1.2-7.i386.rpm
73
74 Using the source rpm file:
75 (Do this only if the rpm does not install on your system)
76
77   rpm -i bps-1.2-7.src.rpm  (install src rpm)
78
79   rpm -bb bps.spec  (build the rpm)
80
81   rpm -i /usr/src/redhat/RPMS/i386/bps-1.2-7.i386.rpm  (install the rpm)
82
83
84 Using the tarball:
85
86  tar -xvzf <bps tarball>.tar.gz
87  cd <bps dir>
88  sh build-all
89
90 This will put all important files in ~bps/bin and ~bps/src.
91
92
93 Usage:
94 ======
95
96 xbps
97         run bps in graphical mode. this mode is a bit easier to use than
98         the command line mode.
99
100 bps
101         run benchmarks included in bps from command line
102
103   Options:
104     -b                            bonnie++
105     -s                            stream
106     -f <send node>,<receive node> netperf to remote node
107     -p <send node>,<receive node> netpipe to remote node
108     -n <compiler>,<#processors),  NAS parallel benchmarks
109      <test size>,<MPI>,           compiler={gnu,pgi,intel}
110      <machine1,machine2,...>      test size={A,B,C,dummy}
111                                   MPI={mpich,lam,mpipro}
112     -k                            keep NAS directory when finished
113     -u                            unixbench
114     -m                            lmbench
115     -l <log_dir>                  benchmark log directory
116     -w                            preserve existing log directory
117     -i <mboard manufacturer>,     machine information
118        <mboard model>,<memory>
119        <interconnect>,<linux ver>
120     -v                            show version
121     -h                            show this help
122
123 bps-html <log directory>
124
125         generate html output files based on files in <log directory>           
126
127
128 In Case of Problems:
129 ====================
130
131 The BPS suite is a collection of many tests. You should have minimal or
132 no problems with the single machine tests. As more machines are involved
133 the tests, there is room for more configuration errors to arise.
134
135 If a test does not run the best thing to do is to check the "test_name.log"
136 file in the log directory. In the case of the NAS tests, the results are
137 in the form npb.COMPILER.MPI.CLASS.PROCESSORS.  In general, if you are
138 problems with a test it may be best to run it from the command line. In the
139 case of the NAS suite, the "-k" option will keep the npb directory
140 in the log directory so you can run the tests more directly by using
141 the "run_suite" script in the npb directory. Also the README.plogic
142 file in the npb directly should provide more information on how the tests
143 are run and how to resolve possible problems.
144
145
146 Background:
147 ===========
148
149 General:
150 http://www.plogic.com/bps
151
152 bonnie++ - hard drive performance
153 Reference: http://www.coker.com.au/bonnie++/
154
155 stream - memory performance
156 Reference: http://www.cs.virginia.edu/stream/
157
158 netperf - general network performance
159 Reference: http://www.netperf.org/netperf/NetperfPage.html
160
161 netpipe - detailed network performance
162 Reference: http://www.scl.ameslab.gov/Projects/ClusterCookbook/nprun.html
163
164 unixbench - general Unix benchmarks
165 Reference: http://www.linuxdoc.org/HOWTO/Benchmarking-HOWTO.html#toc3
166
167 LMbench - low level benchmarks
168 Reference: http://www.bitmover.com/lmbench/
169
170 NAS - parallel tests
171 Reference: http://www.nas.nasa.gov/Software/NPB/
172
173 The following is a description of the NAS tests.
174
175 BT is a simulated CFD application that uses an implicit
176   algorithm to solve 3dimensional (3D) compressible NavierStokes
177   equations. The finite differences solution to the problem
178   is based on an Alternating Direction Implicit (ADI) approximate
179   factorization that decouples the x, y, and z dimensions.
180   The resulting systems are BlockTridiagona/l of 5x5 blocks
181   and are solved sequentially along each dimension.
182
183 SP is a simulated CFD application that has a similar structure
184   to BT. The finite differences solution to the problem
185   is based on a Beam Warming approximate factorization that
186   decouples the x, y, and z dimensions. The resulting system
187   has scalar Pentadiagonal bands of linear equations that
188   are solved sequentially along each dimension.
189
190 LU is a simulated CFD application that uses symmetric successive
191   over relaxation (SSOR) method to solve a seven block diagonal
192   system resulting from finite difference discretization
193   of the NavierStokes equations in 3D by splitting to into
194   block Lower and Upper triangular systems.
195
196 FT contains the computational kernel of a 3D fast Fourier
197   Transform (FFT)based spectral method. FT performs three
198   one dimensional (1D) FFT's, one for each dimension.
199
200 MG uses a Vcycle MultiGrid method to compute the solution
201   of the 3D scalar Poisson equation. The algorithm works
202   continuously on a set of grids that are made between coarse
203   and fine. It tests both short and long distance data movement.
204
205 CG uses a Conjugate Gradient method to compute an approximation
206   to the smallest eigenvalue of a large, sparse, unstructured
207   matrix. This kernel tests unstructured grid computations
208   and communications by using a matrix with randomly generated
209   locations of entries.
210
211 EP is an Embarrassingly Parallel benchmark. It generates
212   pairs of Gaussian random deviates according to a specific
213   scheme. The goal is to establish the reference point for
214   peak performance of a given platform.
215
Note: See TracBrowser for help on using the browser.