root/trunk/README

Revision 1, 7.8 kB (checked in by root, 3 years ago)

initial import

Line 
1 Beowulf Performance Suite V 1.3-1
2 March 22, 2005
3 Douglas Eadline Douglas@Eadline.org
4 www.basement-supercomputing.com
5
6 Purpose:
7 ========
8
9 This package is a collection of performance analysis programs for use
10 with Beowulf clusters. The suite itself provides a graphical user
11 interface for running the programs as well as html file generation of output.
12
13
14 Quick Start:
15 ============
16
17
18 1) Install the rpm - "rpm -ivh <rpmfile>"
19    If the rpm fails dependencies due to missing packages
20    add the packages and retry.
21    (See below for more information.)
22 2) To run the NAS suite
23    You will need MPICH_HOME set to your MPICH path,
24    LAM_HOME set to your LAM-MPI path.  Also, if you wish to
25    use LAM-MPI, you will need to make sure LAM's bin path in your PATH
26    so that LAM can start on the nodes.
27 3) "man bps" and the README.bps file in the
28     nas tar ball are your friends.
29
30 Important Notes:
31 ================
32
33 The bps suite is best run as a user. Some of the tests (i.e. NAS parallel)
34 will not run as root.
35
36 Not all features of the command line interface are possible with the GUI.
37
38 When using Netpipe/Netperf Benchmarks, rsh with no password must be
39 permitted between the nodes upon which the benchmark is to be run.
40 This behavior is typical of most clusters.
41
42 Under normal operation, bps will always overwrite the existing log
43 directory. You can use the -w option to prevent this from happening.
44 In addition, copy previous log files (from older log directories)
45 into the current log directory for bps-html conversion.
46
47 Also, the tests have been designed so that the bps rpm only needs to be
48 installed on the head node. For this to work, the bps log directory must me 
49 mounted on all nodes (i.e. under /home).
50  
51 The NAS tests were originally designed to work with LAM, MPICH, MPI/PRO
52 MPIs and GNU, PGI, and Intel compilers.  Current versions of MPI/PRO,
53 PGI, Intel packages have not been tested. This statement means they
54 probably will not work.
55                                                                                
56 If problems result when using the NAS Parallel Benchmarks, please
57 see the NAS documentation for more information. Normally issues involve
58 running MPI/compiler configuration/linking issues. To make it as easy
59 as possible, the benchmark scripts have been written to rely on the
60 two environment variables LAM_HOME and MPICH_HOME for LAM-MPI and
61 MPICH. These variables MUST point to the appropriate MPI installation.
62 If you are having problems with the NAS benchmarks, extract the
63 npb.tar.gz archive in the/opt/bps/src directory and try running the scripts
64 manually. (You may also use the -k option to preserve the directory from
65 which the NAS tests where run. This directory will be under your bps-log
66 directory. Consult the README.bps file for more information.
67
68
69 Install Procedure:
70 ==================
71
72 Using the rpm file: 
73 (version numbers may vary)
74
75   rpm -i bps-1.3-1.i386.rpm
76
77 Using the source rpm file:
78 (Do this only if the rpm does not install on your system)
79
80   rpm -i bps-1.3-1.src.rpm  (install src rpm)
81
82   rpmbuild -bb bps.spec  (build the rpm)
83
84   rpm -i /usr/src/redhat/RPMS/i386/bps-1.3-1.i386.rpm  (install the rpm)
85
86
87 Using the tarball:
88
89  tar -xvzf <bps tarball>.tar.gz
90  cd <bps dir>
91  sh build-all
92
93 This will put all important files in ~bps/bin and ~bps/src.
94
95
96 Usage:
97 ======
98
99 bps
100         run benchmarks included in bps from command line
101
102   Options:
103     -b                            bonnie++
104     -s                            stream
105     -f <send node>,<receive node> netperf to remote node
106     -p <send node>,<receive node> netpipe to remote node
107     -n <compiler>,<#processors),  NAS parallel benchmarks
108      <test size>,<MPI>,           compiler={gnu,pgi,intel}
109      <machine1,machine2,...>      test size={A,B,C,dummy}
110                                   MPI={mpich,lam,mpipro}
111     -k                            keep NAS directory when finished
112     -u                            unixbench
113     -m                            lmbench
114     -l <log_dir>                  benchmark log directory
115     -w                            preserve existing log directory
116     -i <mboard manufacturer>,     machine information
117        <mboard model>,<memory>
118        <interconnect>,<linux ver>
119     -v                            show version
120     -h                            show this help
121
122 bps-html <log directory>
123
124         generate html output files based on files in <log directory>           
125
126
127 In Case of Problems:
128 ====================
129
130 The BPS suite is a collection of many tests. You should have minimal or
131 no problems with the single machine tests. As more machines are involved
132 the tests, there is room for more configuration errors to arise.
133
134 If a test does not run the best thing to do is to check the "test_name.log"
135 file in the log directory. In the case of the NAS tests, the results are
136 in the form npb.COMPILER.MPI.CLASS.PROCESSORS.  In general, if you are
137 problems with a test it may be best to run it from the command line. In the
138 case of the NAS suite, the "-k" option will keep the npb directory
139 in the log directory so you can run the tests more directly by using
140 the "run_suite" script in the npb directory. Also the README.bps
141 file in the npb directly should provide more information on how the tests
142 are run and how to resolve possible problems.
143
144
145 Background:
146 ===========
147
148 General:
149 http://www.basement-supercomputing.com
150
151 bonnie++ - hard drive performance
152 Reference: http://www.coker.com.au/bonnie++/
153
154 stream - memory performance
155 Reference: http://www.cs.virginia.edu/stream/
156
157 netperf - general network performance
158 Reference: http://www.netperf.org/netperf/NetperfPage.html
159
160 netpipe - detailed network performance
161 Reference: http://www.scl.ameslab.gov/Projects/ClusterCookbook/nprun.html
162
163 unixbench - general Unix benchmarks
164 Reference: http://www.linuxdoc.org/HOWTO/Benchmarking-HOWTO.html#toc3
165
166 LMbench - low level benchmarks
167 Reference: http://www.bitmover.com/lmbench/
168
169 NAS - parallel tests
170 Reference: http://www.nas.nasa.gov/Software/NPB/
171
172 The following is a description of the NAS tests.
173
174 BT is a simulated CFD application that uses an implicit
175   algorithm to solve 3dimensional (3D) compressible NavierStokes
176   equations. The finite differences solution to the problem
177   is based on an Alternating Direction Implicit (ADI) approximate
178   factorization that decouples the x, y, and z dimensions.
179   The resulting systems are BlockTridiagona/l of 5x5 blocks
180   and are solved sequentially along each dimension.
181
182 SP is a simulated CFD application that has a similar structure
183   to BT. The finite differences solution to the problem
184   is based on a Beam Warming approximate factorization that
185   decouples the x, y, and z dimensions. The resulting system
186   has scalar Pentadiagonal bands of linear equations that
187   are solved sequentially along each dimension.
188
189 LU is a simulated CFD application that uses symmetric successive
190   over relaxation (SSOR) method to solve a seven block diagonal
191   system resulting from finite difference discretization
192   of the NavierStokes equations in 3D by splitting to into
193   block Lower and Upper triangular systems.
194
195 FT contains the computational kernel of a 3D fast Fourier
196   Transform (FFT)based spectral method. FT performs three
197   one dimensional (1D) FFT's, one for each dimension.
198
199 MG uses a Vcycle MultiGrid method to compute the solution
200   of the 3D scalar Poisson equation. The algorithm works
201   continuously on a set of grids that are made between coarse
202   and fine. It tests both short and long distance data movement.
203
204 CG uses a Conjugate Gradient method to compute an approximation
205   to the smallest eigenvalue of a large, sparse, unstructured
206   matrix. This kernel tests unstructured grid computations
207   and communications by using a matrix with randomly generated
208   locations of entries.
209
210 EP is an Embarrassingly Parallel benchmark. It generates
211   pairs of Gaussian random deviates according to a specific
212   scheme. The goal is to establish the reference point for
213   peak performance of a given platform.
214
Note: See TracBrowser for help on using the browser.