SAL Home NUMERICS Misc

DGEMM for Alpha Chip

This is an optimized BLAS(Level 1, 3 and some 2) and some lapack(GESV, LASWP, GETF2, GETRF) library for Alpha. And some of Level 3 routines(GEMM, TRSM, TRMM, SYMM) and lapack(GETRF, GESV) are parallelized by posix thread.

Now, Linpack TPP performs 1069 MFlops(ES40, 21264 677MHz DDR 2nd cache, Tru64 Unix V 5.1, this is a peak performance. The avarage is about 1060MFlops) faster than Compaq's announce performance(Compaq's performance is 1031MFlops). On Linux, TPP performs a little bit slower.

If you want to use SMP version, you should link this library with libpthread.a("-lpthread" link option) and you can specify the number of CPUs by environment value "STATABO_NUM_THREADS"(currently, the maximum number of CPUs is 64). If this environment value does not exist, the library assume that you have 2 CPUs.

Current Version:   20000719

License Type:   GPL

Home Site:
http://members.jcom.home.ne.jp/kgoto/

Source Code Availability:

Yes

Available Binary Packages:

  • Debian Package:   No
  • RedHat RPM Package:   No
  • Other Packages:   No

Targeted Platforms:

Linux/Alpha and Tru64 UNIX

Software/Hardware Requirements:

The EV4/EV5/EV6 architectures are automatically and dynamically detected by this library. So you don't have to take care.

Other Links:
http://cnls.lanl.gov/avalon/ (Avalon, an Alpha Linux cluster which uses DGEMM for Alpha software)

Mailing Lists/USENET News Groups:

None

User Comments:

  • None

See A Screen Shot? (Not Yet)

  SAL Home   |   Numerical Analysis   |   Misc


Comments? SAL@KachinaTech.COM
Copyright © 1995-2001 by Herng-Jeng Jou
Copyright © 1997-2001 by Kachina Technologies, Inc.
All rights reserved.