International Research Journal of Commerce , Arts and Science

 ( Online- ISSN 2319 - 9202 )     New DOI : 10.32804/CASIRJ

Impact Factor* - 6.2311


**Need Help in Content editing, Data Analysis.

Research Gateway

Adv For Editing Content

   No of Download : 88    Submit Your Rating     Cite This   Download        Certificate

DISTRIBUTED ALGORITHMS AND THEIR COMMUNICATION COSTS

    3 Author(s):  ARVIND SINGH , RAKESH KUMAR KATARE , R.K TIWARI

Vol -  5, Issue- 12 ,         Page(s) : 6 - 20  (2014 ) DOI : https://doi.org/10.32804/CASIRJ

Abstract

Abstract We have analyzed the communication-optimal distributed algorithms for dense linear algebra computations in terms of communication costs and performance improvements. We consider all of the fundamental computations-BLAS, Cholesky and symmetric-indefinite factorizations, LU and QR decompositions, eigenvalue and singular value decompositions and compare the best distributed algorithms with the lower bounds.We considered communication between a fast memory of size M and a slow memory of unbounded size. and track both the number of words and messages that an algorithm moves.

  1. M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. “Cache-Oblivious Algorithms". In: Proceedings of the 40th Annual Symposium on Foundations of Computer Science. FOCS '99. Washington, DC, USA: IEEE Computer Society, 1999, p. 285.
  2. G. Ballard, J. Demmel, O. Holtz, and O. Schwartz. “Communication-optimal Parallel and Distributed Cholesky Decomposition". In: SIAM Journal on Scientific Computing 32.6 (2010), pp. 3495-3523.
  3. G. Ballard, D. Becker, J. Demmel, J. Dongarra, A. Druinsky, I. Peled, O. Schwartz, S. Toledo, and I. Yamazaki. Communication-Avoiding Symmetric-Indefinite Factorization. Tech. rep. UCB/EECS-2013-127. EECS Department, University of California, Berkeley, July 2013.
  4. G. Ballard, J. Demmel, B. Lipshitz, O. Schwartz, and S. Toledo. “Communication efficient Gaussian elimination with partial pivoting using a shape morphing data layout". In: Proceedings of the 25th ACM Symposium on Parallelism in Algorithms and Architectures. SPAA '13. Montreal, Quebec, Canada: ACM, 2013, pp. 232-240.
  5. G. Ballard, J. Demmel, and N. Knight. Avoiding Communication in Successive Band Reduction. Tech. rep. UCB/EECS-2013-131. EECS Department, University of California, Berkeley, July 2013.
  6. G. Ballard, J. Demmel, and I. Dumitriu. Communication-optimal parallel and sequential eigenvalue and singular value algorithms. EECS Technical Report EECS-2011-14. UC Berkeley, Feb. 2011.
  7. G. Ballard, J. Demmel, O. Holtz, and O. Schwartz. Distributed Communication Bounds for Fast Linear Algebra. Tech. rep. EECS-2012-36. UC Berkeley, Mar. 2012.
  8. E. Elmroth and F. Gustavson. “New serial and parallel recursive QR factorization algorithms for SMP systems". In: Applied Parallel Computing. Large Scale Scientific and Industrial Problems. Ed. by B. K_agstrom et al. Vol. 1541. Lecture Notes in Computer Science. Springer, 1998, pp. 120-128.
  9. F. G. Gustavson. “Recursion leads to automatic variable blocking for dense linear-algebra algorithms". In: IBM Journal of Research and Development 41.6 (1997), pp. 737-756.
  10. S. Toledo. “Locality of Reference in LU Decomposition with Partial Pivoting". In: SIAM Journal on Matrix Analysis and Applications 18.4 (1997), pp. 1065-1081.
  11. R. Agarwal, F. Gustavson, and M. Zubair. “A high-performance matrix-multiplication algorithm on a distributed-memory parallel computer, using overlapped communication". In: IBM Journal of Research and Development 38.6 (1994), pp. 673-681.
  12. A. Aggarwal and J.S. Vitter. “The input/output complexity of sorting and related problems". In: Communications of the ACM 31.9 (1988), pp. 1116-1127.
  13. E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, S. Ostrouchov, and D. Sorensen. LAPACK Users' Guide. Also available from http://www.netlib.org/lapack/. Philadelphia, PA, USA: SIAM, 1992.
  14. N. Ahmed and K. Pingali. “Automatic Generation of Block-Recursive Codes". In: Proceedings from the 6th International Euro-Par Conference on Parallel Processing. Euro-Par '00. London, UK: Springer-Verlag, 2000, pp. 368-378.
  15. Miroslav Rozloznik, Gil Shklarski, and Sivan Toledo. “Partitioned triangular tridiagonalization". In: ACM Transactions on Mathematical Software 37 (4 2011), 38:1-38:16.
  16. L. Grigori, J. Demmel, and H. Xiang. “CALU: A Communication Optimal LU Factorization Algorithm". In: SIAM Journal on Matrix Analysis and Applications 32.4 (2011), pp. 1317-1350.
  17. L. Trefethen and R. Schreiber. \”Average-Case Stability of Gaussian Elimination". In: SIAM Journal on Matrix Analysis and Applications 11 (1990), pp. 335-360.
  18. J. Demmel. Applied Numerical Linear Algebra. SIAM, 1997.
  19. C. Bischof and C. Van Loan. “The WY representation for products of Householder matrices". In: SIAM Journal on Scientific and Statistical Computing 8.1 (1987).
  20. R. Schreiber and C. Van Loan. “A storage-efficient WY representation for products of Householder transformations". In: SIAM Journal on Scientific and Statistical Computing 10.1 (1989), pp. 53-57.
  21. E. Elmroth and F. Gustavson. “New serial and parallel recursive QR factorization algorithms for SMP systems". In: Applied Parallel Computing. Large Scale Scientific and Industrial Problems. Ed. by B. K_agstrom et al. Vol. 1541. Lecture Notes in Computer Science. Springer, 1998, pp. 120-128.
  22. J. D. Frens and D. S. Wise. “QR factorization with Mortonordered quadtree matrices for memory reuse and parallelism". In: ACM SIGPLAN Notices 38.10 (2003), pp. 144-154.
  23. J. Demmel, L. Grigori, M. Hoemmen, and J. Langou. “Communication-optimal Parallel and Distributed QR and LU Factorizations". In: SIAM Journal on Scientific Computing 34.1 (2012), A206-A239.
  24. G. H. Golub, R. J. Plemmons, and A. Sameh. “Parallel block schemes for large-scale least-squares computations". In: High-speed computing: scientific applications and algorithm design. Champaign, IL, USA: University of Illinois Press, 1988, pp. 171- 179.
  25. A. Buttari, J. Langou, J. Kurzak, and J. J. Dongarra. A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures. Tech. rep. 191. LAPACK Working Note, Sept. 2007.
  26. B. C. Gunter and R. A. Van De Geijn. “Parallel out-of-core computation and updating of the QR factorization". In: ACM Transactions on Mathematical Software 31.1 (2005), pp. 60-78.
  27. G. Ballard, A. Buluc_, J. Demmel, L. Grigori, B. Lipshitz, O. Schwartz, and S. Toledo. “Communication optimal parallel multiplication of sparse random matrices". In: Pro-ceedings of the 25th ACM Symposium on Parallelism in Algorithms and Architectures. SPAA '13. Montreal, Quebec, Canada: ACM, 2013, pp. 222-231.
  28. J. Demmel, O. Marques, B. Parlett, and C. Vomel. “Performance and accuracy of LAPACK's symmetric tridiagonal eigensolvers". In: SIAM Journal on Scientific Computing 30.3 (Mar. 2008), pp. 1508-1526.
  29. C. Bischof, B. Lang, and X. Sun. “A Framework for Symmetric Band Reduction". In: ACM Transactions on Mathematical Software 26.4 (Dec. 2000), pp. 581-601.
  30. C. H. Bischof, B. Lang, and X. Sun. “Algorithm 807: The SBR Toolbox-software for successive band reduction". In: ACM Transactions on Mathematical Software 26.4 (Dec. 2000), pp. 602-616.
  31. Y. Nakatsukasa and N. Higham. Stable and efficient spectral divide and conquer algorithms for the symmetric eigenvalue decomposition and the SVD. Tech. rep. MIMS EPrint 2012.52. The University of Manchester, 2012.
  32. L. Karlsson and B. K_agstrom. “Parallel two-stage reduction to Hessenberg form using dynamic scheduling on shared-memory architectures". In: Parallel Computing 37.12 (2011). 6th International Workshop on Parallel Matrix Algorithms and Applications (PMAA'10), pp. 771 -782.
  33. S. Mohanty and S. Gopalan. “I/O efficient QR and QZ algorithms". In: High Performance Computing (HiPC), 2012 19th International Conference on. 2012, pp. 1- 9.
  34. Z. Bai, J. Demmel, and M. Gu. “An inverse free parallel spectral divide and conquer algorithm for nonsymmetric eigenproblems". In: Numerische Mathematik 76.3 (1997), pp. 279-308.
  35. J. Demmel, I. Dumitriu, and O. Holtz. “Fast linear algebra is stable". In: Numerische Mathematik 108.1 (2007). 10.1007/s00211-007-0114-x, pp. 59-91.

*Contents are provided by Authors of articles. Please contact us if you having any query.






Bank Details