Recent Publications of J. (Ram) Ramanujam

Research supported by

  • NSF CPA (CISE-CCF) grant 0811457
  • NSF CPA (CISE-CCF) grant 0541409
  • NSF CSR (CISE-CNS) grant 0509442
  • NSF NER grant 0508245
  • NSF (CISE-CNS) grant 0601411
  • Environmental Protection Agency (EPA) grant
  • NSF ITR grant 0121706
  • NSF grant 0103933 (CISE) with LSU ECE match and LSU CAPITAL match
  • NSF grant 0073800 (CISE)
  • NSF Young Investigator Award 9457768 (CISE) with a match from Portland Group Inc.


    Copyrights to the many of the following papers are held by the publishers. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

     

    2009

    • Q. Lu, C. Alias, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, P. Sadayappan, Y. Chen, H. Lin and T. Ngai, "Data Layout Transformation for Enhancing Locality on NUCA Chip Multiprocessors," in Proc. 18th International Conference on Parallel Architectures and Compilation Techniques (PACT 09), Raleigh, NC, September 2009.  

    • A. Hartono, M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy, B. Norris, J. Ramanujam, and P. Sadayappan, "Parametric Multi-Level Tiling of Imperfectly Nested Loops," in Proc. 23nd ACM International Conference on Supercomputing, Yorktown Heights, New York, June 2009.  

    • M. Baskaran, N. Vydhyanathan, U. Bondhugula, J. Ramanujam, A. Rountev, and P. Sadayappan, "Compiler-Assisted Dynamic Scheduling for Effective Parallelization of Loop Nests on Multicore Processors," in Proc. 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2009), Raleigh, NC, February 2009.  

    • R. Sankaran, B. Ullmer, K. Kallakuri, S. Jandhyala, C. Toole, J. Ramanujam, and C. Laan, "Decoupling Interaction Hardware Design Using Libraries of Reusable Electronics," in Proc. 3rd International Conference on Tangible and Embedded Interaction (TEI'09), Cambridge, UK, February 2009.  

    • Hassan Salamy and J. Ramanujam, "A Framework for Task Scheduling and Memory Partitioning for Multi-Processor System-on-Chip," in Proc. 4th International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC 2009), Paphos, Cyprus, January 2009.  

    • U. Bondhugula, M. Baskaran, A. Hartono, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan, "A Polyhedral Framework for Automatic Parallelization and Locality Optimization," in Proc. 14th Workshop on Compilers for Parallel Computers (CPC 2009), Zurich, Switzerland, January 2009.  


     

    2008

    • Hassan Salamy and J. Ramanujam, "Optimal Address Register Allocation for Arrays in DSP Applications," in Proc. 6th IEEE Workshop on Embedded Systems for Real-Time Multimedia (ESTIMedia 2008), Atlanta, GA, October 2008. 

    • Hassan Salamy and J. Ramanujam, "Storage Optimization through Code Size Reduction for Digital Signal Processors," in Proc. 6th IEEE Workshop on Embedded Systems for Real-Time Multimedia (ESTIMedia 2008), Atlanta, GA, October 2008.  

    • Jinpyo Hong and J. Ramanujam, "Address Register Allocation in Digital Signal Processors," in Proc. 2008 International Conference on Embedded Systems and Software (ICESS-08), Chengdu, China, July 2008.  

    • Jinpyo Hong and J. Ramanujam, "Scheduling DAGs for Fixed-point DSP Processors by Using Worm Partitions," in Proc. 2008 International Conference on Embedded Systems and Software (ICESS-08), Chengdu, China, July 2008.  

    • U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, "A Practical and Automatic Polyhedral Program Optimization System," Proc. ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation (PLDI 08), Tucson, June 2008. [pdf [Extended version

    • M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan, "A Compiler Framework for Optimization of Affine Loop Nests for General Purpose Computations on GPUs," in Proc. 22nd ACM International Conference on Supercomputing, Island of Kos, Greece, June 2008. [pdf]  [Extended version

    • U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan, "Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model," in Proc. CC 2008 - International Conference on Compiler Construction, Budapest, Hungary, March-April 2008. [pdf]  [Extended version]  

    • M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev and P. Sadayappan, "Automatic Data Movement and Computation Mapping for Multi-level Parallel Architectures with Explicitly Managed Memories," in Proc. 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, (PPoPP 2008), Salt Lake City, UT, February 2008. [pdf] [Extended version]  

    • U. Bondhugula, M. Baskaran, A. Hartono, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan, "Towards Effective Automatic Parallelization for Multicore Systems," in Proc. Workshop on Next Generation Software (NGS 2008), held in conjunction with the 22nd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2008), Miami, FL, April 2008. [pdf]  


     

    2007

    • E. Ayguade, G. Baumgartner, J. Ramanujam, and P. Sadayappan (editors), Languages and Compilers for Parallel Computing, Springer-Verlag, 2007.  

    • X. Gao, S. Krishnamoorthy, S. Sahoo, C. Lam, G. Baumgartner, J. Ramanujam, and P. Sadayappan, "Efficient Search-Space Pruning for Integrated Fusion and Tiling Transformations," Concurrency and Computation: Practice and Experience, 2007. [pdf]  

    • S. Krishnamoorthy, M. Baskaran, U. Bondhugula, J. Ramanujam, A. Rountev and P. Sadayappan, "Effective Automatic Parallelization of Stencil Computations," in Proc. ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation (PLDI 07), San Diego, June 2007. [pdf]  

    • U. Bondhugula, J. Ramanujam, and P. Sadayappan, "Automatic Mapping of Nested Loops to FPGAs," in Proc. ACM SIGPLAN 2007 Symposium on Principles and Practice of Parallel Programming (PPoPP 07), San Jose, CA, March 2007. [pdf]  

    • U. Bondhugula, J. Ramanujam, and P. Sadayappan. PLUTO: A Practical and Fully Automatic Polyhedral Program Optimization Systems. Technical Report OSU-CISRC-11/07-TR70, Department of Computer Science and Engineering, Ohio State University, November 2007. [pdf]  

    • U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. Affine Transformations for Communication Minimal Parallelization and Locality Optimization of Arbitrarily Nested Loop Sequences. Technical Report OSU-CISRC-5/07-TR43, Department of Computer Science and Engineering, Ohio State University, May 2007. [pdf]  

    • S. Pinnepalli, Jinpyo Hong, and J. Ramanujam and Doris Carver, "Code Size Optimization for Embedded Processors using Commutative Transformations," in Proc. The 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA-07), Daegu, Korea, August 2007.

      [pdf]  

    • Jinpyo Hong and J. Ramanujam, "Memory Offset Assignment for DSPs," in Proc. 2007 International Conference on Embedded Systems and Software (ICESS-07), Daegu, Korea, May 2007.

      [pdf]  

    • Hassan Salamy and J. Ramanujam, "An Effective Heuristic for Simple Offset Assignment with Variable Coalescing," Languages and Compilers for Parallel Computing, (C. Cascaval et al. Eds.), Lecture Notes in Computer Science, Springer-Verlag, 2007. [pdf]  


     

    2006

     

    • S. Krishnan, S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Ramanujam, P. Sadayappan, and V. Choppella, "Efficient Synthesis of Out-of-Core Algorithms Using a Nonlinear Optimization Solver," Journal of Parallel and Distributed Computing, vol. 66, no. 5, pp. 659-673, May 2006. [pdf]  

    • G. Chen, M. Kandemir, M. J. Irwin, and J. Ramanujam, "Reducing code size through address register assignment," ACM Transactions on Embedded Computing (TECS), vol. 5, no. 1, pp. 225-258, February 2006. [pdf]  

    • M. Kandemir, J. Ramanujam, and U. Sezer, "Improving the Energy Behavior of Block Buffering Using Compiler Optimizations," ACM Transactions on Design Automation of Electronic Systems, vol. 11, no. 1, pp. 228-250, January 2006. [pdf]  

    • J. Ramanujam, J. Hong, M. Kandemir, A. Narayan, and A. Agarwal, "Estimating and Reducing the Memory Requirements of Signal Processing Codes for Embedded Processor Systems," IEEE Transactions on Signal Processing, vol. 54, no. 1, pp. 286--294, January 2006. [pdf]  

    • A. Auer, G. Baumgartner, D. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X. Gao, R. Harrison, S. Krishnamoorthy, S. Krishnan, C. Lam, Q. Lu, M. Nooijen, R. Pitzer, J. Ramanujam, P. Sadayappan, and A. Sibiryakov, "Automatic Code Generation for Many-Body Electronic Structure Methods: The Tensor Contraction Engine," Molecular Physics, vol. 104, no. 2, pp. 211--228, January 2006. [pdf]  

    • Atef Allam and J. Ramanujam, "Dynamic Memory Usage Optimization using ILP," Proc. 2nd International Computer Engineering Conference: Engineering the Information Society (ICENCO 2006), Cairo, Egypt, December 2006. [pdf]  

    • Atef Allam and J. Ramanujam, "ILP and Iterative LP Solutions for Peak and Average Power Optimization in HLS," Proc. 2nd International Computer Engineering Conference: Engineering the Information Society (ICENCO 2006), Cairo, Egypt, December 2006. [pdf]  

    • Atef Allam and J. Ramanujam, "Modified Force-Directed Scheduling for Peak and Average Power Optimization using Multiple Supply-Voltages," in Proc. International Conference on Integrated Circuit Design and Technology (ICICDT), Padova, Italy, May 2006. [pdf]  

    • Atef Allam and J. Ramanujam, "Simultaneous Peak and Average Power Optimization in Synchronous Sequential Designs Using Retiming and Multiple Supply Voltages," in Proc. International Conference on Integrated Circuit Design and Technology (ICICDT), Padova, Italy, May 2006. [pdf]  

    • A. Hartono, Q. Lu, X. Gao, S. Krishnamoorthy, M. Nooijen, G. Baumgartner, D. Bernholdt, R. Pitzer, J. Ramanujam, A. Rountev, and P. Sadayappan, "Identifying Cost-Effective Common Subexpressions to Reduce Operation Count in Tensor Contraction Evaluations," in Proc. International Conference on Computational Science 2006 (ICCS 2006), Reading, UK, Lecture Notes in Computer Science, Springer-Verlag, 2006. [pdf]  

    • A. Allam, J. Ramanujam, G. Baumgartner, and P. Sadayappan, "Memory Minimization for Tensor Contractions using Integer Linear Programming," Proc. Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL-06), held in conjunction with the 20th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2006), Rhodes, Greece, April 2006. [pdf]  

    • X. Gao, S. Krishnamoorthy, S. Sahoo, C. Lam, G. Baumgartner, J. Ramanujam, and P. Sadayappan, "Efficient Search-Space Pruning for Integrate d Fusion and Tiling Transformations," in Languages and Compilers for Parallel Computing, (E. Ayguade et al. Eds.), Lecture Notes in Computer Science, Springer-Verlag, 2006. [pdf]  

    • X. Gao, S. Krishnamoorthy, Q. Lu, V. Choppella, G. Baumgartner, J. Ramanujam, and P. Sadayappan, "Search-Based Performance-Mod el Driven Optimization for Compilation of Tensor Contraction Expressions," in Proc. 12th Workshop on Compilers for Parallel Computers (CPC 2006), A Coruna , Spain, January 2006.  

     

    2005

     

    • G. Baumgartner, A. Auer, D. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X. Gao, R. Harrison, S. Hirata, S. Krishnamoorthy, S. Krishnan, C. Lam, Q. Lu, M. Nooijen, R. Pitzer, J. Ramanujam, P. Sadayappan, and A. Sibiryakov, "Synthesis of High-Performance Parallel Programs for a Class of ab initio Quantum Chemistry Models," Proceedings of the IEEE, vol. 93, no. 2, pp. 276-292, February 2005. [pdf]  

    • X. Gao, S. Sahoo, Q. Lu, G. Baumgartner, C. Lam, J. Ramanujam, and P. Sadayappan, "Performance Modeling and Optimization of Parallel Out-of-Core Tensor Contractions," in Proc. ACM SIGPLAN 2005 Symposium on Principles and Practice of Parallel Programming, Chicago, IL, June 2005.  [pdf]

    • A. Hartono, A. Sibiryakov, M. Nooijen, G. Baumgartner, D.E. Bernholdt, S. Hirata, C. Lam, R. Pitzer, J. Ramanujam, and P. Sadayappan, "Automated Operation Minimization of Tensor Contraction Expressions in Electronic Structure Calculations," in Proc. International Conference on Computational Science 2005 (ICCS 2005), Atlanta, GA, May 2005.  [pdf]

    • Q. Lu, X. Gao, S. Krishnamoorthy, G. Baumgartner, J. Ramanujam, and P. Sadayappan, "Empirical Performance-Model Driven Data Layout Optimization," Languages and Compilers for Parallel Computing, (R. Eigenmann et al. Eds.), Lecture Notes in Computer Science, Springer-Verlag, 2005.  [pdf]


     

    2004

     

    • L. Benini, M. Kandemir, and J. Ramanujam (editors), Compilers and Operating Systems for Low Power, Kluwer Academic Publishers, Boston, MA, January 2004. ISBN: 1-4020-7573-1.

    • M. Kandemir, I. Kadayif, A. Choudhary, J. Ramanujam, and I. Kolcu, "Compiler-directed scratch pad memory optimization for embedded multiprocessors," IEEE Transactions on VLSI (TVLSI), vol. 12, no. 3, pp. 281-287, March 2004. [pdf]

    • M. Kandemir, J. Ramanujam, M. Irwin, V. Narayanan, I. Kadayif, and A. Parikh, "A Compiler Based Approach for Dynamically Managing Scratch-Pad Memories in Embedded Systems," IEEE Transactions on Computer-Aided Design, vol. 23, no. 2, pp. 243-260, February 2004.  [pdf]

    • G. Baumgartner, D. Bernholdt, V. Choppella, J. Ramanujam, and P. Sadayappan, "A High-Level Approach to Synthesis of High-Performance Codes for Quantum Chemistry: The Tensor Contraction Engine," in Proc. 11th Workshop on Compilers for Parallel Computers (CPC 2004), Chiemsee, Germany, July 2004. 

    • Efficient Synthesis of Out-of-core Algorithms Using a Nonlinear Optimization Solver, Sandhya Krishnan, Sriram Krishnamoorthy, Gerald Baumgartner, Chi-Chung Lam, J. Ramanujam, and P. Sadayappan. In Proceedings of the 18th International Parallel and Distributed Processing Symposium (2004 IPDPS Conference), Santa Fe, April 2004. (Best Paper Award)   [pdf]

    • Memory-Constrained Data Locality Optimization for Tensor Contractions, Alina Bibireata, Sandhya Krishnan, Gerald Baumgartner, Daniel Cociorva, Chi-Chung Lam, P. Sadayappan, J. Ramanujam, David E. Bernholdt, and Venkatesh Choppella, Languages and Compilers for Parallel Computing, L. Rauschwerger et al. (Eds.), Springer-Verlag, 2004.  [pdf]


     

    2003

     


     

    2002

     

    • An I/O conscious tiling strategy for disk-resident data sets, by M. Kandemir, A. Choudhary, and J. Ramanujam. The Journal of Supercomputing, 21(3):257-284, 2002. [pdf]

    • A High-Level Approach to Synthesis of High-Performance Codes for Quantum Chemistry G. Baumgartner, D.E. Bernholdt, D. Cociorva, R. Harrison, S. Hirata, C. Lam, M. Nooijen, R. Pitzer, J. Ramanujam, P. Sadayappan. In Proceedings of Supercomputing 2002, Baltimore, Maryland, November 2002.

    • Memory-Constrained Communication Minimization for a Class of Array Computations D. Cociorva, G. Baumgartner, C. Lam, P. Sadayappan, J. Ramanujam. In Proceedings of the 15th International Workshop on Languages and Compilers for Parallel Computing (LCPC '02), College Park, Maryland, July 2002. [pdf]

    • Automatic Synthesis of High-Performance Codes for Quantum Chemistry Applications G. Baumgartner, D.E. Bernholdt, D. Cociorva, R. Harrison, C. Lam, M. Nooijen, J. Ramanujam, P. Sadayappan. In Proceedings of the Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL-02), New York, New York, June 2002.

    • Space-Time Trade-Off Optimization for a Class of Electronic Structure Calculations D. Cociorva, G. Baumgartner, C. Lam, P. Sadayappan, J. Ramanujam, M. Nooijen, D.E. Bernholdt, R. Harrison. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI '02), Berlin, Germany, June 2002, pp. 177-186.

    • A Performance Optimization Framework for Compilation of Tensor Contraction Expressions into Parallel Programs. G. Baumgartner, D.E. Bernholdt, D. Cociorva, R. Harrison, C. Lam, M. Nooijen, J. Ramanujam, P. Sadayappan. In Proceedings of the 7th International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS '02), Fort Lauderdale, Florida, April 2002.

    • Address code and arithmetic optimizations for embedded systems, by J. Ramanujam, S. Deshpande, J. Hong, and M. Kandemir. In Proc. VLSI Design/ASPDAC 2002 Bangalore, India, January 2002. [pdf]

    • J. Ramanujam, S. Krishnamoorthy, J. Hong, and M. Kandemir. A heuristic for clock selection in high-level synthesis, by In Proc. VLSI Design/ASPDAC 2002 Bangalore, India, January 2002. [pdf]

    • Strategies for improving data locality in embedded applications, by N. Crosbie, M. Kandemir, I. Kolcu, J. Ramanujam, A. Choudhary. In Proc. VLSI Design/ASPDAC 2002 Bangalore, India, January 2002. [pdf]

    • Automatic Data Distribution, by J. Ramanujam. In The Compiler Design Handbook: Optimizations and Machine Code Generation, (Y. N. Srikant and P. Shankar: Eds.), Chapter 12, pages 409-459, CRC Press, Boca Raton, FL, 2002.

    • Integer lattice based methods for local address generation for block-cyclic distributions, by J. Ramanujam. In Compiler Optimizations for Scalable Parallel Systems - Languages, Compilation Techniques, and Run Time Systems, S. Pande and D. P. Agrawal (Eds.), Lecture Notes in Computer Science, Volume 1808, pages 597-645, Springer-Verlag, 2001. [pdf]


    J. Ramanujam
    Ritter Distinguished Professor
    Department of Electrical and Computer Engineering
        and Center for Computation and Technology
    Louisiana State University
    Baton Rouge, LA 70803-5901, USA

    Phone: +1 (225) 578-5628
    Office: +1 (225) 578-5241
    Fax: +1 (225) 578-5200
    E-mail: j x r {@} ece.lsu.edu

    Last modified: May 2009