Research supported by
NSF CPA (CISE-CCF) grant 0811457
NSF CPA (CISE-CCF) grant 0541409
NSF CSR (CISE-CNS) grant 0509442
NSF NER grant 0508245
NSF (CISE-CNS) grant 0601411
Environmental Protection Agency (EPA) grant
NSF ITR grant 0121706
NSF grant 0103933 (CISE) with LSU ECE match
and LSU CAPITAL match
NSF grant 0073800 (CISE)
NSF Young Investigator Award 9457768 (CISE) with a match from
Portland Group Inc.
Copyrights to the many of the following papers are held by the
publishers. It is understood that all persons copying this information
will adhere to the terms and constraints invoked by each author's
copyright. These works may not be reposted without the explicit
permission of the copyright holder.
-
H. Salamy and J. Ramanujam,
"Storage Optimization through Offset Assignment with
Variable Coalescing,"
ACM Transactions on Embedded Computing Systems (TECS),
2010.
-
L.-N. Pouchet, U. Bondhugula, C. Bastoul, A. Cohen,
J. Ramanujam and P. Sadayappan,
"Combined Iterative and Model-driven Optimization
in an Automatic Parallelization Framework,"
in Proc. ACM/IEEE Conference on High Performance Computing SC10,
New Orleans, LA, November 2010.
-
M. Baskaran, A. Hartono, S. Tavarageri, T. Henretty,
J. Ramanujam, and P. Sadayappan,
"Parameterized Tiling Revisited,"
International Symposium on
Code Generation and Optimization (CGO), Toronto, April 2010.
-
S. Tavarageri, A. Hartono, M. Baskaran, L.-N. Pouchet,
J. Ramanujam, and P. Sadayappan,
"Parametric Tiling of Affine Loop Nests,"
in Proc. 15th Workshop on Compilers for Parallel Computers (CPC 2010),
Vienna, Austria, July 2010.
-
A. Hartono, M. Baskaran, J. Ramanujam, and P. Sadayappan,
"DynTile: Parametric Tiled Loop Generation for Parallel
Execution on Multicore Processors,"
24th International
Parallel and Distributed Processing Symposium (2010 IPDPS
Conference), Atlanta, April 2010.
-
M. Baskaran, J. Ramanujam, and P. Sadayappan,
"Automatic C-to-CUDA Code Generation for Affine Programs ,"
International Conference on Compiler
Construction (CC), Cyprus, March 2010.
-
Q. Lu, C. Alias, U. Bondhugula, S. Krishnamoorthy,
J. Ramanujam, A. Rountev, P. Sadayappan,
Y. Chen, H. Lin and T. Ngai,
"Data Layout Transformation for Enhancing Locality
on NUCA Chip Multiprocessors,"
in Proc. 18th International Conference on
Parallel Architectures and Compilation Techniques (PACT 09),
Raleigh, NC, September 2009.
-
Z. Yun, Z. Lei, G. Allen, D. S. Katz, T. Kosar, S. Jha, J. Ramanujam,
"An innovative application execution toolkit for
multicluster grids,"
in Proc. CLUSTER 2009,
New Orleans, LA, September 2009. pp. 1-4.
-
A. Hartono, M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy,
B. Norris, J. Ramanujam, and P. Sadayappan,
"Parametric Multi-Level Tiling of Imperfectly Nested Loops,"
in Proc. 23nd ACM International Conference on
Supercomputing,
Yorktown Heights, New York, June 2009.
-
M. Baskaran, N. Vydhyanathan, U. Bondhugula, J. Ramanujam,
A. Rountev, and P. Sadayappan,
"Compiler-Assisted Dynamic Scheduling for Effective
Parallelization of Loop Nests on Multicore Processors,"
in Proc. 14th ACM SIGPLAN Symposium on Principles
and Practice of Parallel Programming (PPoPP 2009),
Raleigh, NC, February 2009.
-
A. Hartono, Q. Lu, T. Henretty, S. Krishnamoorthy, H. Zhang,
G. Baumgartner, D. E. Bernholdt, M. Nooijen, R. Pitzer,
J. Ramanujam, and P. Sadayappan,
"Performance Optimization of Tensor Contraction Expressions
for Many-Body Methods in Quantum Chemistry,"
The Journal of Physical Chemistry A,
Vol. 113 (45), pp. 12715-12723, 2009.
-
R. Sankaran, B. Ullmer, K. Kallakuri, S. Jandhyala, C. Toole,
J. Ramanujam, and C. Laan, "Decoupling Interaction Hardware
Design Using Libraries of Reusable Electronics,"
in Proc. 3rd International Conference on Tangible and
Embedded Interaction (TEI'09),
Cambridge, UK, February 2009. pp. 331-337.
-
Hassan Salamy and J. Ramanujam,
"A Framework for Task Scheduling and Memory Partitioning for
Multi-Processor System-on-Chip,"
in Proc. 4th International Conference on High Performance
and Embedded Architectures and Compilers (HiPEAC 2009),
Paphos, Cyprus, January 2009.
-
U. Bondhugula, M. Baskaran, A. Hartono,
S. Krishnamoorthy, J. Ramanujam,
A. Rountev, and P. Sadayappan,
"A Polyhedral Framework for Automatic Parallelization
and Locality Optimization,"
in Proc. 14th Workshop
on Compilers for Parallel Computers (CPC 2009),
Zurich, Switzerland, January 2009.
-
Hassan Salamy and J. Ramanujam,
"Optimal Address Register Allocation for Arrays in DSP
Applications,"
in Proc. 6th IEEE Workshop on Embedded Systems for
Real-Time Multimedia (ESTIMedia 2008),
Atlanta, GA, October 2008.
-
Hassan Salamy and J. Ramanujam,
"Storage Optimization through Code Size Reduction for
Digital Signal Processors,"
in Proc. 6th IEEE Workshop on Embedded Systems for
Real-Time Multimedia (ESTIMedia 2008),
Atlanta, GA, October 2008.
- Jinpyo Hong and J. Ramanujam,
"Address Register Allocation in Digital Signal Processors,"
in Proc. 2008 International Conference on Embedded Systems
and Software (ICESS-08),
Chengdu, China, July 2008.
- Jinpyo Hong and J. Ramanujam,
"Scheduling DAGs for Fixed-point DSP Processors
by Using Worm Partitions,"
in Proc. 2008 International Conference on Embedded Systems
and Software (ICESS-08),
Chengdu, China, July 2008.
-
U. Bondhugula, A. Hartono, J. Ramanujam,
and P. Sadayappan,
"A Practical and Automatic Polyhedral Program Optimization System,"
Proc. ACM SIGPLAN 2008 Conference
on Programming Language Design and Implementation (PLDI 08),
Tucson, June 2008.
[pdf]
[Extended
version]
-
M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam,
A. Rountev, and P. Sadayappan,
"A Compiler Framework for Optimization of Affine Loop
Nests for General Purpose Computations on GPUs,"
in Proc. 22nd ACM International Conference on
Supercomputing,
Island of Kos, Greece, June 2008.
[pdf]
[Extended
version]
-
U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam,
A. Rountev, and P. Sadayappan,
"Automatic Transformations for Communication-Minimized
Parallelization and Locality Optimization in the
Polyhedral Model,"
in Proc. CC 2008 - International Conference on
Compiler Construction, Budapest, Hungary, March-April 2008.
[pdf]
[Extended
version]
- M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J.
Ramanujam,
A. Rountev and P. Sadayappan, "Automatic Data Movement and Computation
Mapping for Multi-level Parallel Architectures with Explicitly Managed
Memories," in Proc. 13th ACM SIGPLAN Symposium on Principles and
Practice of Parallel Programming, (PPoPP 2008), Salt Lake City,
UT, February 2008.
[pdf]
[Extended
version]
- U. Bondhugula, M. Baskaran, A. Hartono, S.
Krishnamoorthy,
J. Ramanujam, A. Rountev, and P. Sadayappan,
"Towards Effective Automatic Parallelization for Multicore
Systems,"
in Proc. Workshop on Next Generation Software (NGS 2008),
held in conjunction with the
22nd IEEE International Parallel and Distributed Processing
Symposium
(IPDPS 2008), Miami, FL, April 2008.
[pdf]
- E. Ayguade, G. Baumgartner, J.
Ramanujam, and P. Sadayappan
(editors),
Languages and Compilers for Parallel Computing,
Springer-Verlag, 2007.
- X. Gao, S. Krishnamoorthy, S. Sahoo, C. Lam, G.
Baumgartner, J.
Ramanujam, and P. Sadayappan, "Efficient Search-Space Pruning for
Integrated Fusion and Tiling Transformations," Concurrency and
Computation: Practice and Experience, 2007.
[pdf]
- S. Krishnamoorthy, M. Baskaran, U. Bondhugula, J.
Ramanujam,
A. Rountev and P. Sadayappan, "Effective Automatic Parallelization
of Stencil Computations," in Proc. ACM SIGPLAN 2007 Conference
on Programming Language Design and Implementation (PLDI 07),
San Diego, June 2007.
[pdf]
- U. Bondhugula, J. Ramanujam, and P. Sadayappan,
"Automatic
Mapping of Nested Loops to FPGAs," in Proc. ACM SIGPLAN 2007
Symposium on Principles and Practice of Parallel Programming
(PPoPP 07), San Jose, CA, March 2007.
[pdf]
-
U. Bondhugula, J. Ramanujam, and P. Sadayappan.
PLUTO: A Practical and Fully Automatic Polyhedral Program
Optimization Systems.
Technical Report OSU-CISRC-11/07-TR70, Department of Computer
Science and Engineering, Ohio State University, November 2007.
[pdf]
- U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J.
Ramanujam,
A. Rountev, and P. Sadayappan.
Affine Transformations for Communication Minimal Parallelization
and Locality Optimization of Arbitrarily Nested Loop Sequences.
Technical Report OSU-CISRC-5/07-TR43, Department of Computer
Science and Engineering, Ohio State University, May 2007.
[pdf]
- S. Pinnepalli, Jinpyo
Hong, and J. Ramanujam and
Doris Carver, "Code Size Optimization for Embedded Processors
using Commutative Transformations," in Proc. The 13th IEEE
International Conference on Embedded and Real-Time Computing Systems
and Applications (RTCSA-07), Daegu, Korea, August 2007.
[pdf]
- Jinpyo Hong and J. Ramanujam, "Memory Offset
Assignment for DSPs," in Proc. 2007 International Conference
on Embedded Systems and Software (ICESS-07), Daegu, Korea, May
2007.
[pdf]
- Hassan Salamy and J. Ramanujam, "An Effective Heuristic
for
Simple Offset Assignment with Variable Coalescing," Languages
and Compilers for Parallel Computing, (C. Cascaval et al. Eds.),
Lecture Notes in Computer Science, Springer-Verlag, 2007.
[pdf]
- S. Krishnan, S. Krishnamoorthy,
G. Baumgartner, C. Lam,
J. Ramanujam, P. Sadayappan, and V. Choppella, "Efficient Synthesis of
Out-of-Core Algorithms Using a Nonlinear Optimization Solver,"
Journal of Parallel and Distributed Computing, vol. 66,
no. 5,
pp. 659-673, May 2006.
[pdf]
- G. Chen, M. Kandemir, M. J. Irwin, and J. Ramanujam,
"Reducing
code size through address register assignment,"
ACM Transactions on Embedded Computing (TECS), vol. 5, no.
1,
pp. 225-258, February 2006.
[pdf]
- M. Kandemir, J. Ramanujam, and U. Sezer, "Improving the
Energy
Behavior of Block Buffering Using Compiler Optimizations," ACM
Transactions on Design Automation of Electronic Systems, vol.
11,
no. 1, pp. 228-250, January 2006.
[pdf]
- J. Ramanujam, J. Hong, M. Kandemir, A. Narayan, and A.
Agarwal,
"Estimating and Reducing the Memory Requirements of Signal
Processing Codes for Embedded Processor Systems," IEEE
Transactions on Signal Processing, vol. 54, no. 1, pp. 286--294,
January 2006.
[pdf]
- A. Auer, G. Baumgartner, D. Bernholdt,
A. Bibireata, V. Choppella, D. Cociorva,
X. Gao, R. Harrison, S. Krishnamoorthy,
S. Krishnan, C. Lam, Q. Lu, M. Nooijen,
R. Pitzer, J. Ramanujam, P. Sadayappan, and
A. Sibiryakov, "Automatic Code Generation for Many-Body
Electronic Structure Methods: The Tensor Contraction Engine,"
Molecular Physics, vol. 104, no. 2,
pp. 211--228, January 2006.
[pdf]
- Atef Allam and J. Ramanujam, "Dynamic Memory Usage
Optimization
using ILP," Proc. 2nd International Computer Engineering
Conference: Engineering the Information Society (ICENCO 2006),
Cairo, Egypt, December 2006.
[pdf]
- Atef Allam and J. Ramanujam, "ILP and Iterative LP
Solutions
for Peak and Average Power Optimization in HLS," Proc. 2nd
International Computer Engineering Conference: Engineering the
Information Society (ICENCO 2006), Cairo, Egypt, December 2006.
[pdf]
- Atef Allam and J. Ramanujam, "Modified Force-Directed
Scheduling for Peak and Average Power Optimization using Multiple
Supply-Voltages," in Proc. International Conference on
Integrated Circuit Design and Technology (ICICDT), Padova,
Italy, May 2006.
[pdf]
- Atef Allam and J. Ramanujam, "Simultaneous Peak and
Average Power Optimization in Synchronous Sequential Designs Using
Retiming and Multiple Supply Voltages," in
Proc. International Conference on Integrated Circuit Design and
Technology (ICICDT), Padova, Italy, May 2006.
[pdf]
- A. Hartono, Q. Lu, X. Gao, S. Krishnamoorthy, M.
Nooijen,
G. Baumgartner, D. Bernholdt, R. Pitzer, J. Ramanujam, A. Rountev,
and P. Sadayappan, "Identifying Cost-Effective Common
Subexpressions to Reduce Operation Count in Tensor Contraction
Evaluations," in Proc. International Conference on
Computational Science 2006 (ICCS 2006), Reading, UK, Lecture
Notes in Computer Science, Springer-Verlag, 2006.
[pdf]
- A. Allam, J. Ramanujam, G. Baumgartner, and P.
Sadayappan, "Memory
Minimization for Tensor Contractions using Integer Linear
Programming,"
Proc. Workshop on Performance Optimization for High-Level
Languages and Libraries (POHLL-06), held in conjunction with
the 20th IEEE International Parallel & Distributed Processing
Symposium (IPDPS 2006), Rhodes, Greece, April 2006.
[pdf]
- X. Gao, S. Krishnamoorthy, S. Sahoo, C. Lam, G.
Baumgartner,
J. Ramanujam, and P. Sadayappan, "Efficient Search-Space Pruning
for Integrate d Fusion and Tiling Transformations," in
Languages and Compilers for Parallel Computing, (E. Ayguade
et al. Eds.), Lecture Notes in Computer Science, Springer-Verlag,
2006.
[pdf]
- X. Gao, S. Krishnamoorthy, Q. Lu, V. Choppella, G.
Baumgartner, J. Ramanujam, and P. Sadayappan, "Search-Based
Performance-Mod el Driven Optimization for Compilation of Tensor
Contraction Expressions," in Proc. 12th Workshop on Compilers
for Parallel Computers (CPC 2006), A Coruna , Spain, January
2006.
- G. Baumgartner,
A. Auer, D. Bernholdt, A. Bibireata,
V. Choppella, D. Cociorva, X. Gao,
R. Harrison, S. Hirata,
S. Krishnamoorthy, S. Krishnan, C. Lam,
Q. Lu, M. Nooijen, R. Pitzer,
J. Ramanujam, P. Sadayappan, and A. Sibiryakov,
"Synthesis of High-Performance Parallel Programs for a Class of
ab
initio Quantum Chemistry Models,"
Proceedings of the IEEE, vol. 93, no. 2, pp. 276-292,
February 2005.
[pdf]
- X. Gao, S. Sahoo, Q. Lu, G. Baumgartner, C. Lam, J.
Ramanujam, and
P. Sadayappan, "Performance Modeling and Optimization of
Parallel Out-of-Core Tensor Contractions," in Proc. ACM
SIGPLAN 2005 Symposium on Principles and Practice of Parallel
Programming, Chicago, IL, June 2005.
[pdf]
- A. Hartono, A. Sibiryakov, M. Nooijen, G. Baumgartner,
D.E. Bernholdt, S. Hirata, C. Lam, R. Pitzer, J. Ramanujam, and
P. Sadayappan, "Automated Operation Minimization of Tensor
Contraction Expressions in Electronic Structure Calculations," in
Proc. International Conference on Computational Science 2005
(ICCS 2005), Atlanta, GA, May 2005.
[pdf]
- Q. Lu, X. Gao, S. Krishnamoorthy,
G. Baumgartner, J. Ramanujam, and P. Sadayappan,
"Empirical Performance-Model Driven Data Layout Optimization,"
Languages and Compilers for Parallel Computing,
(R. Eigenmann et al. Eds.),
Lecture Notes in Computer Science,
Springer-Verlag, 2005.
[pdf]
- L. Benini, M. Kandemir, and J.
Ramanujam (editors),
Compilers
and Operating Systems for Low Power,
Kluwer Academic Publishers, Boston, MA, January 2004.
ISBN: 1-4020-7573-1.
- M. Kandemir, I. Kadayif, A. Choudhary, J. Ramanujam, and
I. Kolcu,
"Compiler-directed scratch pad memory optimization for embedded
multiprocessors,"
IEEE Transactions on VLSI (TVLSI),
vol. 12, no. 3, pp. 281-287, March 2004.
[pdf]
- M. Kandemir, J. Ramanujam, M. Irwin,
V. Narayanan, I. Kadayif,
and A. Parikh, "A Compiler Based Approach for Dynamically
Managing Scratch-Pad Memories in Embedded Systems,"
IEEE Transactions on Computer-Aided Design,
vol. 23, no. 2, pp. 243-260, February 2004.
[pdf]
- G. Baumgartner, D. Bernholdt,
V. Choppella,
J. Ramanujam, and P. Sadayappan, "A High-Level
Approach to Synthesis of High-Performance Codes for Quantum
Chemistry: The Tensor Contraction Engine," in
Proc. 11th Workshop on Compilers for Parallel Computers
(CPC 2004), Chiemsee, Germany, July 2004.
- Efficient Synthesis of Out-of-core Algorithms Using a
Nonlinear Optimization Solver, Sandhya Krishnan, Sriram
Krishnamoorthy, Gerald Baumgartner, Chi-Chung Lam, J. Ramanujam, and
P. Sadayappan. In Proceedings of the 18th International
Parallel and Distributed Processing Symposium (2004 IPDPS
Conference), Santa Fe, April 2004. (Best Paper
Award)
[pdf]
-
Memory-Constrained Data Locality Optimization for Tensor Contractions,
Alina Bibireata, Sandhya Krishnan, Gerald Baumgartner, Daniel
Cociorva, Chi-Chung Lam, P. Sadayappan, J. Ramanujam, David
E. Bernholdt, and Venkatesh Choppella,
Languages and Compilers for Parallel Computing,
L. Rauschwerger et al. (Eds.), Springer-Verlag, 2004.
[pdf]
- Chua-Huang Huang and J.
Ramanujam
(editors),
Proceedings
of The 2003 International
Conference on Parallel Processing Workshops.
IEEE Computer Society Press, October 2003. ISBN:
9780769520186.
Available at
Amazon
Barnes&Noble ISBN (10-digit):
0769520189.
-
Reducing false sharing and improving spatial locality in a
unified compilation framework, by
M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee.
IEEE Transactions on Parallel and Distributed Systems,
14(4):337-354, April 2003.
[pdf]
- Data
Locality
Optimization for Synthesis of Efficient Out-of-Core Algorithms
Sandhya Krishnan, Sriram Krishnamoorthy, Gerald Baumgartner, Daniel
Cociorva, Chi-Chung Lam, P. Sadayappan, J. Ramanujam, David
E. Bernholdt, and Venkatesh Choppella. In Proceedings of the
International Conference on High-Performance Computing (HiPC
'03), Hyderabad, India, December 2003, Springer Verlag,
Lecture
Nodes in Computer Science. (Best Paper Award)
- Global
Communication
Optimization for Tensor Contraction Expressions under Memory
Constraints D. Cociorva, X. Gao, S. Krishnan,
G. Baumgartner, C. Lam, P. Sadayappan, J. Ramanujam. In
Proceedings of the International Parallel and
Distributed Processing Symposium, Nice, France,
April, 2003.
- An I/O conscious tiling strategy
for disk-resident data sets,
by M. Kandemir, A. Choudhary, and J. Ramanujam.
The Journal of Supercomputing,
21(3):257-284, 2002.
[pdf]
-
A High-Level Approach to Synthesis of High-Performance Codes for
Quantum Chemistry
G. Baumgartner, D.E. Bernholdt, D. Cociorva, R. Harrison, S. Hirata,
C. Lam, M. Nooijen, R. Pitzer, J. Ramanujam, P. Sadayappan.
In Proceedings of Supercomputing 2002,
Baltimore, Maryland, November 2002.
-
Memory-Constrained Communication Minimization for a Class of
Array Computations
D. Cociorva, G. Baumgartner, C. Lam, P. Sadayappan, J. Ramanujam.
In Proceedings of the 15th International Workshop
on Languages and Compilers for Parallel Computing (LCPC '02),
College Park, Maryland, July 2002.
[pdf]
-
Automatic Synthesis of High-Performance Codes for Quantum
Chemistry Applications
G. Baumgartner, D.E. Bernholdt, D. Cociorva, R. Harrison,
C. Lam, M. Nooijen, J. Ramanujam, P. Sadayappan.
In Proceedings of the Workshop on Performance
Optimization for High-Level Languages and Libraries
(POHLL-02), New York, New York, June 2002.
-
Space-Time Trade-Off Optimization for a Class of Electronic
Structure Calculations
D. Cociorva, G. Baumgartner, C. Lam, P. Sadayappan,
J. Ramanujam, M. Nooijen, D.E. Bernholdt, R. Harrison.
In Proceedings of the ACM SIGPLAN 2002 Conference on
Programming Language Design and Implementation (PLDI '02),
Berlin, Germany, June 2002, pp. 177-186.
-
A Performance Optimization Framework for Compilation of Tensor
Contraction Expressions into Parallel Programs.
G. Baumgartner, D.E. Bernholdt, D. Cociorva, R. Harrison,
C. Lam, M. Nooijen, J. Ramanujam, P. Sadayappan.
In Proceedings of the 7th International Workshop on
High-Level Parallel Programming Models and Supportive
Environments (HIPS '02), Fort Lauderdale, Florida, April
2002.
- Address code and arithmetic optimizations for
embedded systems,
by J. Ramanujam, S. Deshpande, J. Hong, and M. Kandemir.
In Proc. VLSI Design/ASPDAC 2002 Bangalore, India,
January 2002.
[pdf]
- J. Ramanujam, S. Krishnamoorthy, J. Hong, and M. Kandemir.
A heuristic for clock selection in high-level synthesis, by
In Proc. VLSI Design/ASPDAC 2002 Bangalore,
India, January 2002.
[pdf]
-
Strategies for improving data locality in embedded applications,
by N. Crosbie, M. Kandemir, I. Kolcu, J. Ramanujam, A. Choudhary.
In Proc. VLSI Design/ASPDAC
2002 Bangalore, India, January 2002.
[pdf]
-
Automatic Data Distribution,
by J. Ramanujam.
In The Compiler Design Handbook: Optimizations and Machine
Code
Generation, (Y. N. Srikant and P. Shankar: Eds.), Chapter
12,
pages 409-459, CRC Press, Boca Raton, FL, 2002.
-
Integer lattice based methods for local address
generation for block-cyclic distributions,
by J. Ramanujam. In
Compiler Optimizations for Scalable Parallel Systems -
Languages, Compilation Techniques, and Run Time Systems,
S. Pande and D. P. Agrawal (Eds.), Lecture Notes in Computer
Science, Volume 1808, pages 597-645, Springer-Verlag, 2001.
[pdf]
J. Ramanujam
Ritter Distinguished Professor
Department of Electrical and Computer Engineering
and Center for Computation and Technology
Louisiana State University
Baton Rouge, LA 70803-5901, USA
Phone: +1 (225) 578-5628
Office: +1 (225) 578-5241
Fax: +1 (225) 578-5200
E-mail: j x r {@} ece.lsu.edu
Last updated: