Hassan Salamy and J. Ramanujam, "An Effective Heuristic for
Simple Offset Assignment with Variable Coalescing," Languages
and Compilers for Parallel Computing, (C. Cascaval et al. Eds.),
Lecture Notes in Computer Science, Springer-Verlag, 2007.
[pdf]
- S. Krishnan, S. Krishnamoorthy, G. Baumgartner, C. Lam,
J. Ramanujam, P. Sadayappan, and V. Choppella, "Efficient Synthesis of
Out-of-Core Algorithms Using a Nonlinear Optimization Solver,"
Journal of Parallel and Distributed Computing, vol. 66, no. 5,
pp. 659-673, May 2006.
[pdf]
- G. Chen, M. Kandemir, M. J. Irwin, and J. Ramanujam, "Reducing
code size through address register assignment,"
ACM Transactions on Embedded Computing (TECS), vol. 5, no. 1,
pp. 225-258, February 2006.
[pdf]
- M. Kandemir, J. Ramanujam, and U. Sezer, "Improving the Energy
Behavior of Block Buffering Using Compiler Optimizations," ACM
Transactions on Design Automation of Electronic Systems, vol. 11,
no. 1, pp. 228-250, January 2006.
[pdf]
- J. Ramanujam, J. Hong, M. Kandemir, A. Narayan, and A. Agarwal,
"Estimating and Reducing the Memory Requirements of Signal
Processing Codes for Embedded Processor Systems," IEEE
Transactions on Signal Processing, vol. 54, no. 1, pp. 286--294,
January 2006.
[pdf]
- A. Auer, G. Baumgartner, D. Bernholdt,
A. Bibireata, V. Choppella, D. Cociorva,
X. Gao, R. Harrison, S. Krishnamoorthy,
S. Krishnan, C. Lam, Q. Lu, M. Nooijen,
R. Pitzer, J. Ramanujam, P. Sadayappan, and
A. Sibiryakov, "Automatic Code Generation for Many-Body
Electronic Structure Methods: The Tensor Contraction Engine,"
Molecular Physics, vol. 104, no. 2,
pp. 211--228, January 2006.
[pdf]
- Atef Allam and J. Ramanujam, "Dynamic Memory Usage Optimization
using ILP," Proc. 2nd International Computer Engineering
Conference: Engineering the Information Society (ICENCO 2006),
Cairo, Egypt, December 2006.
[pdf]
- Atef Allam and J. Ramanujam, "ILP and Iterative LP Solutions
for Peak and Average Power Optimization in HLS," Proc. 2nd
International Computer Engineering Conference: Engineering the
Information Society (ICENCO 2006), Cairo, Egypt, December 2006.
[pdf]
- Atef Allam and J. Ramanujam, "Modified Force-Directed
Scheduling for Peak and Average Power Optimization using Multiple
Supply-Voltages," in Proc. International Conference on
Integrated Circuit Design and Technology (ICICDT), Padova,
Italy, May 2006.
[pdf]
- Atef Allam and J. Ramanujam, "Simultaneous Peak and
Average Power Optimization in Synchronous Sequential Designs Using
Retiming and Multiple Supply Voltages," in
Proc. International Conference on Integrated Circuit Design and
Technology (ICICDT), Padova, Italy, May 2006.
[pdf]
- A. Hartono, Q. Lu, X. Gao, S. Krishnamoorthy, M. Nooijen,
G. Baumgartner, D. Bernholdt, R. Pitzer, J. Ramanujam, A. Rountev,
and P. Sadayappan, "Identifying Cost-Effective Common
Subexpressions to Reduce Operation Count in Tensor Contraction
Evaluations," in Proc. International Conference on
Computational Science 2006 (ICCS 2006), Reading, UK, Lecture
Notes in Computer Science, Springer-Verlag, 2006.
[pdf]
- A. Allam, J. Ramanujam, G. Baumgartner, and P. Sadayappan, "Memory
Minimization for Tensor Contractions using Integer Linear Programming,"
Proc. Workshop on Performance Optimization for High-Level
Languages and Libraries (POHLL-06), held in conjunction with
the 20th IEEE International Parallel & Distributed Processing
Symposium (IPDPS 2006), Rhodes, Greece, April 2006.
[pdf]
- X. Gao, S. Krishnamoorthy, S. Sahoo, C. Lam, G. Baumgartner,
J. Ramanujam, and P. Sadayappan, "Efficient Search-Space Pruning
for Integrate d Fusion and Tiling Transformations," in
Languages and Compilers for Parallel Computing, (E. Ayguade
et al. Eds.), Lecture Notes in Computer Science, Springer-Verlag,
2006.
[pdf]
- X. Gao, S. Krishnamoorthy, Q. Lu, V. Choppella, G.
Baumgartner, J. Ramanujam, and P. Sadayappan, "Search-Based
Performance-Mod el Driven Optimization for Compilation of Tensor
Contraction Expressions," in Proc. 12th Workshop on Compilers
for Parallel Computers (CPC 2006), A Coruna , Spain, January
2006.
- G. Baumgartner, A. Auer, D. Bernholdt, A. Bibireata,
V. Choppella, D. Cociorva, X. Gao, R. Harrison, S. Hirata,
S. Krishnamoorthy, S. Krishnan, C. Lam, Q. Lu, M. Nooijen, R. Pitzer,
J. Ramanujam, P. Sadayappan, and A. Sibiryakov,
"Synthesis of High-Performance Parallel Programs for a Class of ab
initio Quantum Chemistry Models,"
Proceedings of the IEEE, vol. 93, no. 2, pp. 276-292, February 2005.
[pdf]
- X. Gao, S. Sahoo, Q. Lu, G. Baumgartner, C. Lam, J. Ramanujam, and
P. Sadayappan, "Performance Modeling and Optimization of
Parallel Out-of-Core Tensor Contractions," in Proc. ACM
SIGPLAN 2005 Symposium on Principles and Practice of Parallel
Programming, Chicago, IL, June 2005.
[pdf]
- A. Hartono, A. Sibiryakov, M. Nooijen, G. Baumgartner,
D.E. Bernholdt, S. Hirata, C. Lam, R. Pitzer, J. Ramanujam, and
P. Sadayappan, "Automated Operation Minimization of Tensor
Contraction Expressions in Electronic Structure Calculations," in
Proc. International Conference on Computational Science 2005
(ICCS 2005), Atlanta, GA, May 2005.
[pdf]
- Q. Lu, X. Gao, S. Krishnamoorthy,
G. Baumgartner, J. Ramanujam, and P. Sadayappan,
"Empirical Performance-Model Driven Data Layout Optimization,"
Languages and Compilers for Parallel Computing,
(R. Eigenmann et al. Eds.),
Lecture Notes in Computer Science,
Springer-Verlag, 2005.
[pdf]
- L. Benini, M. Kandemir, and J. Ramanujam (editors),
Compilers and Operating Systems for Low Power,
Kluwer Academic Publishers, Boston, MA, January 2004.
ISBN: 1-4020-7573-1.
- M. Kandemir, I. Kadayif, A. Choudhary, J. Ramanujam, and I. Kolcu,
"Compiler-directed scratch pad memory optimization for embedded
multiprocessors,"
IEEE Transactions on VLSI (TVLSI),
vol. 12, no. 3, pp. 281-287, March 2004.
[pdf]
- M. Kandemir, J. Ramanujam, M. Irwin, V. Narayanan, I. Kadayif,
and A. Parikh, "A Compiler Based Approach for Dynamically
Managing Scratch-Pad Memories in Embedded Systems,"
IEEE Transactions on Computer-Aided Design,
vol. 23, no. 2, pp. 243-260, February 2004.
[pdf]
- G. Baumgartner, D. Bernholdt, V. Choppella,
J. Ramanujam, and P. Sadayappan, "A High-Level
Approach to Synthesis of High-Performance Codes for Quantum
Chemistry: The Tensor Contraction Engine," in
Proc. 11th Workshop on Compilers for Parallel Computers
(CPC 2004), Chiemsee, Germany, July 2004.
- Efficient Synthesis of Out-of-core Algorithms Using a
Nonlinear Optimization Solver, Sandhya Krishnan, Sriram
Krishnamoorthy, Gerald Baumgartner, Chi-Chung Lam, J. Ramanujam, and
P. Sadayappan. In Proceedings of the 18th International
Parallel and Distributed Processing Symposium (2004 IPDPS
Conference), Santa Fe, April 2004. (Best Paper
Award)
[pdf]
-
Memory-Constrained Data Locality Optimization for Tensor Contractions,
Alina Bibireata, Sandhya Krishnan, Gerald Baumgartner, Daniel
Cociorva, Chi-Chung Lam, P. Sadayappan, J. Ramanujam, David
E. Bernholdt, and Venkatesh Choppella,
Languages and Compilers for Parallel Computing,
L. Rauschwerger et al. (Eds.), Springer-Verlag, 2004.
[pdf]
- Chua-Huang Huang and J. Ramanujam
(editors),
Proceedings of The 2003 International
Conference on Parallel Processing Workshops.
IEEE Computer Society Press, October 2003. ISBN:
9780769520186.
Available at
Amazon
Barnes&Noble ISBN (10-digit):
0769520189.
-
Reducing false sharing and improving spatial locality in a
unified compilation framework, by
M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee.
IEEE Transactions on Parallel and Distributed Systems,
14(4):337-354, April 2003.
[pdf]
- Data Locality
Optimization for Synthesis of Efficient Out-of-Core Algorithms
Sandhya Krishnan, Sriram Krishnamoorthy, Gerald Baumgartner, Daniel
Cociorva, Chi-Chung Lam, P. Sadayappan, J. Ramanujam, David
E. Bernholdt, and Venkatesh Choppella. In Proceedings of the
International Conference on High-Performance Computing (HiPC
'03), Hyderabad, India, December 2003, Springer Verlag, Lecture
Nodes in Computer Science. (Best Paper Award)
- Global Communication
Optimization for Tensor Contraction Expressions under Memory
Constraints D. Cociorva, X. Gao, S. Krishnan,
G. Baumgartner, C. Lam, P. Sadayappan, J. Ramanujam. In
Proceedings of the International Parallel and
Distributed Processing Symposium, Nice, France,
April, 2003.
- An I/O conscious tiling strategy for disk-resident data sets,
by M. Kandemir, A. Choudhary, and J. Ramanujam.
The Journal of Supercomputing,
21(3):257-284, 2002.
[pdf]
-
A High-Level Approach to Synthesis of High-Performance Codes for
Quantum Chemistry
G. Baumgartner, D.E. Bernholdt, D. Cociorva, R. Harrison, S. Hirata,
C. Lam, M. Nooijen, R. Pitzer, J. Ramanujam, P. Sadayappan.
In Proceedings of Supercomputing 2002,
Baltimore, Maryland, November 2002.
-
Memory-Constrained Communication Minimization for a Class of
Array Computations
D. Cociorva, G. Baumgartner, C. Lam, P. Sadayappan, J. Ramanujam.
In Proceedings of the 15th International Workshop
on Languages and Compilers for Parallel Computing (LCPC '02),
College Park, Maryland, July 2002.
[pdf]
-
Automatic Synthesis of High-Performance Codes for Quantum
Chemistry Applications
G. Baumgartner, D.E. Bernholdt, D. Cociorva, R. Harrison,
C. Lam, M. Nooijen, J. Ramanujam, P. Sadayappan.
In Proceedings of the Workshop on Performance
Optimization for High-Level Languages and Libraries
(POHLL-02), New York, New York, June 2002.
-
Space-Time Trade-Off Optimization for a Class of Electronic
Structure Calculations
D. Cociorva, G. Baumgartner, C. Lam, P. Sadayappan,
J. Ramanujam, M. Nooijen, D.E. Bernholdt, R. Harrison.
In Proceedings of the ACM SIGPLAN 2002 Conference on
Programming Language Design and Implementation (PLDI '02),
Berlin, Germany, June 2002, pp. 177-186.
-
A Performance Optimization Framework for Compilation of Tensor
Contraction Expressions into Parallel Programs.
G. Baumgartner, D.E. Bernholdt, D. Cociorva, R. Harrison,
C. Lam, M. Nooijen, J. Ramanujam, P. Sadayappan.
In Proceedings of the 7th International Workshop on
High-Level Parallel Programming Models and Supportive
Environments (HIPS '02), Fort Lauderdale, Florida, April 2002.
- Address code and arithmetic optimizations for embedded systems,
by J. Ramanujam, S. Deshpande, J. Hong, and M. Kandemir.
In Proc. VLSI Design/ASPDAC 2002 Bangalore, India,
January 2002.
[pdf]
- J. Ramanujam, S. Krishnamoorthy, J. Hong, and M. Kandemir.
A heuristic for clock selection in high-level synthesis, by
In Proc. VLSI Design/ASPDAC 2002 Bangalore,
India, January 2002.
[pdf]
-
Strategies for improving data locality in embedded applications,
by N. Crosbie, M. Kandemir, I. Kolcu, J. Ramanujam, A. Choudhary.
In Proc. VLSI Design/ASPDAC
2002 Bangalore, India, January 2002.
[pdf]
-
Automatic Data Distribution,
by J. Ramanujam.
In The Compiler Design Handbook: Optimizations and Machine Code
Generation, (Y. N. Srikant and P. Shankar: Eds.), Chapter 12,
pages 409-459, CRC Press, Boca Raton, FL, 2002.
-
Integer lattice based methods for local address
generation for block-cyclic distributions,
by J. Ramanujam. In
Compiler Optimizations for Scalable Parallel Systems -
Languages, Compilation Techniques, and Run Time Systems,
S. Pande and D. P. Agrawal (Eds.), Lecture Notes in Computer
Science, Volume 1808, pages 597-645, Springer-Verlag, 2001.
[pdf]
J. Ramanujam
Ritter Distinguished Professor
Department of Electrical and Computer Engineering
and Center for Computation and Technology
Louisiana State University
Baton Rouge, LA 70803-5901, USA
Phone: +1 (225) 578-5628
Office: +1 (225) 578-5241
Fax: +1 (225) 578-5200
E-mail: j x r {@} ece.lsu.edu