登录 | 注册 | 退出 | 公司首页 | 繁体中文 | 满意度调查
综合馆
基于超块的统一分簇与模调度
  • 摘要

    超长指令字处理器为了提高指令集并行(ILP)往往采用多个功能单元,从而需要多端口的寄存器文件提供支持.但是寄存器文件会随着端口的增多变得更复杂,频率难以提升,成为系统的瓶颈.分簇是解决这一问题的有效手段.分簇在不影响处理器ILP的前提下减少了每簇寄存器文件的端口数目,但对编译器提出了挑战,编译器必须将指令和操作数在簇间进行合理分配才能得到较好的指令级并行.针对分簇超长指令字结构提出了一种基于超块的统一分簇与模调度编译方法.使用超块技术可以增大调度范围以获得更好的ILP,并且可以处理含有控制流的循环体,增加了模调度的适用范围.超块中指令的分簇与模调度则是统一进行的,这将比分阶段进行有更好的优化效果,因为统一进行是从全局的角度寻求优化而非寻求各个阶段局部优化.在YHFT-DSP/700编译器中的实验结果表明,与ITSS算法相比,该算法可以达到较好的优化效果.

  • 作者

    胡定磊  陈书明  刘春林  Hu Dinglei  Chen Shuming  Liu Chunlin 

  • 作者单位

    国防科学技术大学计算机学院,长沙,410073

  • 刊期

    2007年8期 ISTIC EI PKU

  • 关键词

    超长指令字  编译器  超块  分簇  模调度  指令级并行 

参考文献
  • [1] 陈书明,李振涛,万江华,胡定磊,郭阳,汪东,扈啸,孙书为. "银河飞腾"高性能数字信号处理器研究进展. 计算机研究与发展, 2006,6
  • [2] 胡定磊,陈书明,刘春林. 分簇结构超长指令字DSP编译器的设计与实现. 小型微型计算机系统, 2006,2
  • [3] Paolo Faraboschi;Joseph A. Fisher;Cliff Young. Instruction scheduling for instruction level parallel processors. Proceedings of the IEEE, 2001,11
  • [4] Viktor S. Lapinskii;Margarida F. Jacome;Gustavo A. De Veciana. Cluster assignment for high-performance embedded VLIW processors. ACM Transactions on Design Automation of Electronic Systems, 2002,3
  • [5] Chang P.P.;Warter N.F.. Three architectural models for compiler-controlled speculative execution. IEEE Transactions on Computers, 1995,4
  • [6] 胡定磊,陈书明,王凤芹,刘春林. 基于互补谓词的编译优化. 电子学报, 2006,7
  • [7] TMS320C6000 CPU and Instruction Set Reference Guide(Rev.F). Dallas,TX:texas Instruments Inc, 2000
  • [8] G Desoli. Instruction assigment for clustered VLIW DSP compilers:a new approach[Hewlett-Packard Laboratories,Tech Rep:HPL-98-13]. 1998
  • [9] S Jang;S Carr. A code generation framework for VLIW architectures with partitioned register banks. Colorado Springs,CO, 1998
  • [10] R Leupers. Instruction scheduling for clustered VLIW DSPs. Los Alamitos.CA:IEEE Computer Society Press, 2000
  • [11] E (O)zer;S Banerjia. Unified assign and schedule:A new approach to scheduling for clustered register file microarchitectures. Los Alamitos.CA:IEEE Computer Society Press, 1998
  • [12] C Akturan;M F Jacome. CALiBeR:A software pipelining algorithm for clustered embedded VLIW processors. Piscataway,NJ:IEEE Press, 2001
  • [13] J Fridman;Z Greenfield. The tiger SHARC DSP architecture. IEEE Transactions on Microwave, 2000,01
  • [14] J C Gyllenhaal;W W Hwu. HMDES version 2.0 specification[The IMPACT Research Group,Tech Rep:IMPACT-96-03]. 1996
  • [15] S A Mahlke;D C Lin. Effective compiler support for predicated execution using the hyperblock. Los Alamitos.CA:IEEE Computer Society Press, 1992
  • [16] R A Huff. Lifetime-sensitive modulo scheduling. New York:Acm Press, 1993
  • [17] W W Hwu. . http://www.crhc.uiuc.edu/Impact/, 2006
  • [18] The Institute for Integrated Signal Processing Systems. DSPstone. http://www.ert.rwth-aachen.de/Projekte/Tools/DSPSTONE/dspstone.html, 2006
  • [19] C Lee;M Potkonjak. Mediabench:A tool for evaluating and synthesizing multimedia and communications systems. Los Alamitos.CA:IEEE Computer Society Press, 1997
  • [20] P Faraboschi;G Brown. Lx:A technology platform for customizable VLIW embedded processing. New York:Acm Press, 2000
  • [21] B R Rau. Iterative modulo scheduling:An algorithm for software pipelining loops. New York:Acm Press, 1994
  • [22] D M Lavery. Modulo scheduling for control-intensive generalpurpose programs. Urbana,IL:University of Illinois, 1997
  • [23] J R Ellis. Bulldog:A Compiler for VLSI Architectures. Cambridge,Ma:The Mit Press, 1986
  • [24] P G Lowney;S M Freudenberger. The multiflow trace scheduling compiler. Journal of Supercomputer, 1993,1-2
  • [25] R Nagpal;Y N Srikant. Integrated temporal and spatial scheduling for extended operand clustered VLIW processors. New York:Acm Press, 2004
  • [26] M M Fernandes;J Llosa. Distributed modulo scheduling. Los Alamitos.CA:IEEE Computer Society Press, 1999
  • [27] E Nystrom;A E Eichenberger. Effective cluster assignment for modulo scheduling. Los Alamitos.CA:IEEE Computer Society Press, 1998
查看更多︾
相似文献 查看更多>>
34.226.244.70