Static Worst-Case Execution Time Optimization using DPSO for ASIP Architecture
Introduction: The application of specific instructions significantly improves energy, performance, and code size of configurable processors. The design of these instructions is performed by the conversion of patterns related to application-specific operations into effective complex instructions. This research was presented at the icitkm Conference, University of Delhi, India in 2017.
Methods: Static analysis was a prominent research method during late the 1980’s. However, end-to-end measurements consist of a standard approach in industrial settings. Both static analysis tools perform at a high-level in order to determine the program structure, which works on source code, or is executable in a disassembled binary. It is possible to work at a low-level if the real hardware timing information for the executable task has the desired features.
Results: We experimented, tested and evaluated using a H.264 encoder application that uses nine cis, covering most of the computation intensive kernels. Multimedia applications are frequently subject to hard real time constraints in the field of computer vision. The H.264 encoder consists of complicated control flow with more number of decisions and nested loops. The parameters evaluated were different numbers of A partitions (300 slices on a Xilinx Virtex 7each), reconfiguration bandwidths, as well as relations of cpu frequency and fabric frequency fCPU/ffabric. ffabric remains constant at 100MHz, and we selected a multiplicity of its values for fCPU that resemble realistic units. Note that while we anticipate the wcet in seconds (wcetcycles/ f CPU) to be lower (better) with higher fCPU, the wcet cycles increase (at a constant ffabric) because hardware cis perform less computations on the reconfigurable fabric within one cpu cycle.
How to Cite
License
Copyright (c) 2018 Ingeniaría Solidaria

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Cession of rights and ethical commitment
As the author of the article, I declare that is an original unpublished work exclusively created by me, that it has not been submitted for simultaneous evaluation by another publication and that there is no impediment of any kind for concession of the rights provided for in this contract.
In this sense, I am committed to await the result of the evaluation by the journal Ingeniería Solidaría before considering its submission to another medium; in case the response by that publication is positive, additionally, I am committed to respond for any action involving claims, plagiarism or any other kind of claim that could be made by third parties.
At the same time, as the author or co-author, I declare that I am completely in agreement with the conditions presented in this work and that I cede all patrimonial rights, in other words, regarding reproduction, public communication, distribution, dissemination, transformation, making it available and all forms of exploitation of the work using any medium or procedure, during the term of the legal protection of the work and in every country in the world, to the Universidad Cooperativa de Colombia Press.
S. S. Lim, Y. H. Bae, C. T. Jang, B.-D. Rhee, S. L. Min, C. Y. Park, H. Shin, K. Park, and C. S. Ki, “An Accurate Worst-Case Timing Analysis for risc Processors,” ieee Transactions on Software Enginee-ring, vol. 21, no. 7, pp. 593-604, Jul. 1995. doi: ht-tps://doi.org/10.1109/32.392980
S. K. Kim, S. L. Min, and R. Ha, “Efficient Worst Case Timing Analysis of Data Caching,” in Proc. 2nd ieee Real-Time Technology and Applications Symposium (rtas’96). ieee, 1996. doi: https://doi.org/10.1109/rttas.1996.509540
R. White, F. Müller, C. Healy, D. Whalley, and M. Harmon, “Timing Analysis for Data Caches and Set-Associative Caches,” in Proc. 3rd ieee Real-Time Technology and Applications Symposium (rtas’97),Jun 1997, pp. 192-202, doi: https://doi.org/10.1109/RTTAS.1997.601358
A. Colin and I. Puaut, “Worst Case Execution Time Analysis for a Processor with Branch Pre-diction,” Journal of Real-Time Systems, vol. 18, no. 2/3, pp. 249-274, May 2000, doi: https://doi.or-g/10.1023/a:1008149332687
T. Mitra and A. Roychoudhury, “Effects of Branch Prediction on Worst Case Execution Time of Pro-grams,” National University of Singapore (nus), Tech. Rep. 11-01, Nov 2001.
J. Engblom, “Processor Pipelines and Static Worst-Case Execution Time Analysis,” Ph.D. dissertation, Dept. of Information Technology, Uppsala University, Uppsala, Sweden, Apr. 2002. Available: https://www.diva-portal.org/smash/get/diva2:161408/FULLTEXT01.pdf
J. Engblom and A. Ermedahl, “Pipeline Timing Analysis Using a Trace-Driven Simulator,” in Proc. 6th International Conference on Real-Time Com-puting Systems and Applications (rtcsa’99). ieee Computer Society Press, Dec1999. doi: https://doi.org/10.1109/RTCSA.1999.811197
C. Ferdinand, R. Heckmann, M. Langenbach, F. Martin, M. Schmidt, H. Theiling, S. Thesing, and R. Wilhelm, “Reliable and Precise wcet Determination for a Real-Life Processor,” in Proc. 1st International Workshop on Embedded Systems, (emosoft2000), lncs2211, Oct. 2001.
S. S. Lim, J. H. Han, J. Kim, and S. L. Min, “A Worst Case Timing Analysis Technique for Multiple-Issue Machines,” in Proc. 19th ieee Real-Time Systems Symposium (rtss’98), Dec 1998. doi: https://doi.org/10.1109/REAL.1998.739765
J. Schneider and C. Ferdinand, “Pipeline Beha-viour Prediction for Superscalar Processors by Abstract Interpretation,” in Proc. sigplan Wo r k s-hop on Languages, Compilers and Tools for Embed-ded Systems (lctes’99). May 1999. doi: https://doi.org/10.1145/315253.314432
S. Petters and G. Farber, “Making Worst-Case Exe-cution Time Analysis for Hard Real-Time Tasks on State of the Art Processors Feasible,” in Proc. 6th International Conferenceon Real-Time Computing Systems and Applications (rtcsa’99), Dec. 1999. doi: https://doi.org/10.1109/RTCSA.1999.811296
M. Venkanna, R. Rao, and P.Chandra Sekhar, “Application of asip in Embedded Design with Op-timized Clock Management,” icitkm Conference, Newdelhi, pp. 161-165, 2017. doi: https://doi.or-g/10.15439/2017km41
A. Alomary, T. Nakata, Y. Honma, J. Sato, N. Hikichi, and M. Imai, “peas-i: A hardware/software co-de-sign system for asips,” in Proc. euro-dac,1993. doi: https://doi.org/10.1109/EURDAC.1993.410608
J. Van Praetet, G. Goossens, D. Lanneer, and H. De Man, “Instruction set definition and instruc-tion selection for asips,” in Proc. HLS Symposium 1994, Instruction set definition and instruction se-lection for asips, 1994, doi: https://doi.org/10.1109/ISHLS.1994.302348.
N. Clark, H. Zhong, and S. Mahlke, “Processor Ac-celeration through Automated Instruction Set Cus-tomization.” In Proceedings of the 36th annual ieee/acm International Symposium on Micro architecture (micro36), 2003.
R. R. Hoare et al., “Rapid vliw Processor Custo-mization for Signal Processing Applications Using Combinational Hardware Functions,” eurasip Jour-nal on Applied Signal Processing, vol. 2006, no. 46472, 2010. doi: https://doi.org/10.1155/ASP/2006/46472
F. Tlili and A. Ghorbel, “asip Solution for Imple-mentation of H.264 Multi Resolution Motion Esti-mation,” International Journal of Communications, Network and System Sciences, vol. 3 no. 5, May 2010. doi: https://doi.org/10.4236/ijcns.2010.35060
P. Guironnet de Massas, P. Amblard, and F. Pétrot, “On sparc leon-2 isa Extensions Experiments for mpeg Encoding Acceleration,” Journal vlsi De-sign, vol. 2007, no. 28686, 2007. doi: https://doi.org/10.1155/2007/28686
S. Tillich, “Instruction Set Extensions for Secret-Key Cryptography,” Ph .D. Forum at the 9th Conference on Design, Automation and Test in Europe (2006), Munich, Germany, March 6, 2006. doi: https://doi.org/10.1109/CCST.2014.6986988
F. Naessens, A. Bourdoux, and A. Dejonghe, “A flexible asip decoder for combined binary and non-binary ldpc codes,” 17th ieee Symposium on Communications and Vehicular Technology (scvt),24-25 Nov. 2010. doi: https://doi.org/10.1109/SCVT.2010.5720462
G. Kappen, L. Kurz, O. Priebe, and T. G. Noll, “De-sign Space Exploration for an asip/Co-Processor Architecture used in gnss Receivers,” Journal of Sig-nal Processing Systems, vol. 58, no. 1, pp. 41-51, 2010. doi: https://doi.org/10.1007/s11265-008-0261-z
J. Kennedy and R. C. Eberhart, “A discrete binary version of the particle swarm algorithm,” inieee International Conference on Systems, Man, and Cy-bernetics, 1997. Computational Cybernetics and Si-mulation, 1997, vol. 5, pp. 4104-4108. doi: https://doi.org/10.1109/ICSMC.1997.637339
Q. K. Pan, M. F. Tasgetiren, and Y. C. Liang, “A dis-crete particle swarm optimization algorithm for the no-wait flowshop scheduling problem,” Computers & Operations Research, vol. 35, no. 9, pp. 2807-2839, 2008. doi: https://doi.org/10.1016/j.cor.2006.12.030
H. Falk and J. C. Kleinsorge, “Optimal static wcet-aware scratchpad allocation of program code,” In Proc. of Design Automat. Conf. acm, pp. 732-737, 2009. doi: https://doi.org/10.1145/1629911.1630101
H. Falk, S. Plazar, and H. Theiling, “Compi-le-time decided instruction cache locking using worst-case execution paths,” in Proc. of Int. Conf. on Hardware/Software Codesign and Syst. Syn-thesis. ACM, pp. 143-148, 2007. doi: https://doi.org/10.1145/1289816.1289853
J. Henkel, L. Bauer, M. Hubner, and Ar. Grudnitsky, “i-Core: A run-time adaptive processor for embed-ded multi-core systems”, in Proc. Int. Conf. on En-gineering of Reconfig. Syst. and Algorithms, 2011. Available: http://worldcomp-proceedings.com/proc/p2011/ERS6061.pdf
T. Liu, M. Li, and C. J. Xue, “Minimizing wcet for real-time embedded systems via static instruction cache locking”, in Real-Time and Embed. Technol. Applications Symp. ieee, pp. 35-44, 2009. doi: ht-tps://doi.org/10.1109/RTAS.2009.11
S. Plazar, J. C. Kleinsorge, P. Marwedel and H. Falk, “wcet-aware static locking of instruction caches”, in Proc. 10th International Symposium on Code Gene-ration and Optimization, acm, pp. 44-52, 2012. doi: https://doi.org/10.1145/2259016.2259023.
C. Steiger, H. Walder, M. Platzner, and L. Thiele, “Online scheduling and placement of real-time tas-ks to partially reconfigurable devices,” in Proc. of Real-Time Syst. Symp. ieee, pp. 224-225, 2003. doi: https://doi.org/10.1109/REAL.2003.1253269
P. Yu and T. Mitra, “Scalable custom instructions identification for instruction-set extensible proces-sors,” in Proc. of Int. Conf. on Compilers, Architecture and Synthesis for Embed. Syst. acm, pp. 69-78, 2004. doi: https://doi.org/10.1145/1023833.1023844




