外文翻譯--h.264在dsp上的實(shí)時(shí)實(shí)現(xiàn)_第1頁
已閱讀1頁,還剩16頁未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

1、<p>  3300單詞,1.7萬英文字符,5300漢字</p><p>  畢 業(yè) 設(shè) 計(jì)(論 文)外 文 參 考 資 料 及 譯 文</p><p>  譯文題目: </p><p>  學(xué)生姓名: 學(xué)  號(hào): <

2、/p><p>  ?! I(yè): </p><p>  所在學(xué)院: </p><p>  指導(dǎo)教師: </p><p>  職  稱:

3、 講師、工程師 </p><p>  2011年12 月8 日</p><p>  H.264 Real Time Implementation on DSP</p><p>  ——摘自《International Broadcasting Convention 2003》第488頁~第499頁</p><p>

4、;  A. Dueñas, J. C. Pujol, A. Martín, C. Peláez, F. Díaz, G. Gomez, F. Martin</p><p>  PRODYS, Spain; Universidad CIII de Madrid, Spain</p><p><b>  Abstract</b>&l

5、t;/p><p>  H.264/MPEG-4 AVC is the latest generation standard from the ITU-T and ISO/IEC MPEG standardization bodies and it builds on the success of the previous ITU-T and MPEG standards. H.264/MPEG-4 AVC offer

6、s major advantages compared with previous standards, among them its high compression efficiency, although it demands a high degree of computational complexity to be able to make use of the advantages of its new technique

7、s. In this paper several implementation issues of the H.264/MPEG-4 AVC standard h</p><p>  Introduction</p><p>  The recently standardized H.264/MPEG-4 AVC video coder (1) (formerly known as ITU

8、-T H.26L) is the result of the work carried out by a Joint Video Team (JVT) part of the International Telecommunication Union (ITU-T VCEG –Video Coding Experts Group) and of the International Organization for Standardiza

9、tion (ISO/IEC MPEG– Moving Picture Experts Group). This version of the ITU-T H.26x standards is added, as well as Part 10 of the MPEG-4 Standard (also called MPEG-4 Advanced Video Coding) and, due to</p><p>

10、  This standard, born under a ‘clean slate’ of ‘back to basics’ paradigm, is designed to enhance the compression efficiency (even at very low bit rates) compared to existing video compression coders. It aims at covering

11、a wide variety of applications ranging from digital storage media, television broadcasting, Internet streaming to real-time audiovisual communication. Typically, a distinction between ‘conversational’ and ‘not conversati

12、onal’ applications is established in order to impose the appro</p><p>  Another important characteristic of this standard is that two abstraction layers are sharply separated: a Video Coding Layer (VCL) and

13、a Network Adaptation Layer (NAL). This fact denotes the new standard vocation of being totally network-agnostic. Besides, special emphasis is placed in recovering from the errors introduced by the transmission networks (

14、see, for example Stockhammer (3) or Sullivan (4)). </p><p>  Broadcasting is one of the applications where this new standard is expected to have special relevance. However, real-time performance and effic

15、ient DSP implementations must be accomplished to achieve its potential enhanced performance, yet maintaining an acceptable degree of computational complexity. In our preliminary experiments we have focused on the baselin

16、e profile, though our main interest is in broadcasting, which will lead us shortly to the consideration of the characteristics of the mai</p><p>  The remainder of this paper is organized as follows. The nex

17、t section is devoted to the analysis of the main technical characteristics of the new coder, stressing its new features with respect to previous standards. Following that, a preliminary evaluation of the key features in

18、terms of its rate-distortion characteristic versus its computational cost is presented. Finally, an assessment of the performance of some of these techniques is offered. Some conclusions and further directions close this

19、</p><p>  H.264/MPEG-4 AVC technical features</p><p>  In the following sections a brief description of the most relevant novelties of the H.264 standard will be presented. We will focus on thos

20、e algorithmic features supported by the baseline profile coder. One of its most important characteristics is the great range of different possible implementations of the basic components allowed in the coder that result

21、from the number of prediction modes, picture subdivisions or motion vector references allowed. A real time DSP implementation of such a great</p><p>  Intra prediction modes</p><p>  In contrast

22、 to previous video coding standards, spatial prediction is always performed in H.264/MPEG-4 AVC. Besides, two different segmentation modes are allowed: Intra_4 × 4 and Intra_16 × 16. When using the first one, n

23、ine prediction modes are available for the luminance component: eight directional prediction modes plus one DC. The Intra_16 × 16 mode, intended for smooth areas, provides four luminance prediction modes (three dire

24、ctional and one DC). Chrominance information can also be predict</p><p>  As can be imagined, an exhaustive search for the most convenient mode results in a significant computational burden and a DSP impleme

25、ntation requires an efficient algorithm to perform this operation.</p><p>  Motion Compensation</p><p>  Motion compensation is another building block in which a plethora of prediction modes are

26、 available. Thus, each P-type macroblock can be partitioned into several block sizes and shapes (rectangular shapes are also allowed together with the traditional square forms) to allow an accurate description of motion.

27、 Initially, 16 × 16, 16 × 8, 8 × 16 and 8 × 8 blocks are available, and there exists the possibility of further dividing the last one into 8 × 4, 4 × 8 and 4 × 4 blocks or coding it in

28、Intra_</p><p>  Unrestricted motion vectors and half- and quarter- sample accuracy are permitted. Besides, multiple frame references are also allowed, requiring the storage of these reference frames both in

29、the coder and decoder ends. Therefore, a smart buffer management system must be designed to cope with that complexity efficiently. </p><p>  De-blocking filtering</p><p>  The perception of the

30、block segmentation boundaries can be one of the most noticeable artifacts present in a coded video sequence and consequently the mitigation of this effect is extremely important. A conditional smoothing filter is applied

31、 to the edges of each macroblock with 5 degrees of smoothness (including a no-filtering degree) that depend on the type of content present on the filtered and adjacent macroblocks. It is worth noting that this filter, in

32、 contrast with previous video coders, </p><p>  Transform coding and coefficient quantification</p><p>  Simultaneous scaling and transformation of each macroblock is performed using 4×4 DC

33、T transforms where a distinction between Intra_16×16 DC luminance coefficients and the rest of the 4×4 luminance and AC chrominance components of the remaining modes is established. Chrominance DC coefficients,

34、 however, employ a 2×2 DCT transform.</p><p>  Non-linear scalar quantification is performed following a zig-zag scan order (an alternative field scan is available for field parameters). Both for chromi

35、nance and luminance parameters, there are 52 available quantification steps designed in such a way that there is an increase in the factor used for scaling of approximately 12% in consecutive quantification steps. The us

36、age of this new type of fix-point transform is beneficial for a fix-point DSP implementation, because its implementation will</p><p>  Entropy coding</p><p>  A Universal Variable Length Coding

37、(UVLC) method using Exp-Golomb codes is employed for all syntax elements except for the quantized transform coefficients. For these last coefficients two entropy coding methods are on hand in this coder: CAVLC (Context A

38、daptive Variable Length Coding) or CABAC (Context Adaptive Binary Arithmetic Coding). However, in the baseline profile only CAVLC is supported and CABAC is added to the Main Profile. These new methods, primarily CABAC, i

39、ncrease substantially the</p><p>  Performance evaluation</p><p>  General description</p><p>  To test the performance of the new features of H.264 some well known video sequences

40、have been used: ‘foreman’ (QCIF format), ‘football’ (CIF), ‘tempete’ (CIF) and ‘paris’ (CIF).</p><p>  The software used for all the experiments is based on the JM61d version of the H.264/AVC reference softw

41、are.</p><p>  We have performed two main types of test on a Pentium CPU: an assessment of the potential benefits of the de-blocking filter in combination with different quantization parameters and an evaluat

42、ion of those obtainable by using multiple reference frames and macroblock segmentations. A 1 GHz Pentium III with 128 MB of RAM has been used for the former experiments and a 2 GHz Pentium 4 with 256 MB for the later. So

43、me further DSP testing was performed on a Texas Instruments TMS320DM642? at 600 MHz DSP</p><p>  DSP Device Architecture</p><p>  The DSP implementation described in the present paper have been

44、developed on general purpose Digital Signal Processors. The TMS320DM642? device from Texas Instruments has been chosen for this implementation; this has Very Long Instruction Word (VLIW) processor core, with a highly par

45、allel and deterministic architecture with eight functional units. This is a high performance fixed-point 600 MHz clocked Digital Signal Processor capable of providing 4,800 MIPS, that has a 16 + 16 Kbytes of L1 of ca<

46、/p><p>  De–blocking filter and Quantization</p><p>  In this section some experimental results are presented to assess the effectiveness of de-blocking filtering. For motion estimation in P-planes

47、, blocks sizes of 16 × 16, 16 × 8, 8 × 16 and 8 × 8 have been used. A single slice per plane is used. In addition, the Flexible Macroblock Order is disabled. Skipped frames are not allowed and motion

48、estimation is performed using 5 reference images.</p><p>  To experimentally assess the de-blocking filter, it has been selectively enabled or disabled.</p><p>  Besides, the experiments have be

49、en carried out for several values of the quantization parameter (QP). The finer the quantization is, the smaller the QP parameter used. Consequently, the images are more faithful to the original ones. For our experiments

50、, we have used the following values of the de-blocking filter parameters: for the offset to access the α and C0/2 tables we have used ‘–2’, while the chosen value for the offset to access the β tables has been ‘–1’.</

51、p><p>  Now that the experimental setup has been described, we proceed to present the results obtained. We have conducted three types of measurements: average coding time per image, conducting bit rate and SNR

52、of the luminance component. With respect sequences.,we have selected two of them to illustrate this graphically (see Figure 1).</p><p>  Figure 1: Mean Coding Time per frame vs. the parameter of quantificati

53、on QP</p><p>  As it can be observed, the use of the de-blocking filter does not entail a considerable increase the coding time. For example, the increase for the ‘football’ sequence is around 3%. If a coars

54、e quantization is used (QP> = 40), the increase in the coding time is insignificant (for QP higher than 40 the coding time increase is negligible). Once it has been shown that one of the potential disadvantages of the

55、 de-blocking filter is not very significant, we will evaluate its potential advantages.</p><p>  It is expected that the inclusion of the de-blocking filter would allow a reduction in the resulting bit rate,

56、 since it aims at eliminating the artificial high frequencies that appear in the block borders (edge effects), providing smoother images. Results concerning bit rate and luminance SNR for the ‘football’ sequence are illu

57、strated in Figure 2. Very similar results are obtained for other video sequences.</p><p>  Figure 2: (a) Resulting Bit Rate vs. QP (b) Luminance SNR vs. QP</p><p>  As it becomes obvious, the ob

58、tained bit rate reduction is negligible for fine quantization (0.1% for QP = 30); however, the coarser the quantization becomes, the higher the bit rate reduction obtained (from 5% to 8.5% for QP = 40).</p><p&

59、gt;  With respect to the SNR, the results do not seem to be good. Nevertheless, perceptually, the use of the de-blocking filters turns out very effective in coarse quantization cases (QP higher than 40).</p><p

60、>  Summarizing, the convenience of using the de-blocking filter depends on the application. Besides, this technique can be implemented very efficiently on the selected DSP, because this is a method that can be paralle

61、lized very efficiently and its increment on the execution time will be minimal, offering potential benefits (bit rate reduction and quality increase) primarily in coarsely quantized videos. As we have already mentioned,

62、the inclusion of the de-blocking filter is optional and therefore,</p><p>  Multiple reference frames and macroblock segmentation</p><p>  Since multiple reference frames and macroblock segmenta

63、tion are closely related techniques, they have been simultaneously tested to evaluate their overall performance. Firstly, we show how multiple reference frames can improve coding efficiency and secondly some video sequen

64、ces are processed combining multiple reference frames and several subdivision parameters.</p><p>  Fully exhaustive motion estimation is used on ‘foreman’, ‘football’ and ‘paris’ which are divided into 12 im

65、ages per subsequence. For the encoding of the actual image, ‘N’, three previous images, ‘N-1’, ‘N-2’, ‘N-3’, and the first image have been tested as reference frames. The macroblock size is set to 8 × 8 and a search

66、 range of [-32,32] is used. For this test, we have evaluated the MV rate that points to a certain reference frame.</p><p>  Figure 3: Rate of MV that point to the different reference frames</p><p&

67、gt;  Figure 3 depicts the results for ‘foreman’ and ‘football’ (400 frames have been coded for ‘foreman’ and 90 for ‘football’). As can be seen, the use of 3 reference images is advantageous. Even for the ‘football’ sequ

68、ence, which presents a great amount of motion, there is still a non-negligible percentage of MVs that point to the third reference frame. However, as could have been imagined, this rate decreases along with the distance

69、of the reference frame from the current frame.</p><p>  Having established the utility of multiple reference frames, we have encoded ‘paris’, ‘tempete’ and ‘football’ using different macroblock segmentations

70、. In order to make the notation briefer, let us consider a maximum macroblock subdivision of 16 × 16 named with ‘10000’; ‘11000’ for macroblocks of 16 × 8 or 8 × 16; ‘11100’ is used if a subdivision of 8 &

71、#215; 8 is allowed, ‘11110’ for 8 × 4 or 4 × 8 and ‘11111’ for any other macroblock size.</p><p>  The sequences have been coded using all the combinations from 1 to 4 reference frames (always the

72、adjacent previous ones), ‘10000’ to ‘11111’ macroblock subdivision and different QP (25, 35 and 45). The common settings are: sequence type of ‘IPPP’, no search range restrictions, disabled de-blocking filter and no RD-o

73、ptimized mode decision.</p><p>  Because of the huge amount of data generated, only the most relevant results are presented.</p><p>  F Figure 4: Average coding ti

74、me per sequence</p><p>  Let us first show the time taken to code every single sequence. Since the reference software uses fully exhaustive motion search, the time spent is almost the same for every combinat

75、ion of coding settings, with a negligible variance. Under these conditions, time has nearly a linear behavior. Though this might be seen as obvious, it is worth noting that smart algorithms able to provide a good estimat

76、e of the optimal combination of these two parameters can considerably change this performance, but</p><p>  It is important to emphasize that the experiments have been configured in a way that for a given vi

77、deo sequence the SNR remains almost constant irrespective of the number of reference frames and the macroblock segmentation used. This enables us to make comparisons in terms of the compression rate obtained for a given

78、quality.</p><p>  Figure 5 shows the bit rate obtained for ‘football’ and ‘paris’ which have exhibited very different behaviors. It is worth noting that the best compression in ‘football’ is achieved when mo

79、re reference frames are used, rather than smaller macroblock subdivision. On the other hand, ‘paris’ is more highly compressed if a smaller size of macroblock is allowed. These results lead us to choose a balance between

80、 number of reference images and minimum macroblock sizes: a diagonal move in the grid of co</p><p>  Figure 5: Bit rate</p><p>  If we focus on ‘paris’, two reference frames and ‘11000’ macroblo

81、ck size, we get 14% better compression with 64% more time, for the same SNR. Three reference frames and ‘11100’ achieve 18% lower bit rate, spending 160% of the time used by the encoder with one reference frame and 16 &#

82、215; 16 macroblock size.</p><p>  It is worth mentioning that, instead of a strictly decreasing function, we find a local minimum at ‘11110’. As our final aim is the maximization of the compression rate, whi

83、ch we cannot do directly, we pursue the minimization of the prediction error (which most of the times redounds in a maximum compression rate). However, there are situations when both objectives do not converge uniformly.

84、 It is therefore worth noting that when using 4 × 4 macroblocks size, we need to pay close attention to the</p><p>  Conclusions</p><p>  In this paper, some of the new features of the ITU-

85、T H.264/MPEG-4 AVC standard have been evaluated, such as de-blocking filtering, multiple reference frames and macroblock subdivision for the motion estimation. Compression performance, quality and coding time required ha

86、ve been analyzed for different sequences.</p><p>  The type of application and the necessities of the user will be crucial to decide on the convenience of using the de-blocking filter. Its impact on the codi

87、ng time is not really very substantial and, when using a fine quantification, the benefits of its use are not very noticeable. Nevertheless if bandwidth restrictions are imposed, and therefore quality is not high, the de

88、-blocking filter is able to reduce bandwidth around 5% and the subjective quality of the video obtained becomes considerably</p><p>  It has been shown that a good combination of multiple reference frames an

89、d macroblock segmentation can increase the compression efficiency of the coder, increasing however substantially the processing time required to perform the complete motion estimation.</p><p>  The processin

90、g power limitations found on the current DSP devices will make it necessary to consider a number of tradeoffs to be able to develop these techniques in real time in commercial products. Advanced algorithms need to be use

91、d to perform the multiple reference frames search method as well as the macroblock segmentation. Also they will be utilized to take advantage of what these techniques offer without increasing significantly the required p

92、rocessing time of the complete coder to the poi</p><p>  H.264在DSP上的實(shí)時(shí)實(shí)現(xiàn) </p><p>  ——摘自《國際廣播節(jié)目大會(huì)2003》</p><p>  第488頁~第499頁</p><p>  A. Dueñas, J. C. Pujol, A. Mart

93、ín, C. Peláez, F. Díaz, G. Gomez, F. Martin</p><p>  PRODYS, Spain; Universidad CIII de Madrid, Spain</p><p><b>  摘要</b></p><p>  H.264/MPEG-4 AVC是由ITU -

94、T的和ISO / IEC的MPEG標(biāo)準(zhǔn)化機(jī)構(gòu)提出的最新一代的標(biāo)準(zhǔn),它建立在以前的ITU - T的和MPEG標(biāo)準(zhǔn)成功的基礎(chǔ)上。 與往年相比,H.264/MPEG-4 AVC提供的主要優(yōu)勢,包括其高壓縮效率,但它要求高度的計(jì)算復(fù)雜性,以便能夠充分利用其優(yōu)勢的新技術(shù)。本文對(duì)H.264/MPEG-4 AVC標(biāo)準(zhǔn)的幾個(gè)文件的執(zhí)行問題進(jìn)行了探討,并處理了一些出現(xiàn)在這種情況下的實(shí)時(shí)問題。在通用的符合廣播環(huán)境要求的DSP上,執(zhí)行本方案,并在一個(gè)MPEG

95、 - 2編碼系統(tǒng)上應(yīng)能提供非常重要的優(yōu)勢。</p><p><b>  引言</b></p><p>  近來的H.264標(biāo)準(zhǔn)/MPEG-4 AVC視頻編碼器(1)(前稱是 ITU - T H.26L)是由ITU-T的VCEG(視頻編碼專家組)和ISO / IEC的MPEG(運(yùn)動(dòng)圖像專家組)的聯(lián)合視頻組(JVT)開發(fā)出來的。由于這兩個(gè)組織在這一領(lǐng)域的聲譽(yù)和經(jīng)驗(yàn)以及這一

96、新的編碼器在業(yè)績上取得的成果,ITU – T 的H.26x標(biāo)準(zhǔn)的版本以及MPEG - 4的第10部分(也稱為MPEG - 4先進(jìn)視頻編碼)都將進(jìn)一步改進(jìn),這一方案在廣播市場發(fā)揮著重要的作用。</p><p>  這一標(biāo)準(zhǔn)和以前的編碼模式一樣,但與現(xiàn)有的視頻壓縮編解碼器相比,它提高了壓縮效率(即使在非常低的比特率下)。它的目的是涵蓋數(shù)字存儲(chǔ)媒體、電視廣播、互聯(lián)網(wǎng)流媒體的實(shí)時(shí)視聽通信中的各種應(yīng)用。通常情況下,為了施加

97、適當(dāng)?shù)臅r(shí)延約束,要建立一個(gè)區(qū)分有會(huì)話和沒有會(huì)話的應(yīng)用。</p><p>  另一個(gè)重要特點(diǎn)是,這一標(biāo)準(zhǔn)結(jié)構(gòu)上分為兩層:一個(gè)是視頻編碼層(VCL),另一個(gè)是網(wǎng)絡(luò)適配層(NAL) 。這一事實(shí)表明,新標(biāo)準(zhǔn)的使命是網(wǎng)絡(luò)完全不可知的。此外,在介紹傳輸網(wǎng)絡(luò)差錯(cuò)恢復(fù)時(shí),特別強(qiáng)調(diào)了一些內(nèi)容(見Stockhammer (3)或Sullivan(4)) 。</p><p>  廣播是這一新的標(biāo)準(zhǔn)應(yīng)用中的一種,

98、預(yù)計(jì)它將有特殊的意義。然而,必須完成實(shí)時(shí)性能和高效率的DSP實(shí)現(xiàn),以實(shí)現(xiàn)其潛在的性能提高,保持計(jì)算復(fù)雜性的可接受程度。在我們的初步實(shí)驗(yàn)中,我們把重點(diǎn)放在了基本檔次上,但我們主要關(guān)心的是廣播,這將導(dǎo)致我們很少關(guān)心主檔次的特點(diǎn)。</p><p>  其余的本文安排如下。下一節(jié)是專門分析新的編碼器的主要技術(shù)特點(diǎn),強(qiáng)調(diào)關(guān)于以前標(biāo)準(zhǔn)上添加的新的功能。在此之后,對(duì)失真率特性與計(jì)算成本的主要特點(diǎn),提出了初步評(píng)估。最后,提供了一

99、些技術(shù)的性能評(píng)估。一些結(jié)論和進(jìn)一步的指示作為這篇文章的結(jié)尾。</p><p>  H.264/MPEG-4 AVC的技術(shù)特點(diǎn)</p><p>  在以下各節(jié)中,將簡要介紹H.264標(biāo)準(zhǔn)相關(guān)的新內(nèi)容。我們將重點(diǎn)放在那些算法功能所支持的基本檔次的編碼器。它的一個(gè)最重要的特點(diǎn)是,可以在一些預(yù)測模式的編碼器中運(yùn)用各種不同方法,最大范圍的實(shí)現(xiàn)基本組成,圖片分支機(jī)構(gòu)或運(yùn)動(dòng)參考矢量也是可以的。一個(gè)大自由

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 眾賞文庫僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論