24
Copyright©2015 NTT corp. All Rights Reserved. Professional H.265/HEVC Encoder LSI Toward High-Quality 4K/8K Broadcast Infrastructure Hiroe Iwasaki, Takayuki Onishi, Ken Nakamura, Koyo Nitta, Takashi Sano, Yukikuni Nishida, Kazuya Yokohari, Jia Su, Naoki Ono, Ritsu Kusaba, Atsushi Sagata, Mitsuo Ikeda, and Atsushi Shimizu NTT Media Intelligence Laboratories NTT Corporation Code Name: NARA (Next-generation encoder Architecture for Real-time HEVC Application)

Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

Copyright©2015 NTT corp. All Rights Reserved.

Professional H.265/HEVC Encoder

LSI Toward High-Quality 4K/8K

Broadcast Infrastructure

Hiroe Iwasaki, Takayuki Onishi, Ken Nakamura, Koyo Nitta,

Takashi Sano, Yukikuni Nishida, Kazuya Yokohari, Jia Su,

Naoki Ono, Ritsu Kusaba, Atsushi Sagata, Mitsuo Ikeda, and

Atsushi Shimizu

NTT Media Intelligence Laboratories

NTT Corporation

Code Name: NARA (Next-generation encoder Architecture for Real-time HEVC Application)

Page 2: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

2 Copyright©2015 NTT corp. All Rights Reserved.

Outline

• Introduction and Background • History of NTT’s Video CODEC LSIs

• Roadmaps toward 4K/8K UHDTV

• Latest video coding standard (HEVC)

• Requirements for 4K/8K broadcasting

• NARA Architecture

• NARA Key Features and Functions • Single-chip configuration

• Multi-chip configuration

• NARA Chip Implementation

• Target Applications

• Conclusion

Page 3: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

3 Copyright©2015 NTT corp. All Rights Reserved.

Encoder PC Card

(ICCE2000)

SDTV HDTV

Encoder PCI Board

Portable HDTV Encoder

(ICCE2001)

ISIL(’02)

(CICC2003)

SARA(’07) (HotChips19)

VASA (’02) (HotChips14)

NHK/NTT-COM Board

SARA/D (’08) (CoolChips XI)

HDTV H.264 Encoder

HDTV H.264 Decoder

MPEG-2/H.264 Transcoder

MPEG2 H.264

SuperENCII(‘00)

ISIL-II (’07)

(CoolChips X)

Analog Digital

ENC-C/-M (’95)

(HotChips7)

SuperENC (‘98)

(HotChips10)

HDTV MPEG-2 Encoder

HDTV MPEG-2 Decoder

HDTV UHDTV

H.264 H.265

NARA(’15)

History of NTT’s video CODEC LSIs

Page 4: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

4 Copyright©2015 NTT corp. All Rights Reserved.

Roadmap toward 4K/8K UHDTV

• 4K test broadcast over satellite in 2014, 8K in 2016

• 4K/8K commercial broadcast TV programs in 2020

2011 2012 2013 2014

Analog to Digital Terrestrial

2016 2020

CATV/IPTV

4K Test

Broad-cast

Rio de Janeiro soccer games

Tokyo sporting event

Rio de Janeiro sporting event

8K Test

Broad-cast

4K/8K Roadmap in Japan

Reference: Interim Report of 4K/8K Roadmap Follow-up Meeting (MIC, Japan)

2160pix

3840pix

4320pix

7680pix

1080pix

1920pix

Page 5: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

5 Copyright©2015 NTT corp. All Rights Reserved.

HEVC – High Efficiency Video Coding

• The latest video coding standard (Jan. 2013, Range extensions Apr. 2014)

• Achieves half bit rate compared to H.264, 1/4 to MPEG-2, key technology for 4K/8K

H.261 (ISDN videophone)

MPEG-2 (Digital TV, DVD)

Year 1990 1999

H.265/HEVC

H.264 (Blu-ray, camcorder)

MPEG-4 (Cellular videophone)

2013 1996 2003

Video coding standards history

4K/8K Broadcast & Distribution

Bit

rate

(lo

g)

Page 6: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

6 Copyright©2015 NTT corp. All Rights Reserved.

What is HEVC?

• Existing encoding flows, but “adaptive and exhaustive” combination of prediction tools

Inter prediction (motion vector search)

Quantize Entropy coding - Video frames Encoded stream

Loop filtering

Transform

Inv. transform

Inv. quantize

Frame memory

Intra prediction

Locally decoded frames

10011010…

Predicted Pixels

33 pred. directions

4x4..32x32 blocks

-1 0 1 …… 6 7 8 9 ….. 15 -1 0 1 …

6 7 8 9

15

Reference Pixels -1 0 1 …

6 7

8 directions 4x4..16x16 blocks

-1 0 1 …… 6 7

Intra (within-a-frame) prediction

H.264 HEVC

Inter (inter-frame) prediction

H.264 HEVC

8x8

8x16

16x8

16x16

*4x4 sub-MB available, but rarely used

CU

Coding Unit (CU)

CU

CU

CU CU

CU CU

CU CU

Prediction Unit (PU) Coding Tree Unit(CTU)

(Max. 64x64)

PU PU

PU PU PU

PU PU

PU PU

PU PU PU PU PU

PU

PU

PU

Motion Vectors(MV)

TU

TU TU

TU TU

TU TU

Intra

Intra

Intra

Intra

64

64

Example result: …

Page 7: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

7 Copyright©2015 NTT corp. All Rights Reserved.

HEVC encoding complexity

• About 30x of MPEG-2 processing time, 5x of H.264 processing time

Reference software: MPEG-2_MSSG H.264_JM18.5 HEVC_HM12.1

0 5 10 15 20 25 30 35

MPEG-2

H.264

HEVC

Processing time ratio (MPEG-2 = 1)

4K video coding time comparison

Motion search

Freq. transform

Q+IQ

CABAC

Intra prediction

Deblocking

SAO

Others

Page 8: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

8 Copyright©2015 NTT corp. All Rights Reserved.

• Practical 4K/8K broadcast infrastructure in 2020

• Latest video coding standard (H.265/HEVC) for high compression

• Color signal robustness against tandem encoding

• High bitrate of up to 600 Mbps

Requirements for 4K/8K broadcasting

NARA: Professional H.265/HEVC encoder LSI toward high-quality 4K/8K broadcast infrastructure

Page 9: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

9 Copyright©2015 NTT corp. All Rights Reserved.

Main concepts for NARA architecture

• Application specific hardware blocks for processes high computational complexity processes, such as precise motion estimation

• Hierarchical pipeline scheme for decisions on optimal hierarchical coding/prediction/transform unit size with high compression

• Single-chip 4K configuration and multi-chip 8K configuration for practical encoding systems

Page 10: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

10 Copyright©2015 NTT corp. All Rights Reserved.

NARA block diagram

8K Configurable Reference Picture Image Cache

Reference Picture Image Bus

MBUS

VIF Video Data

Prediction Core

Bus Interconnect

TQ / ITIQ

DF / SAO

CRISC

Dual Coding Core

TRISC

MRISC

MBUS

Output

Host CPU Host BUS

DDR x 3

Multi-chip Stream In

PCIe I/F

IFE

WME FME MC MME IPD MED CABAC

BSO

PCIe

IIM

MRISC

DDR I/F

IME

PRISC

Motion Estimation(ME) engines

Audio

MUX

8K Configurable Reference Picture Image Cache

210Mbit

5120bit

768 GOPS

600Mbps

VIF: Video Interface IFE: Image Feature Extraction MED: Multi-block-size Edge Detector IPD: Intra Prediction WME: Wide-range Motion Estimation MME: Multi-Block-Size Motion Estimation IME: Integer pixel Motion Estimation FME: Fractional pixel Motion Estimation

BSO: Bit Stream Out MUX: Multiplexer PRISC: Prediction Core RISC CRISC: Coding Core RISC MRISC: Middle-level RISC TRISC: Top-level RISC

MC: Motion Compensation IIM: Intra-Inter Mode Decision MBUS: Memory BUS TQ: Transform and Quantization ITIQ: Inverse Transform and Quantization DF: Deblocking Filter SAO: Sample Adaptive Offset filtering

Page 11: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

11 Copyright©2015 NTT corp. All Rights Reserved.

NARA pipeline polices

• Parallel processing for precise motion estimation achieving better coding efficiency

• Strictly sequential calculation (conforming to HEVC standard) desirable for mode decision to precisely evaluate coding bit costs

• Short pipeline stages for efficient rate control

Page 12: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

12 Copyright©2015 NTT corp. All Rights Reserved.

NARA pipeline scheme

1CTU

Pixel data transfer and filtering

Block-size-parallel motion search

Squeeze from 4 to 3 sizes by parallel pre-decision

Full-tournament mode decision

WME

MME

IME

Hpel-FME

Qpel-FME

Bipred-FME

IIM

MC

TQ/ITIQ

DF

SAO

CABAC

• NARA adopts CTU-based hierarchical pipeline scheme

• Filter mode decisions for coding efficiency while keeping image quality: – Wide-range ME (WME) -> Multi-block-size ME (MME) -> Integer ME (IME) ->

Fractional ME (FME) -> Inter/Intra Mode Decision (IIM)

Page 13: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

13 Copyright©2015 NTT corp. All Rights Reserved.

Parallel pre-decision in MME

Parallel pre-decision

8x8 SAD 16x16 32x32 64x64

Jud

gmen

t

WME

MME

IME

Hpel-FME

Qpel-FME

Bipred-FME

IIM

MC

TQ/ITIQ

DF

SAO

CABAC

• Motion estimation for all block sizes by parallel pre-decision

• Filter into three block sizes by parallel pre-decision

Page 14: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

14 Copyright©2015 NTT corp. All Rights Reserved.

Block-size parallel motion search in FME

16x16 32x32 8x8/64x64

Block-size parallel motion search

WME

MME

IME

Hpel-FME

Qpel-FME

Bipred-FME

IIM

MC

TQ/ITIQ

DF

SAO

CABAC

• Motion estimation for 3 block sizes in block-size parallel motion search

Page 15: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

15 Copyright©2015 NTT corp. All Rights Reserved.

Full-tournament mode decision in IIM

4x4 / 8x8 16x16 32x32 64x64

Full- tournament mode decision Z-scan order

WME

MME

IME

Hpel-FME

Qpel-FME

Bipred-FME

IIM

MC

TQ/ITIQ

DF

SAO

CABAC

• Strictly sequential calculation by full-tournament mode decision

• Strictly sequential calculation (conforming to HEVC standard) desirable for mode decision to precisely evaluate of coding bit costs

Page 16: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

16 Copyright©2015 NTT corp. All Rights Reserved.

NARA configurations

• Capability

• Single-chip processing up to 4K 60fps 4:2:2

2160

3840

60 frames/sec

HEVC

4:2:0 x

y y

4:2:2

• Multi-chip scalability up to 8K 60fps

x

4320

7680

HEVC

Page 17: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

17 Copyright©2015 NTT corp. All Rights Reserved.

Multi-chip configuration

PCIe Switch

Chip

#0 Chip

#1

Chip

#2 Chip

#3 Concatenated

Stream Out

Host CPU

8k

4k

Horizontal split (Slice or Tile)

#0 #1 #2 #3

Reference Image, DF&SAO Image near Slice/Tile Boundaries

Stream of slice/tile #0

Stream of slice/tile #0+#1

Stream of slice/tile #0+#1+#2

Slice or Tile Boundaries

128

128

4

4

Reference Picture Transfer Region

DF&SAO Transfer Region

Picture

Page 18: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

18 Copyright©2015 NTT corp. All Rights Reserved.

Key NARA features and functions

• Complicated HEVC processing mapped to hierarchical pipeline scheme based on coding tree units (CTUs).

• Hierarchical pipeline achieves wide-range motion estimation with ±3847.75 x ±1926.75 search range and optimized HEVC’s high-precision prediction mode decision

• 4k/60p 4:2:2 real-time encoding with ultra-low delay for field pickup units (FPUs), high bitrate of up to 600 Mbps for contribution, multi-channel encoding for cloud systems, and multi-standard encoding for smooth migration

• Ultra-high definition TV encoded beyond 4K with motion estimation and loop filtering across split boundaries when each chip encodes a partitioned frame

• Suitable for HEVC-based tandem encoding with 4:2:2 for keeping good color information and two-pass encoding for higher compression of final distribution

Single-chip configuration:

Multi-chip configuration:

Page 19: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

19 Copyright©2015 NTT corp. All Rights Reserved.

NARA chip implementation

Reference Picture Image Cache Dual Coding Core

PCIe

DD

R ch

0

DD

R ch

1

DD

R ch

2

BSO MUX

MME

IFE

IME FME

WME

Bipred FME

ETHER

Audio

VIF

IIM

MED IPD

MC

T RISC

M RISC

RISC

MRISC

PRISC

H264

Page 20: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

20 Copyright©2015 NTT corp. All Rights Reserved.

Physical features

Technology 28nm CMOS

Number of transistors 83M gates

Clock frequency Max 600 MHz

Supply voltage Core: 0.9 V IO: 1.8/3.3 V DDR3: 1.5 V PCIe and 3G-SDI: 0.9/1.8 V

Power consumption Approximately 15.0W

Package 1152 pin FCBGA (35 x 35mm)

External memories DDR3

Page 21: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

21 Copyright©2015 NTT corp. All Rights Reserved.

Functional features

Video Profile H.265/HEVC Main, Main 10, Main 4:2:2 10 H.264/AVC Baseline, Main, High, High422

Motion search range

-3847.75/+3847.75 (H) -1926.75/+1926.75 (V)

Resolution and video rate

Single-chip: 4096x2160 at up to 60 frames per second Multi-chip: 7860x4320 at up to 60 frames per second

Others Audio: Serial I/F x 2 Port Stream Out: Parallel x 1 /Serial x 4 PCIe: Gen.2 x 8 Lane Ethernet: 1000/100/10 Mbps with MAC Others: User PES input, STC input/output

Page 22: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

22 Copyright©2015 NTT corp. All Rights Reserved.

Target applications (1)

Embedded CODEC Portable Microwave link

TV Station

TV System

HDTV CODEC

HDTV CODEC

Satellite

Edge

Local TV

Station

Digital TV Broadcasting Network Service

Distribution Transmission

Contribution Transmission

Original

3 times encoding and decoding

420 422

Page 23: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

23 Copyright©2015 NTT corp. All Rights Reserved.

Target applications (2)

4K/8K

UHDTV

Technology

Mobile/ TV

Broadcast

Advertising Education/ Academic

Cinema

Conferencing/

Presentation

Security/

Surveillance

Medical

Industrial Design

Reference: Interim Report of 4K/8K Roadmap Follow-up Meeting (MIC, Japan) *1USD= 120JPY

$68B

$192B

$21B

$17B

$7B

$7B

$20B

$0.6B

• Mobile phones • Consumer TVs • Broadcast devices (Cameras, editors, encoders)

• Digital signage • Outdoor display LEDs

• Medical monitors • Endoscopic systems

• CAD, CAM, CG (Mechanical, automotive, industrial designs)

• Security (surveillance) cameras • Industrial camera sensors

• Conferencing systems/services • Projectors

• Digital cinema (Projectors, screens) • Box office sales

• Museums, galleries

Page 24: Professional H.265/HEVC Encoder LSI Toward High-Quality 4K ...hotchips.org/wp-content/uploads/hc_archives/hc27/... · HEVC – High Efficiency Video Coding • The latest video coding

24 Copyright©2015 NTT corp. All Rights Reserved.

Conclusion

• Developed: single-chip 4K 60fps 4:2:2 HEVC video encoder LSI, scalable to 8K 60fps

• 8K scalability achieved inter-chip connectivity and parallel processing functions

• NARA architecture has hierarchical pipeline scheme for CTUs

NARA is a key LSI for professional H.265/HEVC encoder LSI toward high-quality 4K/8K broadcast infrastructure