Professional Documents
Culture Documents
Document: JVET-M_Notes_dFE
of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11
13th Meeting: Marrakech, MA, 9–18 Jan. 2019
Title: Meeting Report of the 13th Meeting of the Joint Video Experts Team (JVET),
Marrakech, MA, 9–18 January 2019
Status: Report document from the chairs of JVET
Purpose: Report
Author(s) or Gary Sullivan
Contact(s): Microsoft Corp. Tel: +1 425 703 5308
1 Microsoft Way Email: garysull@microsoft.com
Redmond, WA 98052 USA
Jens-Rainer Ohm
Institute of Communication Engineering Tel: +49 241 80 27671
RWTH Aachen Email: ohm@ient.rwth-aachen.de
Melatener Straße 23
D-52074 Aachen
Source: Chairs of JVET
_____________________________
1 Summary
The Joint Video Experts Team (JVET) of ITU-T WP3/16 and ISO/IEC JTC 1/ SC 29/ WG 11 held its
thirteenth meeting during 9–18 January 2019 at the Palais des Congrès Mogador (Tourist Zone Agdal,
40000, Marrakech, Morocco, Tel: + 212 530 530 530). The JVET meeting was held under the
chairmanship of Dr Gary Sullivan (Microsoft/USA) and Dr Jens-Rainer Ohm (RWTH Aachen/Germany).
For rapid access to particular topics in this report, a subject categorization is found (with hyperlinks) in
section 2.13 of this document. It is further noted that the unabbreviated name of JVET was formerly
known as “Joint Video Exploration Team”, but the parent bodies modified it when entering the phase of
formal development of a new standard. The name Versatile Video Coding (VVC) was chosen in April
2018 as the informal nickname for the new standard.
The JVET meeting began at approximately 0910 hours on Wednesday 9 January 2019. Meeting sessions
were held on all days (including weekend days) until the meeting was closed at approximately 1318 hours
on Friday 18 October 2019. Approximately 268 people attended the JVET meeting, and approximately
830 input documents, 17 AHG reports, and 13 CE summary reports were discussed. The meeting took
place in a collocated fashion with a meeting of WG11 – one of the two parent bodies of the JVET. The
subject matter of the JVET meeting activities consisted of developing video coding technology with a
compression capability that significantly exceeds that of the current HEVC standard, or otherwise gives
better support regarding the requirements of future application domains of video coding. As a primary
goal, the JVET meeting reviewed the work that was performed in the interim period since the twelfth
JVET meeting in producing a third draft of the VVC standard and the third version of the associated VVC
test model (VTM). Further important goals were reviewing the results of 13 Core Experiments (CE),
reviewing other technical input on novel aspects of video coding technology, and producing the next
versions of the VVC draft text and VTM, and plan next steps for further investigation of candidate
technology towards the formal standard development.
The JVET produced 18 output documents from the meeting:
JVET-M1001 Versatile Video Coding specification text (Draft 4)
JVET-M1002 Algorithm description for Versatile Video Coding and Test Model 4 (VTM 4)
2 Administrative topics
2.1 Organization
The ITU-T/ISO/IEC Joint Video Experts Team (JVET) is a group of video coding experts from the ITU-
T Study Group 16 Visual Coding Experts Group (VCEG) and the ISO/IEC JTC 1/SC 29/WG 11 Moving
Picture Experts Group (MPEG). The parent bodies of the JVET are ITU-T WP3/16 and ISO/IEC
JTC 1/SC 29/WG 11.
The Joint Video Experts Team (JVET) of ITU-T WP3/16 and ISO/IEC JTC 1/SC 29/WG 11 held its
thirteenth meeting during 9–18 January 2019 at the Palais des Congrès Mogador (Tourist Zone Agdal,
40000, Marrakech, Morocco, Tel: + 212 530 530 530). The JVET meeting was held under the
chairmanship of Dr Gary Sullivan (Microsoft/USA) and Dr Jens-Rainer Ohm (RWTH Aachen/Germany).
It is further noted that the unabbreviated name of JVET was formerly known as “Joint Video Exploration
Team”, but the parent bodies modified it when entering the phase of formal development of a new
standard. The name Versatile Video Coding (VVC) was chosen in April 2018 as the informal nickname
for the new standard.
2.5 Attendance
The list of participants in the JVET meeting can be found in Annex B of this report.
The meeting was open to those qualified to participate either in ITU-T WP3/16 or ISO/IEC JTC 1/SC 29/
WG 11 (including experts who had been personally invited as permitted by ITU-T or ISO/IEC policies).
Participants had been reminded of the need to be properly qualified to attend. Those seeking further
information regarding qualifications to attend future meetings may contact the responsible coordinators.
2.6 Agenda
The agenda for the meeting was as follows:
Opening remarks and review of meeting logistics and communication practices
IPR policy reminder and declarations
Contribution document allocation
Review of results of the previous meeting
Reports of ad hoc group (AHG) activities
Reports of core experiments planned at the previous meeting
Consideration of contributions and communications on project guidance
Consideration of additional video coding technology contributions
Consideration of information contributions
Coordination activities
Approval of output documents and associated editing periods
Future planning: Determination of next steps, discussion of working methods, communication
practices, establishment of coordinated experiments, establishment of AHGs, meeting planning,
other planning issues
Other business as appropriate for consideration
2.10 Terminology
Some terminology used in this report is explained below:
ACT: Adaptive colour transform.
AI: All-intra.
AIF: Adaptive interpolation filtering.
ALF: Adaptive loop filter.
AMP: Asymmetric motion partitioning – a motion prediction partitioning for which the sub-
regions of a region are not equal in size (in HEVC, being N/2x2N and 3N/2x2N or 2NxN/2 and
2Nx3N/2 with 2N equal to 16 or 32 for the luma component).
AMVP: Adaptive motion vector prediction.
AMT or MTS: Adaptive multi-core transform, or multiple transform set.
AMVR: (Locally) adaptive motion vector resolution.
APS: Adaptation parameter set.
ARC: Adaptive resolution change / conversion (synonymous with DRC, and a form of RPR).
ARSS: Adaptive reference sample smoothing.
ATMVP or “subblock-based temporal merging candidates” : Alternative temporal motion vector
prediction.
AU: Access unit.
AUD: Access unit delimiter.
AVC: Advanced video coding – the video coding standard formally published as ITU-T
Recommendation H.264 and ISO/IEC 14496-10.
BA: Block adaptive.
BC: See CPR or IBC.
Track A (238) was generally chaired by GJS, and Track B (272) by JRO.
As a general remark, it was established that in Track B “further study” meant that a technology should be
studied in next CE on the subject area, whereas if such a remark is missing it implicitly meant it should
not be studied in CE. If further study in an AHG was expected, that would be explicitly expressed. In
Track A, “further study” (expressed by itself) did not necessarily indicate whether the encouraged study
should be in a CE or AHG or unstructured effort.
JVET-M0002 JVET AHG report: Draft text and test model algorithm description editing
(AHG2) [B. Bross, J. Chen, J. Boyce, S. Kim, S. Liu, Y. Ye]
This document reported the work of the JVET ad hoc group on draft text and test model algorithm
description editing (AHG2) between the 12th meeting in Macao, CN (3–12 October 2018) and the 13th
meeting in Marrakech, MA (9–18 January 2019).
At the 12th JVET meeting, it was decided to include more coding features for intra picture-prediction,
inter-picture prediction, transform coefficient coding, transform, adaptive loop filtering and a tile group
based high-level syntax in the third draft of Versatile Video Coding (VVC D3) and the VVC Test
JVET-M0004 JVET AHG report: Test material and visual assessment (AHG4) [T. Suzuki,
V. Baroncini, R. Chernyak, P. Hanhart, A. Norkin, J. Ye]
The test sequences used for the CfP and CTC are available on ftp://jvet@ftp.ient.rwth-aachen.de in
directory “/jvet-cfp” (accredited members of JVET may contact the JVET chairs for login information).
Due to copyright restrictions, the JVET database of test sequences is only available to accredited
members of JVET (i.e. members of ISO/IEC MPEG and ITU-T VCEG).
There was discussion that the current directory structure of test sequence ftp site is not good for the
current activities. The ftp directory was created during the preparation of the CfE and CfP, and the same
directory structure is still used. One possibility was suggested to re-design the directory as follows, for
example,
ctc/ : Contains the active test set of the common testing conditions
ahg/ : Contains subdirectories with sequences under consideration. The subfolder might be structured
by meeting period (e.g. named by the doc-number of the corresponding meeting report?)
ce/ : Contains subdirectories for data exchange for specific CE (already implemented, see ce/JVET-
{K,L}1031_Deblocking
upload : stays as before
JVET-M0006 JVET AHG Report: 360 video conversion software development (AHG6)
[Y. He, K. Choi]
This document summarized activities on 360-degree video content conversion software development
between the 12th (3–12 Oct. 2018) and the 13th (9–18 January 2019) JVET meetings.
A brief summary for these activities is as follows:
The 360Lib-8.0 software package included following changes:
Projection format conversion:
o Chroma sample location type support (JVET-L0238).
Configurations:
o Added chroma sample location type for the output in the encoding parameter settings.
Software:
Fixed the compilation error reported by ticket #118 to support GCC 8.2.1.
360Lib-8.0 related release:
360Lib-8.0rc1 with support of VTM-3.0rc1 was released on Nov. 16, 2018;
360Lib-8.0 with support of VTM-3.0 was released on Nov. 22, 2018;
The 360Lib software is developed using a Subversion repository located at:
https://jvet.hhi.fraunhofer.de/svn/svn_360Lib/
The released version of 360Lib-8.0 can be found at:
https://jvet.hhi.fraunhofer.de/svn/svn_360Lib/tags/360Lib-8.0/
360Lib-8.0 testing results can be found at:
ftp.ient.rwth-aachen.de/testresults/360Lib-8.0
360Lib bug tracker
https://hevc.hhi.fraunhofer.de/trac/jem/newticket?component=360Lib
The first table below is for the projection formats comparison using VTM-3.0 according to 360 o video
CTC (JVET-L1012). It compares padded hybrid equi-angular cubemap (PHEC) coding and padded equi-
rectangular projection (PERP) coding using VTM-3.0.
The following table is to compare VTM-3.0 with PHEC coding and HM-16.16 with CMP coding.
The AHG recommended to continue software development of the 360Lib software package, to generate
CTC VTM anchors according to the 360° video CTC, and to finalize the reporting template for the
common test conditions.
JVET-M0008 JVET AHG report: 360° video coding tools and test conditions (AHG8)
[J. Boyce, K. Choi, P. Hanhart, J.-L. Lin]
This document summarized the activity of AHG8: 360º video coding tools and test conditions, between
the 12th meeting in Macao, CN (3–12 October 2018) and the 13th meeting in Marrakech, MA (9–18
January 2019).
There was no AHG email activity on the main jvet reflector, jvet@lists.rwth-aachen.de, with an [AHG8]
indication on message headers.
There were five non-CE related input documents identified (three contributions and one cross-check)
related to 360º video coding, which are listed below. In addition, CE13 on projection formats is related to
360º video coding, and has ten input documents, which are in the CE report in JVET-M0033. There are
four additional CE13-related input documents (three contributions and one cross-check) listed below.
360 video contributions not related to CE13
o JVET-M0225 AHG8: On wrap around motion compensation [B. Choi, W. Feng, S. Liu
(Tencent)
o JVET-M0368 AHG8: 360Lib support for chroma sample location in PHEC blending process
[C.-H. Shih, Y.-H. Lee, J.-L. Lin, Y.-C. Chang, C.-C. Ju (MediaTek)
o JVET-M0452 AHG8: Hemisphere cubemap projection format [J. Boyce, M. Dmytrychenko
(Intel)
Crosschecks of 360 video contributions not related to CE13
o JVET-M0644 Crosscheck of JVET-M0368 (AHG8: 360Lib support for chroma sample
location in PHEC blending process) [P. Hanhart (InterDigital)
CE13-related contributions
o JVET-M0322 CE13-related: In-loop filters disabled across face discontinuities on PHEC with
2-pixel padding [Yule Sun, Xuchang Huangfu, Lu Yu (Zhejiang Univ.)
JVET-M0009 JVET AHG report: Neural Networks in Video Coding (AHG9) [S. Liu,
B. Choi, K. Kawamura, Y. Li, L. Wang, P. Wu, H. Yang]
This document summarized the activity of AHG9: neural networks in video coding between the 12th
meeting in Macao, CN (3–12 Oct 2018) and the 13th meeting in Marrakech, MA (9–18 Jan. 2019).
The AHG used the main JVET reflector, jvet@lists.rwth-aachen.de, with [AHG9] in message headers.
Subjects such as software sharing, training data and process, neural network structure and complexity,
etc. had been actively discussed among proponents and participants.
Some contributions to previous meetings were noted in the AHG report, with participants in the AHG
discussions. Following a BoG (see JVET-L0704 of the previous meeting) and plenary discussions in the
previous meeting, a Gitlab repository (https://vcgit.hhi.fraunhofer.de/jvet-l-ahg9/VVCSoftware_VTM)
had been set up for AHG9-related proponents to voluntarily share their software for interested parties to
explore. As of the beginning of the current meeting, proponents of two proposals had uploaded their
software: JVET-J0032 (by USTC) and JVET-L0242 (by Wuhan Univ.).
High resolution images in the DIV2K set (https://data.vision.ee.ethz.ch/cvl/DIV2K/) were used as the
base image dataset for offline training, which consists of 800 images for training, and 100 images for
validation. The original images (in RGB format) were first converted to YUV420 format, and then
compressed using QP values {22, 27, 32, 37} to generate the training images (in YUV420 format).
Proponents were welcome to use other image and video training datasets, for various purposes such as
inter frame prediction, or online training, etc. but should report the used training dataset clearly in their
proposals. A reporting template for describing the training stage was established. In the inference stage,
the coding schemes use the model parameters for prediction. Information to be provided about the
inference stage, with a reporting template, was described in the report. More details on the work are
described in JVET-L1006.
An informational contribution JVET-M0691 provided a summary of some AHG9 related coding tools
with compression performance and complexity analyses.
The organized tests were implemented on top of VTM3 and tests were conducted under the common test
conditions (CTC) for VTM3.
AHG9 related input documents for this meeting were identified as follows.
JVET-M0159, AHG9: Convolutional neural network loop filter [Y.-L. Hsiao, C.-Y. Chen, T.-D.
Chuang, C.-W. Hsu, Y.-W. Huang, S.-M. Lei (MediaTek)]
JVET-M0011 JVET AHG report: Screen Content Coding (AHG11) [S. Liu, J. Boyce,
A. Filippov, Y.-C. Sun, J. Xu, M. Zhou]
This document summarized the activity of AHG11: Screen Content Coding between the 12th meeting in
Macao, CN (3–12 Oct 2018) and the 13th meeting in Marrakech, MA (9–18 Jan. 2019).
The AHG used the main JVET reflector, jvet@lists.rwth-aachen.de, with [AHG11] in message headers.
There were about a dozen emails exchanged on the JVET reflector with some discussions about testing
sequences. There were also some email discussions about the interaction between CPR and inter coding
tools adopted in Macao. Through the discussions, some mismatches between software VTM3 and spec
VVC Draft 3 were identified. Some possible solutions are suggested in JVET-M0409. More in-depth
technical discussions were carried in CE8 mailing list.
In total there were 26 CPR related technical contributions, 8 Palette related technical contributions, and 7
other SCC related technical contributions identified for this meeting that were noted in the report:
CPR related:
o JVET-M0151, CE8-related: Virtual search area for current picture referencing (CPR) [L.
Pham Van, T. Hsieh, W.-J. Chien, V. Seregin, H. Wang, M. Karczewicz (Qualcomm)
o JVET-M0174, CE8-related: Removal of subblock-based chroma MC in CPR [C.-Y. Lai, T.-
D. Chuang, Y.-L. Hsiao, C.-Y. Chen, Y.-W. Huang, S.-M. Lei (MediaTek)]
o JVET-M0175, CE8-related: Clarification on interaction between CPR and other inter coding
tools [C.-Y. Lai, T.-D. Chuang, Y.-L. Hsiao, C.-Y. Chen, Y.-W. Huang, S.-M. Lei
(MediaTek)]
o JVET-M0254, Non-CE8: Subblock Operation Removal for Chroma CPR [J. Xu, K. Zhang,
L. Zhang, H. Liu, Y. Wang, P. Zhao, D. Hong (Bytedance)]
JVET-M0013 JVET AHG report: Tool reporting procedure (AHG13) [W.-J. Chien,
J. Boyce, R. Chernyak, R. Hashimoto, Y.-W. Huang, S. Liu, D. Luo]
This document summarized the activity of AHG13: “Tool reporting procedure” between the 12th Meeting
in Macao, CN (3–12 Oct. 2018) and the 13th meeting in Marrakech, MA (9–18 Jan. 2019). Tool on/off
experimental results vs. VTM anchor are provided for the tools specified in JVET-L1005.
The initial version of JVET-L1005 “Methodology and reporting template for tool testing” was provided
on Oct 27th. The document contained a reporting template.
All tests described in JVET-L1005 were conducted, except 67IPM and PDPC. VTM tool tests were
conducted on VTM-3.0 software with VTM configuration by switching off specific tool either in
configuration files or macros. Tool tests of 67IPM and PDPC were not conducted because there was no
associated configuration setting nor associated macros in VTM-3.0 in order to disable the coding tools.
The tested tools, testers, and cross-checkers are listed in the tables below.
Tzu-Der Wei-Jung
Chuang Chien
Chroma dual CST JVET-K0230, X X X (peter.chuang (wchien@qti
tree JVET-K0556 @mediatek.co .qualcomm.c
m) om)
Wei-Jung
Yuwen He
Chien
Dependent JVET-K0072, (yuwen.he@i
DQ X X X (wchien@qti.
quantization JVET-L0274 nterdigital.c
qualcomm.co
om)
m)
JVET-K0190, Roman
JVET-L0085, Shan Liu
Cross- Chernyak
JVET-L0136, (shanl;
component CCLM X X X (chernyak.ro
JVET-L0191, leolzhao@
linear model man@huawei
JVET-L0338, tencent.com)
.com)
JVET-L0340,
JVET-K0171,
JVET-K0173, Wei-Jung
Shan Liu
JVET-K0096, Chien
multiple (shanl;
MTS JVET-L0059, X X X (wchien@qti
transform set xinzzhao@
JVET-L0111, .qualcomm.c
tencent.com)
JVET-L0118, om)
JVET-L0285,
67 intra
prediction Roman
mode +6 MPM JVET-K0529, Shan Liu
Chernyak
intra mode JVET-K0368, (shanl;
67IPM X X X (chernyak.ro
coding + Wide JVET-L0165, xinzzhao@
man@huawe
angle intra JVET-L0279 tencent.com)
i.com)
prediction (test
not available)
Position Wei-Jung
dependent Shan Liu
Chien
prediction (shanl;
PDPC JVET-K0063 X X X (wchien@qti
combination leolzhao@
.qualcomm.c
(test not tencent.com)
om)
available)
JVET-K0371, Wei-Jung
JVET-L0082, Yuwen He
Chien
Adaptive loop JVET-L0083, (yuwen.he@i
ALF X X X (wchien@qti.
filter JVET-L0147, nterdigital.c
qualcomm.co
JVET-L0392, om)
m)
JVET-L0664
JVET-L0045,
JVET-L0047,
JVET-L0142, Roman
Shan Liu
JVET-L0265, Chernyak
Affine motion (shanl;
AFF JVET-L0260, X X (chernyak.ro
model guichunli@
JVET-L0271, man@huawei
tencent.com)
JVET-L0632, .com)
JVET-L0694
The results of the tests are summarized in the tables below. The spreadsheet attached to the AHG report
provides additional data. Scatter plots are also provided for the tested tools in random access
configuration, comparing PSNR-Y based bd-rate on the Y axis vs. each of Enc runtime ratio, Dec runtime
ratio, and a weighted average of Enc and Dec runtime ratio, (Enc + a*Dec)/(a+1), with a configurable
weight, a. The exemplary weighting is set to 6 and can be adjusted in the spreadsheet attached to this
report.
Full experimental results and configuration files can be found at the link below:
https://hevc.hhi.fraunhofer.de/svn/svn_VVCTestConfig/branches/VTM-3.0/
There were no bit rate or PSNR differences between testers and cross-checkers.
Encoder and Decoder runtime ratios provided by both the testers and cross-checkers were included in the
reporting template, to identify whether there were significant runtime differences.
Simulation results in all-intra configuration (AI) of VTM tool “off” test. (VTM anchor)
AI
Tester Tester XChecker XChecker
Abbreviation BDR-Y BDR-U BDR-V EncTime DecTime EncTime DecTime
CST 2.14% -3.28% -2.62% 129% 102% 131% 99%
DQ 1.91% 1.15% 0.86% 82% 101% 84% 101%
CCLM 2.07% 18.76% 18.66% 99% 100% 99% 100%
MTS 2.81% 2.35% 2.38% 47% 85% 47% 84%
ALF 2.25% 3.05% 3.18% 99% 88% 100% 90%
MRLP 0.54% 0.26% 0.28% 95% 98% 95% 103%
CPR -0.27% -0.40% -0.33% 137% 100% 133% 100%
SAO 0.30% 0.41% 0.72% 100% 101% 100% 98%
Simulation results in random access configuration (RA) of VTM tool “off” test. (VTM anchor)
RA
Tester Tester XChecker XChecker
Abbreviation BDR-Y BDR-U BDR-V EncTime DecTime EncTime DecTime
CST 0.30% 2.44% 2.58% 101% 101% 100% 100%
DQ 1.66% 0.51% 0.06% 95% 102% 96% 102%
CCLM 0.95% 16.61% 16.67% 99% 100% 100% 103%
MTS 1.26% 1.14% 1.28% 90% 97% 89% 98%
ALF 3.68% 3.54% 3.15% 100% 87% 101% 87%
AFF 2.57% 1.85% 1.79% 89% 98% 89% 97%
SBTMC 0.53% 0.42% 0.48% 100% 99% 100% 99%
AMVR 0.88% 1.39% 1.41% 91% 101% 91% 101%
HMVP 0.42% 0.49% 0.49% 102% 100% 101% 100%
PMC 0.13% 0.07% 0.05% 100% 100% 100% 100%
TAP 0.37% 0.66% 0.67% 90% 101% 90% 103%
BDOF 1.19% 0.39% 0.26% 93% 96% 91% 93%
CIIP 0.47% 0.23% 0.28% 96% 100% 95% 100%
MMVD 0.86% 0.63% 0.65% 89% 101% 88% 100%
BPWA 0.44% 0.65% 0.66% 97% 101% 97% 100%
Simulation results in low-delay B configuration (LDB) of VTM tool “off” test. (VTM anchor)
RA
Tester Tester XChecker XChecker
Abbreviation BDR-Y BDR-U BDR-V EncTime DecTime EncTime DecTime
CST 0.15% -0.54% -0.03% 101% 100% 100% 101%
DQ 1.48% 0.86% -0.31% 95% 103% 96% 101%
CCLM 0.08% 4.25% 4.67% 100% 100% 100% 105%
MTS 0.34% 0.54% 0.67% 96% 102% 96% 100%
ALF 2.59% 3.38% 3.40% 101% 89% 101% 89%
AFF 2.10% 1.34% 1.51% 82% 96% 81% 95%
SBTMC 0.63% 0.73% 0.57% 100% 97% 101% 99%
AMVR 0.54% 0.94% 0.97% 87% 101% 89% 101%
HMVP 0.25% 0.21% 0.28% 100% 98% 102% 102%
PMC 0.01% -0.08% -0.11% 100% 103% 100% 100%
TAP 0.89% 1.19% 1.18% 88% 104% 87% 104%
CIIP 0.47% 0.54% 0.59% 96% 102% 95% 100%
MMVD 0.21% 0.09% -0.02% 95% 99% 95% 100%
BPWA 0.31% 0.29% 0.25% 96% 100% 97% 101%
MRLP 0.12% 0.16% 0.15% 100% 98% 100% 100%
CPR 0.14% 0.26% 0.01% 107% 99% 106% 100%
SAO 1.41% 3.25% 4.20% 101% 97% 100% 98%
Pixel usage and memory bandwidth results of VTM tool “off” test. (VTM anchor)
AI RA LDB
Abbreviatio Pixel Ave mem Max mem Ave mem Max mem
n usage Pixel usage BW BW Pixel usage BW BW
CCLM 51% 3% 1%
ALF 99% 62% 59%
AFF 24% 28%
SBTMC 16% 95% 94% 15% 97% 98%
AMVR 4% 102% 102% 2% 101% 100%
TAP 2% 100% 5% 98%
BDOF 42% 98%
CIIP 2% 99% 2% 100%
MMVD 10% 99% 8% 98%
BPWA 10% 100% 100% 7% 100% 99%
MRLP 7% 1% 0%
JVET-M0090 On the use of chroma QP offsets in the VVC common test conditions
[C. Helmrich, H. Schwarz, D. Marpe, T. Wiegand (HHI)]
Discussed Thursday 17 January ~2200 (GJS & F. Bossen).
Since version 2 of the VTM reference software for the Versatile Video Coding (VVC) standard, chroma
QP offsets of value 1 have been used for testing according to the common test conditions (CTC), but only
in (Intra) frames for which dual-tree coding has been enabled. This contribution reports that, in
comparison with the HEVC reference software, HM 16.x, the current VTM 3.x version provides much
higher BD-rate gains in the chroma channels than in the luma channel. To counteract this uneven
JVET-M0135 On adaptive resolution change (ARC) for VVC [Hendry, Y.-K Wang,
J. Chen (Huawei), T. Davies, A. Fuldseth (Cisco), Y.-C Sun, T.-S Chang,
J. Lou (Alibaba)]
This contribution discusses use cases or application scenarios that it was asserted would benefit from an
adaptive resolution change (ARC) capability where the spatial resolution may change at a non-IRAP
picture. A preliminary design of ARC was presented and suggested to be a place-holder just to trigger the
discussion. It was proposed that the support of ARC for VVC be discussed, and if there is sufficient
interest, either to start a CE or include aspects of ARC support as one or more mandates of an AHG.
The need to switch may be motivated by wanting to reduce bit rate – being required to send an I frame is
obviously a problem for that purpose.
Rate adaption in video telephony and conferencing
Active speaker change in multi-party video conferencing
Fast start in streaming
Adaptive stream switching in streaming
The basic tools constraints for supporting ARC are as follows:
The spatial resolution may differ from the nominal resolution by a factor 0.5, applied to both
dimensions. The spatial resolution may increase or decrease, yielding scaling ratios of 0.5 and 2.0.
The aspect ratios and chroma formats of the video format are not changed.
The cropping areas are scaled in proportion to spatial resolutions.
JVET-M0259 Use cases and proposed design choices for adaptive resolution changing
(ARC) [M. M. Hannuksela, A. Aminlou (Nokia)]
The following high-level design choices were proposed for VVC version 1:
1. It was proposed to include a reference picture resampling process in VVC version 1 for the following
use cases:
Usage of efficient prediction structures (e.g. so-called open groups of pictures) in adaptive
streaming without compromising from the fast and seamless representation switching capability
between representations of different properties, such as different spatial resolutions.
Adapting low-delay conversational video content to network conditions and application-
originated resolution changes without significant delay or delay variation.
2. The VVC design was proposed to allow merging of a sub-picture originating from a random-access
picture and another sub-picture originating from a non-random-access picture into the same coded
picture conforming to VVC. This is asserted to enable efficient handling of viewing orientation
changes in mixed-quality and mixed-resolution viewport-adaptive 360° streaming. This design choice
is discussed also in JVET-M0388.
3. It was proposed to include a sub-picture-wise resampling process in VVC version 1. This was
asserted to enable efficient prediction structure for more efficient handling of viewing orientation
changes in mixed-resolution viewport-adaptive 360° streaming.
Section 5.13 ("Support for Adaptive Streaming") of MPEG N 17074 was reported to include the
following requirement for VVC:
The standard shall support fast representation switching in the case of adaptive streaming services
that offer multiple representations of the same content, each having different properties (e.g. spatial
resolution or sample bit depth). The standard shall enable the use of efficient prediction structures
(e.g. so-called open groups of pictures) without compromising from the fast and seamless
representation switching capability between representations of different properties, such as different
spatial resolutions.
The contribution suggested that using this with a CRA picture can provide a gradual quality change.
Significant overhead was reported: +29% for LDB (RA results were incomplete).
The contribution suggested some normative changes to reduce the overhead.
This proposes to use the provided software for future AHG14 experiments.
No action was taken on this except to encourage AHG14 to consider / decide on the desirability and use
of the proposed software. It could be uploaded to an AHG14 space on Gitlab.
JVET-M0360 Video coding based on cross RAP referencing (CRR) [H. Yu, X. Gao,
Q. Yuan, X. Lin, L. Yu (Zhejiang Univ.), Y. Fan, Y. Zhao, H. Yang, Y.-K.
Wang, J. Chen (Huawei)] [late]
This contribution proposes an approach to allow inter prediction referencing across random access points
(RAPs), referred to as cross-RAP referencing (CRR). Simulation results reportedly show BD coding gains
in the range from 11.21% to 25.91% based on VTM3.0 in VVC CTC and from 19.6% to 37.38% based
on HM16.15 in HEVC CTC. A brief description of signalling, decoding process, and random access
operations is provided. It is proposed to start a CE on this.
The idea of a “library picture” provided by external (unicast) means and referenced across what are
otherwise random access points was described.
JVET-M0514 Removal of CIP from Multi-hypothesis Intra Prediction [C.-C. Chen, W.-J.
Chien, M. Karczewicz (Qualcomm)]
[Should this be in a different section?]
This contribution was presented Thursday 17 January 1830. Chaired by F. Bossen.
This report proposes to remove the constrained intra prediction (CIP) scheme from multi-hypothesis intra
prediction (MHIntra). Specifically, the CIP flag was proposed to not take effect on the derivation process
of reference samples when MHIntra flag is on. It was reported that the proposed method reduces the
Y/U/V BD-rate loss from VTM-3.0 with CIP=1 by 0.83%/0.44%/0.43% for Random Access and
0.93%/2.01%/2.09% for Low Delay B.
It was remarked that there was no CIP in the current design, and had been no agreement to add it.
No action was thus taken on the proposal at this time.
5 Core Experiments
5.1 CE1: Partitioning (3)
JVET-M0021 CE1: Summary report on partitioning [J. Ma, F. Le Léannec, M. W. Park]
This contribution was discussed on Wednesday 9 January at 1530–1700 (chaired by GJS).
This document evaluates CE1: Partitioning JVET-L1021. There were two tests that were conducted by
proponents and cross-checked by at least one cross-checker. There were no mismatches reported to the
coordinators.
The current VTM Software has split restrictions implemented to allow 64x64 pipelining. More precisely,
the concept of (square shaped) virtual pipelining data units (VPDUs) is used to allow for 64x64 pipelining
inside the picture.
The software used for both tests can be found at:
https://vcgit.hhi.fraunhofer.de/JVET-L-CE1/VVCSoftware_VTM.git
SubCE1.1.1
(JVET-M0446, Tencent)
Proponents study non-square VPDUs to enhance RD-performance and unify with boundary partitioning
of the current VTM software.
Conditions for CU_TRIH_SPLIT and CU_TRIV_SPLIT are changed so that, e.g. 128x32 and 32x128
block sizes are possible. Sub-partitions of such blocks are also allowed. However, 128x64 and 64x128 are
not allowed to be split using a ternary split whereas 128x128 can be split using a ternary split. Results do
not effect AI.
Example for different VPDU structures. Numbers denote the processing order of VPDUs
While the cross-checker indicated that case 2 had more cases where the immediately previous decoded
region was needed for prediction (e.g., intra prediction) than case 1. The proponent indicated that the
worst case was the same in both cases. And if what matters is average rather than worst case, case 2
would be more rarely encountered.
It was suggested that the key test sequences for this issue are the high resolution ones, since there is
generally little gain for using very large blocks with low-resolution test sequences.
The VTM uses 64x64 VPDUs (for luma). The proposal tries to get coding efficiency by allowing
rectangular shapes 32x128/128x32.
In the current draft spec, these rectangular shapes are available for the edge of the picture.
SubCE1.2.1
(JVET-M0236, Canon)
Proponents extend current TU tiling such that TUs do not cross the boundaries of a 64x64 grid. It is also
proposed that by using this technology the ternary split at the top level becomes available and therefore an
improved RD-performance is possible.
Remarks:
The gain is substantial (0.59% for Class A in RA configuration)
There is a substantial encoder complexity increase (~27%)
This has a latency issue for decoding, since the TU coefficient data needs to be buffered up for VPDU
pipeline processing. A related contribution JVET-M0237 proposes to shift this buffering burden to
the encoder.
No action was taken on this.
Related proposals JVET-M0195 and JVET-M0285 allows PUs to straddle the VPDU boundaries but
requires there to be no residual in those cases (0.38% benefit for Class A in RA configuration with
encoder complexity increase of 19% if allowing all modes with zero residual, or 0.2% benefit for Class A
in RA configuration with encoder complexity increase of 5% if allowing only skip mode in these cases).
No action was taken on these as well.
JVET-M0236 CE1: Transform tiling for pipelined processing of CTUs (Test 1.2.1)
[C. Rosewarne, A. Dorrell (Canon)]
JVET-M0446 CE1: Rectangular virtual pipeline data unit (test 1.1.1) and supplementary
results [M. Xu, X. Li, S. Liu (Tencent)]
2.2.1a has loss, but does not really solve a problem in terms of complexity
2.2.1b uses different neighbour positions (A1/B1) for the context of affine flag coding (which currently
uses A2/B3), claiming that these are positions different from those used in merge. However, other
elements (e.g. AMVR, triangular, quadtree/MTT split flag) use also A2/B3. Therefore, making this
change only for affine flag does not seem to be a unification/simplification.
The current affine merge has 4 context coded bins. Both tests 2.2.2a/b reduce this to only one (as in
normal merge mode), where 2.2.2a uses different contexts for affine and normal merge, whereas 2.2.2b
uses the same. 2.2.2a has no loss in RA, very small loss in LB, whereas 2.2b also shows 0.07% loss in
RA.
2.2.3.x establishes a history-based motion vector prediction for affine, however the HMVP candidates are
different (specifically collected from affine coded blocks) than in normal HMVP.
It is commented that no coordinates are used for a/b/c (which theoretically might lead to wrong model
depending on distance)
a/b/c have additional complexity in list derivation and storage of history camdidates, which is not justified
by the small (highest 0.07% for 2.2.3c)
2.2.3d is interesting as it reduces the need of local storage for spatial CPMV (control point motion
vectors) and replaces CPMV by history-based candidates. It however also disables the inheritance at CTU
boundaries, which is asserted to be the main reason for the loss of 0.07% (there are CE related
contributions JVET-M0432, JVET-M0110, JVET-M0168, JVET-M0262, JVET-M0270) that tackle this
issue in various ways. The approach should be further studied to solve the complexity reduction with less
penalty in terms of compression performance.
Test 2.2.4.x signals offset for affine merge mode (applied to all CPMV). The method b based on picture
height only provides small additional benefit. The method d has some additional complexity for scaling,
without showing benefit over c (which just uses mirroring). Gain is on average between 0.16% (for a) and
0.20% for c/d, generally higher for high resolution sequences.
It was initilly recommended in Track B to adopt JVET-M0431, test CE3-2.2.4c, but on top of 2.2.4a
(using distance offsets 1/2, 1, 2, 4, and 8-pel as per table 2.1 of JVET-M0431). Add a high-level enabling
flag.
Specification text was to be provided and reviewed.
This decision was later reverted. See the discussion of this topic in the notes of the plenary of Sunday 13
January in section 10.1. It was agreed that this should be further studied in a CE.
2.2.5 is a similar approach with same number of offset candidates (total 20), but supports larger offset
values, and different combinations. Gives less compression benefit.
2.2.6.x reduce the number of constructed affine candidates from 6 to 3. However, loss of 0.07% is
observed in LDB (where constructed candidates may be used more often). Furthermore, the study of
AHG16 shows that affine merge is not the most critical path in VVC. Therefore, proposals which reduce
complexity in this area should rather come with very low penalty in compression.
2.2.7.a removes “mixed” spatial-temporal constructed candidates, replacing them by pure temporal
candidates. The total number of candidates is unchanged, but the list construction (pruning) becomes
more complicated. Gain is 0.08%/0.03% for RA/LB
2.2.7b keeps the original candidates, and adds three more pure temporal candidates. The list construction
(pruning) has significantly higher complexity than current VVC (almost 3x number of MV comparisons).
Gain is 0.1%/0.07% for RA/LDB
Some of the reported gain may also be due to encoder changes. No good tradeoff of performance
compared to additional complexity. Also reports increased encoder runtime.
2.2.8.x adds some pruning processes to affine MV prediction and merge modes. Benefit in terms of
compression not obvious. (Note that the technology had more promising compression benefit before
VTM3).
Test 2.3.1.a adds PMVP as a new mode to determine subblock vectors. 2.3.1.a3 imposes the same
restrictions as current (non-affine) subblock MC. Low gain (around 0.1%), whereas encoder/decoder run
time increases. The CE report does not have a detailed complexity analysis, but it can be asserted to be
not insignificant, as MV scaling is necessary for each subblock.
Test 2.3.1.c puts PMVP as additional candidate into the affine merge list (but uses 8x8 subblocks) –
therefore, loss is observed, probably as affine usually has 4x4 subblocks, and/or merge list size is
reduced.
PMVP had 0.6% gain before, which is now down to 0.1%. This does not justify adding another subblock
mode with additional processing.
Complexity analysis:
Test# Bandwidth for Affine MVs storage Others (additional operations for MV
uni/bi-prediction clipping or derivation, and so on.)
Translational model 6.25 / 8.56
Affine model 8.33 / 16.66
8.33 / 11.84 Same 2 comparisons per CU
2.4.1.a
(1 for width/height + 1 for uni/bi)
8.33 / 11.84 Same 1 comparison per CU
2.4.1.b
(1 for uni/bi)
8.33 / 11.84 Same 1 comparison per CU
2.4.1.c
(1 for uni/bi)
8.33 / 8.56 Same MV clipping for subblk MVs
2.4.2
8.33 / 11.84 Same 3 comparisons per CU
2.4.3.a (1 for width/height + 1 for uni/bi + 1 for
restriction)
8.33 /8.56 Same 1 comparison per CU
2.4.3.b
(1 for uni/bi)
8.33 / 9.53 Same 13 comparisons per CU
2.4.3.c (1 for uni/bi + 12 for restriction)
1. Reduce worst case memory bandwidth for affine. This can be done either by disallowing 4x4 subblocks
(Tests 2.4.1, 2.4.3b, 2.4.4). All these come with some penalty in compression performance (typically
0.1% when 4x8/8x4 SB are used for bi pred, 0.2+% when restriction 8x8 is used). It is noted that these
approaches also reduce the number of computations for motion comp (more lines have to be interpolated
at boundaries of 4x4 blocks before the vertical interpolation can be performed)
It is also mentioned that possible impact on visual quality might occur when using larger subblocks in
afine MC, in particular for sequences with stronger affine motion. This has not been investigated yet
Other approaches (Tests 2.4.2, 2.4.3a/c, 2.4.5) keep 4x4 subblocks and impose some restrictions (e.g. by
clipping, grouping, adaptive selection of subblock size based on CPMV), such that adjacent subblocks’
vectors are not too much different, and joint memory access could be made. These come with practically
no loss in compression, but may require some additional logic. Keeping 4x4 subblocks and not losing
compression performance appears the more desirable concept, but more study is necessary to understand
the impact of different approaches how it can be achieved. The following aspects are important:
- Number of operations at decoder (for determining SB MVs and MC)
- More detailed study of worst case memory bandwidth in relation to memory access models, cache
size, etc.
- Possibility of formulating by encoder/bitstream restrictions
- Possibility of applying such concepts not only for affine, but also for other small-size CU cases
- Possible impact on subjective quality
2. Reduction of local buffers for storing affine related CPMV inheritance (2.4.7, 2.4.8). 2.2.3.d is
targeting the same issue. It appears from CE results (and possibly from CE related contributions) that this
can be achieved by almost no loss in compression. However, also here more detailed analysis is necessary
in terms of additional logic, and the effective saving in number of bits for local buffers.
Further analysis of the different approaches in BoG (Y. He), also review the CE related proposals, and
suggest approaches to be further studied in upcoming CE. It is unquestionable that something is necessary
in VVC to restrict the worst case memory bandwidth, but the optimum solution of achieving this may not
be identified yet.
3. Sub-Test 2.4.6 enables 6-parameter model inheritance across CTU boundary. However in VTM3 this
does not seem to have any benefit.
The subsequent notes contain descriptions of technology which were copied from JVET-M0022. Actions
taken are noted above.
JVET-M0053 CE2.4.7 Size constrain for inherited affine motion prediction [H. Huang, W.-
J. Chien, M. Karczewicz (Qualcomm)]
It is proposed to disable inherited affine motion prediction from a neighbouring affine coded block if the
width or height of the neighbouring block is less than threshold.
JVET-M0054 CE2.4.6 Modified affine inheritance from above CTU [H. Huang, W.-J.
Chien, M. Karczewicz (Qualcomm)]
In the proposed method, deriving a 6-parameter affine inherited candidate from a neighbouring block in
the above CTU is illustrated in Fig. 1. The top-left CPMV ⃗v 0and top-right CPMVs ⃗v1 are derived by the
the bottom-left MV ⃗v LB and bottom-right MV ⃗v RB of the neighbouring candidate block:
⃗v RB −⃗v LB
⃗v 0= ∗( posCurX− posNeiX )+ ⃗v LB
neiW
⃗v RB −⃗v LB
⃗v1 = ∗( posCurX +curW − posNeiX ) + ⃗v LB
neiW
If the MV from left neighbour ⃗v ¿¿is available and has the same reference picture as ⃗v LB
⃗v 2=⃗v ¿ ¿;
Otherwise, if the MV from bellow-left neighbour ⃗v bellowleftis available and has the same reference
picture as ⃗v LB
⃗v 2=⃗v bellowleft ;
Otherwise
⃗v RB−⃗v LB
( deltaHorX , deltaHorY )=
neiW
∇ ⃗v Ver =(−deltaHorY , deltaHorX )
⃗v RB −⃗v LB
⃗v 2= ∗( posCurX− posNeiX )+ ∇ ⃗v Ver∗curH + ⃗v LB
neiW
where curH is the height of current block.
JVET-M0125 CE2: History Based Affine Motion Candidate (Test 2.2.3) [J. Zhao, S. Kim
(LGE), G. Li, X. Xu, X. Li, S. Liu (Tencent)]
In these tests, history based affine MV candidate (Affine HMVP) methods will be applied to affine merge
list or affine AMVP list generation or both. The following aspects will be investigated:
Generation of a table to store motion information of coded affine CUs
The derivation of affine HMVP candidate from the stored affine motion information.
Simplified pruning for affine HMVP candidates.
Impact of various history table sizes, e.g. 4, 6, 8
Coexist or replace existing affine inherited candidates
JVET-M0165 CE2.5.1: Simplification of SbTMVP [C.-C. Chen, C.-W. Hsu, Y.-W. Huang,
S.-M. Lei (MediaTek)]
In JVET-L0092, a simplification method for ATMVP was described to simply disable the ATMVP for
small size of CU. This method can reduce the hardware processing cycles for small CUs. Currently, the
sub-block merge mode is disabled when width or height is smaller than 8. This CE test wants to disable
ATMVP when the number of samples in one CU is smaller than 128, 256, or other value. Within this CE,
the performance of the method described in JVET-L0092 is to be studied for different requirements, such
as disabling ATMVP for small area, small width, small height, and so on. Due to combination of ATMVP
and affine merge into a unified subblock-based merge list, the coherence of the disabling rule related to
affine merge is also be studied.
JVET-M0227 CE2.5.2: A second ATMVP candidate [Y.-W. Chen, X. Wang (Kwai Inc.)]
In the method proposed in JVET-L0105, a new ATMVP-like merge candidate is generated and inserted
into the merge candidate list in addition to the original ATMVP. The derivation process of the new
ATMVP is similar to the original ATMVP, with slightly different rules used in selecting motion vectors
(MV) from corresponding reference sub-blocks. More specifically, the MV selection rules for TMVP
from a reference block are used in deriving MVs for each subblock under the new ATMVP mode.
JVET-M0246 CE2: Adaptive Motion Vector Resolution for Affine Inter Mode (Test 2.1.2)
[H. Liu, K. Zhang, L. Zhang, J. Xu, Y. Wang, P. Zhao, D. Hong (Bytedance)]
The contribution extends Adaptive Motion Vector Resolution (AMVR) to affine inter mode coding,
wherein MVD in affine inter mode can be coded with adaptive precision.
JVET-M0262 CE2: Affine model inheritance from single-line motion vectors (Test 2.4.8)
[K. Zhang, L. Zhang, H. Liu, J. Xu, Y. Wang, P. Zhao, D. Hong (Bytedance)]
In affine model inheritance, fewer MVs stored in neighbouring blocks are required.
JVET-M0282 CE2: Affine motion predictor pruning (test 2.2.8) [A. Robert, F. Le Léannec,
T. Poirier, F. Galpin (Technicolor)]
This proposal includes two aspects. First, before any pruning operation takes place the involved motion
vectors are computed in their final representation. This means the MVs are set to the appropriate precision
level, rounded to the right motion vector resolution or clipped, before being compared to candidates
already present in the concerned list. Second aspect consists in adding some pruning operations between
candidates to improve the trade-off between coding efficiency and complexity.
JVET-M0302 CE2: Merge Mode with Regression-based Motion Vector Field (Test 2.3.3)
[R. Ghaznavi-Youvalari, A. Aminlou, J. Lainema (Nokia)]
JVET-L0171 proposed a new sub-block merge mode with Regression-based Motion Vector Field
(RMVF). RMVF uses a 6-parameter motion model for calculating the motion vectors of sub-blocks.
[ MV X
MV Y
subPU
subPU
][
=
][ ] [ ]
a xx a xy X subPU b x
+
a yx a yy Y subPU b y
Motion parameters are calculated based on one line and row of spatially neighbouring 4x4 sub-blocks
using their motion vectors and center locations as an input to a linear regression method.
JVET-M0309 CE2: Memory bandwidth reduction for affine mode (test 2.4.2) [J. Li, R.-L.
Liao, C. S. Lim (Panasonic)]
To reduce the worst-case memory bandwidth of an affine CU, the sub-block motion vectors of the affine
CU are constrained to be within a pre-defined motion vector field which is calculated according to the
motion vector of first sub-block, the size of the affine CU and the prediction direction of the affine CU
(i.e. uni-prediction or bi-prediction). Assume that the memory is retrieved per CU instead of per sub-
block and the motion vectors of first (top left) and the second sub-blocks are (v 0x,v0y) and (vix, viy), values
of vix and viy exhibit the following constraints:
vix ∊ [v0x-H, v0x+H],
viy ∊ [v0y-V, v0y+V].
The values H and V are chosen to guarantee that the worst case memory bandwidth of the affine CU does
not exceed that of normal inter MC of a 8×8 bi-prediction block. If the motion vector of any sub-block
exceeds the pre-defined motion vector field, the motion vector is clipped.
JVET-M0380 CE2: Affine Merge flag coding (Test 2.2.1) [G. Laroche, C. Gisquet, P. Onno,
J. Taquet (Canon)]
Test: CE2.2.1.a (Bypass Affine flag for Skip CU)
The affine motion compensation prediction is particularly efficient for complex motion like rotation,
zoom in, zoom out and perspective motion. The Skip mode is efficient for static content or constant
motion. It seems that these modes compensate different kinds of motion and are not compatible. This test
proposes to bypass the affine mode flag coding when the current CU is “skip” coded.
Test: CE2.2.1.b (Affine flag context simplification)
This test aligns the block positions used to derive the affine flag context with the Merge candidate
positions. The new proposed block positions for the Affine flag context are A1 and B1 instead of A2 and
B3 in VTM3. With this modification, only 5 block positions (instead of 7) needs to be checked before
decoding or parsing the Affine flag.
JVET-M0420 CE2: Adaptive precision for affine MVD coding (Test 2.1.1) [J. Luo, Y. He,
X. Xiu (InterDigital)]
In VTM-3.0, affine MVD precision for the signalling is fixed ({1/4-pel, 1/4-pel, 1/4-pel}). Adaptive
precision is proposed for affine MVD signalling. Three precisions are proposed for affine MVD coding at
those control points. The precision set {prec_TL, prec_TR, prec_BL} is used to indicate the MVD
precision for top-left, top-right and bottom-left control points. For 6-parameter affine model, three
precision sets are used: {1-pel, 1/4-pel, 1/4-pel}, {1/4-pel, 1/4-pel, 1/4-pel}, and {1/8-pel, 1/8-pel, 1/8-
pel}. For 4-parameter affine mode, the same three precisions are used, except that the precision for
bottom-left control point is not needed.
Distance IDX 0 1 2 3 4
Distance- 2-
1/2-pel 1-pel 4-pel 8-pel
offset pel
The direction index can represent four directions as shown below, where only x or y direction
may have an MV difference, but not in both directions.
Offset Direction
00 01 10 11
IDX
x-dir-factor +1 –1 0 0
y-dir-factor 0 0 +1 –1
If the inter prediction is Uni-prediction, the signalled distance offset is applied on the offset
direction for each control point predictor. Results will be the MV value of each control point.
If the inter prediction is Bi-prediction, the signalled distance offset is applied on the signalled
offset direction for control point predictor’s L0 motion vector; the offset to be applied on L1 MV
is applied on a mirrored basis.
JVET-M0472 CE2: Affine sub-block size restrictions (Test 2.4.4) [H. Chen, T. Solovyev,
H. Yang, J. Chen (Huawei)]
In affine mode, different sub-block shapes are used, which depends on the prediction direction and the
CU size, as shown in the table below.
a) For uni-prediction affine coded CU, the sub-block size is set equal to 4x4.
JVET-M0476 CE2: Control point MV offset for Affine merge mode (Test 2.2.5) [Y.-C.
Yang, Y.-J. Chang (Foxconn)]
New Affine merge candidates are generated based on the MVs offsets of the first Affine merge candidate
or the first sub-block MV merge candidate. In Uni-prediction, the MV offsets are applied to the MVs of
the first candidate. In Bi-prediction with List 0 and List 1 on the same direction, the MV offsets are
applied to the MVs of the first candidate as follows:
MVnew(L0), i = MVold(L0) + MVoffset(i)
MVnew(L1), i = MVold(L1) + MVoffset(i)
In Bi-prediction with List 0 and List 1 on the opposite direction, the MV offsets are applied to the MVs of
the first candidate as follows:
MVnew(L0), i = MVold(L0) + MVoffset(i)
MVnew(L1), i = MVold(L1) - MVoffset(i)
On top of new adoptions in Affine merge aspect, this sub-test will test various offset directions combined
with various offset magnitudes for the proposed Affine MV offset merge candidates. The various
numbers of the Affine MV offset merge candidates will be tested.
JVET-M0477 CE2: Simplification of Affine constructed merge candidates (Test 2.2.6) and
supplementary results [Y.-C. Yang, Y.-J. Chang (Foxconn)]
The first aspect in JVET-L0390 is related to reduction of the maximum number of available merge
candidates. This test focuses on the simplification of the constructed Affine merge candidates. Define 6-
param constructed candidates and 4-param constructed candidates as follows:
0: Null
1: [CP0, CP1, CP2]
2: [CP0, CP1, CP3]
JVET-M0485 CE2: Sub-block MV clip in planar motion vector prediction (test 2.3.2)
[M. Gao, X. Li, M. Xu, S. Liu (Tencent)]
The MVs for all 4x4 sub-blocks inside the 4x16, 16x4 or 8x8 blocks are constrained such that the max
difference between integer parts of these sub-blocks is no more than a given threshold.
JVET-M0488 CE2: Sub-block MV clip in affine prediction (test 2.4.5) [M. Gao, X. Li,
M. Xu, S. Liu (Tencent)]
For affine prediction, it is proposed to constrain the MVs of four 4x4 sub-blocks within one 8x8 block
such that the max difference between integer parts of the four 4x4 sub-block MVs is no more than 1 pixel.
All Intra Main 10 - Over VTM-3.0 Random Access Main 10 - Over VTM-3.0
Test # Y U V EncT DecT Y U V EncT DecT
1.1.1 -0.59% -0.44% -0.47% 112% 103% -0.29% -0.31% -0.15% 102% 103%
1.1.2 -0.46% -0.34% -0.34% 112% 104% -0.24% -0.28% -0.18% 102% 102%
1.2.1 -1.36% -1.02% -1.01% 153% 105% -0.85% -0.92% -0.98% 112% 99%
1.2.2 -0.95% -0.42% -0.46% 153% 101% -0.57% -0.73% -0.83% 110% 98%
1.3 -0.08% -0.14% -0.11% 106% 100% -0.04% 0.00% -0.01% 101% 100%
It was noted that 1.1.x shows more gain on Class F (SCC content) than on other content, and Class F is
not included in the average. A non-proponent participant focused on implementation issues indicated that
they had analysed it and found it acceptable for implementation.
This was further discussed in the plenary on Sunday 13 January, and it was agreed to revisit this topic
since it has an aspect of reversal of decoding order which may cause it to be difficult to implement. This
aspect was studied further and was agreed to be removed. See the notes of the two plenary discussions of
this in section 10.1.
Decision: Adopt CE3-1.1.1 proposal (without the reverse coding order aspect); text was provided in a
revision of JVET-M0102.
Regarding 1.2.1, this has (in the decoder perspective) a selection of a matrix of stored fixed values among
a set of such matrices, followed by a matrix multiply applied to boundary sample values to generate the
prediction signal in the frequency domain and then an inverse transform is applied to generate a spatial
domain prediction, followed by an ordinary residual difference.
It was noted that this has a bit more gain on RA than is usual for intra coding efficiency proposals
(0.85/1.36=0.625 versus the usual ~0.5).
This has some decoder runtime increase. For 1.2.1 there is an increase in computational operations.
It was commented that the need to include an inverse transform in the 1.2.1 variant is an additional
functional block unlike anything typically done for intra.
The amount of stored coefficient data is another issue, especially for the 1.2.1 variant (~300 kbytes). The
1.2.2 variant omits the inverse transform and has a (simple 2-tap one-dimensional average) downsampling
that reduces the size of the matrices (to about ~18 kbytes – a total of around 14,000 numbers of 10 bits
each), with a corresponding (bilinear) upsampling in the decoder. The proponent pointed to CPR as an
instance where added storage of a greater amount is needed (although, for screen content, that has quite
high gain).
The VTM has 3 CCLM modes (CCLM, CCLM-above, and CCLM-left). The selection between these
modes is signalled, but the model parameters are not. The tests in the CE are to improve coding efficiency
by adding more models.
2.1 uses 3 columns to the left and 2 lines above (except at the CTU boundary)
2.2 uses 5 columns to the left and 4 lines above (except at the CTU boundary)
Between 2.1, 2.2, and 2.2.1, it was suggested to focus on the combination test 2.2.1, for using fewer
lines and cleaner signalling
Page: 83 Date Saved: 2019-03-172019-03-14
All Intra Main10 - Over VTM-3.0 Random Access Main10 - Over VTM-3.0
Test # Y U V EncT DecT Y U V EncT DecT
2.1 -0.25% -1.71% -2.19% 100% 100% -0.12% -1.36% -1.70% 99% 99%
2.2 -0.29% -1.94% -2.51% 100% 100% -0.15% -1.70% -1.99% 101% 100%
2.2.1 -0.27% -1.71% -2.26% 99% 99% -0.13% -1.30% -1.73% 99% 98%
2.3 -0.10% -0.86% -0.88% 102% 100% -0.03% -0.62% -0.66% 100% 100%
2.4 See below (*)
2.5.1 -0.09% -0.44% -0.47% 100% 99% -0.08% -0.31% -0.39% 100% 99%
2.5.2 -0.18% -0.88% -0.90% 100% 99% -0.11% -0.74% -0.84% 100% 99%
2.5.3 0.05% 0.45% 0.48% 100% 100% 0.02% 0.51% 0.51% 100% 99%
2.5.4 -0.14% -0.51% -0.51% 100% 101% -0.09% -0.41% -0.46% 100% 99%
2.6.1 0.09% 0.71% 0.79% 99% 97% 0.06% 0.88% 0.95% 100% 99%
2.6.2 0.03% 0.16% 0.21% 100% 97% 0.00% 0.35% 0.35% 100% 99%
All Intra Main10 - Over VTM-3.0 Random Access Main10 - Over VTM-3.0
Test #
Y U V EncT DecT Y U V EncT DecT
3.1.1 -0.11% -0.09% -0.08% 122% 103% -0.07% -0.06% -0.07% 104% 101%
3.1.2 -0.01% -0.05% -0.02% 122% 102% 0.01% 0.06% 0.04% 103% 100%
3.1.3 -0.12% -0.04% -0.05% 124% 104% -0.05% -0.01% 0.03% 103% 101%
3.1.4 -0.09% -0.09% -0.06% 113% 103% -0.05% 0.10% 0.01% 100% 100%
3.2 -0.03% -0.04% -0.05% 97% 100% 0.02% 0.14% 0.13% 99% 99%
3.3.1 -0.03% -0.88% -0.86% 100% 100% -0.01% -0.46% -0.50% 100% 100%
3.3.2 -0.02% -0.29% -0.20% 97% 100% 0.00% -0.12% -0.05% 99% 100%
3.4.1 -0.02% -0.96% -1.03% 101% 99% 0.04% -0.98% -1.04% 100% 100%
3.4.2 0.03% -0.34% -0.32% 97% 100% 0.10% -0.86% -0.92% 99% 100%
3.5 -0.01% -0.61% -0.64% 100% 100% 0.06% -0.68% -0.76% 100% 101%
For 3.1.x, the proposal is to add another mode in which the intra prediction mode is inferred by decoder
processing rather than signalled. This adds encoder (and some decoder) complexity, but the test results do
not show much benefit from this, so no action was taken on this.
For 3.2, the proposal is to restrict the number of selectable intra prediction modes to 32 of the 67. Some
gain is observed, but only a small amount, and it was commented that the restriction of what modes an
encoder would be allowed to select would restrict encoder freedom on how to make its mode decisions
(not allowing an encoder to choose a mode until after it is able to determine which 32 of the 67 are
allowed for selection), so no action was taken on this.
3.3.1, 3.3.2, 3.4.1, 3.4.2, and 3.5 are regarding chroma mode coding, which is dependent on luma mode.
3.3.1 and 3.4.1 would increase decoder complexity. One key aspect is how many luma points are
considered for deriving the chroma mode candidates (1 or 2). Another key aspect is how many chroma
mode candidates there are (3 or 5).
Schemes 3.3.2 and 3.4.2 reduce the number of candidates from 5 to 3. As tested, this reduced encoding
time since the encoder checked fewer modes, although the decoder complexity is higher than for the
current scheme (because it uses a 2-point check for direct mode selection). It was commented that the DC,
planar, horizontal and vertical modes are especially important for some encoder implementations. If these
are not always selectable, it would force a dependency between luma and chroma for encoding decisions.
These two schemes did not provide much coding gain, although in the way they were tested, they reduced
encoding time. The lack of significant coding gain, together with that dependency, did not appear to
justify action on those.
Results for an additional scheme called 3.5.1 (proposed in JVET-M0203) were included in the CE report.
This was a late addition that was not in the CE plan, so it was considered a non-CE proposal.
3.3.1 and 3.4.1 check two luma locations, whereas the VTM checks only 1 (the central position of the
luma block). The VTM sends a flag on whether to use that mode; if not, it sends a CCLM mode flag; if
not, it sends 2 bypass-coded bins to select between four modes. If the luma mode was not DC, planar,
horizontal or vertical, then those are the four modes; otherwise the luma mode is replaced with the
vertical diagonal mode to determine the four modes.
3.3.1 and 3.4.1 check two luma locations and perform some comparison flowchart operations to
determine what is the primary selectable mode and what are the other four modes. The DC and planar
modes are always among the 5 selectable modes. It was noted that this forces a dependency between the
luma and chroma mode decisions unless the encoder only used DC and planar modes for chroma.
The possibility of supporting both the current scheme and the alternative was discussed. The gain seemed
insufficient to want to need two different ways to be supported in the decoder.
3.5 checks one luma location (same as VTM); if the luma mode is DC or planar and the block shape is
vertical, then instead of the vertical diagonal mode being considered special, the horizontal diagonal
mode is considered special; if the luma mode is angular, the other selectable modes are determined by a
flowchart (but the horizontal and vertical modes are not always available). This has basically the same
forced cross-component dependency as 3.3.1 and 3.4.1.
Since the gain is relatively small and the 3.3.1, 3.3.2, 3.4.1, 3.4.2, and 3.5 proposals introduce an
undesirable cross-component dependency for encoders, no action was taken on on these.
JVET-M0043 CE3: Affine linear weighted intra prediction (test 1.2.1, test 1.2.2) [J. Pfaff,
B. Stallenberger, M. Schäfer, P. Merkle, P. Helle, R. Rischke, H. Schwarz,
D. Marpe, T. Wiegand (HHI)]
JVET-M0102 CE3: Intra Sub-Partitions Coding Mode (Tests 1.1.1 and 1.1.2) [S. De-
Luxán-Hernández, V. George, J. Ma, T. Nguyen, H. Schwarz, D. Marpe,
T. Wiegand (HHI)]
JVET-M0142 CE3: Modified CCLM downsampling filter for “type 2” content (Test 2.4)
[P. Hanhart, Y. He (InterDigital)]
JVET-M0203 CE3: DM-based chroma intra prediction mode (Test 3.5) [N. Choi,
M. W. Park, K. Choi (Samsung)]
JVET-M0218 CE3: Simplified MDMS (test 3.3.1 and test 3.3.2) [J. Choi, J. Heo, S. Yoo,
L. Li, J. Choi, J. Lim, S. Kim (LGE)]
JVET-M0263 CE3: CCLM prediction with single-line neighbouring luma samples (Test
2.6.1 and Test 2.6.2) [K. Zhang, L. Zhang, H. Liu, J. Xu, Y. Wang, P. Zhao,
D. Hong (Bytedance)]
JVET-M0475 CE3: Multiple neighbour LM (Test 3.2.2) [H.-J. Jhu, Y.-J. Chang (Foxconn)]
JVET-M0503 CE3: Chroma intra prediction simplification (Test 3.4.1 and 3.4.2) [C.-H.
Yau, C.-C. Lin, C.-L. Lin (ITRI)]
JVET-M0504 CE3: adaptive multiple cross-component linear model (Test 3.2.3) [S.-P.
Wang, C.-H. Yau, C.-C. Lin, C.-L. Lin (ITRI)]
JVET-M0024 CE4: Summary report on inter prediction and motion vector coding
[H. Yang, S. Liu, K. Zhang]
This contribution provides a summary report of Core Experiment 4 on inter prediction and motion vector
coding. CE4 comprises 5 categories,
1) Merge mode simplification
2) Merge mode enhancement
3) Parallel processing for merge mode
4) Motion vector coding
5) Motion compensation constraints for complexity reduction
All techniques are implemented on top of and test against VTM 3.0. Simulation results and crosschecking
reports of each test specified in this document are provided.
CE4.1: Merge mode simplification
1) HMVP buffer size is increased from 6 to 10, with no pruning to update the buffer.
2) One out of every 3 is picked to add in the merge list. 4 HMVP candidates at most
4.1.2.a JVET-M0126
3) Pairwise candidates are generated not using HMVP candidates.
4) First 3 HMVP candidates are pruned to the left and above spatial candidates
4.1.2.a 1) 2) 3) + 4) First 2 HMVP candidates are pruned to the left and above spatial
4.1.2.b JVET-M0126 candidates
Regarding JVET-M0126, only aspects 1) and 2) were in the original proposal (JVET-L0401), whereas
aspects 3) and 4) were included in the first release of the software. It is suggested that the proponents also
provide separate results which show the benefit of combination 1/2, as well as 3-only and 4-only
separately.
Results were made available in JVET-M0126v8. Based on this analysis, it was found that only the
method 4.1.2.4 (First 2 HMVP candidates are pruned to the left and above spatial candidates) is relevant. This
would be competing with 4.1.1.a. Both methods have significant reduction in number of pruning process compared
to the current design, where 4.1.2.4 has even slightly less complexity, and loss of compression seems to be more
homogeneous over RA and LB (approx. 0.04% luma), and over different sequences.
Decision: Adopt JVET-M0126 version 4.1.2.4 (text is available, but needs to be reduced reflect that only this aspect
is changed.
Difference in
the number of
Max number of
Test# candidate Others RA LB
pruning stages
comparison
(+/- xx)
4.1.1.a -4 unchanged 0.00% 0.07%
4.1.1.b -2 unchanged 0.00% 0.02%
4.1.1.c -3 unchanged -0.04% 0.05%
4.1.1.d -2 unchanged -0.02% 0.03%
4.1.2.a -8 -2 HMVP table = 10 0.03% 0.08%
HMVP table=10, runtime saving of
4.1.2.b -10 -3 0.03% 0.08%
HMVP functions is 59%(RA), 62%(LB)
4.1.5.a +0 +0 0.00% -0.01%
4.1.5.b +6 +1 Discard pairwise calculation -0.02% 0.02%
4.1.5.c +6 +1 Discard pairwise calculation -0.02% -0.02%
It was remarked that the table above should be updated to get better understanding of the properties. 4.1.1
and 4.1.2 are modifying HMVP. In particular, the number of maximum pruning operations and
comparison operations necessary during table construction and during merge should be listed separately.
Number of cycles should also be calculated rather than number of pruning stages. It is reported that 4.1.2
is not doing any pruning during table construction (using FIFO list, but with extended length from 5 to
10) and also reduces the number of pruning operations in merge list construction. 4.1.1. is keeping the
HMVP list construction unchanged, but reduces the pruning operations in merge list construction.
More analysis later became available (see under the discussion of the adoption of JVET-M0126 above).
Current VTM3: Merge list size 6, Max number of potential candidates 18, Max number of candidate
comparisons 15, Max number of pruning stages 8-9(?)
Max number Max number Max number
Merge Max number
Test# of potential of candidate of pruning Others RA LB
list size of MV scaling
candidates comparison stages
4.2.1.a 6 +1 0 0 0 -0.09% 0.00%
4.2.2.a 6 +1 +4 +1 +0 -0.11% 0.00%
4.2.2.b 8 +1 +4 +1 +0 -0.17% -0.06%
4.2.2.c 6 +2 +9 +2 +0 -0.15% 0.00%
4.2.2.d 8 +2 +9 +2 +0 -0.23% -0.06%
4.2.3.a 6 0.00% 0.11%
4.2.4.a 6 -0.03% 0.16%
4.2.5.a 6 0.07% 0.03%
4.2.5.b 6 0.02% -0.07%
Similar concepts (shared merge of JVET-M0170, parallel merge of JVET-M0289) have been used in
HEVC, where parallel is regarded mostly beneficial for parallel processing at the encoder, whereas shared
merge could also have some benefit for the decoder, as it is not necessary to generate merge lists
separately for very small blocks (however only if the sharing would be made mandatory, which is not the
case in HEVC). However, due to the fact that VVC has more irregular (non-square) block structures,
consistent definition of regions which use shared merge / parallel merge becomes more difficult. For
example, if the regions are square, it may happen that a rectangular block is only partially included. 4.3.1
and 4.3.2 S3 try to solve that by introducing non-square regions below some parent node of the tree.
Both approaches require normative changes. Making the decoder more complex for the benefit of a
parallel encoder may be not as desirable as it was in HEVC, in particular as it is more irregular due to
non-square blocks. Reducing decoder complexity by sharing merge lists between very small blocks (e.g.
two adjacent 4x4 blocks use same merge seems relative simple to define and would be beneficial if it is
normative. Results from test 4.3.1a above are from a version called “type1” in JVET-M0170, whereas
“type2” (that is also described in JVET-M0170) with threshold 32 would be the desirable solution. The
type2 is understood in a way that if after a split any block is smaller than the threshold, all blocks of that
split share the same merge list (e.g. 16x4 split ternary or 8x8 with quad split). This needs to be
mandatory. Reportedly, this comes with no loss for luma, and some negligible loss (0.03/0.06%) for
chroma. This was being cross-checked in JVET-M0584 (“supplementary test D1”) which was not
finished yet.
This was revisited after the cross-check was finished. Cross-checkers had been asked to investigate the
code and associated specification. The specification text should be simplified such that the threshold is
not signalled, but always used as 32. X.Li is also asked to inspect the code and text.
It was later confirmed (Tue 15 Jan afternoon) by the cross-checkers and X. Li that everything is
consistent as requested above. Text is available in v4 of JVET-M0170.
Decision: Adopt JVET-M0170 (type 2, draft text “…type2sharing” of Jan. 12)
Further study on the aspects that simplify encoder, in particular also by investigating the potential of non-
normative solutions. It is mentioned that by the time when parallel merge was introduced in HEVC, a
corresponding non-normative solution would have lost about 2% in compression performance.
CE4.4: Motion vector coding
Test# Source Description
4.4.1.a JVET-M0403 MVD is not signalled as x/y components but as layer and index: Two layer-groups.
4.4.1.b JVET-M0403 MVD is not signalled as x/y components but as layer and index: Four layer-groups.
Symmetrical MVD mode: BiDirPredFlag, RefIdxSymL0 and RefIdxSymL1 are
4.4.3 JVET-M0481 derived at slice level; Only mvp_l0_flag, mvp_l1_flag and MVD0 are explicitly
signalled; MVD1=-MVD0.
4.4.1: The claimed benefit is that the number of context-coded bins in MVD coding is reduced from 4 to
1. This is achieved by joint coding of x and y differences, where the “layer” is the sum of x and y, and the
index is an addressing in a layer. Index is coded as truncated binary (number of indices depends on the
layer). In total, the number of bits for expressing the layer and the index is not changed relative to the
independent coding of MVD.
Only the MVD coding part was changed, motion estimation, RDO etc. were identical.
4.4.1a is the simpler solution, difference in compression performance is marginal.
It was initially planned in Track B to adopt JVET-M0403 (test 4.4.1a, 2 layer groups, where the first
group is just the 0,0 MVD).
Specification text was available.
In the Sunday plenary it was discussed that the benefit of context coding reduction in MVD coding is
almost marginal, and might not justify changing a well established scheme. Further, in the context of later
discussion related to CE8, it was detected that this might have coding efficiency problems with CPR.
Furthermore, it would be more difficult to check in the layer/index representation if the CPR range
constraints are valid. It was therefore agreed to revert the decision above (Thu. 17 Jan. afternoon).
4.4.3: The compression gain of symmetric MVD coding is 0.16% (was approx. 0.6% before), could be
reduced by the fact that VTM3 incudes MMVD, which also uses symmetric coding, and also other
elements of improved MV coding. Encoding time increases by 5%, very simple for decoder.
Some concern is expressed that not all of the current gain may be retained when MMVD would be further
improved.
It is noted that the table above does not consider memory access patterns, is only pixel level access.
Further study on these aspects is needed. There are also non-CE proposals which use other methods (e.g.
shorter interpolation filters, integer pel, padding, …) which could be considered for reducing memory
access.
Establish BoG (K. Zhang) to review the CE4 related proposals, and suggest aspects to be studied in CE.
Consider either significant complexity reduction without losing compression performance, or proposals
with significant improvement of compression without increasing complexity. If there are proposals that
are closely related to proposals that were investigated in CE (e.g. some beneficial encoder optimization,
or minor syntax change with benefit, or further complexity reduction), these should be reported to Track
B for possible consideration for adoption.
For the spatial candidates, the first and second candidates in the current merge candidate list is used.
For the temporal candidate, the same position as VTM / HEVC collocated position is used.
If three candidates, of which the reference are equal to zero, are available, the following apply.
mvLX[0] = (mvLX_A[0] * 3 + mvLX_L[0] * 3 + mvLX_C[0] * 2 ) / 8
mvLX[1] = (mvLX_A[1] * 3 + mvLX_L[1] * 3 + mvLX_C[1] * 2 ) / 8
If two motion information, of which the reference are equal to zero, is available, the following apply
mvLX[0] = (mvLX_A[0] + mvLX_C[0] ) / 2
mvLX[1] = (mvLX_A[1] + mvLX_C[1] ) / 2
or
mvLX[0] = (mvLX_B[0] + mvLX_C[0] ) / 2
mvLX[1] = (mvLX_B[1] + mvLX_C[1] ) / 2
Note: If the temporal candidate is unavailable, the STMVP mode is off.
JVET-M0060 CE4: Enhanced Merge with MVD (Test 4.4.4) [T. Hashimoto, E. Sasaki,
T. Ikai (Sharp)]
Directional candidates
Directional candidates are used in addition to horizontal / vertical candidates of VTM-3.0. The directional
candidates are shown as below.
Motion direction (proposal)
Direction IDX 000 001 010 011 100 101 110 111
x-axis +2 –2 0 0 +1 -1 -1 +1
y-axis 0 0 +2 –2 +1 -1 +1 -1
Note: the value of x-axis and y-axis of diagonal direction are half of that of horizontal and vertical
direction respectively.
JVET-M0106 CE4: STMVP without scaling (tests 4.2.2) [F. Le Léannec, T. Poirier,
F. Galpin (Technicolor)]
Test 4.2.2.a
The proposed STMVP technique is based on JVT-L0207. It consists in constructing the STMVP merge
candidate as follows.
One top-neighbouring motion vector (MV) of current CU is selected among spatial position (w/2,-1),
(2*w,-1) and (w-1,-1). The first found position where all available MVs use the reference picture
index 0 is selected.
One left-neighbouring motion vector (MV) of current CU is selected among spatial position (-1,h/2),
(-1,2*h) and (-1,h-1). The first found position where all available MVs use the reference picture index
0 is selected.
The same temporal MV predictor as in JVET-L0207 is used. The temporal candidate is set as always
available, given that the Right-Bottom position of the current block is inside the picture. The y-
position of the temporal candidate is clipped to ensure it lies inside the current CTU row.
The STMVP candidate is considered as available if at least 2 spatial-temporal MV predictors are
found. If only one neighbouring MV is retrieved, no STVMP candidate is included into the merge
list.
No MV scaling is applied.
The STMVP candidate is computed as the average between the 2 or 3 retrieved spatial and temporal
MV predictors.
Test 4.2.2.c
JVET-M0170 CE4.3.1: Shared merging candidate list [C.-C. Chen, Y.-C. Lin, M.-S.
Chiang, C.-W. Hsu, T.-D. Chuang, C.-Y. Chen, Y.-W. Huang, S.-M. Lei
(MediaTek)]
It is proposed to share the same merging candidate list for all leaf CUs of one ancestor node in the CU
split tree for enabling parallel processing of small skip/merge-coded CUs. The ancestor node is named
merge sharing node. The shared merging candidate list is generated at the merge sharing node pretending
the merge sharing node is a leaf CU.
There are 2 types of size threshold definitions, which are denoted as Type-1 and Type-2 definitions. For
Type-1 definition, the merge sharing node will be decided for each CU inside a CTU during parsing stage
of decoding; moreover, the merge sharing node is the largest ancestor node among all the ancestor nodes
of the leaf CUs satisfying the following two criteria.
1) The merge sharing node size is equal to or smaller than the size threshold
2) No samples of the merge sharing node are outside the picture boundary.
For Type-2 definition, the merge sharing node will be decided for each CU inside a CTU during parsing
stage of decoding; moreover, the merge sharing node is an ancestor node of leaf CU which must satisfy
the following 2 criteria:
1) The merge sharing node size is equal to or larger than the size threshold
2) In the merge sharing node, one of the child CU size is smaller than the size threshold
The proposed shared merging candidate list algorithm supports translational merge (including merge
mode and triangle merge mode, history-based candidate is also supported) and subblock-based merge
mode. For all kinds of merge mode, the behavior of shared merging candidate list algorithm looks
basically the same, and it just generates candidates at the merge sharing node pretending the merge
sharing node is a leaf CU.
Besides, it is proposed to add new syntax elements to sequence parameter set (SPS) and picture parameter
set (PPS).
Notes from the BoG report review: JVET-M0507 has two aspects, where the aspect of removing the
clipping for the shared merge list might also be beneficial on top of JVET-M0170. It is reported to come
at no coding loss. The proponents of JVET-M0507 discussed with proponents of JVET-M0170 that the
check for one of the subbocks being outside of the picture could be removed, whereas another check
whether the CU center is still inside needed to be added, to make it consistent with other boundary check
conditions in VVC. Proponents of JVET-M0170 were to make an update on this aspect.
JVET-M0221 CE4: STMVP simplification (test 4.2.3a) [Y.-H. Chao, Y. Han, W.-J. Chien,
M. Karczewicz (Qualcomm)]
In this proposal, four modifications are proposed to simplify the complexity in the non-sub-PU STMVP
in JVET_L0399, as described as follows:
1. No MV scaling for the two spatial neighbours.
JVET-M0281 CE4: Inter motion predictor pruning (test 4.1.5) [A. Robert, F. Le Léannec,
T. Poirier, F. Galpin (Technicolor)]
In AMVP, a motion predictor is a motion vector dedicated to a particular reference picture, and a list must
contain 2 motion predictors. The process is repeated for each reference frame of each reference frame list.
In VTM3, when performing the AMVR rounding, the ¼-pel rounding should be performed at the same
time. This rounding step takes place just before the existing pruning. But for motion predictors that do not
use AMVR, the needed ¼-pel rounding is performed at the end of the list construction.
This contribution then proposes to perform all the rounding operations before pruning even if AMVR is
off, i.e. either ¼-pel or AMVR and ¼-pel rounding is done before any pruning.
In Merge mode, a motion predictor is one or two motion vector(s) with its associated reference picture,
and a list must contain 6 motion predictors.
The pruning process of the spatial candidates has been simplified and the one of the temporal candidate
has been removed. But, for pair-wise candidates, no pruning process exists.
This contribution then proposes to perform an early and simple pruning of the pair-wise candidates. For
each used pair (A, B), the pair-wise candidate is not calculated:
If both A and B are bi-directional and both motion vectors are equal,
If both A and B are uni-directional or only A is bi-directional and motion vectors of the common list
are equal,
If only B is bi-directional and motion vectors and reference frame of the common list are equal.
JVET-M0289 CE4: Parallel Merge Estimation for VVC (Test 4.3.2) [H. Gao, S. Esenlik,
B. Wang, A. M. Kotra, J. Chen (Huawei)]
Variant 1
In the variant 1, the MER is defined as a fixed non-overlapped square grid.
According to the proposal the rule for setting a neighbour unavailable for merge list construction is
changed as follows.
New parallel merge estimation rule: “A neighbour coding block is marked as unavailable if its bottom
right coordinate falls in the extended merge estimation region of the current block”. The extended MER is
depicted in the following figure:
JVET-M0312 CE4: MMVD improvement (test 4.4.5) [J. Li, R.-L. Liao, C. S. Lim
(Panasonic)]
Adaptive distance table is proposed to improve coding gain of MMVD.
Test 1
Use adaptive distance table based on occurrence-based distance table reordering.
1. An optimal reordered distance table is determined after coding each inter picture. If current
picture is intra picture, base distance table is used as optimal reordered distance table.
2. The optimal reordered distance table of reference picture in list 0 and reference index 0 is used as
distance table for current inter picture.
3. Weighted table is used for determining optimal reordered distance table. And weighted table is
determined using gain table.
gain_table[8][8] = {
{ 0, -1, -2, -3, -4, -5, -6, -6},
{ 2, 1, 0, -1, -2, -3, -4, -4},
{ 4, 3, 2, 1, 0, -1, -2, -2},
{ 6, 5, 4, 3, 2, 1, 0, 0},
{ 8, 7, 6, 5, 4, 3, 2, 2},
{10, 9, 8, 7, 6, 5, 4, 4},
{12, 11, 10, 9, 8, 7, 6, 6},
{14, 13, 12, 11, 10, 9, 8, 8} };
gain_table_4K[8][8] = {
{ 4, 3, 2, 1, 0, -1, -2, -2},
{ 6, 5, 4, 3, 2, 1, 0, 0},
{ 8, 7, 6, 5, 4, 3, 2, 2},
{10, 9, 8, 7, 6, 5, 4, 4},
{12, 11, 10, 9, 8, 7, 6, 6},
{14, 13, 12, 11, 10, 9, 8, 8},
{16, 15, 14, 13, 12, 11, 10, 10},
{18, 17, 16, 15, 14, 13, 12, 12} };
Test 2
Use adaptive distance table based on picture resolution, i.e., if picture resolution is not larger than 2K, i.e.,
for 1920×1080, the table below is used as the base distance table:
MMVD distance table candidate
Distance IDX 0 1 2 3 4 5 6 7
Pixel distance 1/4-pel 1/2-pel 1-pel 2-pel 4-pel 8-pel 16-pel 32-pel
JVET-M0313 CE4: Motion compensation constraints for complexity reduction (test 4.5.1
and test 4.5.2) [R.-L. Liao, J. Li, C. S. Lim (Panasonic)]
To guarantee that the worst-case memory bandwidth is not exceeded that of 8×8 bi-prediction, bi-
prediction is disabled for small CU. For the CU coded in merge mode, the motion vector from L0
direction of a bi-prediction motion vector candidate is used to predict the CU, which is the same situation
in HEVC. For the CU coded in inter mode, only uni-prediction is allowed and the first bin of the inter
direction which indicates uni-prediction or bi-prediction is not signalled.
JVET-M0403 CE4: Generic Vector Coding of Motion Vector Difference (Tests 4.4.1.a and
4.4.1.b) [S. Paluri, M. Salehifar, S. Kim (LGE)]
Modifications to the algorithm have been carried out to reduce the worst case use of context coded bins
from 24 to 6 by coding MVDx and MVDy components together. MVD (x,y) is coded using a
combination of Layer and index information, wherein the index identifies the MVD(x,y) combination
within a group of MVDs (i.e., a Layer). Both the tests differ in their groupings of the layers. In the
former, all the layers greater than 0 are grouped together, while in the later test, layers greater than or
equal to 3 are grouped together while layers 1 and 2 are grouped separately.
JVET-M0481 CE4: Symmetrical MVD mode (Test 4.4.3) [H. Chen, T. Solovyev, H. Yang,
J. Chen (Huawei)]
In slice level, variables BiDirPredFlag, RefIdxSymL0 and RefIdxSymL1 are derived by searching the
reference pictures in List 0 and List 1.
In CU level, a symmetrical mode flag indicating whether symmetrical mode is used or not is explicitly
signalled if the prediction direction for the CU is bi-prediction and BiDirPredFlag is equal to 1.
When the flag is true, only mvp_l0_flag, mvp_l1_flag and MVD0 are explicitly signalled. The reference
indices are set equal to RefIdxSymL0, RefIdxSymL1 for list 0 and list 1, respectively. MVD1 is just set
equal to –MVD0. The final motion vectors are shown in below formula.
¿
It was reported that some of the encoding/decoding time in the table may not be reliable. It was
commented that these tests have similar encoding/decoding time, and variation is in the noise range.
The CABAC engine and the initialization states are tested together.
VTM 3 uses HEVC CABAC engine with the following characteristics:
1 log. state, 7 bits, fixed window size
Coding interval subdivision of 64x4x8 bits LUT
Throughput-optimized software implementations of the configurations in CE5.1 are tested in terms of the
achievable throughput in the decoder. Two types of bin sequences are used for the evaluation:
VTM3: Bin sequences extracted from VTM-3.0 CTC bitstreams
RND: Randomly generated bin sequences
Three contributions report results for CE5.2:
JVET-M0453 (Sharp)
JVET-M0463 (Qualcomm)
JVET-M0762 (HHI)
It was commented that JVET-M0453 and JVET-M0463 reported consistent trends in terms of throughput
numbers for the CE tests. JVET-M0762 was a late contribution that was updated shortly before the CE
review started. It was agreed to review JVET-M0762 for further discussion of this CE. See notes under
JVET-M0762.
JVET-M0172 CE5.1.9: CABAC engine with simplified range sub-interval derivation [T.-D.
Chuang, C.-Y. Chen, Y.-W. Huang, S.-M. Lei (MediaTek)]
JVET-M0413 CE5: Per-context CABAC initialization with single window (Test 5.1.4)
[A. Said, J. Dong, H. Egilmez, Y.-H. Chao, M. Karczewicz, V. Seregin
(Qualcomm)]
JVET-M0453 CE5 on arithmetic coding: experiments 5.1.1, 5.1.2, 5.1.3, 5.1.4, 5.1.5, 5.1.6,
5.1.7, 5.1.8, 5.1.10, 5.1.11, 5.1.12, 5.1.13, 5.2, and more [F. Bossen (Sharp)]
This report provides results for the following CE5 experiments: 5.1.1, 5.1.2, 5.1.3, 5.1.4, 5.1.5, 5.1.6,
5.1.7, 5.1.8, 5.1.10, 5.1.11, 5.1.12, 5.1.13 and 5.2. Additional variants of these experiments are considered
where constants, such as initial state values and shift amounts, are modified.
Version 2 of the document provides additional results for experiment 5.2 on throughput, as well as BD-
rate results for a configuration that yields no encoder run time increase.
The following table reports the decoder throughput from an actual bin stream (first 10 million context-
coded bins from ParkRunning, RA configuration, QP22). Three compilers have been tested: clang 10.0
(Apple LLVM version 10.0.0), gcc 7.4 (Homebrew GCC 7.4.0) and gcc 8.2 (Homebrew GCC 8.2.0). The
test bed was configured with the macro NOBRANCH_MPS either on (left half of table) or off (right half
of table). Results were obtained on an Intel® Xeon® W-2140B CPU @ 3.20GHz.
In almost all cases the combination of clang and NOBRANCH_MPS enabled yielded the highest
throughput. This configuration is thus considered for comparing the various engines.
The rightmost column in the table below reports the decoder throughput for a different bin stream (first 10
million context-coded bins from BQTerrace, RA configuration, QP22, clang, NOBRANCH_MPS
enabled).
Experiments were run a second time, where each test is done 7 times and the highest throughput value is
recorded. While the numbers are slightly higher, the same trends persist.
The table above was copied from JVET-M0453 and then updated in group discussion to form the
modified table below. The each category in the table was reviewed and particular tests that provided the
highest coding efficiency in each category with good throughput numbers were highlighted for further
focused discussion. The properties, similarities and differences between the tested methods were
discussed.
The following table was obtained by updating the table above as follows: 1) adding 5.1.9, 2) replacing
throughput numbers with the second throughput number table, and 3) keeping only rows that pertain to
the CE tests.
Engine AI RA LB LP AI RA LB Thruput
VTM3 w/ HEVC engine 0 0 0 0 100/100 100/100 100/100 137.83
2 states, fixed window size
5.1.1 -0.85% -0.74% -0.56% -0.56% 106/103 103/102 101/100 124.17
5.1.5 -0.83% -0.64% -0.62% -0.60% 109/107 105/103 103/103 115.84
5.1.8 -0.91% -0.69% -0.48% -0.50% 106/102 103/101 100/98 122.01
5.1.9 -0.79% -0.57% -0.50% N/A 128.67
5.1.10 -0.83% -0.70% -0.55% -0.61% 108/104 104/102 103/101 128.67
5.1.10* + new init from 5.1.2 -0.84% -0.73% -0.51% -0.53% 109/107 105/104 105/104 128.67
2 states, variable window size
5.1.2 -1.00% -0.97% -0.82% -0.76% 109/105 104/103 101/101 116.69
5.1.3 clz + 8x8 table (config. 2) -1.03% -0.96% -0.74% -0.83% 107/108 104/104 104/104 107.96
5.1.3 4x5 mult (config. 1) -0.99% -0.91% -0.71% -0.83% 109/105 104/102 103/102 128.51
5.1.6 8+12 bit state -0.90% -0.76% -0.72% -0.69% 111/105 105/103 105/102 112.69
5.1.6 10+14 bit state -0.97% -0.82% -0.67% -0.73% 112/105 105/102 105/102 112.69*
5.1.11 -0.97% -0.93% -0.75% -0.76% 108/103 103/101 102/101 124.55
5.1.11* + new init from 5.1.2 -1.00% -0.98% -0.77% -0.75% 110/105 104/102 104/102 124.55
5.1.13 -0.95% -0.91% -0.73% -0.74% 109/103 104/101 101/98 128.51
5.1.13* +new init from 5.1.2 -0.98% -0.96% -0.75% -0.73% 110/105 104/103 106/103 128.51
It was suggested to look at the hardware aspect of these different engines, which was provided as part of
JVET-M0025 subtest 3.
It was agreed that no hardware problem had been identified for any of the CE tests, and that such a
hardware problem could be fixed if/when it was identified.
When looking at different categories, it was remarked that the “1 state, variable window” category has the
highest throughput (closest to HEVC engine), and the “2 state, variable window” category has the highest
coding performance. Regarding the “2 state, fixed window” category, it was remarked that this category
does not need custom window size parameters for different context models, but needs re-training of
initialization parameters.
Regarding custom window size parameters, it was remarked that that does not seem to increase
complexity to such a degree that it justifies going for a simpler solution (i.e. fixed window size).
It was remarked that some of the coding efficiency gain from the tests that use custom window size
parameters may have come from training of custom window size parameters based on the test set.
It was remarked that the initialization parameters were also trained on the test set, which also may have
provided some of the coding efficiency gain.
In terms of coding efficiency, the options “5.1.3 4x5 mult (config. 1)”, “5.1.13” and “5.1.13* +new init
from 5.1.2” are the most attractive options. Between “5.1.3” and “5.1.13,” the latter has a slight advantage
for hardware implementations due to needing to support fewer shift values. And the difference between
“5.1.13” and “5.1.13* +new init from 5.1.2” is purely due to training of initialization parameters, with the
new initialization parameters providing better coding efficiency.
Decision: Adopt “5.1.13* +new init from 5.1.2”.
It was commented that these graphs contained outliers. It was also commented that the graphs were
uploaded very late (~20 minutes before the CE5 discussion started). It was then agreed to review JVET-
M0453 instead, which provided throughput numbers for each CE test, for the further discussion of this
CE. See notes under JVET-M0453.
Sony
Technicolor
6-1.2a JVET-M0200 JVET-L0060: Unified matrix for transform K. Choi X. Zhao
(Samsung) (Tencent)
6-1.3a JVET-M0244 JVET-L0353: MTS using DST-4 and transposed DCT-2 Y. Lin H. Egilmez
(HiSilicon) (Qualcomm)
6-1.3b MTS using DCT-2 like transforms Y. Lin H. Egilmez
(HiSilicon) (Qualcomm)
6-1.4a JVET-M0538 JVET-L0386, JVET-L0682: TAF for 32-pt MTS A. Said A. Karabutov
(Qualcomm) (Huawei)
X. Zhao
(Tencent)
6-1.4b JVET-L0386, JVET-L0682: TAF for 32-pt and 64-pt A. Said A. Karabutov
MTS (Qualcomm) (Huawei)
6-1.4c JVET-L0386, JVET-L0682: TAF for 16-pt and 32-pt A. Said A. Karabutov
MTS (Qualcomm) (Huawei)
X. Zhao
(Tencent)
6-1.4d JVET-L0386, JVET-L0682: TAF for 16-pt, 32-pt and A. Said A. Karabutov
64-pt MTS (Qualcomm) (Huawei)
6-1.5a JVET-M0080 JVET-L0135: TAF simplification for 32-pt MTS P. Philippe LGE
(Orange)
6-1.5b JVET-L0135: TAF simplification for 32-pt and 64-pt P. Philippe LGE
MTS (Orange)
6-1.5c JVET-L0135: TAF simplification for 16-pt and 32-pt P. Philippe LGE
MTS (Orange)
6-1.5d JVET-L0135: TAF simplification for 16-pt, 32-pt and P. Philippe LGE
64-pt MTS (Orange)
6-1.6a JVET-M0521 JVET-L0395:4-pt DST-4 and DCT-4 replacing DST-7 H. Egilmez Y. Lin
and DCT-8 used in MTS (Qualcomm) (HiSilicon)
6-1.6b JVET-L0395: 4-pt and 8-pt DST-4 and DCT-4 replacing H. Egilmez Y. Lin
DST-7 and DCT-8 used in MTS (Qualcomm) (HiSilicon)
6-1.6c JVET-L0395: 4-pt, 8-pt and 16-pt DST-4 and DCT-4 H. Egilmez Y. Lin
replacing DST-7 and DCT-8 used in MTS (Qualcomm) (HiSilicon)
AI RA LB
Test # Doc. # Y U V EncT DecT Y U V EncT DecT Y U V EncT DecT
CE6-1.1a JVET-M0496 -0.17% -0.14% -0.11% 101% 99% -0.06% 0.05% 0.04% 101% 99% 0.02% 0.30% 0.13% 100% 98%
CE6-1.1b JVET-M0496 -0.10% -0.17% -0.18% 102% 99% -0.01% 0.14% 0.05% 100% 100% 0.05% 0.00% 0.01% 100% 97%
CE6-1.1c JVET-M0496 -0.03% -0.09% -0.09% 101% 100% 0.07% 0.12% 0.11% 101% 100% 0.15% 0.20% -0.42% 99% 98%
CE6-1.1d JVET-M0496 -0.04% -0.11% -0.13% 96% 91% 0.06% 0.08% 0.04% 99% 99% 0.10% 0.27% -0.05% 100% 100%
CE6-1.1e JVET-M0084 0.23% 0.03% 0.07% 103% 98% 0.16% 0.30% 0.26% 101% 101% 0.04% 0.22% -0.17% 101% 100%
CE6-1.2a JVET-M0200 -0.06% -0.14% -0.12% 101% 99% 0.07% 0.14% 0.11% 99% 99% 0.17% 0.32% 0.18% 97% 95%
CE6-1.3a JVET-M0244 0.08% 0.11% 0.10% 98% 97% 0.07% 0.25% 0.19% 100% 99% -0.02% 0.32% 0.33% 100% 101%
CE6-1.3b JVET-M0244 0.22% 0.34% 0.32% 97% 93% 0.17% 0.39% 0.36% 99% 99% 0.03% 0.34% 0.10% 100% 101%
CE6-1.4a JVET-M0538 0.00% 0.07% 0.05% 98% 94% -0.01% 0.08% 0.13% 100% 99% -0.02% 0.09% 0.11% 99% 99%
CE6-1.4b JVET-M0538 -0.07% 0.01% 0.00% 98% 94% -0.12% -0.12% -0.22% 100% 98% 0.01% 0.28% 0.01% 99% 97%
CE6-1.4c JVET-M0538 0.12% 0.17% 0.15% 95% 91% 0.04% 0.09% 0.18% 99% 99% 0.04% 0.20% 0.19% 99% 100%
CE6-1.4d JVET-M0538 0.05% 0.09% 0.08% 95% 90% -0.06% -0.11% -0.14% 99% 98% 0.01% -0.03% 0.01% 98% 96%
CE6-1.5a JVET-M0080 0.07% 0.06% 0.06% 96% 88% 0.06% 0.11% 0.11% 99% 99% 0.03% 0.18% 0.16% 100% 100%
CE6-1.5b JVET-M0080 0.02% -0.03% -0.02% 98% 88% -0.03% -0.14% -0.12% 101% 99% -0.01% 0.04% -0.15% 101% 100%
CE6-1.5c JVET-M0080 0.15% 0.09% 0.09% 94% 85% 0.09% 0.19% 0.25% 99% 98% 0.06% 0.16% 0.14% 100% 100%
CE6-1.5d JVET-M0080 0.09% 0.01% 0.02% 96% 85% 0.01% -0.07% -0.07% 101% 98% 0.07% 0.04% 0.05% 101% 100%
CE6-1.6a JVET-M0521 -0.17% -0.14% -0.11% 99% 103% -0.06% 0.05% 0.04% 100% 101% 0.02% 0.30% 0.13% 100% 99%
CE6-1.6b JVET-M0521 -0.12% -0.18% -0.16% 99% 101% -0.02% 0.16% 0.02% 100% 100% 0.02% 0.11% -0.07% 100% 100%
CE6-1.6c JVET-M0521 0.08% -0.06% -0.04% 101% 101% 0.09% 0.14% 0.18% 100% 100% 0.07% 0.00% 0.18% 100% 100%
For low QP
The following table summarizes the results for CE6-1 using low QP configuration and VTM-3.0 as
anchor.
AI RA LB
Test # Doc. # Y U V EncT DecT Y U V EncT DecT Y U V EncT DecT
CE6-1.1a JVET-M0496 0.00% -0.06% -0.06% 99% 100% 0.00% -0.02% -0.01% 100% 100% -0.01% 0.00% -0.01% 101% 101%
CE6-1.1b JVET-M0496 0.24% 0.00% 0.01% 101% 101% 0.06% -0.03% -0.01% 100% 101% 0.02% -0.01% 0.00% 100% 101%
CE6-1.1c JVET-M0496 0.26% 0.03% 0.04% 102% 101% 0.12% 0.04% 0.06% 100% 101% 0.05% 0.06% 0.07% 99% 101%
CE6-1.1d JVET-M0496 0.28% 0.04% 0.04% 99% 100% 0.13% 0.05% 0.06% 100% 100% 0.05% 0.08% 0.06% 100% 102%
CE6-1.1e JVET-M0084 0.41% 0.20% 0.20% 101% 101% 0.11% 0.05% 0.08% 100% 100% 0.04% 0.03% 0.03% 100% 100%
CE6-1.2a JVET-M0200
CE6-1.3a JVET-M0244 0.20% 0.17% 0.18% 99% 100%
CE6-1.3b JVET-M0244 0.21% 0.18% 0.18% 99% 99%
CE6-1.4a JVET-M0538 0.03% 0.04% 0.05% 87% 101% 0.01% 0.04% 0.04% 95% 104% 0.00% 0.01% 0.01% 91% 102%
CE6-1.4b JVET-M0538 0.03% 0.04% 0.05% 87% 102% 0.01% 0.04% 0.05% 95% 105% 0.00% 0.01% 0.01% 91% 101%
CE6-1.4c JVET-M0538 0.13% 0.18% 0.18% 85% 100% 0.05% 0.09% 0.10% 94% 103% 0.02% 0.05% 0.04% 90% 101%
CE6-1.4d JVET-M0538 0.13% 0.18% 0.18% 85% 101% 0.05% 0.09% 0.09% 95% 104% 0.02% 0.04% 0.03% 90% 101%
CE6-1.5a JVET-M0080 0.05% 0.07% 0.07% 98% 100% 0.02% 0.08% 0.07% 99% 100%
CE6-1.5b JVET-M0080 0.05% 0.07% 0.07% 99% 100% 0.02% 0.07% 0.06% 100% 100%
CE6-1.5c JVET-M0080 0.23% 0.28% 0.28% 96% 100% 0.10% 0.16% 0.19% 99% 100%
CE6-1.5d JVET-M0080 0.23% 0.28% 0.28% 97% 100% 0.10% 0.16% 0.19% 99% 100%
CE6-1.6a JVET-M0521 0.00% -0.06% -0.06% 94% 97% 0.00% -0.02% -0.01% 99% 105% -0.01% 0.00% -0.01% 96% 102%
CE6-1.6b JVET-M0521
CE6-1.6c JVET-M0521
Variant 1.6c tried also replacing the 8-pt and 16-pt in addition to the 4-pt and showed worse performance
than the anchor. Variants 1.1b and 1.6b tried also replacing the 8-pt in addition to the 4-pt and did not
perform as well as only replacing the 4-pt. It was suggested to focus on the “a” variants of 1.1 and 1.6.
Variants 1.1a and 1.6a are actually the same as each other, only replacing the 4-pt DST-7/DCT-8 by DST-
4/DCT4; there is a little gain, but it is small (0.06% in RA, 0.17% in AI, mostly for lower resolutions). It
was commented that introducing a design inconsistency was undesirable. It was remarked that the DST-4
and DCT4 are subsets of the 8-point DCT2, which could save some memory (16 bytes) if implemented to
take advantage of that. In terms of the amount of computation, there is no difference – it’s just a matter of
what numbers are used in a matrix multiply.
A complexity reduction proposal, 1.1e, proposes replacing all sizes (8, 16, and 32 as well as 4). It has a
loss of 0.23% for AI, 0.16% for RA (more loss on class A: 0.6% for AI and 0.2-0.3% for RA). Its
complexity benefit is reducing storage for the smaller block sizes since the matrix elements for the DST4
and DCT4 of smaller block sizes become subsets of those of larger block sizes of a DCT2. The
implementation is a matrix multiply, so it does not affect cycle counts. The loss for that was considered
excessive.
In Track A, it was initially planned to adopt 1.1a/1.6a, pending review of other things in CE6. It was later
agreed in the plenary on Sunday 13 January not to take this action (see the notes in section 10.1).
The following table summarizes the results for CE6-2 using CTC configuration and VTM-3.0 as anchor.
AI RA LB
Test # Doc. # Y U V EncT DecT Y U V EncT DecT Y U V EncT DecT
CE6-2.1a JVET-M0288 0.00% 0.02% 0.00% 96% 90% -0.01% 0.01% 0.07% 99% 98% 0.02% 0.26% -0.06% 100% 99%
CE6-2.2a JVET-M0372 0.08% 0.05% 0.04% 118% 112% 0.01% 0.07% 0.12% 116% 105% 0.04% 0.22% -0.17% 101% 100%
CE6-2.3a JVET-M0497 0.01% -0.01% 0.01% 96% 91% -0.01% -0.03% -0.02% 100% 100% 0.00% 0.22% -0.22% 99% 101%
The following table summarizes the results for CE6-2 using low QP configuration and VTM-3.0 as
anchor.
AI RA LB
Test # Doc. # Y U V EncT DecT Y U V EncT DecT Y U V EncT DecT
CE6-2.1a JVET-M0288 0.08% 0.08% 0.09% 98% 99%
CE6-2.2a JVET-M0372
CE6-2.3a JVET-M0497 0.02% 0.00% 0.00% 98% 99% 0.01% 0.00% 0.01% 99% 99% 0.00% -0.01% 0.00% 100% 100%
The following table summarizes the results for CE6-2 using CTC w/ Inter MTS configuration and VTM-
3.0 as anchor.
AI RA LB
Test # Doc. # Y U V EncT DecT Y U V EncT DecT Y U V EncT DecT
CE6-2.1a JVET-M0288 0.00% 0.02% 0.00% 96% 90% -0.01% -0.03% 0.03% 97% 98% 0.01% 0.11% -0.19% 97% 96%
CE6-2.2a JVET-M0372 0.08% 0.05% 0.04% 118% 112% 0.03% 0.00% 0.12% 0.07% 0.13% -0.18%
CE6-2.3a JVET-M0497 0.01% -0.01% 0.01% 96% 91% 0.00% -0.04% -0.04% 98% 100% 0.00% 0.18% -0.37% 97% 97%
The following table summarizes the results for CE6-3 using CTC configuration and VTM-3.0 as anchor.
The following table summarizes the results for CE6-3.1 using CTC w/ MTS = 0 configuration for both
test and the VTM-3.0 anchor.
AI RA LB
Test # Doc. # Y U V EncT DecT Y U V EncT DecT Y U V EncT DecT
CE6-3.1a JVET-M0303 -1.66% -2.31% -2.28% 99% 119% -0.75% -0.87% -0.96% 100% 108% -0.09% -0.18% -0.33% 99% 102%
CE6-3.1b JVET-M0303 -1.61% -2.11% -2.12% 101% 111% -0.71% -0.81% -0.84% 100% 102% -0.14% -0.02% -0.33% 100% 100%
6-3.1 is motivated by improving coding efficiency (esp. for the case with MTS off). The others in this
category are to reduce the maximum MTS transform size (eliminating the 32-point transform).
6-3.1 provides substantial coding gain for when MTS is disabled.
When MTS is disabled or when it is enabled but the low-level MTS flag is 0:
The CE6-3.1b variant always uses DCT2 when the length of the transform is 32 (or 64).
The CE6-31.a variant uses 2 transforms of length 32 (but only DCT for length 64).
When the low-level MTS flag is 1, the current 4 combinations are selectable (including for length 32)
Decision: Adopt CE6-3.1b, but with an extra high-level flag to use DCT2 always.
See also the notes of the further discussion of this topic in the plenary of Sunday 13 January in section
10.1.
CE6-3.2b/c, CE6-3.7a, CE6-3.8a/b limit the maximum non-DCT2 transform size to 16. All of these have a loss of
more than 0.5% for AI in Class A, which seems unacceptable, so no action was taken on these.
CE6-3.2a, CE6-3.3a/b, use transform type signalling only on one side when one side is length 32 or
smaller but the other side 64 long. If both sides are small, signalling is used in both dimensions. The gain
for these is approximately negligible (no gain for AI since that case cannot occur, 0.04% in RA when
compared to the current MTS, 0.03% benefit in LB when compared to the current MTS) and the encoder
runtime is increased by the consideration of these additional cases. So no action was taken on these.
(CE6-3.8a/b also has some usage of signalling for only one direction, but is motivated by trying to
eliminate the long non-DCT2 transforms, as discussed above – it has some compression loss.)
CE6-3.3b has gain from two changes: 1) increasing the maximum BT/TT size for I pictures from 32x32
to 64x64, and 2) the signalling scheme from CE6-3.3a. One of these is just an encoder configuration
setting. The encoding runtime increased by 54% for AI, with a coding efficiency benefit of 0.65%. The
benefit reported for the second modification when using the first modification as an anchor was
Page: 123 Date Saved: 2019-03-172019-03-14
reportedly 0.08% for AI. Since this affects only a case we chose not to use in CTC and provides only a
small benefit, no action was taken on that.
CE6.4: Sub-block transform:
Test # Docs Description Tester Cross-
checker
6-4.1a Sub-block Transform (SBT) for inter blocks Y. Zhao X. Zhao
- 1-d split (symmetric or 1/4) (Huawei) (Tencent)
- if symmetric, signal which half; others use the 1/4
- transform type of residual TU inferred
6-4.1b Sub-block Transform (SBT) for inter blocks Y. Zhao C.-M. Tsai
- transform type of residual TU always DCT-2 (Huawei) (MediaTek)
6-4.1c Sub-block Transform (SBT) for inter blocks Y. Zhao K. Choi
- transform type of residual TU signalled (two transform (Huawei) (Samsung)
JVET-M0140 candidates each TU)
6-4.1d Sub-block Transform (SBT) for inter blocks Y. Zhao M. Ikeda
- transform type of residual TU signalled (four transform (Huawei) (Sony)
candidates each TU, like inter MTS but with more
complex context modeling)
6-4.1e Sub-block Transform (SBT) for inter blocks Y. Zhao X. Zhao
- splits of 6-4.1a are allowed, plus (Huawei) (Tencent)
- with Quad-tree split (max TU depth still 1)
- transform type of residual TU inferred
6-4.2a RQT-like concept of transform sub-block splitting Qualcomm Y. Zhao
JVET-M0523
One 4-way QT split (Huawei)
6-4.3a RQT-like concept of transform sub-block splitting X. Zhao Y. Zhao
Square blocks have one 4-way QT split (Tencent) (Huawei)
JVET-M0499
Rectangular blocks have one long-dimension binary
split
6-4.4a RQT-like concept of transform sub-block splitting Y. Zhao H. Egilmez
JVET-M0141 4 choices – binary or ternary split (horizontally or (Huawei) (Qualcomm)
vertically)
The following table summarizes the results for CE6-4 using CTC configuration and VTM-3.0 as anchor.
AI RA LB
Test # Doc. # Y U V EncT DecT Y U V EncT DecT Y U V EncT DecT
CE6-4.1a JVET-M0140 -0.47% -0.16% 0.00% 108% 101% -0.83% -0.98% -0.06% 113% 102%
CE6-4.1b JVET-M0140 -0.25% -0.16% -0.07% 107% 101% -0.42% -0.89% -0.04% 111% 101%
CE6-4.1c JVET-M0140 -0.54% -0.20% 0.08% 110% 101% -0.94% -0.75% 0.23% 116% 102%
CE6-4.1d JVET-M0140 -0.52% -0.10% 0.14% 114% 101% -0.90% -0.70% 0.25% 121% 102%
CE6-4.1e JVET-M0140 -0.55% -0.26% 0.00% 109% 101% -0.93% -0.91% 0.07% 115% 102%
CE6-4.2a JVET-M0523 -0.36% 0.09% 0.35% 122% 102% -0.59% -0.65% 0.47% 129% 101%
CE6-4.2a +JVET-M0523 -0.62% 0.61% 0.80% 146% 105% -0.94% 0.68% 1.41% 161% 107%
InterMTS
CE6-4.3a JVET-M0499 -0.29% -0.16% -0.17% 192% 107% -0.42% -0.30% -0.10% 143% 103% -0.45% -0.64% -0.01% 142% 104%
CE6-4.4a JVET-M0141 -0.59% -0.35% 0.10% 120% 102% -1.01% -1.77% -0.19% 127% 103%
The following table summarizes the results for CE6-4 using CTC w/ Inter MTS configuration and VTM-
3.0 as anchor.
AI RA LB
Test # Doc. # Y U V EncT DecT Y U V EncT DecT Y U V EncT DecT
CE6-4.1a JVET-M0140 -0.28% -0.21% -0.06% 98% 101% -0.43% -0.72% -0.26% 96% 100%
CE6-4.1b JVET-M0140 -0.12% -0.15% -0.06% 97% 101% -0.20% -0.67% -0.21% 95% 101%
CE6-4.1c JVET-M0140 -0.33% -0.25% 0.03% 100% 100% -0.53% -0.88% -0.28% 98% 101%
CE6-4.1d JVET-M0140 -0.29% -0.14% 0.03% 103% 100% -0.47% -0.64% -0.06% 102% 100%
CE6-4.1e JVET-M0140 -0.34% -0.23% -0.06% 99% 101% -0.55% -0.73% -0.21% 97% 101%
CE6-4.2a JVET-M0523 -0.25% 0.06% 0.24% 120% 103% -0.40% -0.23% 0.31% 124% 101%
CE6-4.3a JVET-M0499
CE6-4.4a JVET-M0141 -0.30% -0.29% 0.01% 107% 101% -0.52% -1.32% -0.54% 105% 102%
In JVET-M0292, test results of CE 6-5.1 on reduced secondary transform (RST) are reported. According
to the request of the previous 12th JVET meeting, all tests were performed without normative change of
MTS signalling, which are described as the following:
1) Test 1: (A),
2) Test 2: (A) + (B),
3) Test 3: (A) + (B) + (C),
4) Test 4: (A) + (B) + (D), where
Feature Description
(A) 4 transform sets (instead of 35). 2 transforms per set
(B) Secondary transform uses at most maximum 8 multiplications/sample
(C) Secondary transform is disabled for 4x4 TU
(D) 16x48 matrices are employed instead of 16x64 ones
The transform set is determined from the intra prediction mode, then a syntax flag is sent to select which
kernel in that set is to be applied. The “secondary” (inverse) transform is applied first in the decoding
process and then the ordinary (inverse) transform is applied.
The following table summarizes the results for CE6-5 using CTC configuration and VTM-3.0 as anchor.
AI RA LB
Test # Doc. # Y U V EncT DecT Y U V EncT DecT Y U V EncT DecT
CE6-5.1 JVET-M0292 -1.59% -2.75% -3.22% 128% 98% -0.88% -1.88% -2.27% 108% 100% -0.24% -0.83% -1.19% 104% 101%
(Test 1)
CE6-5.1 JVET-M0292 -1.40% -2.60% -3.09% 128% 97% -0.76% -1.83% -2.25% 108% 100% -0.21% -0.78% -0.85% 104% 98%
(Test 2)
CE6-5.1 JVET-M0292 -1.25% -2.48% -2.88% 125% 97% -0.69% -1.59% -2.03% 107% 100% -0.18% -0.68% -1.11% 103% 100%
(Test 3)
CE6-5.1 JVET-M0292 -1.34% -2.50% -2.95% 129% 97% -0.71% -1.60% -2.02% 107% 100% -0.18% -0.35% -1.00% 104% 99%
(Test 4)
The following table summarizes the results for CE6-5 using CTC w/ Inter MTS configuration and VTM-
3.0 as anchor.
AI RA LB
Test # Doc. # Y U V EncT DecT Y U V EncT DecT Y U V EncT DecT
CE6-5.1 JVET-M0292 -0.91% -1.94% -2.34% 106% 100% -0.29% -0.84% -1.37% 103% 100%
(Test 1)
CE6-5.1 JVET-M0292 -0.80% -1.89% -2.27% 106% 99% -0.24% -0.66% -1.21% 103% 98%
(Test 2)
CE6-5.1 JVET-M0292 -0.74% -1.61% -2.03% 105% 99% -0.22% -0.64% -1.21% 102% 100%
(Test 3)
CE6-5.1 JVET-M0292
(Test 4)
A cross-checker said the gains are about the same regardless of the test sequence class. Chroma gain as
well as luma gain was observed.
JVET-M0079 CE6: MTS size restriction to 16 (test 3.7) [P. Philippe (bcom Orange)]
JVET-M0080 CE6: MTS simplification with TAF (tests 1.5a-d) [P. Philippe (bcom
Orange)]
JVET-M0084 CE6: JVET-L0262: Replacing all DST-7 / DCT-8 by DST-4 / DCT-4 used in
MTS (test 6.1.1e) [K. Abe, T. Toma (Panasonic), M. Ikeda, T. Tsukuba
(Sony), K. Naser, F. Le Léannec, E. François (Technicolor)]
JVET-M0140 CE6: Sub-block transform for inter blocks (Test 6.4.1) [Y. Zhao, H. Gao,
H. Yang, J. Chen (Huawei)]
JVET-M0141 CE6: RQT-like sub-block transform for inter blocks (Test 6.4.4) [Y. Zhao,
H. Gao, H. Yang, J. Chen (Huawei)]
JVET-M0244 CE6: MTS using DST-4 and transposed DCT-2 (test 6-1.3) [Y. Lin, J. Zheng,
Q. Yu, N. Zhang (HiSilicon), C. Zhu (UESTC)]
JVET-M0303 CE6: Shape adaptive transform selection (Test 3.1) [J. Lainema (Nokia)]
JVET-M0319 CE6: MTS for non-square CUs (test 6.3.3) [J. Jung, D. Kim, G. Ko, J. Son,
J. Kwak (Wilus)]
JVET-M0497 CE6: Fast DST-7/DCT-8 with dual implementation support (Test 6.2.3)
[X. Zhao, X. Li, Y. Luo, S. Liu (Tencent)]
JVET-M0498 CE6: MTS up to 16-length (Test 6.3.8) [J. Jung, D. Kim, G. Ko, J.-H. Son,
J. S. Kwak (Wilus), X. Zhao, X. Li, S. Liu (Tencent)]
JVET-M0499 CE6: RQT-like transform sub-block splitting (Test 6.4.3) [X. Zhao, X. Li,
S. Liu (Tencent)]
JVET-M0523 CE6: RQT-like transform partitioning for inter blocks (Test 6.4.2)
[H. Egilmez, V. Seregin, A. Said, M. Karczewicz (Qualcomm)]
In JVET-L0145, three methods to constrain the usage of context-coded bins are proposed. The first
constraint is dependent on color component and coefficient sub-block size. The second method is to
release the constraints values according to the last significant sub-block position to improve the coding
efficiency. The third modification is to move the greater than 2 flag into the first pass to further improve
the decoding throughput. The details are described as follows.
[3] Include greater than 2 flag into the first coding pass
The greater than 2 flag is proposed to be moved to the first coding pass after the parity bit. The parsed
greater than 2 flag is used to calculate the locSumAbsPass1in the context modeling of sig_coeff_flag,
par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag. The reserved number of context-coded bins for
greater than 2 flag in the second coding pass is also merged into the number of context-coded bins for the
first coding pass. The test that moving the greater than 2 flag before parity bit will also be evaluated.
Specification of tests:
Test 7.1: Aspect [1] ( (To reduce worst-case context coded bins (2 per coeff to ~1.6)
Test 7.2: Aspects [1] + [2] ( (Improving coding efficiency relative to 7.1
Test 7.3: Aspects [1] + [2] + [3] (Reduces the scan passes relative to 7.2
Test 7.4: Aspect [3] ( (Reduces the scan passes without including other changes
Average test results for CE 7. The table shows results for a high-complexity encoder configurations (same
as CTC) as well as a low-complexity encoder configuration (dependent quantization and RDOQ
disabled). All results are relative to VTM-3.0 (using the same configuration as the tested approach).
JVET-M0173 CE7 (Tests 7.1, 7.2, 7.3, and 7.4): Constraints on context-coded bins for
coefficient coding [T.-D. Chuang, S.-T. Hsiang, Z.-Y. Lin, C.-Y. Chen, Y.-W.
Huang, S.-M. Lei (MediaTek)]
JVET-M0028 CE8: Summary Report on Screen Content Coding [X. Xu, Y.-C. Chao, Y.-C.
Sun, J. Xu]
Subtests:
8.1: CPR related
8.2: Palette related
8.3: Block-based DPCM
8.1.1a J. Nam JVET-M0332 Block vector prediction for merge mode and X. Xu
(LGE) AMVP mode using default positions. (Tencent)
8.1.1b J. Nam JVET-M0332 Block vector prediction for merge mode and A.
(LGE) AMVP mode. Consider alternative candidates with Karabutov
various positions (Huawei)
8.1.3 L. P. Van JVET-M0474 CPR using extended search range with line buffers
(Qualcomm) from top of CTU and the left columns of the
current CTU
8.2.1 Y.-H. Chao JVET-M0050 Palette mode as in HEVC SCC, CE base software R.
(Qualcomm) Chernyak
Y.-C. Sun (Huawei)
(Alibaba)
8.2.2 Y.-C. Sun JVET-M0051 Palette mode and intra mode combination Y.-W. Chen
(Alibaba) (Kwai)
8.2.4 J. Ye JVET-M0456 Apply palette mode on separate chroma CU only Y.-C. Sun
(Tencent) when corresponding luma samples are all coded in (Alibaba)
palette mode
8.2.5a R. Chernyak JVET-M0052 Separate palette coding for luma and chroma J. Ye
(Huawei) Sub test 1: palette coding is applied separately on (Tencent)
Y.-H. Chao luma and chroma components in both intra and
(Qualcomm) inter slices when dual tree is disabled in
Y.-C. Sun configuration (SPS)
(Alibaba)
8.2.5b R. Chernyak JVET-M0052 Separate palette coding for luma and chroma J. Ye
(Huawei) Sub test 2: palette coding is applied separately on (Tencent)
Y.-H. Chao luma and chroma components in both intra and
(Qualcomm) inter slices when dual tree is enabled in
Y.-C. Sun configuration (SPS)
(Alibaba)
8.2.6 J. Ye JVET-M0457 Palette predictor list enhancement by using palette Y.-C. Sun
8.3.1b F. Henry JVET-M0057 Same as 8.3.1a with vertical and horizontal B. Bross
(Orange) predictors (in a way similar to RDPCM), allowing (HHI)
the reconstruction to proceed line by line / column
by column and thus increase throughput
Results compared to CTC (without CPR), all CE results have CPR enabled
AI RA
Test# Y U V EncT DecT Y U V EncT DecT
VTM+CPR -0.27% -0.40% -0.33% 137% 100% 0.09% -0.01% 0.05% 100% 100%
CE8.1.1a -0.34% -0.43% -0.41% 144% 104% 0.05% -0.06% 0.00% 101% 101%
CE8.1.1b -0.34% -0.41% -0.37% 144% 104% 0.05% 0.00% 0.05% 101% 102%
CE8.1.2a -0.35% -0.46% -0.42% 142% 102% 0.05% -0.04% -0.03% 100% 99%
CE8.1.2b -0.31% -0.41% -0.35% 143% 98% 0.07% -0.04% 0.04% 99% 98%
CE8.1.2c -0.22% -0.30% -0.30% 139% 100% 0.11% -0.07% 0.08% 99% 99%
CE8.1.2d -0.39% -0.48% -0.48% 143% 99% 0.05% 0.01% 0.02% 99% 99%
CE8.1.3 -0.29% -0.37% -0.34% 137% 103% 0.08% 0.01% 0.02% 100% 100%
CE8.2.1 -0.19% -0.26% -0.26% 146% 105% 0.20% 0.07% 0.20% 108% 104%
CTC
CE8.2.1* -0.36% -0.65% -0.81% 142% 100% 0.16% -0.30% -0.22% 107% 102%
overall
CE8.2.2 -0.19% -0.24% -0.26% 147% 105% 0.22% 0.07% 0.18% 106% 103%
CE8.2.3 -0.19% -0.26% -0.25% 142% 104% 0.20% 0.05% 0.15% 100% 99%
CE8.2.4 -0.21% -0.30% -0.26% 141% 105% 0.20% 0.10% 0.22% 101% 99%
CE8.2.5a -0.36% -0.63% -0.81% 142% 100% 0.15% -0.32% -0.24% 106% 101%
CE8.2.5b -0.19% -0.26% -0.26% 146% 105% 0.20% 0.05% 0.13% 105% 103%
CE8.2.6 -0.19% -0.27% -0.26% 143% 104% 0.20% 0.08% 0.18% 101% 100%
CE8.3.1a -0.24% -0.42% -0.36% 134% 104% 0.11% 0.03% 0.09% 101% 102%
CE8.3.1b -0.34% -0.37% -0.32% 137% 106% 0.07% -0.01% 0.09% 102% 102%
CE8.3.2 -0.34% -0.37% -0.33% 137% 106% 0.07% -0.01% 0.08% 102% 102%
Class F VTM+CPR -12.09% -12.04% -12.10% 158% 99% -9.89% -9.92% -9.95% 107% 99%
CE8.1.1a -12.42% -12.24% -12.27% 161% 104% -10.10% -9.99% -10.12% 108% 102%
CE8.1.1b -12.41% -12.37% -12.34% 162% 106% -10.07% -9.94% -10.16% 108% 102%
CE8.1.2a -14.47% -14.32% -14.26% 175% 103% -11.62% -11.63% -11.49% 109% 98%
CE8.1.2b -13.66% -13.52% -13.49% 172% 97% -11.04% -11.05% -11.07% 109% 97%
CE8.1.2c -12.11% -12.04% -11.94% 164% 97% -9.88% -9.89% -10.04% 108% 98%
CE8.1.2d -15.34% -15.20% -15.15% 178% 96% -12.34% -12.28% -12.22% 109% 97%
CE8.1.3 -12.21% -12.14% -12.19% 159% 104% -9.96% -10.10% -9.99% 106% 101%
CE8.2.1 -15.51% -14.05% -14.13% 174% 97% -11.90% -11.77% -11.99% 119% 103%
CE8.2.1* -13.16% -13.63% -13.62% 159% 96% -10.48% -11.17% -11.20% 118% 102%
CE8.2.2 -15.53% -14.03% -14.11% 174% 97% -11.94% -11.72% -11.94% 119% 103%
CE8.2.3 -15.50% -14.03% -14.09% 173% 100% -11.90% -11.72% -11.92% 110% 98%
Results compared to CTC (without CPR), all CE results have CPR+PLT enabled
AI Over RA
VTM- Over
3.0 VTM-
3.0
AI RA
Test# Y U V EncT DecT Y U V EncT DecT
VTM+CPR -0.19% -0.26% -0.26% 145% 103% 0.20% 0.07% 0.20% 105% 102%
+PLT
CE8.1.1a -0.25% -0.32% -0.36% 151% 104% 0.17% 0.06% 0.05% 106% 102%
CE8.1.1b -0.24% -0.30% -0.30% 151% 105% 0.18% 0.09% 0.04% 105% 101%
CE8.1.2a -0.26% -0.33% -0.37% 149% 101% 0.19% 0.11% 0.17% 104% 99%
CE8.1.2b -0.22% -0.31% -0.30% 152% 104% 0.21% 0.05% 0.15% 106% 102%
CE8.1.2c -0.14% -0.19% -0.21% 150% 105% 0.23% 0.07% 0.17% 106% 102%
CE8.1.2d -0.30% -0.37% -0.38% 152% 105% 0.18% 0.08% 0.12% 105% 102%
CE8.1.3 -0.19% -0.31% -0.25% 144% 103% 0.21% 0.05% 0.16% 104% 100%
CTC
CE8.2.1 -0.19% -0.26% -0.26% 146% 105% 0.20% 0.07% 0.20% 108% 104%
overall
CE8.2.2 -0.19% -0.24% -0.26% 147% 105% 0.22% 0.07% 0.18% 106% 103%
CE8.2.3 -0.19% -0.26% -0.25% 142% 104% 0.20% 0.05% 0.15% 100% 99%
CE8.2.4 -0.21% -0.30% -0.26% 141% 105% 0.20% 0.10% 0.22% 101% 99%
CE8.2.5a -0.36% -0.63% -0.81% 142% 100% 0.15% -0.32% -0.24% 106% 101%
CE8.2.5b -0.19% -0.26% -0.26% 146% 105% 0.20% 0.05% 0.13% 105% 103%
CE8.2.6 -0.19% -0.27% -0.26% 143% 104% 0.20% 0.08% 0.18% 101% 100%
CE8.3.1a -0.16% -0.30% -0.32% 170% 116% 0.24% 0.07% 0.17% 141% 122%
CE8.3.1b -0.27% -0.29% -0.24% 141% 109% 0.18% 0.14% 0.24% 104% 105%
CE8.3.2 -0.27% -0.28% -0.24% 141% 109% 0.18% 0.15% 0.26% 105% 106%
Results compared to CTC (without CPR), all CE results have PLT enabled
AI Over RA
VTM- Over
3.0 VTM-
3.0
AI RA
Test# Y U V EncT DecT Y U V EncT DecT
CTC VTM+PLT 0.09% 0.10% 0.06% 107% 101% 0.14% 0.20% 0.09% 106% 102%
overall CE8.2.1 0.09% 0.10% 0.06% 107% 101% 0.14% 0.20% 0.09% 106% 102%
CE8.2.1* 0.07% 0.07% 0.01% 108% 101% 0.14% 0.09% 0.08% 106% 101%
CE8.2.2 0.08% 0.09% 0.11% 108% 103% 0.14% 0.18% 0.10% 106% 102%
CE8.2.3 0.08% 0.10% 0.07% 106% 101% 0.12% 0.17% 0.13% 100% 98%
CE8.2.4 0.06% 0.07% 0.05% 106% 102% 0.13% 0.16% 0.10% 100% 98%
CE8.2.5a 0.07% 0.09% 0.03% 104% 98% 0.13% 0.08% 0.10% 104% 99%
8.1.x
The following question is discussed: Should CTC enable CPR for class F? This would be realistic for the
case where the encoder could know that it is screen content or natural content.
The methods of CE8.1.2 provide additional gain of >3% for class F, 8% for TGM class. These are re-
using existing data from the previous CTU (assuming 2 or 4 buffers of size 64x64 each, depending on
version). This is generally agreed to be practical and give good benefit. It could however be coplicated to
specify as an encoder/bitstream restriction that the limits of CPR vectors are valid.
Shan Liu and other interested experts were asked to inspect the specification text (all 4 versions a…d),
and it was reported back that version a was the most practical solution.
Decision: Adopt JVET-M0407 (variant a).
The methods of CE8.1.1 provide additional gain (0.4% class F, 0.9% for class TGM), but modify AMVP
and merge list construction. Question is raised whether VVC would require to use exactly the same
principle for CPR and normal MV coding. Probably this might be OK if it does not deviate too much, and
does not require much additional processing. In HEVC SCC, it was required to have exactly the same
process for CPR and MV coding, which may not be the case for VVC.
See the notes of further discussion in the JVET Sunday plenary about general design limitations we
would impose on CPR vector coding
8.1.3 uses the line/columns buffers at CTU boundary (which is there for intra prediction) to enable CPR
by one line/column more across CTU boundary. Gives 0.1% for class F, 0.2% for TGM. Probably, part of
that gain (as far as the left boundary is concerned) is overlapping with 8.1.2.
A combination on top of 8.1.2.d was reported in JVET-M0878. It seemed that the additional benefit was
very small, so no action was taken on it.
8.2.x
Current VVC does not include palette mode, however a kind of “baseline” exists which is HEVC palette
plus dual tree. This provides roughly 3%/7.5% gain over VVC+CPR for classes F/TGM. Results from CE
also indicate that this gain goes to 2.5%/4.5% when combined with the improved CPR from 8.1.2. This
may be even less when other aspects such as transform skip come into play. Such a relative low gain
might not justify adding
8.3.x
These approaches perform sample-wise DPCM as an additional concept that demonstrates some benefit
for screen content types. The most viable approach (according to proponents) is 8.3.2 which can best be
optimized in terms of throughput and also shows best compression. The gain over VTM+CPR is
3.7%/4.9% for classes F/TGM, and gain over VTM+CPR+PLT is 1.3%/1.4%, respectively. The encoder
runtimes are significantly faster when those methods are used, as some early termination approach is
employed (not searching other intra modes when the prediction works well).
The current method uses a maximum of 12 context coded bins per sample, which is much too large.
Furthermore, this would be yet another alternative prediction method (and building block) which needs to
be implemented in parallel with existing ones, which needs some justification in terms of compression
performance to be included.
Could the residual coding be unified with the existing approach of VVC (in that case, the method could
just be seen as another intra prediction mode with transform skip)?
It is also asked how the current residual coding method would perform in the low QP range
Further study was considered necessary.
The subsequent notes contain descriptions of technology which were copied from JVET-M0028. Actions
taken are noted above.
JVET-M0051 CE8: Palette Mode and Intra Mode Combination (test 8.2.2) [Y.-C. Sun,
J. Lou (Alibaba)]
In this test, the method combining palette mode and intra prediction is tested. The decoder first derives
the prediction block based on the intra prediction information. Then, the decoder decodes a palette and an
index map. Using the decoding palette information, the decoder refines the prediction block and
reconstructs the block.
JVET-M0052 CE8: Separate Palette Coding for Luma and Chroma (test 8.2.5) [Y.-C. Sun,
J. Lou (Alibaba), Y.-H. Chao, H. Wang, V. Seregin, M. Karczewicz
(Qualcomm), R. Chernyak, S. Ikonin, J. Chen (Huawei)]
In the palette anchor (test CE8.2.1), when dual tree is enabled in the configuration (SPS), palette coding is
applied separately on luma tree and chroma tree in intra slices and jointly on luma/chroma in inter slices.
When dual tree is disabled in the configuration, palette mode is applied jointly for both luma and chroma
in all slice types.
In this test, separated palette coding for luma/chroma components are investigated based on the same
palette coding functions in anchor software:
Sub test 1: palette coding is applied separately on luma and chroma components in both intra and inter
slices when dual tree is disabled in configuration (SPS).
Sub test 2: palette coding is applied separately on luma and chroma components in both intra and inter
slices when dual tree is enabled in configuration (SPS).
JVET-M0056 CE8: BDPCM with LOCO-I and independently decodable areas (test 8.3.1a)
[F. Henry, A. Mohsen (Orange), P. Philippe, G. Clare (B-com)]
It is contribution proposes to use a classical DPCM approach at the block level.
A bdpcm_flag is transmitted at the CU level whenever it is a luma intra CU having each dimension
smaller or equal to 32. This flag indicates whether regular intra coding or DPCM is used. This flag is
encoded using a single CABAC context.
Block DPCM uses the Median Edge Detector of LOCO-I. For a current pixel X having pixel A as left
neighbour, pixel B as top neighbour and C as top-left neighbour, the prediction P(X) is determined by
P(X)= min(A,B) if C≥max(A,B)
max(A,B) if C≤min(A,B)
A+B-C otherwise
The predictor uses unfiltered reference pixels when predicting the top row and left column of the CU. The
predictor then uses reconstructed pixels for the rest of the CU. Pixels are processed in raster-scan order
inside the CU. The prediction error is quantized in the spatial domain, after rescaling, in a way identical to
JVET-M0332 CE8: Block vector prediction for CPR (test 8.1.1a and test 8.1.1b) [J. Nam,
J. Lim, S. Kim (LGE)]
In this test, alternative candidates for CPR is proposed. It is added in both AMVP and merge mode. For
AMVP mode, when reference picture of current coding block indicated by reference index is same as
current picture and the number of candidates in the constructed list is smaller than the maximum number
of candidates, default candidates are inserted into MVP candidate list. For Merge mode, when the current
picture exists in the reference picture list and the number of candidates in the constructed list is smaller
than the maximum number of candidates, default candidates are added to merge candidate list. Various
positions for alternative candidates will be tested.
In subtest 1, (-2W, 0) and (0, -2H) are tested, where (W, H) is the size of current coding block
In subtest 2, (-mid, 0) and (0, -mid)) are tested, which is the middle positions between the CTU boundary
and current block position.
JVET-M0407 CE8: CPR reference memory reuse without increasing memory requirement
(CE8.1.2a and CE8.1.2d) [X. Xu, X. Li, S. Liu (Tencent), E. Chai (Ubilinx)]
See under JVET-M0408.
JVET-M0408 CE8: CPR reference memory reuse with reduced memory requirement
(CE8.1.2b and CE8.1.2c) [X. Xu, X. Li, S. Liu (Tencent), E. Chai (Ubilinx)]
X Curr
X X Curr
X X X X
X Curr
X X Curr
Currently, the search range of CPR mode is constrained to be within the current CTU. The effective
memory requirement to store reference samples for CPR mode is 1 CTU size of samples. Considering the
2:) Similar as in 1:), but the required reference sample memory size is reduced. For example, in addition
to the 64x64 size memory for storing reconstructed samples of the current 64x64 region, additional 2 (or
1) 64x64 size memory can be used to store previously coded regions. Therefore, the total requirement
reference sample memory is reduced from 4 64x64 size to 3 (or 2) 64x64 size.
3:) Similar as in 1:), but the update process is done on a CU basis. The reference samples in the left CTU
can be used to predict the coding block in current CTU with CPR mode until the block in the same
location of the current CTU is being coded or has been coded.
JVET-M0456 CE8: palette mode when dual-tree is enabled (Test 8.2.4) [J. Ye, X. Xu, X. Li,
S. Liu (Tencent)]
In this test, palette mode with dual tree coding structure is investigated.
Sub test b: apply palette mode to luma plane when dual tree is enabled. When coding chroma plane, if the
co-located luma block are all palette mode, the chroma block are also have the flexibility to use palette
mode. A flag is signalled whether the chroma block using palette mode or not. If the chroma block using
palette mode, the corresponding palette mode syntax will be signalled.
JVET-M0457 CE8: Palette predictor list enhancement (Test 8.2.6) [J. Ye, X. Xu, M. Xu,
X. Li, S. Liu (Tencent)]
In this test, the palette predictor will be derived from previously palette coded coding blocks.
1. Derive the spatial palette predictor:
To derive the spatial palette predictor, the adjacent neighbouring and non-adjacent neighbouring are both
checked from the neighbouring blocks that are close to the current block to the blocks that are far away.
The left block (Ai) and above block (Bi) are checked. The non-adjacent neighbouring candidates are in a
virtual box that surrounding the current block. The virtual block size and position are illustrated in Fig.1.
In the current implementation, the gridX is block width and gridY is block height. The number of search
rounds will be tested. Pruning of palette entries will be tested. The orders for checking each candidate will
be tested.
2. Combine the spatial palette predictor and the HEVC SCC palette predictor:
After deriving the spatial palette predictor, the spatial palette predictor will be inserted into the palette
predictor list for the current block first. If the size of palette predictor list for the current block doesn’t
exceed the maximum palette predictor size, the HEVC SCC palette predictor is inserted into the palette
predictor list. If the list size exceeds the maximum palette predictor size, the rest palette entry will be
discarded. The combined palette predictor will be used to code the palette table entry of the current block.
JVET-M0474 CE8.1.3: Extended CPR reference with 1 buffer line [L. Pham Van,
V. Seregin, W.-J. Chien, T. Hsieh, M. Karczewicz (Qualcomm)]
The current CPR reference area is limited to the reconstructed samples of the current CTU. However,
neighbour samples around a CTU are required for intra prediction, that samples can be available for CPR
reference as well. In the CE, the following tests are performed:
The search range is the current CTU and 1 line above, and 1 left column to the current CTU.
JVET-M0029 CE9: Summary report on decoder side motion vector derivation [X. Xiu,
S. Esenlik]
The core experiment summary report is organized into 2 sub-tests as follows:
CE9.1: BDOF design (3 tests)
CE9.2: DMVR design (24 tests)
VTM Crosscheck
VTM Cross-Check
Test Document Crosschecker Y U V EncT DecT EncT DecT
9.1.1 a JVET-M0487 H. Liu -0.01% 0.03% 0.02% 103% 102% 105% 104%
9.1.1 b H. Liu 0.05% 0.02% 0.02% 99% 98% 99% 100%
9.1.1 c H. Liu 0.10% 0.03% 0.01% 99% 98% 100% 100%
9.1.1.a does not simplify, replaces bilinear filters in extended region by DCTIF
9.1.1.b simplifies by using no interpolation in extended region (just integer positions)
9.1.1.c changes the gradient calculation at boundaries and does not need extended region any more, it
simplifies, but the design becomes less unified
Several experts supported 9.1.1.b as the best simplified design approach.
Decision: Adopt JVET-M0487 (solution 9.1.1.b)
Specification text was made available, to be confirmed to be consistent with software by cross-checkers,
and reviewed by spec editors.
VTM
It is noted that disallowing DMVR blocks for intra prediction may not be that relevant in practical
pipeline implementations. Typically, within a CTU first all inter blocks would be reconstructed, and
finally the intra coded blocks,
Early termination approaches do not seem to be very effective in terms of runtime reduction, and also are
not beneficial for hardware.
Generally, the investigation on DMVR has led to a point where it might be manageable implementation-
wise (not low complex, but still giving around 1% gain)
JVET-M0076 CE9: Block size restriction for DMVR (test 9.2.6) [K. Unno, K. Kawamura,
S. Naito (KDDI)]
In this contribution, it is tested that coding performances with block size restriction for decoder-side
motion vector refinement (DMVR). Threshold 4096 pixels for block size restriction is used in proposed
method. On the other hand, threshold 1024 pixels is used in CE9.2 base software (same as CE9.2.1a).
BD-rate for luma is -0.89% with threshold 4096 compared with VTM-3.0.
JVET-M0147 CE9: Results of DMVR related Tests CE9.2.1 and CE9.2.2 [S. Sethuraman
(Ittiam)]
In this proposal, the results of sub-tests CE9.2.1 and 9.2.2 are summarized. In CE9.2.1b0, a simplified
base with refinement disabled for coding units with luma sample counts larger than 1024, integer grid
samples based refinement, SAD as the cost function, and use of refined MVs for only the MC of the
current CU is considered. The progressive impact of the key elements of DMVR design such as forced
partitioning of large CUs into sub-CUs that have a constraint on their maximum width and height, type of
interpolation done for refinement, mean-removed SAD as cost function, and use of refined MVs for
purposes other than MC for current CU are studied in the other sub-tests of CE9.2.1. The results indicate
that DMVR can provide average BDRATE gains of up to -1.13%, -1.33%, -1.44% over VTM3 for the
combination of (a) sub-CUs of maximum width and height of 16 luma samples, (b) bilinear interpolation
for refinement, (c) block mean removed SAD as cost function, and (d) use of refined MVs for de-
blocking, temporal MV prediction, and spatial MV prediction from top and top-left CTU neighbours. The
average encoding and decoding time ratios have increased to up to 103% and 118% respectively due to
performing motion compensation and BDOF at sub-CU granularity. The results also indicate that use of
refined MVs for purposes other than MC provides more than one-third of the coding gain offered by
DMVR.
In CE9.2.2, an alternative cost function that replaces block level mean removed SAD cost function with a
row-level mean removed SAD cost function is studied at two different sub-PU level early exit thresholds.
The results indicate that half of the BDRATE gain provided by block-level MR-SAD over SAD can be
achieved using row-level mean removed SAD cost function. Further study may be required to study
Page: 147 Date Saved: 2019-03-172019-03-14
variants of such cost functions and suitable early exit thresholds that can provide a larger reduction in
average decoding time increase at minimal impact to the coding gains. One result is also provided when
BIO is disabled to show an average BDRATE gain of -1.51%, -1.58%, -1.59% over VTM3 with BIO
disabled.
Complexity analysis for aspects such as pre-fetch cache accesses, internal memory requirements, and
worst-case operation count are provided for the various choices. The analysis results indicate that sub-
CUs of maximum width and height of 16 luma samples provide the least internal memory requirements
while not increasing the pre-fetch cache accesses in the worst-case. The average luma BDRATE drop for
SAD as a cost function when compared to MR-SAD as the cost function is seen to be only ~0.11%.
JVET-M0287 CE9: Integer DMVR (Test 9.2.7) [S. Esenlik, H. Gao, A. M. Kotra, B. Wang,
J. Chen (Huawei)]
This contribution document reports the results of the core experiment CE9.2.7. In the test the initial
motion vectors are rounded to integer precision before the application of motion vector refinement in
order to reduce the computational complexity of DMVR. The test CE9.2.7 is designed to show the impact
of the application of rounding of the initial motion vectors to integer precision in isolation. Therefore the
proposed method is implemented on top of the DMVR Base Software (CE9.2.Base) and no other
modification is included in the test.
Simulation results show 0.13% luma BD-rate increase and 5% decoding time reduction compared to the
DMVR Base Software.
JVET-M0447 CE9: Constrained intra prediction with DMVR (test 9.2.4) [M. Xu, X. Li,
S. Liu (Tencent)]
This contribution presents DMVR simplifications based on VTM3.0. Firstly, only 4 integer precision
surround check for certain condition, followed by MRSAD mean value calculation by sampling process,
then use SAD calculation to replace MRSAD. The proposed technologies have 0.71%/ 0.68%, 0.60%
gain respectively compared to VTM3.0 anchor.
According to the CE description, for investigating the impact of multi-hypothesis inter prediction on
cache-related aspects, the cache model compiled into the reference decoder software by setting #define
JVET_J0090_MEMORY_BANDWITH_MEASURE to 1 is used, in conjunction with the cache config
file provided in JVET-K0451. For each bit stream, the decoder outputs a total hit ratio when using the
cache model. The following table compares the average hit ratios (in percentage) of the tests with those of
VTM-3.0:
Data quoted from JVET-M0176:
Random Access Main 10
VTM-3.0 (%) CE10.1.1.b (%)
Class A1 99.4735 99.4663
Class A2 99.5509 99.5461
Class B 99.5308 99.5249
Class C 99.3467 99.3435
Class E
Overall 99.4755 99.4702
Class D 99.3402 99.3401
Class F 98.8734 98.8607
(mandatory)
Low Delay B
VTM-3.0 CE10.1.2.a CE10.1.2.b CE10.1.2.c CE10.1.2.d
B 99.6547 99.6462 99.6531 99.6539 99.6567
C 99.5650 99.5548 99.5594 99.5656 99.5628
E 99.5242 99.5147 99.5187 99.5224 99.5225
Overall 99.5813 99.5719 99.5771 99.5806 99.5807
D 99.5715 99.5657 99.5683 99.5727 99.5737
F 99.1095 99.0934 99.0978 99.1081 99.1034
1.1.a uses two hypotheses in uni prediction and merge. Basically, the same prediction could be invoked
by using bi prediction. However, only one AMVP list is generated, and it may save some signalling.
Encoder time increases by 7%, no gain in LB.
Are gains of 1.1.a/b additive? They are said to have been almost additive before this CE cycle, but in the
current CE the combination was not tested.
Worst case memory BW of 1.1.a is uncritical, of 1.1.b was analysed as roughly 80% of VTM.
Combination of 1.1.a/b would be somewhat similar to 1.2.a, which uses multi hypothesis for both merge
and AMVP, and additionally allows kind of different weighting of the hypotheses. 1.2.a is the “full set” of
functionality of this proposal, whereas b..d are somewhat simplifications, mainly for the benefit of saving
local storage (b), or memory bandwidth (c,d). d still uses up to 4 hypotheses (2 each in bi with >=8x16
block size restriction), but only for luma, which is the reason for worse performance in chroma. It is
reported that worst case memory BW is 0.97%
It is also mentioned that potentially gains of 1.1.a might add up to 1.2.x.
Generally, the gain of all 1.1.x and 1.2.x proposals is significantly less than it was over VTM2 (cut by
half or even more). In particular, the 1.1.b and 1.2.x proposals add need for more building blocks. 1.1.a is
relatively simple and ce re-use existing memory access structures and MC logic, but for getting the gain
of 0.17% for RA (no gain for LB), the encoder runtime also increases by 7%.
By doing more encoder checks and increasing runtime by 7%, probably it should be possible to get
similar gain without a syntax change.
CE10.2.1 JVET- CU 2: width < 8 (left); height apply OBMC to uni- 1. reuse L CTU row
M0178 boundaries < 8 (top) prediction blocks shape buffer buffer
4: otherwise use uni-prediction to 2. apply CU removal
generate OBMC region size constraints
3. apply MV
All proposals impose a CU size constraint >= 64 samples. It is requested to provide a more detailed
analysis of the memory bandwidth. Likely, for processing on-the-fly, the worst case would be that the
current block is 8x8, and all of the top and left neighbours are 4x4. Another option would be to store
samples from the current block that were already fetched to perform interpolation in the neighbour
blocks. In that case, it should be reported how large the additional local buffer would be, for all four
methods.
Additional analysis is shown Sat. afternoon for 10.2.1 and 10.2.2 (to be provided in updated version of
JVET-M0178/9). This analysis indicates that for those tw approaches the worst-case memory bandwidth
of VVC is not increased for on-the-fly fetch (and even a little less for 10.2.2), or if the pre-generation
method is used, the memory BW is even lower, but 1.96/1.28 kByte is necessary as local buffers.
In terms of processing, it is reported that worst case number of sample interpolations is not increased
relative to the current bi prediction case. Method 10.2.1 uses uni prediction in a 8x8 block, and
additionally needs to interpolate four 4x4 areas with other vectors for OBMC. Method 10.2.2 does not use
any interpolations for OBMC. For 10.2.1, weighted superposition requires at most 4 shifts and 2 adds per
sample. Furthermore, since the operations are locally varying, some additional logic is necessary. For
10.2.2., the latter numbers duplicate. Furthermore, 10.2.2 is more challenging in terms of local memory
acces, as 6 different sources need to be blended.
10.2.1 seems manageable from complexity perspective (but definitely adds some complexity).
As a general note, the gain in compression performance is lower than it was with VTM2 (approx. half).
However, from the results, OBMC even gives more gain for LB than for RA, and LB is in overall
performance of VVC still worse than RA, this is assessed to be valuable enough. Some support, and no
opposition is raised in Track B against adopting it.
It was initially agreed in Track B to adopt JVET-M0178. Specification text was available. This decision
was later reverted in the JVET Sunday plenary (see the notes in section 10.1).
This would have a high-level flag for disabling it.
From the results (low gain versus high increase in encoder run time), not worthwhile to consider
In CE10.4, the goal is to test prediction to be combined using filtering, where three types of filters, two
directional filters and one uniform filter are used. The filters are FIR filters, applied on the prediction
signal, not iterative. The 1D filters are 9-tap, symmetric, only 1 multiplication, otherwise shifts. The 2D
filter is a 5-tap diamond shape, only shift/add operations. At the boundaries, pixel replication is used.
Generally, it was agreed that this method gives interesting gain, and is straightforward to implement at the
decoder. Two concerns are raised:
- As it needs to be run after the prediction signal is generated, it produces additional delay in the intra
prediction loop.
- For inter blocks, it can be used for 128x128 CU, which would break concepts of 64x64VPDU.
It is requested to provide additional results without using the method in intra prediction (and the intra part
of CIIP), and restrict the largest CU size to 64x64.
It is also reported that a late contribution (JVET-M0848) provides new results for the same method of
10.4.2 with some (small) encoder speedup and slightly increased performance (-0.44% for luma).
Furthermore, it would be desirable to use only the prediction samples (no reconstructed samples from
current picture). A version which did that was shown in the previous CE.
Was revisited (Track B Thu 17 Jan.1630) after new results were made available in JVET-M0042v3. The
restrictions were implemented as requested, including not using reconstructed samples from neighbour
blocks. Results are as follows:
The method of 10.5.2 is mainly for the benefit of encoders, and does not solve the latency issue that LIC
imposes on decoder pipeline. The problem is that a current block needs to wait for reconstruction of the
neighbours, and then it requires a certain number of cycles until the parameters of the linear model are
computed. Of particular concern is the fact that many decoder implementations target processing inter and
intra coded blocks independently in order to make best benefit of parallel processing. Typically, inter
coded blocks of a CTU are decoded first. For this, has a non-negligible complexity impact (in particular
for parallel processing) if LIC of an inter coded block uses the reconstruction of intra coded neighbours.
If such an approach (disabling LIC from intra coded neighbours, including CPR) would be combined with
the approach of 10.5.3 (only using reconstructed inter coded samples from current CTU), the situation
would be better, however would still mean that the inter reconstruction within a CTU would need to be
sequential (similar as the situation in intra is). Regardless of that, LIC has some additional processing
complexity (which doubles in case of bi prediction) which needs to be justified by performance. The
processing should also be aligned with the 64x64 VPDU concept. (See further notes on these aspects
under JVET-M0873.)
Further study was recommended on these aspects.
A BoG (coordinated by C.-W. Hsu and M. Winken) was established to review CE10 related proposals,
and suggest candidates for further study in a CE. See the further notes for the discussion of the BoG
report JVET-M0873.
The subsequent notes only contain abstracts copied from the documents. Actions taken are noted above
under JVET-M0029.
JVET-M0042 CE10: Uniform Directional Diffusion Filters for Video Coding [J. Rasch,
A. Henkel, J. Pfaff, H. Schwarz, D. Marpe, T. Wiegand (HHI)]
An encoder speedup for the Uniform Diffusion filters described in JVET-L0157 is investigated. The filter
masks are simplified and no iterations are used. There are three types of filters, two directional filters and
one weighted interpolation of the directional filters. Further, a restriction of the Diffusion filters is
investigated.
JVET-M0087 CE10: Low pipeline latency LIC (test 10.5.2) [K. Abe, T. Toma, J. Li
(Panasonic)]
This contribution provides test results of Low pipeline latency LIC which is described in CE10.5.2.
Proposed LIC is based on the existing LIC implemented in JEM, but it removes all encoder and decoder
LIC processes other than the reconstruction stage. According to this modification, feedback loop of
neighbouring image reference in hardware pipeline can be closed to the reconstruction stage only.
Additionally, it is proposed to modify bS calculation of DBF for LIC boundary. Simulation results
reportedly show that the proposed method provides 0.56% BD-rate gain for RA, 0.51% BD-rate gain for
LDB.
JVET-M0112 CE10: LIC confined within current CTU (test 10.5.3) [P. Bordes, T. Poirier,
F. Le Léannec (Technicolor)]
This contribution describes test CE10.5.3 corresponding to Local Illumination Compensation (LIC) with
reduced memory bandwidth, where the pipelining dependency between the reconstruction of the area to
the left and the current region is confined within the current CTU only.
The proposed process has been implemented in the JVET VTM3.0 on top of regular LIC (CE10.5.2). The
reported simulation results show an average luma BD-rate gain of -0.44% in RA configuration and of -
0.39% in LDB configuration. The encoding and decoding times stay identical to the regular LIC.
JVET-M0189 CE10.3.1: AMVP mode for triangle prediction [Y. Ahn, D. Sim (Digital
Insights)]
This contribution provides the simulation results of CE10.3.1. In this test, AMVP mode for triangle
prediction is applied. The test 10.3.1 can achieve 0.06% and 0.07% BD-rate gain in random access and
low-delay B configuration, respectively.
Test CE11.1
Tests Luma modified (Y/N) Chroma modified (Y/N)
CE11.1.1 Y Y
CE11.1.2 Y Y
CE11.1.3 Y N
CE11.1.4 Y N
CE11.1.5 Y N
CE11.1.6 Y Y
CE11.1.7 Y Y
CE11.1.8 Y Y
11.1.6-8 are the same (combined from 4 previous proposals) with different mumber of line buffers (8/6/4)
at CTU boundary.
11.1.5, 11.1.6-8 use sample-based adaptation of clipping range
It was suggested to translate the maximum number of operations per line into the maximum number of
operations per sample (also considering that the long filters are only applied for large blocks), and decode
the value “M” of 11.1.1 into a number. Some of the statements in the table above (worst case increased
Y/N) may not be true. More detailed analysis is needed but it appears that if proposals increase worst case
complexity in terms of number of operations it would not be by a large factor.
This analysis became available in JVET-M0031v4 and was presented in Track B at Friday 18 January
0800 (JRO).
Beyond the number of operations, the number of line buffers is another complexity issue. VTM uses 4
lines for luma and 2 lines for chroma, which is duplicated by some proposals.
Parallel processing capability is given by all proposals at some sufficiently small granularity (though
larger than current VVC) – see table below.
Tests Smallest unit size needed to perform proposed
LD-B LD-P
Test Y U V EncT DecT Y U V EncT DecT
CE11.1.1 0.40% 0.20% 0.15% 101% 102% 0.40% 0.22% 0.05% 101% 102%
CE11.1.2 0.03% 0.53% 0.23% 100% 102% -0.01% 0.15% -0.32% 100% 102%
CE11.1.3 0.01% -0.02% 0.00% 100% 103% -0.02% 0.13% 0.03% 100% 102%
CE11.1.4 0.13% -0.03% -0.09% 99% 103% 0.06% 0.21% -0.10% 101% 104%
CE11.1.5 0.13% 0.15% 0.07% 98% 99% 0.02% 0.17% -0.09% 98% 96%
CE11.1.6 0.13% -1.32% -1.58% 100% 102% -0.03% -1.54% -1.88% 100% 102%
CE11.1.7 0.12% -1.38% -1.48% 102% 103% -0.02% -1.49% -1.86% 101% 102%
CE11.1.8 0.08% -1.15% -1.42% 99% 101% -0.02% -1.21% -1.62% 101% 100%
CE11.2
Test Proponent(s) Cross-checker(s)
CE11.2.1 Kenneth Andersson Hyeongmun Jang
kenneth.r.andersson@ericsson.com hm.jang@lge.com
Anand Meher Kotra
Anand.meher.kotra@huawei.com
Chia-Ming Tsai
chia-ming.tsai@mediatek.com
JVET-M0299
CE11.2.2 Hyeongmun Jang Kenneth Andersson
hm.jang@lge.com kenneth.r.andersson@ericsson.com
JVET-M0337
Complexity for luma
Tests Samples from Samples from Max num. oper for Max number of oper. Num. Worst case
block bound. block bound. filtering per line for decision per line line complexity
modified for deblocking (add/mult/compar/ (add/mult/compar/shift buffers increased
decision shift) ) (Y/N)
LB LP
Y U V EncT DecT Y U V EncT DecT
CE11.2.1 -0.14% -0.17% -0.01% 98% 98% -0.18% -0.06% -0.20% 98% 98%
CE11.2.2 0.00% 0.00% 0.06% 100% 99% 0.00% -0.02% -0.33% 100% 100%
SNR based results (ALF off)
AI RA
Y U V EncT DecT Y U V EncT DecT
CE11.2.1 -0.05% -0.01% 0.00% 99% 99% -0.23% -0.02% -0.04% 98% 98%
CE11.2.2 -0.03% -0.22% -0.41% 100% 100% 0.08% 0.18% 0.15% 100% 100%
LB LP
Y U V EncT DecT Y U V EncT DecT
CE11.2.1 -0.21% 0.04% -0.23% 98% 98% -0.36% 0.14% -0.16% 97% 98%
CE11.2.2 0.01% 0.01% -0.33% 100% 100% 0.09% -0.11% -0.15% 100% 100%
Subjective testing to be done. It is suggested to use an ”AB” comparison where each proposal is
compared against VTM3 at same QP.
From the test plan, it would be 96 test cases (2 QPs 34,39, 5 sequences, 8 proposals for 11.1, 2 QPs 30,34,
4 sequences, 2 proposals for 11.2). It would be highly desirable to test both ALF on and ALF off cases. If
only one of these cases is possible out of logistic reasons, the ALF off case should be used (as planned in
the original CE description). The ALF on bitstreams for subjective testing still need crosscheck.
TESTB(TIE:0)
IN:+1)
TESTC(LOSS:-1)
TESTA(W
Example illustrating Test A wins, Test B tie, Test C loses for a sequence
From such an approach, it was found that in all cases proposals were either better equal, no case was
found worse than the anchors. All tests were performed with “ALF off” anchor. When “ALF on” was
compared to “ALF off”, it was also found to be visually better than the anchor in 6 out of 10 test cases,
whereas some proposals are better than the anchor in 9 of the test cases, as shown by the following table:
Listed below are the total score of the CE proposals in CE11.1. The attached excel contains more detailed
results.CE11.1: Long Filters Tests – highest possible total score per QP is +5, lowest possible score per
QP is -5:
CE Total Total
Proposal score for score for
QP34 QP39
CE11.1.1 2 4
CE11.1.2 3 4
CE11.1.3 1 2
CE11.1.4 2 2
CE11.1.5 3 4
CE11.1.6 4 5
CE11.1.7 2 5
CE11.1.8 4 5
One conclusion that can be drawn is that longer-tap deblocking helps for visual quality at large CU
boundaries. However, it can not directly be concluded that the effect would still be visible with similar
clarity when ALF was on. From looking at scores of individual sequences, it can be seen that some of the
ne deblocking with ALF off are still working better than existing deblocking with ALF on in
approximately half of the cases, such that it can be concluded that new deblocking methods give a benefit
in terms of subjective quality that is additive to the benefit of ALF. Therefore, a conclusion might be
drawn that when both new deblocking and ALF were enabled, a visual benefit would still be visible. To
investigate if such a statement is true, additional viewing should be done with a selected proposal, one
“best performing” (from the table above, where 11.1.8 seems to be the best candidate, as it has good
performance and does not increase the need of line buffering). Additional viewing to be performed to
confirm that 11.1.8 with ALF on still gives benefit compared to VTM3 with ALF on.
Decision: Adopt JVET-M0471, version 11.1.8 (specification text available in v2 upload, but needs
another small modification for restriction of line buffer, was shortly review in Track B Thu 1330),
pending on confirmation from the viewing, and the more detailed report on complexity impact.
Informal viewing (3 sessions, around 20 participants, 12 of which were not involved in this CE) was
conducted Thu 17 Jan evening. It is reported that also in comparison to the ALF on anchor, differences
are clearly visible in particular for sequences Campfire, Redkayak, and also slight improvement for
Foodmarket and KristenSara.
The complexity analysis in JVET-M0031v4 was presented Fri 18 Jan in Track B. For the adopted
proposal, the worst case number of operations is not increased for luma, and the number of line buffers is
kept the same. Additional complexity is the need for switching to another filter mode in the case of large
Compared to that, ALF on versus ALF off has a score of 3 and 2, for QP30 and 34, respectively. This
indicates that ALF has a clear benefit over any of the proposals, and it can hardly be concluded that they
would further improve the quality when combined. At least in this range of bit rates, deblocking on an
aligned 4x4 grid does not seem to improve the visual quality. No action necessary from these results.
Could be due to the fact that the design of the CE had a flaw in selecting too low QP range. Might be
worthwhile to continue the study now in combination with longer filters, and comparing with ALF on as
anchor.
The subsequent notes only contain abstracts copied from the documents. Actions taken are noted above
under JVET-M0031 and JVET-M0906.
JVET-M0092 CE11: Very strong deblocking with conditional activation signalling (Test
11.1.1) [C. Helmrich, B. Bross (HHI)]
This contribution describes the configuration of the authors’ conditionally signalled very strong
deblocking approach, as previously presented in JVET-L0523, in the VTM software for the Core
Experiment (CE) on improved deblocking filtering for the Versatile Video Coding (VVC) standard. The
integration of this proposal into VTM 3.0 reportedly results in negligible luma BD-rate changes while
providing subjective gains. It is kindly requested to adopt one of the proposed two variants (differing in
algorithmic complexity) of the very strong deblocking method in the next revision of the VVC
specification and VTM reference software.
JVET-M0186 CE11.1.3: Long deblocking filters [C.-M. Tsai, T.-D. Chuang, C.-W. Hsu, C.-
Y. Chen, Y.-W. Huang, S.-M. Lei (MediaTek)]
This contribution proposes to add three long strong filter sets in addition to the HEVC strong filter set for
deblocking. It is reported that the proposed long strong filtering leads to negligible changes in BD-rate
and encoding time, and decoding time increase is 3% for VTM3.0 CTC and and for CTC with ALF off. It
is asserted that the subjective quality at high QPs is improved because of the long strong filter sets.
JVET-M0298 CE11: Longer tap deblocking filter (test 11.1.5) [A. M. Kotra, B. Wang,
S. Esenlik, H. Gao, J. Chen (Huawei)]
This contribution reports the results of CE 11 test 11.1.5 which uses a longer tap deblocking filter to
reduce blocking artefacts which are observed at large block boundaries. Furthermore, to reduce line
buffer requirements for the “longer tap” filter: for the horizontal edges which overlap with the CTU
boundaries, the maximum number of samples used in filter decision and the maximum number of samples
used in filter modification from the top block are restricted to be the same as in VTM 3.0 deblocking
filter.
Objective results are as follows:
Over VTM 3.0 Anchor with ALF OFF (AI, RA, LDB, LDP): Luma BD-Rate of -0.01%, 0.00%, 0.09%, -
0.06% is achieved without any increase in EncT and DecT.
Over VTM 3.0 Anchor with ALF ON (AI, RA, LDB, LDP): Luma BD-Rate of 0.00%, 0.02%, 0.13%,
0.02% is achieved without any increase in EncT and DecT.
JVET-M0299 CE11: Deblocking for 4 x N, N x 4 blocks and 8 x N, N x 8 blocks that are not
aligned with 8 x 8 sample grid (test 11.2.1) [K. Andersson, Z. Zhang,
R. Sjöberg (Ericsson), A. M. Kotra, J. Chen, S. Esenlik, B. Wang, H. Gao
(Huawei), C.-M. Tsai, C.-W. Hsu, T.-D. Chuang, C.-Y. Chen, Y.-W. Huang,
S.-M. Lei (MediaTek)]
This contribution document reports the results of the core experiment CE11.2.1, which is a joint proposal
from Ericsson, Huawei and MediaTek Inc. VTM-3.0 does not apply deblocking filter for some edges
belonging to the blocks whose size is Nx4, 4xN blocks and also for some blocks whose size is Nx8, 8xN
where N value can be upto 64 samples. This test applies deblocking on 4x4 grid to not only allow
deblocking of Nx4, 4xN, Nx8 and 8xN blocks aligned with the current 8x8 sample grid but also for the
Nx4, 4xN, Nx8, 8xN blocks that are not aligned on the current 8x8 sample grid. The number of samples
to read and modify during deblocking filtering is limited to allow for parallel friendly processing. For a
given edge, if at least one of the blocks sharing the edge has a size of 4 samples (orthogonally) then the
VTM-3.0 weak deblocking filter is chosen to be applied. Furthermore, the weak filter is constrained to
only modify one sample on either side of the edge.
Subjective analysis shows that when compared to the Anchor (VTM-3.0), the proposed test improves the
subjective quality of sequences.
Objective results show luma BD-Rate gain with negligible run-time changes.
Over VTM 3.0 Anchor with ALF OFF (AI, RA, LDB, LDP): Luma BD-Rate of -0.05%, -0.23%, -0.21%,
-0.36% is achieved with negligible run-time changes.
Over VTM 3.0 Anchor with ALF ON (AI, RA, LDB, LDP): Luma BD-Rate of 0.02%, -0.12%, -0.14%, -
0.18% is achieved with negligible run-time changes.
JVET-M0471 CE11.1.6, CE11.1.7 and CE11.1.8: Joint proposals for long deblocking from
Sony, Qualcomm, Sharp, Ericsson [M. Ikeda, T. Suzuki (Sony),
D. Rusanovskyy, M. Karczewicz (Qualcomm), W. Zhu, K. Misra, P. Cowan,
A. Segall (Sharp Labs of America), K. Andersson, J. Enhorn, Z. Zhang,
R. Sjöberg (Ericsson)]
This contribution describes a parallel friendly long filter design for deblocking of large blocks common
for CE11.1.6, CE11.1.7 and CE11.1.8. It further describes two variants of CTU line buffer reduction in
CE11.1.7 and CE11.1.8 respectively. The CE tests are based on technologies that scored highly in
subjective assessment carried out for CE11 before Macao meeting. It is asserted that all three tests give
better subjective quality than the VTM-3.0 anchor.
The BD-Rate number for CE11.1.6 is Y/U/V reportedly shows 0.0%/-0.5%/-0.5% at AI, 0.0%/-0.8%/-
0.8% at RA, 0.1%/-1.3% /-1.6% at LDB and 0.0%/-1.5%/-1.9% at LDP over VTM-3.0 with ALF
switched on, respectively. When ALF is switched off, the BD-Rate results for Y/U/V reportedly shows
0.0%/-0.7%/-0.6 % at AI, 0.0%/-0.9%/ -0.9% at RA, 0.0%/-1.1%/-1.7% at LDB and 0.0%/-1.8%/-1.9% at
LDP, respectively.
The BD-Rate number for CE11.1.7 is Y/U/V reportedly shows 0.0%/-0.5%/-0.5% at AI, 0.0%/-0.8%/-
0.8% at RA, 0.1%/-1.4% /-1.5% at LDB and 0.0%/-1.5%/-1.9% at LDP over VTM-3.0 with ALF
switched on, respectively. When ALF is switched off, the BD-Rate results for Y/U/V reportedly shows
0.0%/-0.7%/-0.6 % at AI, 0.0%/-0.9%/ -0.9% at RA, 0.0%/-1.3%/-1.6% at LDB and -0.1%/-1.9%/-2.0%
at LDP, respectively.
The BD-Rate number for CE11.1.8 is Y/U/V reportedly shows 0.0%/-0.4%/-0.4% at AI, 0.0%/-0.7%/-
0.7% at RA, 0.1%/-1.2% /-1.4% at LDB and 0.0%/-1.2%/-1.6% at LDP over VTM-3.0 with ALF
switched on, respectively. When ALF is switched off, the BD-Rate results for Y/U/V reportedly shows
0.0%/-0.6%/-0.5 % at AI, 0.0%/-0.8%/ -0.8% at RA, 0.1%/-1.1%/-1.5% at LDB and 0.0%/-1.6%/-1.7% at
LDP, respectively.
CE12-2: In-loop Reshaping for SDR on In-loop Reshaping for SDR on implementation investigation
Test Proponent Description Xchecker
CE12-2 Dolby Test in-loop reshaping for SDR focused on J. Xu (ByteDance),
(JVET-L0247) implementation investigation on pipelining of the block- E. François
wise prediction loop: such as piece-wise linear (Technicolor)
interpolation replacing LUT-based mapping, disable
intra-block in inter slices.
In-loop luma reshaping: CE12-2 tests a lower complexity pipeline that also eliminates decoding latency
for block-wise intra prediction in inter slice reconstruction. Intra prediction is performed in reshaped
domain for both inter and intra slices.
CE12-2 also tests 16-piece PWL models for luma and chroma residue scaling instead of the 32-piece
PWL models of CE12-1.
Inter slice reconstruction with in-loop luma reshaper in CE12-2 (light-green shaded blocks indicate signal
in reshaped domain: luma residue; intra luma predicted; and intra luma reconstructed)
Luma-dependent chroma residue scaling is a multiplicative process implemented with fixed-point
integer operation. Chroma residue scaling compensates for luma signal interaction with the chroma
signal. Chroma residue scaling is applied at the TU level.
For intra the reconstructed luma is averaged.
For inter, the luma prediction is averaged.
The average is used to identify an index in a PWL model. The index identifies a scaling factor cScaleInv.
The chroma residual is multiplied by that number.
The parameters are (currently) sent in the tile group header (similar to ALF). These reportedly take 40-
100 bits.
Suggestions for easing implementation were:
Disabling the chroma residual scaling for separate tree operation
Disabling the chroma residual scaling for 2x2 chroma
Using the prediction signal rather than the reconstruction signal for intra as well as inter
The proponent indicated that these suggestions had previously been tested and should not affect the
performance, and said test results would be provided in a revision of their contribution JVET-M0427.
It was noted that in the current CTC, separate trees are used for intra slices.
It was noted that some loss was observed for the chroma, although the luma gain was enough to more
than compensate for that.
A participant commented that there could be rate allocation effects that could be achieved by local delta
QP or setting of QP based on temporal layer. It was remarked that some tests along those lines were
JVET-M0427 CE12: Mapping functions (test CE12-1 and CE12-2) [T. Lu, F. Pu, P. Yin,
W. Husak, S. McCarthy, T. Chen (Dolby)]
JVET-M0874 BoG report on CE13 and CE13 related 360° video coding [J. Boyce]
This BoG report was reviewed in Track A 1600-1700 Thursday 17 January (GJS).
The BoG met on 13 January 2019 from 1800 to 2030, on 14 January 2019 from 1800 to 1900, and on 16
January from 1700 to 1800.
The BoG recommended to adopt JVET-M0892 for disabling of in-loop filters (deblocking, SAO, and
ALF) at vertical and horizontal boundaries signalled in the SPS at MinCbSizeY granularity.
It was noted that the change that was requested was not specific to 360° video.
It was noted that this proposed change is for entire columns / entire rows, not line segments that do not
cut through the entire picture.
For a conventional cubemap, the filter would be disabled for one horizontal line in the middle of the
picture.
It was discussed what sort of limit there would be for how many of these cuts would be allowed. One
suggestion was a limit of 3 cuts in each direction.
The granularity was also discussed – whether it was really necessary to have 4x4 granularity.
It was discussed how this would be implemented in a real decoder. Checking a long list of positions
would not be reasonable.
Due to a desire for further study of this before making a decision, no action was taken on this.
Further study was also recommended to consider more flexible in-loop filter disabling patterns, use cases,
and HW implementation complexity.
The BoG endorsed the recommendation of HLS BoG to change the sps_ref_wraparound_offset to
sps_ref_wraparound_offset_minus1 and changing the units to be MinCbSizeY as in option 1.
The BoG recommended, and Track A endorsed, the following for the 360Lib software:
JVET-M0033 CE13: Summary report on coding tools for 360° omnidirectional video
[P. Hanhart, J.-L. Lin, C. Pujara]
JVET-M0143 CE13: Face row based geometry padding using projection with bilinear
interpolation based on test 1.1.a (Test 2.1.b) [P. Hanhart, Y. He
(InterDigital)]
JVET-M0144 CE13: Adaptive frame packing based on test 1.1.a (Test 4.1) [P. Hanhart,
Y. He (InterDigital)]
JVET-M0235 CE13: HEC with Pre-rotation based on test 1.1a and 1.1b (Test 4.2)
[C. Pujara, A. Konda, A. Singh, R. Gadde, W. Choi, K. Choi, K. P. Choi
(Samsung)] [late]
JVET-M0320 CE13: HEC with deblocking using spherical neighbours, SAO and ALF
disabled across face discontinuities (Test 1.4) [X. Huangfu, Y. Sun, L. Yu
(Zhejiang Univ.)] [late]
JVET-M0321 CE13: Post-filtering of seam artefacts based on test 1.1.a (Test 3.1)
[X. Huangfu, Y. Sun, L. Yu (Zhejiang Univ.)] [late]
JVET-M0362 CE13: In-loop filters disabled across face discontinuities (Test 1.1.a and
Test1.1.b) [S.-Y. Lin, L. Liu, J.-L. Lin, Y.-C. Chang, C.-C. Ju (MediaTek),
P. Hanhart, Y. He (InterDigital)]
JVET-M0363 CE13: HEC with in-loop filters using spherical neighbours (Test 1.3) [S.-Y.
Lin, L. Liu, J.-L. Lin, Y.-C. Chang, C.-C. Ju (MediaTek)]
JVET-M0367 CE13: Face row based geometry padding of reference pictures and in-loop
filters using spherical neighbours (Test 5.1) [C.-H. Shih, S.-Y. Lin, L. Liu, J.-
L. Lin, Y.-C. Chang, C.-C. Ju (MediaTek)]
JVET-M0168 CE2-related: Simplifications for inherited affine candidates [Y.-L. Hsiao, T.-
D. Chuang, C.-W. Hsu, C.-Y. Chen, Y.-W. Huang, S.-M. Lei (MediaTek)]
JVET-M0228 CE2-related: Affine mode simplifications [Y.-W. Chen, X. Wang (Kwai Inc.)]
JVET-M0247 CE2 related: Joint test of AMVR for Affine Inter mode (Test 2.1.1 and Test
2.1.2) [H. Liu, K. Zhang, L. Zhang, J. Xu (Bytedance), D. Luo, Y. He, X. Xiu
(InterDigital)]
JVET-M0310 CE2-related: Using shorter-tap filter for 4x4 sized partition [J. Li, R.-L.
Liao, C. S. Lim (Panasonic)]
JVET-M0311 CE2-related: Memory bandwidth reduction for affine mode with less
dependency [J. Li, R.-L. Liao, C. S. Lim (Panasonic)]
This presented in Track B Tue 15 January 1210
This contribution is a further simplification on JVET-M0309. In this contribution, the first control point is
used instead of first sub-block’s motion vector as center of motion vector constrained region of affine
mode.
the BD-rate difference is 0.03% in RA, 0.01% in LD_B respectively.
Further study.
JVET-M0406 CE2/4-related: Unified merge list size for block and sub-block merge modes
[X. Xu, X. Li, S. Liu (Tencent)]
JVET-M0576 Crosscheck of JVET-M0406 (CE2/4-related: Unified merge list size for block
and sub-block merge modes) [C.-Y. Lai (MediaTek)] [late]
JVET-M0462 CE2-related: 4x4 chroma affine motion compensation and motion vector
rounding unification [L. Pham Van, W.-J. Chien, H. Huang, V. Seregin,
M. Karczewicz (Qualcomm)]
JVET-M0702 CE2-related: Adaptive sub-block MV clip for affine blocks [X. Li, M. Gao,
S. Liu (Tencent)] [late]
JVET-M0839 CE2-related: On number of fast merge candidates for Affine Merge mode
[A. Robert, F. Le Léannec, F. Galpin (Technicolor)] [late]
See also JVET-M0203 which contains a new proposal called 3.5.1 that was not part of the CE.
See also section 7, which discusses related complexity analysis and reduction proposals.
JVET-M0857 BoG report on CE3-related intra prediction and mode coding [G. Van der
Auwera]
This BoG report was discussed in Track A on Tuesday 1430–1630 (GJS).
The BoG reviewed related input contributions to Core Experiment 3 on intra prediction and mode coding,
and formulated recommendations for consideration by the track (A).
The CE3-related documents were categorized as follows:
Cross-component prediction (14)
JVET-M0047 Non-CE3: Intra Angular Prediction and Modified PDPC Based on Two
Reference Lines [D. Jiang, J. Lin, F. Zeng, C. Fang (Dahua)]
JVET-M0108 CE3-related: Reducing the number of reference samples and table size in
LM Chroma process [E. François, T. Poirier, F. Le Léannec (Technicolor)]
JVET-M0211 CE3-related: Fixed Reference Samples Design for CCLM [J.-Y. Huo, X.-W.
Li, J.-L. Wang, Y.-Z. Ma, F.-Z. Yang (Xidian Univ.), S. Wan (NPU), Y.-F.
Yu, Y. Liu (Oppo)]
JVET-M0212 CE3-related: Improved reference samples range for MDLM [S. Wan (NPU),
Q.-H.Ran, X.-W. Li, Y.-Z. Ma, J.-Y. Huo, F.-Z. Yang (Xidian Univ.), Y.-F.
Yu, Y. Liu (Oppo)]
JVET-M0274 CE3-related: Modified linear model derivation for CCLM modes [M. Wang,
K. Zhang, L. Zhang, H. Liu, J. Xu, S. Wang (Bytedance), J. Li, S. Wang,
W. Gao (Peking Univ.)]
JVET-M0365 Non-CE3: modified PDPC for horizontal and vertical modes [A. Filippov,
V. Rufitskiy, J. Chen (Huawei)]
JVET-M0383 Non-CE3: Table size reduction and bit width limitation for CCLM
implementation [P. Onno, C. Gisquet, G. Laroche, J. Taquet (Canon)]
JVET-M0493 CE3-related: Simplified look-up table for CCLM mode [L. Zhao, X. Zhao,
X. Li, S. Liu (Tencent)]
JVET-M0528 Non-CE3: A unified luma intra mode list construction process [F. Bossen,
K. Misra (Sharp Labs of America)]
JVET-M0832 Non-CE3: On block size restrictions for PDPC with disabled linear filtering
for PDPC in the case of skew non-diagonal modes [A. Filippov, V. Rufitskiy,
J. Chen (Huawei), J. Lee, J. Kang (ETRI)] [late]
6.4 CE4 related – Inter prediction and motion vector coding (51)
Contributions in this category were first considered discussed in a BoG JVET-M0843 unless otherwise
noted.
See also section 7, which discusses related complexity analysis and reduction proposals.
JVET-M0843 BoG report on CE4 related inter prediction and motion vector coding
contributions [K. Zhang]
This report was reviewed in Track B Tue 15 Jan 1215-1330 and 1435-1800.
Five BoG sessions were held, 1540 ~ 2020 on Jan. 11, 0900 ~ 1045 on Jan. 12, 1830 ~ 2000 on Jan. 12,
1945~2300 on Jan. 13, 2130~2230 on Jan. 14 for discussing 47 technical contributions in five categories:
1) Merge Modifications (19)
2) MMVD Modifications (12)
3) AMVP Modifications (7)
4) Weighted-Prediction Modifications (2)
5) Complexity Reduction (7)
Adoptions recommended by the BoG were as follows (see the notes for each individual contribution
regarding specifics for each document):
Normative changes
Bug Fix/Cleanup/Harmonization:
– JVET-M0436 AHG2: Regarding HMVP Table Size
– JVET-M0264 Non-CE4: Harmonization between HMVP and GBi
– JVET-M0068 Non-CE4: MMVD scaling fix
– JVET-M0171 CE4-related: MMVD cleanups
– JVET-M0111 AHG13: On bi-prediction with weighted averaging and weighted prediction
– JVET-M0479 Non-CE4: On clipping of scaled motion vectors
Coding Efficiency:
– JVET-M0255 AHG11: MMVD without Fractional Distances for SCC
– JVET-M0444 CE4-related: Simplified symmetric MVD based on CE4.4.3
– JVET-M0502 CE4-related: Improved context for prediction mode flag
Decision: all the suggested adoptions were confirmed by Track B.
Proposals suggested for study in upcoming CE (not all of which were later endorsed):
Merge-related
o Syntax modification
– JVET-M0069 Non-CE4: Syntax change of MMVD
– JVET-M0231 CE4-related: Regular merge flag coding
– JVET-M0359 Non-CE4: Modification of merge data syntax
– JVET-M0369 CE4-related: Syntax changes of merge data
It was noted in the Track B discussion that the benefit of some of the last three methods
is low, in particular if the separation of MMVD from merge (JVET-M0069) would be
implemented. Therefore, also combination of JVET-M0069 and each of the other three
should be tested.
o Merge list simplification
– JVET-M0405 CE4-related: Simplified merge candidate list for small blocks
– JVET-M0433 CE4-related: Constraint on GBi index inheritance in Merge Mode
STMVP
– JVET-M0518 CE4-related: Supplemental results on STMVP design of CE4.2.3.a and
combination with methods of JVET-M0126 (CE4.1.2.a) and JVET-M0127
– JVET-M0713 CE4-related: simplification of CE4.2.2
MMVD-related
From Track B discussion: This part should be combined with the Sub-CE on MMVD mode signalling
above, in particular combination with JVET-M0069 should be investigated.
Page: 199 Date Saved: 2019-03-172019-03-14
– JVET-M0206 CE4-related: MMVD improvements
– JVET-M0267 Non-CE4: Harmonization of MMVD and AMVR
– JVET-M0307 CE4-related: Candidates optimization on MMVD
– JVET-M0308 Non-CE4: MMVD simplification
– JVET-M0314 CE4-related: MMVD improving with signalling distance table
– JVET-M0315 Non-CE4: MMVD scaling simplification
– JVET-M0435 CE4-related: MMVD offset table signalling
TMVP Storage Reduction
From dicussion in Track B: This sub-CE is not needed, as the JVET-M0512 solution was adopted.
– JVET-M0230 CE4-related: Temporal MV buffer reduction
– JVET-M0346 CE4-related: Non-square compression grid for temporal motion data storage
– JVET-M0512 Non-CE4: On Temporal Motion Buffer Compression
Open questions reported by the BoG (see the notes for each contribution):
Depending on CE decision
o Pending on JVET-M0170 (which was adopted)
– JVET-M0272 CE4-related: Restrictions on History-based Motion Vector Prediction
– JVET-M0345 CE4-related: Remove redundancy between TMVP and ATMVP
– JVET-M0473 Simplified HMVP
– JVET-M0350 CE4-related: Quadtree-based Merge Estimation Region for VVC
o Related to JVET-M0281
– JVET-M0081 Non-CE4: Simplification of AMVP list generation in AMVR
o Related to JVET-M0403
– JVET-M0422 CE4-related: Simplified MVD coding
– JVET-M0406 CE2/4-related: Unified merge list size for block and sub-block merge
modes
– JVET-M0661 AHG-13: On Merge List Size
– JVET-M0330 CE4-related: Simplification of MMVD scheme
JVET-M0117 CE4-related: On MVP candidate list generation for AMVP [R. Yu, D. Liu,
K. Andersson, P. Wennersten, J. Ström, R. Sjöberg (Ericsson)]
Notes from BoG report review: 0.01%/0.01%, pruning number 10->1, MV rounding 13->3.
JVET-M0171 CE4-related: MMVD cleanups [C.-Y. Lai, T.-D. Chuang, Y.-L. Hsiao, C.-Y.
Chen, Y.-W. Huang, S.-M. Lei (MediaTek)]
Notes from BoG report review: -0.03%/0.02%, JVET-M0068+Forbid 4*4 bi + align SW with WD.
JVET-M0231 CE4-related: Regular merge flag coding [X. Wang, Y.-W. Chen (Kwai Inc.)]
Notes from BoG report review: 0.00% (100%, 96%)/-0.23% (102%, 92%). The codeword length for
regular merge mode becomes the shortest one.
JVET-M0255 AHG11: MMVD without Fractional Distances for SCC [H. Liu, L. Zhang,
K. Zhang, J. Xu, Y. Wang, P. Zhao, D. Hong (Bytedance)]
In the merge with motion vector difference (MMVD) mode, fractional distances including 1/4-pel and
1/2-pel are included in the distance table. However, fractional distances maybe inefficient for screen
contents. To address this issue, this contribution proposes to disable fraction distances adaptively in
MMVD mode. When fractional distances are disabled, distances in the default table are all multiplied by
4. Simulation results reportedly show 0.76% and 1.66% luma BD-rate saving on SCC sequences in
Random Access configuration and Low Delay B configuration respectively. When CPR is off, 1.58% and
2.74% luma BD-rate saving on SCC sequences are achieved in Random Access configuration and Low
Delay B configuration respectively.
More related to MMVD (not SCC). A similar approach was investigated in CE 4.4.5b (see remarks there),
where switching to an integer table was proposed for UHD resolution.
This contribution was initially discussed Friday 11 January afternoon in Track B; discussion was then
moved to CE4 BoG (see JVET-M0843).
Notes from BoG report review: -1.58%/-2.74% in SCC (TGM class) tests with CPR off, -0.76%/-1.66%
with CPR on, one slice level flag indicates whether distance is full-pel. It is further pointed out in the
Track B dicussion that it was demonstrated to provide benefit for UHD when such an option is available
(see under CE 4.4.5). A version shall be used that just multiplies the current MVD distances by 4.
JVET-M0264 Non-CE4: Harmonization between HMVP and GBi [J. Li, S. Wang, W. Gao
(Peking Univ.), L. Zhang, K. Zhang, H. Liu, J. Xu (Bytedance), X. Xiu,
D. Luo, Y. He (InterDigital)]
Notes from BoG report review: Harmonization such that the GBi weight is also stored in HMVP. Can be
seen as a bug fix – gives very small improvement -0.01%/-0.01%.
JVET-M0300 CE4-related: HMVP and parallel processing with tiles and tile groups
[A. M. Kotra, J. Chen, B. Wang, S. Esenlik, H. Gao (Huawei)]
Notes from BoG report review: This resets the history table at the beginning of each tile, however under
the assumption that tile boundaries coincide with CTU boundaries which may not be the case with
flexible tile concepts. It was suggested that proponents should communicate with tile experts and see
whether there is a misaligment with concepts that will be put into VVC draft 4). It was later confirmed by
the text editor that only a small (kind of editorial) change would be required. (might be that depending on
further discussion on tile concepts; some more alignments may be necessary).
JVET-M0314 CE4-related: MMVD improving with signalling distance table [J. Li, R.-L.
Liao, C. S. Lim (Panasonic)]
Notes from BoG report review: -0.20% (103%, 97%)/0.02% (104%, 100%). Signalling the distance table
in the slice header.
From Track B: Part of the gain comes from the encoder optimization “4.4.5*”. So, the benefit of
signalling as such seems to be low, in particular considering the increased encoder runtime. Not
worthwhile, not in CE.
JVET-M0346 CE4-related: Non-square compression grid for temporal motion data storage
[S. H. Wang (Peking Univ.), X. Zheng (DJI), S. S. Wang, S. W. Ma (Peking
Univ.)]
Notes from BoG report review: 0.06% (100%, 100%)/ 0.23% (100%, 100%). MV stored in grid.
JVET-M0405 CE4-related: Simplified merge candidate list for small blocks [X. Xu, X. Li,
S. Liu (Tencent)]
Notes from BoG report review: 0.09% (106%, 104%)/0.14% (108%, 107%). Only keep one
spatial/temporal candidate without pruning when W*H <=32.
From Track B dscussion: the loss does not justify the simplification, as there is no real complexity
problem at the decoder. No value was seen for investigating this in a CE.
JVET-M0406 CE2/4-related: Unified merge list size for block and sub-block merge modes
[X. Xu, X. Li, S. Liu (Tencent)]
Notes from BoG report discussion: 0.06% (101%, 101%)/ 0.10% (100%, 103%), unified merge list size =
5; -0.02% (101%, 101%)/ 0.00% (100%, 103%), merge list size = 6
See also JVET-M0661.
From the discussion in Track B: VTM3 has the choice of signalling merge list size. This is inherited from
HEVC, where the design choice was to give an encoder the option to check less merge candidate,
however it imposes some burden on decoders, as the parsing and signalling of merge candidates depends
on the selected maximum number. It was discussed whether such encoder choice is still relevant for
VVC.
JVET-M0435 CE4-related: MMVD offset table signalling [G. Li, X. Xu, X. Li, S. Liu
(Tencent)]
Notes from BoG report review: -0.19% (100%, 99%)/0.01% (103%, 97%). Like JVET-M0314 but with
differential coding
The same comment applies as above for JVET-M0314.
JVET-M0484 Non-CE4: Line buffer size reduction method for generalized bi prediction
[T. Solovyev, H. Gao, S. Esenlik, S. Ikonin, J. Chen (Huawei)]
JVET-M0502 CE4-related: Improved context for prediction mode flag [X. Zhao, X. Li,
S. Liu (Tencent)]
Notes from BoG report review: -0.09%/-0.02%, add one context to code pred_mode_flag, no change of
run time..
JVET-M0661 AhG-13: On Merge List Size [X. Li, X. Xu, S. Liu (Tencent)] [late]
Notes from BoG report discussion: the CTC uses a unified merge list size = 5.
See also JVET-M0406.
From the discussion in Track B: VTM3 has the choice of signalling merge list size. This is inherited from
HEVC, where the design choice was to give an encoder the option to check less merge candidate,
however it imposes some burden on decoders, as the parsing and signalling of merge candidates depends
on the selected maximum number. It was discussed whether such encoder choice is still relevant for
VVC.
JVET-M0344 Crosscheck of JVET-M0089 (Non-CE5: CABAC skip mode for super low
delay) [R. Hashimoto (Renesas)]
JVET-M0772 CE5-related: Clean up of the context model initialization process for CE5.1.5
and CE5.1.6 [J. Stegemann, H. Kirchhoffer, D. Marpe, H. Schwarz,
T. Wiegand (HHI)] [late]
Detailed presentation of this was not requested by the presenter, due to actions taken earlier at the
meeting.
JVET-M0269 Non-CE6: Extension of transform skip block size to 8x8 [S. Yoo, J. Choi,
J. Heo, J. Choi, L. Li, J. Lim, S. Kim (LGE)]
JVET-M0280 CE6-related: Context selection for entropy coding the MTS flag [S.-T.
Hsiang, S.-M. Lei (MediaTek)]
JVET-M0354 CE6-related: MTS with Haar transform for Screen Contents Coding
[K. Naser, F. Galpin, T. Poirier (Technicolor)]
JVET-M0396 CE6-related: MTS kernel derivation for efficient memory usage [S. Shrestha,
A. Kumar, B. Lee (Chosun Univ), Y. Lee, J. Park (Humax)] [late]
Initial upload rejected as a placeholder.
JVET-M0501 CE6 related: Unification of Transform Skip mode and MTS [X. Zhao, X. Li,
S. Liu (Tencent)]
JVET-M0891 BoG report on CE7 related quantization and coefficient coding contributions
[Y. Ye]
This report was discussed in Track A Thursday 17 January 1700-1800 (GJS).
There were 13 technical contributions in the CE7 related category. These contributions are classified into
the following three categories:
Rice parameter related
Complexity reduction and simplification
Coding efficiency
The first BoG session was held 1900–2345 on January 14 2019 in Saba.
The second BoG session was held 1300–1400 on January 17 2019 in Saba.
Section 1 of this document summarizes the BoG’s recommendations. Section 2 of this document contains
detailed notes on BoG discussion.
Recommended adoptions:
JVET-M0470, CE7-related: Golomb-Rice/exponential Golomb coding for abs_remainder and
dec_abs_level syntax elements
o In VTM 3.0, worst case is 33 bits for abs_gt3_level_flag, and 35 bits for dec_abs_level for
escape codes of coefficient levels, in comparison with is 32 bits for both syntax elements in
HEVC.
o Two aspects in this proposal: the first aspect uses a constant transition prefix code length of
6, and eliminates a LUT; the second aspect uses HEVC RExt extended precision scheme to
limit worst case code length to 32 bits
o Performance: 0.00%/0.01%/0.02% for AI/RA/LB
o Proposed aspects seemed to be straightforward and aligned with HEVC. It also would solve a
software issue that currently exists in VTM 3 that could cause decoder crash.
Decision (BF): Adopt JVET-M0470.
JVET-M0251 (Non-CE7: Last position coding for large block-size transforms) and JVET-M0257
(CE7-related: coefficient scanning and last position coding for TUs of greater than 32 width or
height)
o Three contributions in this BoG tried to address the coefficient coding issues due to high
frequency zeroing: JVET-M0250, JVET-M0251 (which is a superset of JVET-M0250), and
JVET-M0257
o Propose to scan only the portions of the transform blocks that have not been zeroed out
o JVET-M0257 and JVET-M0251 are identical from spec text point of view, but simulation
results are slightly different due to some difference in encoder implementation
o M0257: 0.00% AI, -0.01% RA, -0.01% LB
o M0251: 0.01% AI, 0.01% RA, full set of LB results not yet available at time of discussion
Overall Y U V Ave-UV
AI -0.29% -0.79% -2.16% -1.48%
RA -0.19% -1.44% -2.00% -1.72%
LD-B -0.08% -2.38% -4.83% -3.61%
LD-P -0.11% -2.46% -5.08% -3.77%
o This method is simpler than an LM chroma mode that predicts Cr from Cb that was
previously proposed, and achieves similar coding performance as that previous LM chroma
mode. This complexity reduction of this proposed method compared to the previous LM
chroma mode could be especially beneficial for the decoder.
o It seems that such coding gain was usually not available from coefficient coding methods.
o Low QP results (provided in a v3 of the contribution) were discussed at the second BoG
session. The chroma gains are reported to be somewhat lower than those of CTC, but the
gains in luma are similar to (AI case) or higher than (RA and LB cases) those of CTC.
o In Track A, a participant commented wondered whether there could be subjective artefacts.
o A participant asked how many blocks used this, and it was said that around half of the time
that both channels had a residual, it would use this mode.
o The proponent said it seems to also work for HDR.
Further study in a CE was planned for this.
BoG recommendations for CE testing:
CE on coefficient coding (cleanup):
o JVET-M0198, CE7-related: Unified Rice parameter derivation for coefficient level coding
(some more computes but hardware friendly, removing one variation)
o No test of this one (some loss): JVET-M0469, CE7-related: Unified Rice parameter
derivation for coefficient coding
o JVET-M0558: CE7-related: Template-based Rice parameter derivation (some more storage
of reconstructed values and computes but hardware friendly, removing one variation)
o No test of this one (don’t worry about number of contexts at this point): JVET-M0107: CE7-
related: Reduced local neighbourhood usage for transform coefficients coding
o No test of this one (don’t worry about number of contexts at this point): JVET-M0489: CE7-
related: Reduced context models for transform coefficients coding
JVET-M0198 CE7-related: Unified Rice parameter derivation for coefficient level coding
[Y. Piao, K. Choi (Samsung)]
JVET-M0250 Non-CE7: Simplified CSBF coding for large block-size transforms [J. Choi,
J. Heo, S. Yoo, J. Choi, L. Li, J. Lim, S. Kim (LGE)]
JVET-M0251 Non-CE7: Last position coding for large block-size transforms [J. Choi,
J. Heo, S. Yoo, J. Choi, L. Li, J. Lim, S. Kim (LGE)]
JVET-M0646 Crosscheck of JVET-M0251 (Non-CE7: Last position coding for large block-
size transforms) [H. Schwarz (Fraunhofer HHI)] [late]
JVET-M0279 Non-CE7: Sign coding for transform skip [S. Yoo, J. Choi, J. Heo, J. Choi,
L. Li, J. Lim, S. Kim (LGE)]
JVET-M0151 CE8-related: Virtual search area for current picture referencing (CPR)
[L. Pham Van, T. Hsieh, W.-J. Chien, V. Seregin, H. Wang, M. Karczewicz
(Qualcomm)]
This contribution proposes a method to extend the CPR search area by padding the area using the line
buffer and the column to the left of the current CTU. Without any additional memory cost, the luma BD-
rate changes for [CTC, Class F, class SCC 1080p] with the use of the proposed technique are reported as
test 1:
CPR on and PLT off:
AI: (0.02%, -0.39%, -1.23%) over VTM3+CPR on, resp.
RA: (0.02%, -0.29%, -0.60%) over VTM3+CPR on, resp.
LB: (-0.03%, -0.20%, -0.35%) over VTM3+CPR on, resp.
CPR on and PLT on:
JVET-M0254 Non-CE8: Subblock Operation Removal for Chroma CPR [J. Xu, K. Zhang,
L. Zhang, H. Liu, Y. Wang, P. Zhao, D. Hong (Bytedance)]
In the current design of CPR, when separate tree is applied, chroma blocks inherit collocated luma
blocks’ block vectors, which requires sub-block by sub-block operation. This contribution proposes to
remove such a subblock-wise operation and reports a 0.26% coding gain for class TGM420.
Similar to JVET-M0174 (see notes on that contribution).
JVET-M0255 AHG11: MMVD without Fractional Distances for SCC [H. Liu, L. Zhang,
K. Zhang, J. Xu, Y. Wang, P. Zhao, D. Hong (Bytedance)]
In the merge with motion vector difference (MMVD) mode, fractional distances including 1/4-pel and
1/2-pel are included in the distance table. However, fractional distances maybe inefficient for screen
contents. To address this issue, this contribution proposes to disable fraction distances adaptively in
MMVD mode. When fractional distances are disabled, distances in the default table are all multiplied by
4. Simulation results reportedly show 0.76% and 1.66% luma BD-rate saving on SCC sequences in
Random Access configuration and Low Delay B configuration respectively. When CPR is off, 1.58% and
2.74% luma BD-rate saving on SCC sequences are achieved in Random Access configuration and Low
Delay B configuration respectively.
More related to MMVD (not SCC). A similar approach was investigated in CE 4.4.5b (see remarks there),
where switching to an integer table was proposed for UHD resolution.
Initially discussed Friday 11 January afternoon in Track B; discussion moved to CE4 BoG (see JVET-
M0843).
From the discussion, it is generally agreed that it would be much more desirable to handle CPR as a
separate mode, and remove the signalling via reference picture list, and not use a P picture for that
purpose. For VVC, there is no need to design it the same way as in HEVC (in HEVC, P picture was re-
used, as CPR was defined in a later stage).
This would also resolve the problem of prohibiting combination with other MC tools
It is however desirable to re-use building blocks from MC.
JVET-M0327 does not come with a syntax proposal.
JVET-M0483 goes a similar direction. See further notes on that contribution.
JVET-M0333 Non-CE8: Coding on block vector difference [J. Nam, J. Lim, S. Kim (LGE)]
This contribution proposes a block vector difference coding for CPR mode. Considering that the absolute
value of block vector difference (i.e., abs_mvd_minus2) for CPR mode is larger than that of conventional
motion vector difference, higher order (K is equal to 3) of Exp-Golomb code is applied. From experiment
results, following luma BD-rate changes are observed over VTM-3.0 with CPR:
AI: -0.02% (CTC), -0.19% (Class F), -0.41% (SCC)
RA: 0.00% (CTC), -0.27% (Class F), -0.10% (SCC)
LD: 0.00% (CTC), -0.15% (Class F), -0.76% (SCC)
This would require a different binarization in the context of MV coding. During discussion in a JVET
plenary about importance of consistency between CPR and MV coding whether we would encourage
such low level changes, it was agreed that they could be done if they provide sufficient benefit. This
proposal should be studied in CE to exercise such a case, though it remains tbd what “sufficient” means in
case of screen content.
It is noted that the same statement applies to methods that had been investigated in CE8.1, provided they
deliver similar gain.
JVET-M0393 Non-CE8: chroma block vector initialization for CPR in dual tree
[T. Poirier, F. Le Léannec, F. Galpin (Technicolor)]
This contribution describes a method to initialize chroma block vectors when dual tree is enabled and
collocated luma block uses CPR partially. The method is implemented on top of VTM-3.0 with chroma
subpel interpolation activated. The method is reported to result in PSNR-Y, U, V BD-rate variations of -
0.01%, 0.00%, -0.01% with 100% and 99% encoding and decoding times in AI configuration, 0%, -
0.01%, -0.03% with 100% and 100% encoding and decoding times in RA configuration.
Further results in the document show somewhat higher gains for class TGM.
There are two aspects: subpel chroma (4-tap) and determination of a chroma CPR vector in case of dual
tree when no luma vector is available.
With dual tree enabled (which is CTC and also been known to be most beneficial in combination with
CPR), it gives 0.3% for class TGM, 0.1% for class F. This is rather low, does not justify the additional
complexity (e.g. determining and checking the derived vector, interpolation).
JVET-M0410 Non-CE8: CPR flag signalling at slice level [X. Xu, X. Li, S. Liu (Tencent)]
This contribution proposes to control the usage of CPR at slice level, in addition to the current CPR SPS
flag. That means, when CPR SPS flag is true, for each slice, an encoder can choose to turn on or off the
usage of CPR. On top of this change, an encoder algorithm is provided in a way that when the hash hit
rate of a picture is below a certain threshold, the slice level CPR is turned off. The intent of the suggested
algorithm is that CPR mode is turned off for slices where CPR may not contribute much to the BD rate
reduction while CPR mode is turned on for screen content materials. Simulation results report that:
The BD rate/runtime changes for CTC average are -0.01%/-0.01%/0.13% and 103%/101%/103% for
AI/RA/LB, separately, when compared to VTM-3.0+CPR=0 anchor;
The BD rate/runtime changes for Class F average are 0.13%/0.07%/0.03% and 96%/100%/99% for
AI/RA/LB, separately, when compared to VTM-3.0+CPR=1 anchor
JVET-M0411 Non-CE8: Inter mode related flag signalling when current picture is the only
reference picture [X. Xu, X. Li, S. Liu (Tencent)]
This contribution proposes not to signal the following inter coding related flags in condition that the
current picture is the only reference picture for a slice. Specifically, sub-block merge flag, affine flag,
mmvd flag and intra-inter flag are not signalled. Simulation results report that BD rate changes are
negligible (within +-0.1%) when compared to VTM-3.0 anchor (CPR=1).
No need for further discussion – see notes under JVET-M0175.
JVET-M0417 CE8-related: Combination test of CE8.2.2 and CE8.2.5 [Y.-C. Sun, J. Lou
(Alibaba)]
This document reports the results of the combination of CE8.2.2 (palette mode and intra mode
combination) and CE8.2.5 (separated palette design for the joint CUs containing both luma and chroma
CBs). The tests are performed on top of CE8.2.1 (JVET-M0050). Compared with CE8.2.1, when the
method is tested with CPR turned off in the setting, the results of the combination show that:
1) For 4:2:0 TGM sequences, the proposed method reportedly provides 2.5%, 2.8%, and 3.0% BD-rate
gains for luma with 92%, 99% and 98% for encoding time, and 101%, 97% and 92% in decoding
time under AI, RA, and LD configurations, respectively.
2) For Class F sequences, the proposed method reportedly provides 0.3%, 0.5%, and 1.5% BD-rate gains
for luma, with 100%, 99% and 99% for encoding time, and 101%, 98% and 97% in decoding time
under AI, RA, and LD configurations, respectively.
When the method was tested with CPR turned on in the setting,
1) For 4:2:0 TGM sequences, the proposed method reportedly provides 0.9%, 1.3%, and 2.4% BD-rate
gains for luma with 95%, 99% and 100% for encoding time, and 100%, 98% and 100% in decoding
time under AI, RA, and LD configurations, respectively.
2) For Class F sequences, the proposed method reportedly provides 0.0%, 0.2%, and 0.9% BD-rate gains
for luma, with 99%, 100% and 103% for encoding time, and 100%, 101% and 98% in decoding time
under AI, RA, and LD configurations, respectively.
It’s worthwhile to note that, compared with the results of CE8.2.2 and CE8.2.5 individually, the results of
the proposed combination show further synergy between CE8.2.2 and CE8.2.5.
The contribution shows that palette has some subjective benefit.
Goal is to further improve palette over the basis configuration. Further study.
JVET-M0419 CE8-related: Context modeling on palette mode flag [Y.-C. Sun, J. Lou
(Alibaba)]
This document modifies context modeling for palette flag signalling of CE8.2.1. Compared with CE8.2.1,
the results of the proposed method show that, under AI configurations, the proposed method provides
0.1% and 0.1% BD-rate gains for luma for 4:2:0 TGM when the method is tested with CPR turned off and
on, respectively.
Relative low gain in terms of TGM.
JVET-M0464 Non-CE8: Unified Transform Type Signalling and Residual Coding for
Transform Skip [B. Bross, T. Nguyen, P. Keydel, H. Schwarz, D. Marpe,
T. Wiegand (HHI)]
This contribution proposes two modifications related to transform type signalling and residual coding for
transform skip. The first modification includes aligning the cases in which transform skip (TS) and
multiple transform selection (MTS) apply. This limits TS to luma transform blocks as well as extends it to
transform block sizes up to 32x32. The second modification constitutes of a modified transform
coefficient level coding for the TS residual. Relative to the regular residual coding case, the residual
coding for TS includes no signalling of the last x/y position, coded_sub_block_flag coded for every
subblock, sig_coeff_flag context modelling with reduced template, a single context model for
abs_level_gt1_flag and par_level_flag, context modeling for the sign flag, additional greater than 5, 7, 9
flags, modified Rice parameter derivation for the remainder binarization and a limit for the number of
context coded bins per sample. The proposed joint signalling (first modification) provides on average
(BD-rate Y, enc. time, dec.time):
Natural: 0.01%, 103%, 101% (AI), -0.02%, 98%, 100% (RA), -0.07%, 102%, 102% (LB)
Class F: -1.96%, 103%, 97% (AI), -2.14%, 98%, 100% (RA), -2.79%, 103%, 99% (LB)
TGM: -8.04%, 106%, 88% (AI), -8.74%, 106%, 96% (RA), -9.21%, 112%, 97% (LB)
Additionally changing the residual coding for transform skip (first modification and second modification)
results on average in (BD-rate Y, enc. time, dec.time):
Natural: -0.16%, 103%, 100% (AI), -0.07%, 98%, 101% (RA), -0.07%, 101%, 101% (LB)
Class F: -7.16%, 104%, 94% (AI), -5.76%, 98%, 100% (RA), -5.85%, 102%, 99% (LB)
TGM: -21.03%, 108%, 82% (AI), -15.57%, 105%, 94% (RA), -14.01%, 111%, 95% (LB)
v2 provides updated full frame results, details about the combined TS/MTS encoder search with results
for different encoder operation points as well as draft text for unified TS/MTS syntax.
v3 corrects wrongly pasted results to match with the cross-check provided in JVET-M0708.
JVET-M0483 CE8-related: CPR mode signalling and interaction with inter coding tools
[W.-J. Chien, V. Seregin, M. Karczewicz (Qualcomm)]
This contribution proposes two schemes to align the interaction between CPR mode with all other inter
coding tools. The method is implemented on top of VTM-3.0 with CPR enabled. The first scheme
removes CPR dependency in motion derivation process. It results in negligible BD-rate difference for all
test configurations on regular or screen content materials. The second scheme uses CPR mode as a third
mode other than intra or inter modes. Motion predictor for CPR mode and inter mode would be derived
mutual-exclusively from each other. The simulations show 1.1% in RA configuration and 1.5% in LDB
configuration in screen content coding materials.
(see also discussion under JVET-M0327)
In Track B, the general consensus is that it is a right direction to signal CPR as a separate mode rather
than using a special reference picture index. As this is a fundamental conceptual decision, this was later
discussed and decided in the JVET plenary.
JVET-M0541 Non-CE8: Combination of MMVD and CPR mode [Y. Li, Z. Chen (Wuhan
Univ.), X. Xu, S. Liu (Tencent)] [late]
This document describes unification methods of CPR merge mode with MMVD expansions. In the
proposed methods, merge candidates with default merge type (MRG_TYPE_DEFAULT_N) and CPR
merge type (MRG_TYPE_CPR) may both exist in the MMVD candidate list. Thus CPR will not be
excluded when MMVD is used. It is reported that the BD rate can be improved by enabling CPR with
MMVD expansions.
Several methods are proposed, such as clipping the fractional pel offset into integer values. The average
BD rate gain is around 0.3% for TGM, and 0.1% for class F in AI. Similar gain was reported in JVET-
M0341.
Further study.
JVET-M0542 Non-CE8: Combination of Multi Hypothesis Intra and CPR mode [Y. Li,
Z. Chen (Wuhan Univ.), X. Xu, S. Liu (Tencent)] [late]
This document proposes to unify the CPR mode operation with multi hypothesis intra mode. In the
proposed method, a multi-hypothesis can be a combination of intra mode and CPR merge mode. Thus
CPR mode is unified with inter mode coding when multi hypothesis intra mode tool is used. It is reported
that the BD rate changes by enabling CPR with multi-hypothesis intra mode are negligible.
The contribution shows that it would not be a problem to release the current bitstream constraint of
disallowing combination of CPR and CIIP. On the other hand, results show that it obviously does not give
benefit in terms of compression performance.
May be obsolete if CPR would be a separate mode.
JVET-M0544 Non-CE8: CPR with chroma 4x4 sub-block size when dual-tree is on [X. Xu,
X. Li, S. Liu (Tencent)] [late]
This contribution proposes to unify the handling of chroma sub-block size for CPR mode to that of inter
sub-block mode (4x4), when dual-tree is enabled for CPR. For each group of four 2x2 chroma sub-blocks
in a chroma CU under dual-tree condition, the collocated luma block vector of the top-left 2x2 chroma
sub-block is used to derive the chroma block vector for all four sub-blocks. Simulation results report that:
The BD rate changes for CTC average are 0.00%/0.00%/0.00% when compared to VTM-3.0+CPR=1
anchor
The BD rate changes for Class F average are 0.06%/0.11%/0.07% when compared to VTM-
3.0+CPR=1 anchor
The BD rate changes for Class SCC 1080p average are 0.57%/0.25%/-0.08% when compared to
VTM-3.0+CPR=1 anchor
Does not have advantage for CPR, as due to the integer precision of displacement vectors the 2x2
subblocks are no problem. It causes loss, and there may even be the possibility of luma/chroma mismatch
e.g. at edges.
JVET-M0669 Crosscheck of JVET-M0544 (Non-CE8: CPR with chroma 4x4 sub-block size
when dual-tree is on) [J. Nam (LGE)]
JVET-M0634 Affine motion mode in intra coding [S. Cao, H. Han, J. Wang, F. Liang,
Y. Yu, Y. Liu] [late]
This was discussed in Track A Thursday night 1115-1130 (GJS & F. Bossen) – picked up by Track A to
ensure presentation before the meeting ended.
In the contribution, the results of affine motion mode in intra coding are reported. This proposed
technique combines the affine motion mode with current picture referencing (CPRAffine). The use of
CPRAffine is signalled by using a reference picture index pointing to this current reference picture. In this
way, the syntax structure and decoding process of CPRAffine mode are aligned with the regular intra
mode. The proposed test reportedly shows BD-rate gain under AI for VTM anchors are:
-0.08%/-0.06%/-0.00% for Y, U, V component of Class F, separately.
The runtime impact was reported as 213%. This is clearly not a good tradeoff, but it was commented that
there might be some issue with how this is implemented. Further study was suggested to determine
whether there is some problem with the implementation.
JVET-M0822 Non-CE8: Encoder optimization for palette mode [H. Wang, Y.-H. Chao,
V. Seregin, M. Karczewicz (Qualcomm)] [late]
This contribution proposes an encoder optimization for palette mode. The results show more than 20%
encoding time reduction for AI for screen content sequences. When tested with CPR enabled, the partial
results show that the encoding time is reduced from 157% to 115% with only 0.14% (Y) BD rate loss for
TGM420 sequences, and the encoding time is reduced from 174% to 151% with 0.11% (Y) BD rate loss
for class F.
Early termination criteria are used to avoid full checking of all intra modes if palette mode shows good
performance (i.e. testing palette before intra modes but after CPR). Further, termination criteria for not
testing palette are applied for areas that are too small (which also avoids checks in partitioning).
In the setup of the CE, this might be considered as becoming part of the CE software (for the palette basis
configuration 8.2.1).
JVET-M0890 CE9-related: BDOF buffer reduction and enabling VPDU based application
[H. Chen, X. Ma, S. Esenlik, H. Yang, J. Chen (Huawei), K. Kondo,
M. Ikeda, T. Suzuki] [late]
See notes under JVET-M0858.
JVET-M0115 CE10-related: pipeline reduction for LIC and GBI [P. Bordes, F. Galpin,
T. Poirier (Technicolor)]
JVET-M0183 CE10-related: Simplification of MPM generation for CIIP [M.-S. Chiang, C.-
W. Hsu, Y.-W. Huang, S.-M. Lei (MediaTek)]
JVET-M0276 CE10-related: MPM list alignment between CIIP and intra mode [J. Li,
S. Wang, W. Gao (Peking Univ.), L. Zhang, K. Zhang, H. Liu, J. Xu
(Bytedance)]
JVET-M0329 CE10-related: Modified enabling condition for triangle prediction unit mode
[F. Chen, L. Wang (Hikvision)]
JVET-M0450 CE10-related: LIC inheritance restrictions and interaction with GBI [M. Xu,
X. Li, X. Xu, S. Liu (Tencent)]
JVET-M0848 CE10 related Document: Speedups for Uniform Directional Diffusion Filters
For Video Coding (JVET-M0042) [J. Rasch, A. Henkel, J. Pfaff, H. Schwarz,
D. Marpe, T. Wiegand (HHI)] [late]
JVET-M0851 CE10-related: Using inter merge list derivation for triangle mode [H. Wang,
W.-J. Chien, V. Seregin, Y.-H. Chao, H. Huang, M. Karczewicz (Qualcomm),
X. Wang, Y.-W. Chen (Kwai), T.-D. Chuang, C.-Y. Chen, Y.-W. Huang, S.-
M. Lei (MediaTek), A. Tamse, M. W. Park, S. Jeong, M. Park, K. Choi
(Samsung)] [late]
JVET-M0883 CE10-related: Using regular merge index signalling for triangle mode
[H. Wang, W.-J. Chien, V. Seregin, Y.-H. Chao, H. Huang, M. Karczewicz
(Qualcomm), X. Wang, Y.-W. Chen (Kwai), T. Solovyev, S. Esenlik,
S. Ikonin, J. Chen (Huawei), M. Xu, X. Li, S. Liu (Tencent)] [late]
JVET-M0187 CE11-related: Long deblocking filters with reduced line buffer requirement
and enhanced parallel processing accessibility [C.-M. Tsai, C.-W. Hsu, Y.-W.
Huang, S.-M. Lei (MediaTek)]
This contribution proposes two aspects of modifications on top of the CE11.1.3 long deblocking filters in
JVET-M0186. In both VTM3.0 and CE11.1.3, deblocking can be performed at 8x8 grids and is skipped at
non-8x8 grids. The maximum numbers of to-be-read luma samples on one side of an edge during
deblocking are four and eight for VTM3.0 and CE11.1.3, respectively. The maximum numbers of to-be-
modified luma samples on one side of an edge during deblocking are three and seven for VTM3.0 and
6.13 CE13 related – Coding tools for 360° omnidirectional video (8)
Contributions in this category were were considered in a BoG reported in JVET-M0874. See section 5.13.
Contributions in this category were discussed XXday X Jan. XXXX–XXXX (chaired by XXX).
JVET-M0368 AHG8: 360Lib support for chroma sample location in PHEC blending
process [C.-H. Shih, Y.-H. Lee, J.-L. Lin, Y.-C. Chang, C.-C. Ju (MediaTek)]
JVET-M0547 360° coding tools using uncoded areas [J. Sauer, M. Bläser (RWTH Aachen
Univ.)] [late]
JVET-M0892 CE-13 related: Loop filter disabled across virtual boundaries [S.-Y. Lin,
L. Liu, J.-L. Lin, Y.-C. Chang, C.-C. Ju (MediaTek), P. Hanhart, Y. He
(InterDigital)] [late]
JVET-M0162 Adaptive loop filter with a maximum number of luma filters per slice
constraint [C.-Y. Chen, Z.-Y. Lin, C.-Y. Lai, Y.-W. Huang, S.-M. Lei
(MediaTek)]
This contribution proposes a “maximum number of luma filters per slice” constraint to adaptive loop filter
(ALF) in order to reduce the storage of filter coefficients in the on-chip memory. In the current ALF
design, up to 25 luma filters can be signalled and used in one slice, which requires one on-chip memory
with size equal to 25 filters x 13 coefficients per filter x 8 bits per coefficient = 2600 bits. The proposed
method is to constrain the number of luma filters per slice. When the maximum number of luma filters is
reduced from 25 to 16, the BD-rates for AI, RA, and LB are 0.00%, 0.00%, and 0.00%, respectively and
36% on-chip memory of luma filter coefficients is saved. When the maximum number of luma filters is
JVET-M0163 Adaptive loop filter with history filters [C.-Y. Chen, Z.-Y. Lin, C.-Y. Lai, Y.-
W. Huang, S.-M. Lei (MediaTek)]
This contribution proposes history filters in adaptive loop filter (ALF). The concept of history filters is to
allow using history filters for the current slice to increase coding efficiency, where the history filters are
decoded filters from previously decoded slices. When the maximum number of history filter set is five,
where one filter set contains all signalled filters of one slice, BD-rate savings are 0.16% and 0.34% for
RA and LB, respectively, and the required memory of history filter set storage is 1660 bytes (5 sets * 25
luma filters per set * 13 coefficients per luma filter * 1 byte per coefficient + 5 sets * 1 chroma filter per
set * 7 coefficients per chroma filter * 1 byte per coefficient). To reduce the memory requirement, it is
further proposed to apply history filters with the “maximum number of luma filters per slice” constraint in
JVET-M0162. When the maximum number of luma filters per slice is reduced from 25 to 16, BD-rate
savings are 0.16% and 0.33% for RA and LB, respectively, and the required memory size is 1075 bytes
(65% of no constraint). When the maximum number of luma filters per slice is 12, BD-rate savings are
0.15% and 0.33% for RA and LB, respectively, and the required memory size is 815 bytes (49% of no
constraint).
The proposal targets compression improvement by re-using filters from previous pictures. The argument
of saving on-chip memory (as under JVET-M0162) is not fully understood, as it would be external
memory that is needed to store the coefficients along with the refernce pictures.
The reduction in bit rate is relatively low, and the approach would introduce some additional dependency
between parameters of pictures. The current design of coding ALF parameters independently for every
picture is quite clean.
It was also asked how it is determined which filter to use. This is based on matching with the covariance
matrix.
No action.
JVET-M0619 Crosscheck of JVET-M0163 (Adaptive loop filter with history filters) [Y.-W.
Chen (Kwai Inc.)] [late]
JVET-M0164 Adaptive loop filter with virtual boundary processing [C.-Y. Chen, T.-D.
Chuang, Z.-Y. Lin, C.-Y. Lai, Y.-W. Huang, S.-M. Lei (MediaTek)]
In the adaptive loop filter (ALF) of VTM3.0, 7x7 diamond filters with 4x4 block-based classification are
used for luma, and a 5x5 diamond filter without classification is used for chroma, which induces seven
luma sample line buffers and four chroma sample line buffers in ALF implementation at the decoder. In
order to reduce the line buffer requirement, ALF with virtual boundary (VB) processing is proposed:
when one sample located at one side of a VB is filtered, accessing samples located at the other side of the
VB is forbidden. The originally required samples at the other side of the VB are replaced with padded
samples. To accommodate deblocking filter (DF) and sample adaptive offset (SAO) in VTM3.0, the VBs
are set as four luma lines and two chroma lines above CTU row boundaries. It is reported that ALF with
JVET-M0301 Non-CE: Loop filter line buffer reduction [A. M. Kotra, S. Esenlik, B. Wang,
H. Gao, J. Chen (Huawei)]
The current contribution proposes a mechanism of reducing the line buffer requirement of ALF (adaptive
loop filter). The contribution uses the concept of virtual boundaries (VBs) which are upward shifted
horizontal CTU boundaries by “N” samples. Modified ALF block classification and modified ALF
filtering are applied for the samples which are near the virtual boundary to reduce the number of line
buffers required. Modified ALF block classification only uses the samples which are above the VB to
classify the given 4 x 4 block which is above VB. Similarly for the classification of the 4 x 4 block below
VB, samples belonging to the lines below the VB are used. Modified ALF filtering uses a combination of
conditional disabling and truncated versions of the original ALF filter.
JVET-M0553 Crosscheck of JVET-M0301 (Non-CE: Loop filter line buffer reduction) [C.-
M. Tsai (MediaTek)] [late]
JVET-M0428 Encoder optimization with deblocking filter [N. Hu, V. Seregin, W.-J. Chien,
M. Karczewicz (Qualcomm)]
Deblocking filter is included in VTM-3.0 to apply to reconstructed pixels in order to reduce the blocking
artefacts between blocks. However, the encoder of VTM-3.0 doesn’t apply the deblocking filter in rate
distortion optimization (RDO). In this contribution, to enhance the coding performance, deblocking filter
is applied during RDO, such that distortion is calculated between filtered reconstructed pixels and original
ones. Test results reportedly show 0.58%, 0.71% and 0.66% luma gain with similar encoding and
decoding time, in AI, RA and LDB configuration respectively over VTM-3.0 anchor.
Decision (SW): Adopt JVET-M0428, not for CTC.
Some CE might to use this for additional comparison exercising normative vs. non-normative
optimizations.
However, when VVC performance is compared e.g. against HEVC, such tricks should not be used, or the
HM should use such option as well.
JVET-M0429 Coding tree block based adaptive loop filter [N. Hu, V. Seregin, H. Egilmez,
M. Karczewicz (Qualcomm)]
JVET-M0243 Cross-check of JVET-M0429 (Coding tree block based adaptive loop filter)
[S.-C. Lim, J. Kang, H. Lee, J. Lee (ETRI)] [late]
In this proposal, the coding tree block (CTB) based ALF scheme proposed in JVET-K0382 and JVET-
L0391 was tested in VTM-3.0. Test results reportedly show 0.16%, 0.51% and 0.67% luma gain with
similar encoding and decoding time, in AI, RA and LDB configuration respectively over VTM-3.0
anchor.
Each CTU can select one out of 22 filter sets (16 fixed, 5 temporal, 1 picture optimized). The 16 fixed
filters were optimized from HM encoded, range of qualities as in common test conditions.
As it was said before (see comments under JVET-M0163), the usage of temporal filter sets might be
undesirable, as it introduces further dependency between pictures.
(It is noted that depending on possible alternatives how the filter parameters would be signalled in HLS,
the above comment might need to be revised.)
It is also asked how the performance would be in non-CTC QP ranges, as the fixed sets were ptimized
specifically for the CTC QPs.
Compared to previous proposals on this issue, the gain is similar, but the encoding complexity is not
increased.
Study in CE, also considering the questions above, performance without temporal filters, lower QP range.
It is pointed out that the filtering should also be disabled for CIIP and IBC, as this would have the same
latency problem as intra prediction. Whereas it was said in the last meeting for another proposal (bilateral
filter) that had similar latency problem for inter and only 0.4% for RA, it is interesting to note that the
post-filtering gain seems still to be preserved in VTM3. It is also verbally reported that for the bilateral
filter a new reduced complexity exists (at last meeting, BF was more complex than Hadamard domain
filter). The proponents of BF announce that they intend to submit a late contribution on these changes.
It is discussed what the relation with other approaches, particularly diffusion filter would be. Each of
those would add another stage in the pipeline of prediction, residual generation by inverse transform, etc.,
where one expert argues, that this pipeline should not be extended by too many steps. Complexity-wise,
the diffusion filter is simpler in terms of number of operations per sample (however, has one
multiplication), and a decision on that is still to be made.
During the discussion, it is questioned whether it would be better to enable switching at block level.
Further study in CE for Test 3 with constraint on CIIP and IBC, also version with block-level flag. It
should also be tested how the performance is when it is outside of the loop, i.e. not used for predicting
intra blocks (in which it could be applied somewhere before deblocking).
JVET-M0222 Context Reduction for CABAC in VVC [Y.-H. Chao, A. Said, V. Seregin,
J. Dong, M. Karczewicz (Qualcomm)]
Discussed Thu 8pm. Chaired by FJB
No planned CE on context reduction.
Encouraged to resubmit in future meetings when cleaning up spec.
JVET-M0904 BoG report on neural networks for video coding [Y. Li, S. Liu]
This BoG report was presented in Track B on Thursday 17 January 1500-1815 (JRO).
This contribution provides the report of the BoG on neural networks (NN) for video coding, especially for
neural network-based loop filtering. An information report (JVET-M0691) was discussed in this BoG,
and then a CE plan.
The BoG recommended the following plan of a CE:
Divide the 6 NN based loop filter methods into two categories and build two sub CE tests.
JVET-M0159 AHG9: Convolutional neural network loop filter [Y.-L. Hsiao, C.-Y. Chen,
T.-D. Chuang, C.-W. Hsu, Y.-W. Huang, S.-M. Lei (MediaTek)]
This document presents two modifications of convolution neural network loop filter (CNNLF) introduced
in JVET-K0222. The first modification is to reduce two 4-layer networks separate for luma and chroma to
only one 3-layer network shared by luma and chroma. The second modification is to conditionally signal
the CNNLF parameters in the I-slice header. Compared with VTM3.0, the proposed CNNLF reportedly
achieves -1.23% (Y), -10.11% (Cb), and -9.96% (Cr) BD-rates with 42% decoding time increase in
random access (RA) condition without using any GPUs. After shifting coding gain from chroma to luma
by increasing chroma quantization parameter (QP) offsets, the BD-rates are -2.47% (Y), +2.90% (Cb),
and +3.01% (Cr). It is shown that Class C (small resolution, 832x480) has no coding gain because of the
relatively “expensive” side information bits of CNNLF parameters while Class A (large resolution,
3840x2160) has higher coding gains (-3.7% luma BD-rate and +3% chroma BD-rate). Further research on
CNNLF to reduce complexity and enhance training for improving coding efficiency is suggested.
It is assumed that the CNN parameters are offline trained per RA period.
Results with chroma QP offset are non-CTC, difficult to conclude something from that.
It is reported that the gain becomes lower for low resolution sequences such as class C, due to the higher
relative amount of network parameters (which is about 10 kbit/s).
Software would be available.
JVET-M0351 Convolutional Neural Network Filter (CNNF) for Intra Frame [C. Lin,
J. Yao, L. Wang (Hikvision)] [late]
Initial upload rejected as a placeholder.
This contribution provides a convolutional neural network filter (CNNF) for intra frames. In the current
VTM, multiple filters, i.e., deblocking filter (DF) and sample adaptive offset (SAO) are used to remove
artefacts or improve performance. CNNF is motivated by the latest advances in deep learning and is
proposed as a single type of filter to replace multiple filters in intra frame. Simulation results report -
4.94%, -7.07%, -8.17% BD-rate savings for luma, and both chroma components for VTM-3.0 with AI
configuration.
Same method was proposed in JVET-I0022 (by that time run on top of JEM). Similar gain.
Software was already released by that time.
JVET-M0508 AHG9: Test Results of Dense Residual Convolutional Neural Network based
In-Loop Filter [Y. Wang, Z. Chen, Y. Li (Wuhan Univ.), L. Zhao, S. Liu,
X. Li (Tencent)]
This contribution reports the test results for dense residual convolutional neural network based in-loop
filter (DRNLF) JVET-L0242 according to the methodology in JVET-L1006. The proposed DRNLF is
implemented on VTM 3.0. Simulation results report -2.17%, -1.47%, -1.48% BD-rate savings for luma ,
and both chroma components compared with VTM 3.0 under AI configuration in CPU only platform, and
-2.15%, -3.04%, -1.96% for RA configuration, and -2.06%, -3.73%, -2.86% for LDB configuration.
Operated between deblocking and SAO.
Different networks were trained specifically for the different QP values of CTC.
It is noted that it would be interesting to investigate how large the loss would be when used for another
QP value.
Gain over VTM3 is slightly lower than it was for VTM2.
JVET-M0510 AHG9: CNN-based in-loop filter proposed by USTC [Y. Dai, D. Liu, Y. Li,
F. Wu (USTC)] [late]
This contribution presents the simulation results of an efficient network for loop filter. To reduce storage
space and reduce complexity, we build two light weight deep convolutional neural networks by reduce
network parameters. Simulation results report -0.96%, -0.32%, -0.45% BD-rate savings for Y, Cb, and Cr
components compared with VTM3.0 under AI configuration, -0.61%, -0.25%, -0.26% for RA
configuration, and -0.76%, -0.56%, -0.69% for LDB configuration.
Operated between deblocking and SAO.
For AI; Decoding time 21x/13x for the two versions used. For LB: 3-4x. Encoding time increases approx.
25%.
Can be enabled/disabled at CTU level.
Question: Does this produce visual artefacts?
JVET-M0566 Adaptive convolutional neural network loop filter [H. Yin, R. Yang, X. Fang,
S. Ma, Y. Yu (Intel)] [late]
This document proposes an adaptive convolution neural network loop filter (ACNNLF) design. In this
design, 3 CNN based loop filters are adaptively trained for luma and chroma respectively from the current
video sequence. Each filter is a small 2 layer CNN with total 692 parameters. The encoder selects one of
the three ACNNLFs for luma and chroma respectively for each CTB block during encoding. The weights
of the trained set of ACNNLFs are signalled in the slice header of I frames and the index of selected
ACNNLF is signalled for each CTB.
Compared with VTM-3.0-RA, the proposed ACNNLF achieves -2.37%, -1.34%, and -2.77% BD-rates for
Y, U, and V, respectively, for Class A1 video sequences; -0.45%, -10.92%, and -6.19% BD-rates for Y,
U, and V, respectively, for Class A2 video sequences; -0.49%, -11.29%, and -10.73% BD-rates for Y, U,
and V, respectively, for Class B video sequences; and 0.12%, -3.31%, and -1.62% BD-rates for Y, U, and
V, respectively, for Class C video sequences.
Operated after ALF (last loop filter)
Two different CNN for luma and chroma. At CTU level, it can be decided which out of three filters to use
(or no filter). An additional weighting is signalled at slice level, which weights the residual that is
generated by the network and superimposed.
The gain comes mainly from a few sequences (Campfire has the biggest gain).
JVET-M0691 AHG9: Complexity analysis about neural network video coding tools [Y. Li,
Z. Chen (Wuhan Univ.), S. Liu (Tencent)] [late]
Reviewed in BoG JVET-M0904.
The BoG also met 15 Jan from 1800 to 1930, with the notes reflected in the -v3 version of this document.
This was discussed in Track A Wednesday 16 January 1400-1445.
At the 16 January meeting, the BoG also recommended, and Track A agreed, to the following further
recommendation:
Decision: Adopt an adaptation parameter set (APS) to carry ALF parameters. The tile group header
contains an aps_id which is conditionally present when ALF is enabled. The APS contains an aps_id
and the ALF parameters. A new NUT value is assigned for APS (from JVET-M0132). For the CTC,
we will just use aps_id = 0 and send the APS with each picture. For now, the range of APS ID values
will be 0..31 and APSs can be shared across pictures (and can be different in different tile groups
within a picture). The ID value should be fixed-length coded when present. ID values cannot be re-
used with different content within the same picture.
Further study was encouraged for making use of shared values across pictures and different APSs in the
same picture. Further study was also encouraged on whether to relax the constraint on re-using ID values.
JVET-M0120 Proposed NAL Unit Header Design Principles [S. Wenger, B. Choi, S. Liu
(Tencent)]
JVET-M0131 AHG17: On NAL unit types for IRAP pictures and leading pictures [Y.-K.
Wang, Hendry (Huawei)]
JVET-M0152 AHG17: On random access point for VVC [B. Choi, S. Wenger, S. Liu
(Tencent)]
JVET-M0153 AHG17: On leading picture for VVC [B. Choi, S. Wenger, S. Liu (Tencent)]
JVET-M0156 AHG17: On component type indication for VVC [B. Choi, S. Wenger, S. Liu
(Tencent)] [late]
JVET-M0157 AHG17: On picture order count for VVC [B. Choi, S. Wenger, S. Liu
(Tencent)] [late]
JVET-M0161 AHG17: Signalling random access properties in the NAL unit header
[L. Chen, C.-W. Hsu, Y.-W. Huang, S.-M. Lei (MediaTek)]
JVET-M0520 AHG17: On NAL unit header design for VVC [S. Wenger, B. Choi, S. Liu
(Tencent)]
JVET-M0537 AHG17: On tile group signalling in NAL unit header and as non-VCL NAL
unit [E. Thomas, A. Gabriel (TNO)] [late]
JVET-M0128 AHG17: On reference picture management for VVC [Y.-K. Wang, Hendry
(Huawei), S. Deshpande (Sharp), M. M. Hannusela (Nokia), G. Ryu, W. Choi
(Samsung), X. Wang, Y.-W. Chen (Kawi), L. Zhang (Bytedance), P. Wu,
M. Li (ZTE), S.-H. Kim (LG), J. Boyce (Intel), A. M. Tourapis, D. Singer
(Apple), F. Edouard, P. Andrivon (Technicolor), Y.-W. Huang, C.-W. Hsu,
C.-Y. Chen, T.-D. Chuang, L. Chen (MediaTek), K. Kawamura (KDDI), Y.-
C. Sun, J. Lou (Alibaba)]
After BoG discussion, this was discussed Sunday 13 January 1700 (GJS).
This contribution proposes a reference picture management approach for VVC based on direct signalling
and derivation of reference picture lists 0 and 1, without use of reference picture set (RPS) as in HEVC or
the sliding window plus memory management control operation (MMCO) process as in AVC.
It is asserted that the proposed approach is significantly simpler compared to the approaches in HEVC
and AVC.
The proposed direct-RPL-based reference picture management approach is summarized as follows:
Two reference picture lists, list 0 and list 1, are directly signalled and derived. They are not based on RPS
as in HEVC or the sliding window plus MMCO process as in AVC.
Reference picture marking is directly based on reference picture lists 0 and 1, utilizing both active and
inactive entries in the reference picture lists, while only active entries may be used as reference indices in
inter prediction of CTUs.
Information for derivation of the two reference picture lists is signalled by syntax elements and syntax
structures in the SPS, the PPS, and the slice header. Predefined RPL structures are signalled in the SPS,
for use by referencing in the slice header.
The two reference picture lists are generated for all types of slices, i.e., B, P, and I slices.
The two reference picture lists are constructed without using a reference picture list initialization process
or a reference picture list modification process.
Long-term reference pictures (LTRPs) are identified by POC LSBs. When needed, additional POC LSBs
are signalled for LTRPs, determined at a picture by picture basis.
In the discussion, a problem was identified with the POC MSB handling when considering random access
that is not at the start of the first CVS in the “original” bitstream. Instead of signalling MSBs directly, it
was suggested to use the scheme of HEVC with delta_poc_msb_present_flag[ i ] and
delta_poc_msb_cycle_lt[ i ] (and remove the proposed additional MSB length syntax in the PPS).
In the discussion, it was noted that there should be a POC wrap-around prevention constraint for short-
term pictures (see HEVC C.4 item 8). It was agreed that this is needed.
Decision: Adopt with modifications as described above.
JVET-M0132 AHG17: On header parameter set (HPS) [Y.-K. Wang, Hendry, J. Chen
(Huawei)]
JVET-M0260 AHG17: Carriage of tile group header parameters in higher level structures
[M. M. Hannuksela (Nokia)]
The following tile group picture was drawn for illustration and discussion purposes regarding the above
constraints (an accidental homage to Piet Mondrian).
The BoG recommended to adopt the software implementation in JVET-M0445 into the VTM.
Decision (SW): Adopted.
The software implementation in JVET-M0445 implements encoder motion constraints for MCTSs.
Add loop_filter_across_tile_group_enabled_flag to the PPS, with syntax as proposed in JVET-
M0160, and semantics as follows:
loop_filter_across_tile_groups_enabled_flag equal to 1 specifies that in-loop filtering operations
may be performed across tile group boundaries in pictures referring to the PPS.
loop_filter_across_tile_group_enabled_flag equal to 0 specifies that in-loop filtering operations are
not performed across tile group boundaries in pictures referring to the PPS. The in-loop filtering
JVET-M0527 raised some concerns related to hardware implementation. And it was commented that in
addition to the issues mentioned in JVET-M0527, this design also causes issues with line buffer ing, as
well as in VDPU rate.
It was commented that a benefit of the design is enable turning off loop filtering across CMP faces. It was
suggested that enabling loop filtering turning off within a tile would be a lighter solution than having this
design. It was commented that there is ongoing CE on this.
Discussion resumed 1920 Monday 14 January (GJS).
It was thus suggested that these two topics should be discussed and decided together.
The latest HEVC text specifies, as part of the semantics of an SEI message, the MCTS sub-bitstream
extraction process and requires the extracted sub-bitstream to be conforming bitstream. For VVC,
should we do more than this way of defining conformance for MCTSs? For example, should
normative decoding process be specified for decoding of MCTSs instead of just specifying SEI
messages for indication of the encoder motion constraints? There were several related proposals. For
example, whether MCTS sequences are signalled in parameter sets or whether an MCTS sequence
would have its own associated parameter sets. Whether to provide a level definition at the MCTS
sequence level. Whether to define a normative decoder interface and conforming decoding behaviour
for MCTS-specific decoding. Further study is highly encouraged.
On whether to (allow) treat boundaries of MCTS / sub-picture sequences as picture boundaries (i.e.,
to do padding).
Allowing this is expected to help in coding efficiency. However, it would impose some burden for
hardware implementations. Tests showing coding efficiency gain numbers and an analysis on hardware
implementation complexity are needed for making a decision on this. Note that JVET-M0445 provides a
software implementation that could be used as the basis and anchor for such coding efficiency
comparison.
Interested parties were requested to work offline to prepare the test conditions. This effort was
coordinated by M. Coban, the proposed test conditions were included in JVET-M0870.
It was thus suggested that JVET-M0870 be reviewed.
Whether an interface for inputting/outputting reference pictures to/from the decoder could be
acceptable in decoder implementation architectures.
This relates to allowing empty tiles and applying geometry padding to fill in the empty tiles in an external
process as proposed in JVET-M0547. The contribution was also discussed in the 360 o video BoG (see that
BoG report).
It was commented that there could be other ways to indicate uncoded areas in the picture than using
empty tiles.
6.19.3.2Tiling allowing tile size unit less than CTU size (5)
JVET-M0875 Request for flexible unit size tile with implementation friendly restriction
[T. Ikai, Y. Yasugi (Sharp), G. Bang (ETRI), Y.-W. Chen, X. Wang (Kwai
Inc.), M. Coban (Qualcomm), C.-C. Lin (ITRI), P.-H. Lin (Foxconn),
A. Ichigaya (NHK), K. Kawamura (KDDI), K. Kazui (Fujitsu), R. Sjöberg
(Ericsson), R. Skupin, K. Sühring, Y. Sanchez, T. Schierl (HHI), L. Zhang
(Bytedance)] [late]
This contribution proposes flexible unit size tile with certain restrictions, allowing tiling with tile size unit
that is not a multiple of the CTU size. In the tile BoG, the implementation difficulty for less than CTU
size and loop filter control for 360° video has been discussed. In this contribution, it is asserted that CTU
alignment is too restricted to enable important tile use cases. It is also argued that the proposed specific
restrictions alleviate the concerns implementation cost increase.
In JVET-M0066, the main part of this proposal had reportedly been implemented, tested and cross-
checked. The software and working draft had been uploaded at 2018-12-28 05:28:56 and this was
announced in the JVET main reflector at 2018-12-29.
The contribution reported the following points as the main difficulty for hardware implementation which
could affect cost:
1) Throughput capability, i.e. Partial size CTUs could need the same time processing time as full size
CTUs in pipeline processing.
2) Memory bandwidth / compression, i.e. temporal motion vector or other information needs to be
transferred between internal memory and external memory via burst transfer.
3) Address generator, i.e. address generator needs more flexible calculation which depends on tile
position.
The contribution proposed that the minimum unit granularity (TileUnitSizeY) be required to be 32 or
larger.
A participant commented that if the 32x32 restriction is imposed, the desired functionality can be
obtained with no special support in the standard, at a small loss of coding efficiency – by using 32x32 as
the CTU size.
It was also commented that the 32x32 granularity is still restrictive and may not align with natural content
boundaries.
The proponent showed some test results that indicated that the penalty of using 32x32 CTUs was large.
Other participants questioned whether that penalty could really be that large. The data had not been cross-
checked.
A participant commented that pipelining implementation requires fetching and processing data in large
chunks with a regular structure that this would interfere with.
There were strong concerns about this expressed by some participants. It was said that the 32x32 CTU
case is not as difficult because these come in strings.
Further study was encouraged. It was planned to ask AHG13 to measure the effect of CTU size on coding
efficiency.
JVET-M0160 AHG17: Flexible tile grouping for VVC [L. Chen, T.-D. Chuang, Y.-W.
Huang, S.-M. Lei (MediaTek)]
JVET-M0134 AHG12: On explicit signalling of tile IDs [Hendry, Y.-K. Wang, J. Chen,
M. Sychev (Huawei)]
JVET-M0155 AHG12: On tile group identification for VVC [B. Choi, S. Wenger, S. Liu
(Tencent)] [late]
JVET-M0430 AHG12: On Tiles and Tile Groups for VVC [R. Skupin, K. Sühring,
Y. Sanchez, T. Schierl (HHI)]
JVET-M0536 AHG12: On picture-level tiles and sequence-level tiles for VVC [E. Thomas,
A. Gabriel (TNO)] [late]
JVET-M0870 AHG12: Proposed JVET common test conditions and evaluation procedures
for MCTS and sub-pictures with boundary padding [M. Coban (Qualcomm),
R. Skupin (HHI)] [late]
Discussed Monday 1940 (GJS).
This document proposes common test conditions (CTC), conversion practices, and software reference
configurations to be used in evaluation of MCTS and sub-picture coding schemes.
This is for coding efficiency testing. The suggested method is to encode and decode MCTSs separately
and compute and substract the duplicate header overhead data quantity to measure results. Software
decoder runtimes might not be properly measured that way, since that is not likely to match how the
feature would be implemented. The tested use case is a cubemap projection 360° video source.
In Track A, it was suggested to make this a CE, since it was a plan for one specific test of a particular
technology (see CE12).
JVET-M0136 AHG12: Treating tile and tile group boundaries as picture boundaries
[J. Chen, Y.-K. Wang, Hendry, M. Sychev (Huawei)]
It was commented that it may be undesirable to depend on requiring decoders to be able to use parallel
processing, since some (e.g., software) decoders may not be able to do that.
It was agreed to consider these approaches in CE3.
JVET-M0169 CE3-related: Shared reference samples for multiple chroma intra CBs [Z.-Y.
Lin, T.-D. Chuang, C.-Y. Chen, Y.-W. Huang, S.-M. Lei (MediaTek)]
JVET-M0248 AHG16: Motion compensation with padded samples for small coding units
[H. Liu, J. Chon, H.-C. Chuang, L. Zhang, K. Zhang, J. Xu (Bytedance)]
JVET-M0814 Non-CE3: block size restriction on PDPC [L. Li, J. Heo, J. Choi, S. Yoo,
J. Choi, J. Lim, S. Kim (LGE)] [late]
JVET-M0511 Bug fix for rate control under all-intra [Y. Li, D. Liu, Z. Chen (USTC)] [late]
Presented Thu 8:30pm. Chaired by FJB
JVET-M0600 AHG10: Quality dependency factor based rate control for VVC [Z. Liu,
Z. Chen, Y. Li (Wuhan Univ.), Y. Wu, S. Liu (Tencent)] [late]
Presented Thursday 17 January 2030 (chaired by F. Bossen).
This contribution presents some improvements based on the current rate control scheme proposed in
JVET-K0390. With the proposed quality dependency factor based bit allocation algorithm, when using
the anchor bit rate of VTM 3.0 as the target, there are 0.34%/3.45%/3.02% for Y/U/V coding efficiency
improvements in random access (RA) configuration when compared with the rate control algorithm in
JVET-K0390.
Decision (SW): Adopt.
All Intra Main 10 - Over VTM-3.0 Random Access Main 10 - Over VTM-3.0
Test # Y U V EncT DecT Y U V EncT DecT
1.1.1 -0.59% -0.44% -0.47% 112% 103% -0.29% -0.31% -0.15% 102% 103%
Engine AI RA LB LP AI RA LB
5.1.13* +new init from -0.98% -0.96% -0.75% -0.73% 110/105 104/103 106/103
5.1.2
Rando
All m Low
Intra Access Delay B
AI RA LB
Test #
Y U V EncT DecT Y U V EncT DecT Y U V EncT DecT
Doc. #
CE6-1.1a/1.6a -0.17% -0.14% -0.11% 101% 99% -0.06% 0.05% 0.04% 101% 99% 0.02% 0.30% 0.13% 100% 98%
This has very small benefit, but no real impact on complexity. From a spec perspective, it is just a
matter of the values of numbers in tables. It increases the number of types of transforms being used in
the design and it was commented that having something different just for 4x4 seems conceptually
inconsistent. It was noted that DST4 is part of DCT2. It was also noted that the impact on LB is
negative. Some participants commented that the gain is too small to justify a change of the spec. The
proponent commented that some gain was shown for LDB with inter-MTS on (0.07% with low-delay
B). No gain was shown for low QP. It was agreed in the plenary not to adopt this change.
CE6-2.3a DST-7/DCT-8 with dual implementation support (no coding efficiency impact)
In the AI configuration, this provides a reported 9% speed-up for the decoder, 4% for the encoder (as
tested).
CE6-3.1b Block shape adaptive transform selection, but with an extra high-level flag to use DCT2
always (benefit relative to an anchor that is not using MTS: AI 1.61%, RA 0.71%, LB 0.14%)
It was asked how much benefit this has relative to an encoder that, for rectangular blocks, chooses a
fixed combination using the MTS syntax. The particular transform combination that this proposal
uses for rectangular blocks has a DCT2 in one direct and a non-DCT2 transform in the other
direction, which is not a combination supported in the MTS syntax, so that combination could not be
selected in the suggested alternative low-complexity approach. It was commented that this may call
into question the way MTS has been designed. Further study of these issues was encouraged.
Rando
All m Low
Intra Access Delay B
AI RA LB
CE6-4.1a JVET-M0140 -0.47% -0.16% 0.00% 108% 101% -0.83% -0.98% -0.06% 113% 102%
CE7.4: In transform coefficient coding, the greater than 2 flag is moved to the first coding pass after
the parity bit and the number of scans is reduced from 3 to 2 (very small coding efficiency
improvement)
In-loop “reshaping” seemed likely to be adopted.
Track B:
[Clean up the relationship between this section and the related notes elsewhere, avoiding duplication and
adding cross-references]
As a general remark, it was established in Track B that “further study” means that technology should be
studied in next CE on the subject area, whereas if such a remark is missing it implicitly means it shall not
be studied in CE. If further study in an AHG is expected, that would be explicitly expressed.
Furthermore, the issue was raised that many of (or most of) the CE proposals had come without
specification text. It was agreed that in future CEs, the text should be available by the time of the
document deadline. Furthermore, CE contribution documents should be complete and not make it
necessary to open old documents to understand the technology.
JVET-M0464 Unified Transform Type Signalling and Residual Coding for Transform Skip
Signals TS before MTS (and therefore also enables it for blocks up to 32x32), but also changes the MTS
binarization
CTC: 0.01%, 103%, 101% (AI), -0.02%, 98%, 100% (RA), -0.07%, 102%, 102% (LB)
Class F: -1.96%, 103%, 97% (AI), -2.14%, 98%, 100% (RA), -2.79%, 103%, 99% (LB)
TGM: -8.04%, 106%, 88% (AI), -8.74%, 106%, 96% (RA), -9.21%, 112%, 97% (LB)
Additionally changing the residual coding for transform skip (first modification and second modification)
results on average in (BD-rate Y, enc. time, dec.time):
CTC: -0.16%, 103%, 100% (AI), -0.07%, 98%, 101% (RA), -0.07%, 101%, 101% (LB)
Class F: -7.16%, 104%, 94% (AI), -5.76%, 98%, 100% (RA), -5.85%, 102%, 99% (LB)
TGM: -21.03%, 108%, 82% (AI), -15.57%, 105%, 94% (RA), -14.01%, 111%, 95% (LB)
CE10.2: OBMC
Among the four proposals, version 10.2.1 seems the only one which is manageable from complexity
perspective (but definitely adds some complexity). The test results are summarized as follows:
CE11: Deblocking
Sub-tests
1) long-tap deblocking filters (11.1)
2) deblocking at 4x4 block boundaries (11.2).
Viewing in CE11 to assess necessity and benefit still to be done (being prepared).
M46578 On the decoding interface for immersive media [Emmanuel Thomas (TNO) Rob
Koenen (Tiledmedia) Thomas Stockhammer (Qualcomm)]
This has been called “Immersive media access and delivery” in some MPEG work, and there was an
MPEG output document N 18071 at the October 2018 MPEG meeting.
This is for a scenario with additional processing that takes place after decoding; not a 1:1 mapping
between output of decoder and display of the decoded video by the decoding system.
Example: 360° video tiled streaming using cubemap with each face of the cubemap segmented further
into tiles. Using view-port-dependent streaming to only serve the tiles needed for viewing. Problems
mentioned:
Some systems having limits on the number of decoder instantiations.
Need for systems coordination of timing.
An example approach is rewriting the bitstream to produce a “packed picture” that is decoded.
Track A:
Reference picture management in high-level syntax per JVET-M0128 (modified as noted)
Small bug fixes from JVET-M0265
o Fix the software to match the WD to remove clipping of luma MVs before deriving chroma
MVs.
o Adopt rounding away from zero for MV averages to remove inconsistency for similar
averages: Offset = 1 << (F-1); M= S >= 0 ? (S + Offset) >> F: -((-S + Offset) >> F).
Flexible rectangular tile groups per JVET-M0853-v2 (constrained as noted), with software in JVET-
M0445 and with loop_filter_across_tile_groups_enabled_flag from JVET-M0160 (confirmed in plenary)
High-level syntax actions noted in discussion of JVET-M0816 BoG report.
Simplification of division operation used in CCLM modelling from JVET-M0064.
Simplifying PDPC linear interpolation to use nearest neighbour on secondary boundary for adjacent
angular modes
Bug fix in spec text related to CBF signalling identified in JVET-M0361.
Reduce complexity of 32-length DST-7/DCT-8 using zero-out approach of JVET-M0297 Test 2.
Enable transform skip up to 32x32 block size, with associated syntax approach in JVET-M0464 using
tu_mts_idx (substantial gain for Class F / SCC).
The reverse coding order part of CE3-1.1.1 intra sub-partitions coding mode and text was discussed in the
plenary. It was reported that there was only a 0.04% penalty for not doing the reverse coding order, so it
was agreed to adopt the proposed scheme without that aspect; text was made available in a revision of
JVET-M0102. Further study was suggested for limiting the sub-partition width to be greater than or equal
to 4.
Track A action item: Avoiding 32-point DST (with 64-length DCT2 on the other side) in CE6-4.1a
[0.01% penalty reported in plenary Thu and avoiding all DST combined with 64-length DCT2 has 0.02%
penalty, so no DST (including size 32, 16, 8 and 4) combined with 64-length DCT2]
Track A action item: CE12-2 in-loop remapping function (adoption action likely) – experiment results
were discussed in a plenary Thu 17th; there was no significant penalty for the additional restrictions.
The encoder algorithm was discussed. It was described in JVET-M0427.
At a previous meeting, a curve-crossing problem had been observed and it had been suggested to do
something about low-QP operation. There was said to be a very large amount of code in the encoder
optimization, with resolution dependency and a smoothness measure and various thresholds and checks.
Some of that code was reportedly related to a different variant (CE12-1) and can be removed. Some of it
was for HDR, which was not measured in this test (but is also in-scope for VVC). It was discussed
whether the code would be difficult to maintain and might have excessive tuning within the code. It was
commented that the complexity had been reduced from previous versions. This has been tested in
multiple rounds of CE and appears to provide significant gain if an adequate encoding method is use.
Track A also had recommended to discuss enabling CPR in CTC, at least for Class F.
Track B:
Reviewed all CE related BoGs (CEs 2, 4, 9, 10), CE8 & CE11 related had been reviewed in track
CE11 viewing ready, not reviewed, no conclusion yet
BoG on NN technology not reviewed yet
Various revisits still open pending on availability of more information. Most relevant are on CPR/IBC,
deblocking, diffusion filters
The following tile group picture was drawn for illustration and discussion purposes regarding the above
constraints (an accidental homage to Piet Mondrian).
JVET-M0527 raised some concerns related to hardware implementation. And it was commented that in
addition to the issues mentioned in JVET-M0527, this design also causes issues with line buffer ing, as
well as in VDPU rate.
It was commented that a benefit of the design is enable turning off loop filtering across CMP faces. It was
suggested that enabling loop filtering turning off within a tile would be a lighter solution than having this
design. It was commented that there is ongoing CE on this.
Discussion resumed 1920 Monday 14 January (GJS).
It was thus suggested that these two topics should be discussed and decided together.
The BoG also met 15 Jan from 1800 to 1930, with the notes reflected in the -v3 version of this document.
This was discussed in Track A Wednesday 16 January 1400-1445.
At the 16 January meeting, the BoG also recommended, and Track A agreed, to the following further
recommendation:
Decision: Adopt an adaptation parameter set (APS) to carry ALF parameters. The tile group header
contains an aps_id which is conditionally present when ALF is enabled. The APS contains an aps_id
and the ALF parameters. A new NUT value is assigned for APS (from M0132). For the CTC, we will
just use aps_id = 0 and send the APS with each picture. For now, the range of APS ID values will be
0..31 and APSs can be shared across pictures (and can be different in different tile groups within a
picture). The ID value should be fixed-length coded when present. ID values cannot be re-used with
different content within the same picture.
Further study was encouraged for making use of shared values across pictures and different APSs in the
same picture. Further study was also encouraged on whether to relax the constraint on re-using ID values.
See section 6.19.1.1.
JVET-M0843 BoG report on CE4 related inter prediction and motion vector coding
contributions [K. Zhang]
See section 6.4.This report was reviewed in Track B Tue 15 Jan 1215-1330 and 1435-1800.
Five sessions were held, 1540 ~ 2020 on Jan. 11, 0900 ~ 1045 on Jan. 12, 1830 ~ 2000 on Jan. 12,
1945~2300 on Jan. 13, 2130~2230 on Jan. 14 for discussing 47 technical contributions in five categories:
MMVD-related
From Track B discussion: This part should be combined with the Sub-CE on MMVD mode signalling
above, in particular combination with JVET-M0069 should be investigated.
JVET-M0206 CE4-related: MMVD improvements
-0.10% (102%, 103%)/0.01% (100%, 101%). Change the binarization method for
MMVD distance index
JVET-M0267 Non-CE4: Harmonization of MMVD and AMVR
-0.14% (100%, 100%)/0.00% (100%, 100%). Distance based on signalled AMVR
From Track B discussion: It should be clarified how this relates to the adoption of JVET-M0255, and
different combinations tested (e.g. enabling only one, different ways of interpreting them together)
JVET-M0307 CE4-related: Candidates optimization on MMVD
-0.08% (96%, 99%)/-0.02% (99%, 101%). Reduce distance candidates and introduce
distance refinement
JVET-M0308 Non-CE4: MMVD simplification
-0.01% (97%, 98%)/0.08% (100%, 99%). Remove MVD scaling in MMVD
From Track B: Not worthwhile, not in CE.
JVET-M0314 CE4-related: MMVD improving with signalling distance table
-0.20% (103%, 97%)/0.02% (104%, 100%). Signalling the distance table in slice
header
JVET-M0507 has two aspects, where the aspect of removing the clipping for the shared merge list might
also be beneficial on top of JVET-M0170. It is reported to come at no coding loss. The proponents of
M0507 discussed with proponents of M0170 that the check for one of the subbocks being outside of the
picture could be removed, whereas another check whether the CU center is still inside needed to be
added, to make it consistent with other boundary check conditions in VVC. Proponents of M0170 were to
make an update on this aspect.
Related to JVET-M0281
JVET-M0081 Non-CE4: Simplification of AMVP list generation in AMVR
0.01% (101%, 99%)/0.03% (101%, 99%), Remove all intermediate rounding for
AMVR mode, only keep the final rounding
From discussion in Track B: The unification of where the rounding is done was achieved by adoption of
JVET-M0281, no need to change that again.
Related to JVET-M0403
JVET-M0422 CE4-related: Simplified MVD coding
0.03% (100%, 100%)/0.02% (99%, 103%), bypass code coding
abs_mvd_greater1_flag
From discussion in Track B: Saving context coded bins in MV coding does not have high relevance, as it
hardly influences the worst case of CABAC throughput. Getting it at expense of loss is not desirable.
JVET-M0406 CE2/4-related: Unified merge list size for block and sub-block merge modes
0.06% (101%, 101%)/ 0.10% (100%, 103%), unified merge list size = 5; -0.02%
(101%, 101%)/ 0.00% (100%, 103%), merge list size = 6
JVET-M0661 AHG-13: On Merge List Size
From discussion in Track B: Reduction of MMVD candidates gives loss and is mainly at benefit of
encoder. At the last meeting, when MMVD was adopted, different numbers of candidates were
considered and it was finally decided to use two, based performance/complexizy tradeoff. As the gap
between using one and two candidates may have become smaller, it could be worthwhile to consider this
aspect again on top of VTM4. Test this as part of the MMVD sub-CE, in combination with other
proposals.
Changing the sequence of candidates based on block shape introduces some irregularity in merge list
construction, which is not desirable (same proposal was investigated in earlier CE but not considered).
JVET-M0857 BoG report on CE3-related intra prediction and mode coding [G. Van der
Auwera]
(Track A Tuesday 1430–1630 GJS)
The BoG reviewed related input contributions to Core Experiment 3 on intra prediction and mode coding,
and formulated recommendations for consideration by the track (A).
The CE3-related documents were categorized as follows:
Cross-component prediction (14)
Luma intra mode coding (8)
Chroma intra mode coding (6)
Interpolation of intra reference samples (4)
PDPC-related (7)
Various (7)
The BoG met on 11 Jan. 2019 from 9:00am–1:00pm, from 3pm–7pm, and on 12 Jan. 2019, from 9am–
1:00pm, 3pm–6pm. [Incorporate into section 2]
Open issues from BoG, where additional presentation and discussion was performed in Track B:
JVET-M0268, Non-CE2: Interweaved Prediction for Affine Motion Compensation
Additional results were presented in Track B which observe the VPDU constraint (denoted as test3/test4 in r2).
These show that the gain of 0.27% (RA) is retained, while current worst case memory bandwidth of subblocks
(relating to 4x4 bi) is not exceeded, also number of interpolations is not higher, and the additional superposition is
cheap. However, as it is the goal to establish more restrictions on the memory bandwidth of affine subblocks (see
notes on planned CE above) it would depend how the method of M0268 would perform in combination with those.
Study this aspect in CE.
JVET-M0432, CE2-related: Combination of CE2.2.3.d and affine inheritance from motion data line buffer
In comparison to CE 2.2.3.d (which replaces the local buffer for CPMV by a history-based approach and this way
reduces local memory storage), but came with 0.07%/0.09% loss for RA/LB, this method reinvokes the usage of the
CPMV candidate from the normal MV buffer at upper CTU boundary. The loss compared to VTM3 is now
0.03%/0.07%. Further study in CE.
JVET-M0343, Non-CE2: Simplified subblock motion derivation for SbTMVP
The complexity reduction is negligible, but small loss – no action.
M0311 was reviewed and agreed to become part of the CE on affine memory bandwidth reduction.
See section 6.2.
CIIP
JVET-M0874 BoG report on CE13 and CE13 related 360° video coding [J. Boyce]
See section 5.13.(Track A 1600-1700 Thursday 17 January GJS)
The BoG met on 13 Jan 2019 from 1800 to 2030. The BoG met on 14 Jan 2019 from 1800 to 1900. The
BoG met again on 16 Jan from 1700 to 1800.
The BoG recommended to adopt JVET-M0892 for disabling of in-loop filters (deblocking, SAO, and
ALF) at vertical and horizontal boundaries signalled in the SPS at MinCbSizeY granularity.
It was noted that the change that was requested was not specific to 360° video.
It was noted that this proposed change is for entire columns / entire rows, not line segments that do not
cut through the entire picture.
For a conventional cubemap, the filter would be disabled for one horizontal line in the middle of the
picture.
It was discussed what sort of limit there would be for how many of these cuts would be allowed. One
suggestion was a limit of 3 cuts in each direction.
The granularity was also discussed – whether it was really necessary to have 4x4 granularity.
It was discussed how this would be implemented in a real decoder. Checking a long list of positions
would not be reasonable.
Due to a desire for further study of this before making a decision, no action was taken on this.
Further study was also recommended to consider more flexible in-loop filter disabling patterns, use cases,
and HW implementation complexity.
Open issues:
JVET-M0891 BoG report on CE7 related quantization and coefficient coding contributions
[Y. Ye]
This report was discussed in Track A Thursday 17 January 1700-1800 (GJS).
Overall Y U V Ave-UV
AI -0.29% -0.79% -2.16% -1.48%
RA -0.19% -1.44% -2.00% -1.72%
LD-B -0.08% -2.38% -4.83% -3.61%
LD-P -0.11% -2.46% -5.08% -3.77%
o This proposal achieves similar gains for all sequence classes and all coding configurations.
o This method is simpler than an LM chroma mode that predicts Cr from Cb that was
previously proposed, and achieves similar coding performance as that previous LM chroma
mode. This complexity reduction of this proposed method compared to the previous LM
chroma mode could be especially beneficial for the decoder.
o It seems that such coding gain was usually not available from coefficient coding methods.
o Low QP results (provided in a v3 of the contribution) were discussed at the second BoG
session. The chroma gains are reported to be somewhat lower than those of CTC, but the
gains in luma are similar to (AI case) or higher than (RA and LB cases) those of CTC.
o In Track A, a participant commented wondered whether there could be subjective artefacts.
o A participant asked how many blocks used this, and it was said that around half of the time
that both channels had a residual, it would use this mode.
o The proponent said it seems to also work for HDR.
Further study in a CE was planned for this.
BoG recommendations for CE testing:
CE on coefficient coding (cleanup):
o JVET-M0198, CE7-related: Unified Rice parameter derivation for coefficient level coding
(some more computes but hardware friendly, removing one variation)
o No test of this one (some loss): JVET-M0469, CE7-related: Unified Rice parameter
derivation for coefficient coding
o JVET-M0558: CE7-related: Template-based Rice parameter derivation (some more storage
of reconstructed values and computes but hardware friendly, removing one variation)
o No test of this one (don’t worry about number of contexts at this point): JVET-M0107: CE7-
related: Reduced local neighbourhood usage for transform coefficients coding
o No test of this one (don’t worry about number of contexts at this point): JVET-M0489: CE7-
related: Reduced context models for transform coefficients coding
o JVET-M0491: CE7-related: Reduced maximum number of context-coded bins for transform
coefficient coding
CE on screen content coding (coding efficiency)
o JVET-M0278: Non-CE7: Residual rearrangement for transform skipped blocks
o JVET-M0279: Non-CE7: Sign coding for transform skip
Regarding JVET-M0198, JVET-M0469 and JVET-M0558, during the BoG discussion, the pros and cons
of the proposed approach vs the current design in VVC draft 3 were discussed. The approaches of JVET-
M0685
Prediction of QP value.
HEVC had a different QP prediction operation for when WPP is on and off. When WPP is off, any QP
change will propagate to all subsequently coded blocks in the slice.
In the current draft the QP is predicted as the average of the QP above and to the left. If either of those is
outside the CTU, the previous one in decoding order is used as the predictor.
This is a problem for parallel encoding, e.g., when encoding a CTU at the left edge, the QP predictor
(which is from the right edge) may not be known.
M0685 proposed storing the QP values of the bottom of the above CTU and considering those
“available”. If both are available, an average would be used; if one is unavailable, the other would be
used.
In the discussion of the contribution, another method was discussed, which was to only have the QP of
the above CTU available if the current CTU is the first CTU of the row.
It was commented that it may be undesirable to depend on requiring decoders to be able to use parallel
processing, since some (e.g., software) decoders may not be able to do that.
It was agreed to consider these approaches in CE3.
M0248 was reviewed in the context of CE2 rather than in this BoG.
M0265 was not reviewed in the BoG. See notes for that.
See section 7.
JVET-M0904 BoG report on neural networks for video coding [Y. Li, S. Liu]
This BoG report was presented in Track B on Thursday 17 Jan. 1500-XXXX.
This contribution provides the report of the BoG on neural networks (NN) for video coding, especially for
neural network based loop filter. An information report (JVET-M0691) is discussed in this BoG, and then
the CE plan.
10.7 List of actions taken affecting Draft 3 of VVC, VTM 3, and 360Lib
The following is a summary, in the form of a brief list, of the actions taken at the meeting that affect the
text of the VVC draft text, VTM or 360Lib description. Both technical and editorial issues are included.
This list is provided only as a summary – details of specific actions are noted elsewhere in this report and
the list provided here may not be complete and correct. The listing of a document number only indicates
that the document is related, not that it was adopted in whole or in part.
Category Motivation Modification AI RA Document Decision
BD-R Y BD-R Y
In-loop filters
ALF Fix pcm_loop_filter_d 0.0% 0.0% JVET-M0277 Decision (BF/text): Include
isabled_flag for disabling ALF as a third loop
ALF filter when the
pcm_loop_filter_disabled_flag is
set, as suggested in JVET-
M0277
Deblocking Subjective quality Long deblocking 0.1% 0.0% JVET-M0471 Decision: Adopt JVET-M0471,
version 11.1.8 (specification text
available in v2 upload, but needs
another small modification for
restriction of line buffer, was
shortly review in track B Thu
1330), pending on confirmation
from the viewing, and the more
detailed report on complexity
impact
Deblocking Subjective quality Deblocking of JVET-M0908 combination of JVET-M0103
CIIP boundaries and JVET-M0294
Reconstruction Coding efficiency Picture -0.9% -1.3% JVET-M0427 Decision: Adopt (modified as
reconstructon with noted).
mapping
Intra
Prediction mode Coding efficiency Intra subpartitions -0.6% -0.3% JVET-M0102 Decision: Adopt 1.1.1 proposal,
pending the provision of text and
its review.
CCLM Coding efficiency Modified CCLM 0.0% 0.0% JVET-M0142 Decision: Adopt 2.4.c with a
(HDR) downsampling high-level flag to switch
filter between two chroma format type
optimizations (pending test
results for applying the type 2
scheme to type 0 content).
CCLM Simplification Table reduction in 0.0% 0.0% JVET-M0064 Decision: Adopted
CCLM modelling
Prediction Cleanup Harmonize the ref JVET-M0095 Editorial action item: Agreed
sample filtering
PDPC Simplification Simplified linear 0.0% 0.0% JVET-M0238 Decision: Adopted
interpolation
CPR Coding efficiency Reference sample JVET-M0407 Decision: Adopt JVET-M0407
(SCC) memory reuse (variant a)
Draft specification text shall be provided with CE input documents. Availability of spec text is important
to have a detailed understanding of the technology and also to judge what its impact on the complexity of
the spec will be. There must also be sufficient time to study it in detail. CE contributions without
sufficiently mature draft spec text in the CE input document should not be considered for adoption.
Plans for the CEs to be conducted were established Thursday 18 January (GJS); CE plan documents were
reviewed Friday 19 January (GJS & JRO).
Lists of participants in CE documents should be pruned to include only the active participants. Read
access to software will be available to all members.
Draft text and test model algorithm description B. Bross, J. Chen (co- N
editing (AHG2) chairs), J. Boyce,
S. Kim, S. Liu, Y. Ye
(jvet@lists.rwth-aachen.de)
(vice-chairs)
Produce and finalize JVET-M1001 VVC text
specification draft 4.
Produce and finalize JVET-M1002 VVC Test Model
4 (VTM 4) Algorithm and Encoder Description.
Gather and address comments for refinement of
these documents.
Coordinate with test model software development
AhG to address issues relating to mismatches
between software and text.
13 Output documents
The following documents were agreed to be produced or endorsed as outputs of the meeting. Names
recorded below indicate the editors responsible for the document production. Where applicable, dates of
planned finalization and corresponding parent-body document numbers are also noted.
It was reminded that in cases where the JVET document is also made available as MPEG output
document, a separate version under the MPEG document header should be generated. This version should
be sent to GJS and JRO for upload.
JVET-M1000 Meeting Report of the 13th JVET Meeting [G. J. Sullivan, J.-R. Ohm] (2019-
03-08, near next meeting)
Initial versions of the meeting notes (d0 … d8) were made available on a daily basis during the meeting.
JVET-M1001 Versatile Video Coding (Draft 4) [B. Bross, J. Chen, S. Liu] [WG 11 N 18274]
(2019-03-08)
(Initial version planned to be made available by 2019-02-01.)
See the list of elements under section Error: Reference source not found 10.7, as agreed by the Wed. 18
October plenary.
JVET-M1002 Algorithm description for Versatile Video Coding and Test Model 4 (VTM 4)
[J. Chen, Y. Ye, S. Kim] [WG 11 N 18725] (2019-03-08)
(Initial version planned to be made available by 2019-02-15.)
See the list of elements under section Error: Reference source not found 10.7, as agreed by the Wed. 18
October plenary.
Remains valid – not updated: JVET-L1005 Methodology and reporting template for coding
tool testing [W.-J. Chien and J. Boyce] (2018-10-26)
JVET-M1006 Methodology and reporting template for neural network coding tool testing
[Y. Li, S. Liu, K. Kawamura] (2019-02-01)
This output was produced to capture aspects specific to enable study of neural network techniques.
JVET-M1010 JVET common test conditions and software reference configurations for
SDR video [F. Bossen, J. Boyce, X. Li, V. Seregin, K. Sühring] (2019-02-01)
Update regarding CPR and hash search, used only for class F.
Enable inter MTS for lower-resoluitons? Perhaps in a CE, but not in CTC.
Remains valid – not updated: JVET-L1011 JVET common test conditions and evaluation
procedures for HDR/WCG video [A. Segall, E. François, S. Iwamura,
D. Rusanovskyy] (2018-10-26)
Remains valid – not updated: JVET-L1012 JVET common test conditions and evaluation
procedures for 360° video [P. Hanhart, J. Boyce, K. Choi, J.-L. Lin] (2018-
10-26)
JVET-M1023 Description of Core Experiment 3 (CE3): Intra prediction and mode coding
[G. Van der Auwera, L. Li, A. Filippov]
JVET-M1024 Description of Core Experiment 4 (CE4): Inter prediction and motion vector
coding [H. Yang, G. Li, K. Zhang]
Potentially obsolete notes: New CEs which may fill gaps in above numbering:
Adaptive loop filter [V. Seregin, …]
Post prediction/reconstruction filtering (include BF, HF, LIC, DIF) [J. Ström, S. Ikonin, …]
Neural network based loop filters [Y. Li, …]
JVET-VC MPEG
Created First upload Last upload Title Authors
number number
2018-12-28 2019-01-09 2019-01-09 JVET AHG report: Project management
JVET-M0001 m45352 J.-R. Ohm, G. . J. . Sullivan
17:24:14 00:06:28 00:06:28 (AHG1)
JVET AHG report: Draft text and test B. . Bross, J. . Chen, J.
2019-01-01 2019-01-09 2019-01-10
JVET-M0002 m45400 model algorithm description editing . Boyce, S. . Kim, S. . Liu, Y.
03:28:59 09:23:09 10:58:16
(AHG2) . Ye
2019-01-01 2019-01-09 2019-01-12 JVET AHG report: Test model software F. . Bossen, X. . Li, K.
JVET-M0003 m45401
03:36:29 09:47:41 16:17:20 development (AHG3) . Sühring
T. . Suzuki, V. . Baroncini, R.
2018-12-28 2019-01-07 2019-01-07 JVET AHG report: Test material and visual
JVET-M0004 m45328 . Chernyak, P. . Hanhart, A.
06:58:26 15:13:49 15:13:49 assessment (AHG4)
. Norkin, J. . Ye
R. . Hashimoto, T. . Ikai, X.
2019-01-02 2019-01-09 2019-01-09 JVET AHG report: Memory bandwidth
JVET-M0005 m45684 . Li, D. . Luo, H. . Yang, M.
19:20:11 09:02:58 09:19:06 consumption of coding tools (AHG5)
. Zhou
2019-01-03 2019-01-07 2019-01-07 JVET AHG Report: 360 video conversion
JVET-M0006 m45742 Y. . He, K. . Choi
01:00:15 02:12:34 02:12:34 software development (AHG6)
2019-01-08 2019-01-09 2019-01-10 JVET AHG report: Coding of HDR/WCG A. . Segall, E. . François, D.
JVET-M0007 m46280
19:10:24 07:47:34 19:11:28 material (AHG7) . Rusanovskyy
2019-01-08 2019-01-08 2019-01-08 JVET AHG report: 360° video coding J. . Boyce, K. . Choi, P.
JVET-M0008 m46264
14:58:16 23:16:33 23:16:33 tools and test conditions (AHG8) . Hanhart, J.-L. Lin
S. . Liu, B. . Choi, K.
2019-01-07 2019-01-07 2019-01-07 JVET AHG report: Neural Networks in
JVET-M0009 m46196 . Kawamura, Y. . Li, L.
22:44:22 22:46:23 22:46:23 Video Coding (AHG9)
. Wang, P. . Wu, H. . Yang
A. . M. . Tourapis, A.
2019-01-09 2019-01-09 2019-01-09 JVET AHG report: Encoding algorithm . Duenas, C. . Helmrich, S.
JVET-M0010 m46297
08:31:12 08:31:36 08:31:36 optimizations (AHG10) . Ikonin, A. . Norkin, R.
. Sjöberg
S. . Liu, J. . Boyce, A.
2019-01-05 2019-01-07 2019-01-07 JVET AHG report: Screen Content Coding
JVET-M0011 m45915 . Filippov, Y.-C. Sun, J. Xu,
09:37:49 22:16:08 22:16:08 (AHG11)
M. Zhou
T. Ikai, M. M. Hannuksela, R.
2019-01-05 2019-01-09 2019-01-09 JVET AHG report: High-level parallelism
JVET-M0012 m45897 Sjöberg, R. Skupin, W. Wan,
02:37:00 08:24:15 12:59:56 and coded picture regions (AHG12)
Y.-K. Wang, S. Wenger
W.-J. Chien, J. Boyce, R.
2019-01-08 2019-01-08 2019-01-08 JVET AHG report: Tool reporting
JVET-M0013 m46281 Chernyak, R. Hashimoto, Y.-
20:39:28 20:47:52 20:47:52 procedure (AHG13)
W. Huang, S. Liu, D. Luo
2019-01-08 2019-01-08 2019-01-08 JVET AHG report: Progressive intra refresh J.-M. Thiesse, A. Duenas, K.
JVET-M0014 m46269
16:13:04 23:36:53 23:36:53 (AHG14) Kazui, A. Tourapis
J. Boyce, J. Chen, S.
2019-01-08 2019-01-13 2019-01-13 JVET AHG report: Bitstream decoding Deshpande, M. Karczewicz,
JVET-M0015 m46265
15:00:58 15:33:27 15:33:27 properties signalling (AHG15) A. Tourapis, Y.-K. Wang, S.
Wenger
M. Zhou, J. An, E. Chai, K.
2019-01-02 2019-01-07 2019-01-31 JVET AHG report: Implementation studies
JVET-M0016 m45673 Choi, S. Sethuraman, T.
18:37:14 03:02:57 05:35:36 (AHG16)
Hsieh, X. Xiu
R. Sjöberg, S. Deshpande, M.
2019-01-04 2019-01-09 2019-01-09 JVET AHG report: High-level syntax
JVET-M0017 m45872 M. Hannuksela, R. Skupin,
13:58:52 12:42:32 12:42:32 (AHG17)
Y.-K. Wang, S. Wenger
2019-01-04 2019-01-07 2019-01-10 J. Ma, F. Le Léannec, M. W.
JVET-M0021 m45873 CE1: Summary report on partitioning
14:00:45 07:17:44 09:11:34 Park
2019-01-04 2019-01-09 2019-01-09 CE2: Summary report on sub-block based Y. He, C.-Y. Chen, C.-C.
JVET-M0022 m45863
07:30:26 00:31:59 15:41:29 motion prediction Chen
2018-12-28 2019-01-06 2019-01-09 CE3: Summary report on intra prediction G. Van der Auwera, J. Heo,
JVET-M0023 m45355
18:55:18 04:12:23 15:38:42 and mode coding A. Filippov
2019-01-05 2019-01-08 2019-01-10 CE4: Summary report on inter prediction
JVET-M0024 m45900 H. Yang, S. Liu, K. Zhang
03:06:52 04:33:43 10:50:25 and motion vector coding
2019-01-04 2019-01-09 2019-01-10 CE5: Summary report on the Arithmetic
JVET-M0025 m45878 H. Kirchhoffer, A. Said
15:48:58 15:53:12 16:07:15 Coding Engine
2019-01-03 2019-01-09 2019-01-10 CE6: Summary Report on Transforms and
JVET-M0026 m45809 A. Said, X. Zhao
09:21:03 01:39:16 19:01:12 Transform Signalling