Make clear that this isn't the official version
[dirac-spec-errata.git] / quant-matrix.tex
blobc083ee259ae591e69d7035cc0e037fc352ed994a
1 \label{quantmatrices}
3 This annex specifies the default quantisation matrices to be used
4 in the low delay syntax and provides an informative description of quantisation
5 matrix design principles and of quantiser selection in both the core
6 and low-delay syntax.
8 \subsection{Quantisation matrices (low delay syntax)}
9 \label{defaultquantmatrices}
11 This section defines default quantisation matrices to be used
12 for the quantisation of slice coefficients in the low-delay syntax.
13 The following tables define matrices for $\TransformDepth\leq 4$.
14 Values of $\TransformDepth$ not present in the tables
15 in this section shall require a custom matrix to be encoded,
16 as per Section \ref{sliceparams}. Informative advice for
17 constructing quantisation matrices based on noise power
18 conservation and perceptual weighting is given in
19 Annex \ref{custommatrices}.
21 \begin{table}[!ht]
22 \centering
23 \begin{tabular}{|c|c|c|c|c|c|c|}
24 \hline
25 \multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
26 \hline
27 Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
28 \hline
29 0 & LL & 0 & 5 & 5 & 5 & 5\\
30 \hline
31 1 & HL,LH, HH & - & 3, 3, 0 & 3, 3, 0 & 3, 3, 0 & 3, 3, 0 \\
32 \hline
33 2 & HL,LH, HH & - & - & 4, 4, 1 & 4, 4, 1 & 4, 4, 1 \\
34 \hline
35 3 & HL,LH, HH & - & - & - & 5, 5, 2 & 5, 5, 2 \\
36 \hline
37 4 & HL,LH, HH & - & - & - & - & 6, 6, 3 \\
38 \hline
39 \end{tabular}
40 \caption{Default quantisation matrices for $\WaveletIndex==0$ (Deslauriers-Dubuc (9,7))
41 \label{table:qm0}}
42 \end{table}
44 \begin{table}[!ht]
45 \centering
46 \begin{tabular}{|c|c|c|c|c|c|c|}
47 \hline
48 \multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
49 \hline
50 Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
51 \hline
52 0 & LL & 0 & 4 & 4 & 4 & 4\\
53 \hline
54 1 & HL,LH, HH & - & 2, 2, 0 & 2, 2, 0 & 2, 2, 0 & 2, 2, 0 \\
55 \hline
56 2 & HL,LH, HH & - & - & 4, 4, 2 & 4, 4, 2 & 4, 4, 2 \\
57 \hline
58 3 & HL,LH, HH & - & - & - & 5, 5, 3 & 5, 5, 3 \\
59 \hline
60 4 & HL,LH, HH & - & - & - & - & 7, 7, 5 \\
61 \hline
62 \end{tabular}
63 \caption{Default quantisation matrices for $\WaveletIndex==1$ (LeGall (5,3))
64 \label{table:qm1}}
65 \end{table}
67 \begin{table}[!ht]
68 \centering
69 \begin{tabular}{|c|c|c|c|c|c|c|}
70 \hline
71 \multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
72 \hline
73 Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
74 \hline
75 0 & LL & 0 & 5 & 5 & 5 & 5\\
76 \hline
77 1 & HL,LH, HH & - & 3, 3, 0 & 3, 3, 0 & 3, 3, 0 & 3, 3, 0 \\
78 \hline
79 2 & HL,LH, HH & - & - & 4, 4, 1 & 4, 4, 1 & 4, 4, 1 \\
80 \hline
81 3 & HL,LH, HH & - & - & - & 5, 5, 2 & 5, 5, 2 \\
82 \hline
83 4 & HL,LH, HH & - & - & - & - & 6, 6, 3 \\
84 \hline
85 \end{tabular}
86 \caption{Default quantisation matrices for $\WaveletIndex==2$ (Deslauriers-Dubuc (13,7)))
87 \label{table:qm2}}
88 \end{table}
90 \begin{table}[!ht]
91 \centering
92 \begin{tabular}{|c|c|c|c|c|c|c|}
93 \hline
94 \multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
95 \hline
96 Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
97 \hline
98 0 & LL & 0 & 8 & 12 & 16 & 20\\
99 \hline
100 1 & HL,LH, HH & - & 4, 4, 0 & 8, 8, 4 & 12, 12, 8 & 16, 16, 12 \\
101 \hline
102 2 & HL,LH, HH & - & - & 4, 4, 0 & 8, 8, 4 & 12, 12, 8 \\
103 \hline
104 3 & HL,LH, HH & - & - & - & 4, 4, 0 & 8, 8, 4 \\
105 \hline
106 4 & HL,LH, HH & - & - & - & - & 4, 4, 0 \\
107 \hline
108 \end{tabular}
109 \caption{Default quantisation matrices for $\WaveletIndex==3$ (Haar with no shift))
110 \label{table:qm3}}
111 \end{table}
113 \begin{table}[!ht]
114 \centering
115 \begin{tabular}{|c|c|c|c|c|c|c|}
116 \hline
117 \multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
118 \hline
119 Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
120 \hline
121 0 & LL & 0 & 8 & 8 & 8 & 8\\
122 \hline
123 1 & HL,LH, HH & - & 4, 4, 0 & 4, 4, 0 & 4, 4, 0 & 4, 4, 0 \\
124 \hline
125 2 & HL,LH, HH & - & - & 4, 4, 0 & 4, 4, 0 & 4, 4, 0 \\
126 \hline
127 3 & HL,LH, HH & - & - & - & 4, 4, 0 & 4, 4, 0 \\
128 \hline
129 4 & HL,LH, HH & - & - & - & - & 4, 4, 0 \\
130 \hline
131 \end{tabular}
132 \caption{Default quantisation matrices for $\WaveletIndex==4$ (Haar with single shift per level))
133 \label{table:qm4}}
134 \end{table}
136 \begin{table}[!ht]
137 \centering
138 \begin{tabular}{|c|c|c|c|c|c|c|}
139 \hline
140 \multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
141 \hline
142 Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
143 \hline
144 0 & LL & 0 & 0 & 0 & 0 & 0\\
145 \hline
146 1 & HL,LH, HH & - & 4, 4, 8 & 4, 4, 8 & 4, 4, 8 & 4, 4, 8 \\
147 \hline
148 2 & HL,LH, HH & - & - & 8, 8, 12 & 8, 8, 12 & 8, 8, 12 \\
149 \hline
150 3 & HL,LH, HH & - & - & - & 13, 13, 17 & 13, 13, 17 \\
151 \hline
152 4 & HL,LH, HH & - & - & - & - & 17, 17, 21 \\
153 \hline
154 \end{tabular}
155 \caption{Default quantisation matrices for $\WaveletIndex==5$ (Fidelity))
156 \label{table:qm6}}
157 \end{table}
159 \begin{table}[!ht]
160 \centering
161 \begin{tabular}{|c|c|c|c|c|c|c|}
162 \hline
163 \multicolumn{2}{|c|}{\cellcolor[gray]{0.75}}& \multicolumn{5}{|c|}{\cellcolor[gray]{0.75}{$\TransformDepth$}} \\
164 \hline
165 Level & Orientation & 0 & 1 & 2 & 3 & 4 \\
166 \hline
167 0 & LL & 0 & 3 & 3 & 3 & 3\\
168 \hline
169 1 & HL,LH, HH & - & 1, 1, 0 & 1, 1, 0 & 1, 1, 0 & 1, 1, 0 \\
170 \hline
171 2 & HL,LH, HH & - & - & 4, 4, 2 & 4, 4, 2 & 4, 4, 2 \\
172 \hline
173 3 & HL,LH, HH & - & - & - & 6, 6, 5 & 6, 6, 5 \\
174 \hline
175 4 & HL,LH, HH & - & - & - & - & 9, 9, 7 \\
176 \hline
177 \end{tabular}
178 \caption{Default quantisation matrices for $\WaveletIndex==6$ (Daubechies (9,7))
179 \label{table:qm7}}
180 \end{table}
182 \clearpage
183 \begin{informative*}
184 \subsection{Quantisation matrix design and quantiser selection (Informative)}
185 \label{qmatrixdesign}
187 This section provides an informative guide to the principles used to design the default
188 quantisation matrix
190 \subsubsection{Noise power normalisation}
191 \label{noisenorm}
193 The quantisation matrices defined in the preceding section are designed to counteract the
194 differential power gain of the various wavelet filters, so that quantisation noise from
195 each subband is weighted equally in terms of its contribution to noise power when transformed
196 back into the picture domain. Let $\alpha$ and $\beta$ represent the noise gain factors of
197 the low-pass and high-pass wavelet filters used in wavelet decomposition. In a single level of
198 wavelet decomposition, quantisation noise in each of the four subbands is therefore weighted by the factors shown in Figure \ref{fig:onelevelweight}.
199 \end{informative*}
200 \setlength{\unitlength}{1em}
201 \begin{figure}[!h]
202 \centering
203 \begin{picture}(20,27)
204 \put(0,5){\line(1,0){20}}
205 \put(0,5){\line(0,1){20}}
206 \put(20,5){\line(0,1){20}}
207 \put(20,25){\line(-1,0){20}}
209 \put(10,5){\line(0,1){20}}
210 \put(0,15){\line(1,0){20}}
212 \put(3,19.5){\text{\Large LL -- $\alpha^2$}}
213 \put(3,9.5){\text{\Large LH -- $\alpha\beta$}}
214 \put(13,19.5){\text{\Large HL -- $\alpha\beta$}}
215 \put(13,9.5){\text{\Large HH -- $\beta^2$}}
216 \end{picture}
217 \caption{Subband weights for a 1-level decomposition}\label{fig:onelevelweight}
218 \end{figure}
219 \begin{informative*}
221 For higher levels of decomposition, these subband weighting factors iterate
222 in the same manner as the wavelet transform itself. For example, with a two-level
223 decomposition, the first level LL band, with weight $\alpha^2$ is further decomposed
224 to give four more bands with weights as for the 1-level decomposition, but multiplied
225 by $\alpha^2$. This yields the weights shown in Figure \ref{fig:twolevelweight}.
226 \end{informative*}
227 \setlength{\unitlength}{1em}
228 \begin{figure}[!h]
229 \centering
230 \begin{picture}(30,40)
231 \put(0,5){\line(1,0){30}}
232 \put(0,5){\line(0,1){30}}
233 \put(30,5){\line(0,1){30}}
234 \put(30,35){\line(-1,0){30}}
236 \put(15,5){\line(0,1){30}}
237 \put(0,20){\line(1,0){30}}
239 \put(5.5,12){\text{\Large LH -- $\alpha\beta$}}
240 \put(20.5,27){\text{\Large HL -- $\alpha\beta$}}
241 \put(20.5,12){\text{\Large HH -- $\beta^2$}}
243 \put(7.5,20){\line(0,1){15}}
244 \put(0,27.5){\line(1,0){15}}
246 \put(2,31){\text{\Large LL -- $\alpha^4$}}
247 \put(2,23.5){\text{\Large LH -- $\alpha^3\beta$}}
248 \put(9,31){\text{\Large HL -- $\alpha^3\beta$}}
249 \put(9,23.5){\text{\Large HH -- $\alpha^2\beta^2$}}
251 \end{picture}
252 \caption{Subband weights for a 2-level decomposition}\label{fig:twolevelweight}
253 \end{figure}
254 \begin{informative*}
256 In this specification, wavelet synthesis filters have been defined in terms of lifting stages,
257 which are filters operating on subsampled data. Wavelet filters are more traditionally
258 represented in terms of an iterated binary polyphase filter bank: the relationship between
259 these representation is described in Annex \ref{lifting}. The factors $\alpha$ and $\beta$
260 are most easily computed from the filter bank representation. In this case $\alpha$ is either
261 the RMS power gain of the low-pass synthesis filter, or the {\em reciprocal} of the RMS power
262 gain of the low-pass analysis filter; and $\beta$ is the RMS power gain of the high-pass
263 synthesis filter of the reciprocal of the RMS power gain of the high-pass analysis filter.
265 Thus, in the terminology of Annex \ref{lifting},
266 $\alpha=\dfrac{1}{(\sum_n h(n)^2)^{\frac{1}{2}}}$ or
267 $\alpha=(\sum_n \tilde{h}(n)^2)^{\frac{1}{2}}$
270 $\beta=\dfrac{1}{(\sum_n g(n)^2)^{\frac{1}{2}}}$ or
271 $\beta=(\sum_n \tilde{g}(n)^2)^{\frac{1}{2}}$
273 These alternative definitions arise because the wavelet filters defined in this specification
274 are not orthogonal, but technically {\em biorthogonal} and so, strictly speaking, there is
275 not power addition of the quantisation noise in each subband. The values used for quantisation
276 matrices have been computed from the analysis rather than the synthesis filters, as this yields
277 better compression results in practice.
279 Note also that these factors must also take into account the shift factors used to add accuracy
280 bits prior to each wavelet decomposition stage. For a filter shift of $d$, $\alpha$ and
281 $\beta$ are each multiplied by $2^{-d/2}$.
283 Given a subband weighting factor $w$, a quantisation offset for that subband may be defined
284 as $4*\log_2(w)$ rounded to the nearest integer. These offsets are then normalised so as
285 to be non-negative, to produce the tables of the preceding section.
287 \subsubsection{Custom quantisation matrices}
288 \label{custommatrices}
290 Custom matrices may be defined that take into account not only noise power normalisation
291 but also perceptual weighting based on spatial frequency. Additional multiplicative factors
292 may be computed for each subband, which produce a matrix of quantisation offsets which may
293 then be added to the default unweighted quantisation matrices to produce a weighted quantisation
294 matrix.
296 An example perceptual weighting may be constructed from the CCIR 959 Contrast Sensitivity
297 Function (CSF). This is a function $csf(s)$ which produces a value representing the
298 sensitivity to detail at a given normalised spatial frequency $s$. For luminance, it is defined
300 \[csf(s)=0.255*(1+0.2561*s^2)^{-0.75}\]
302 Assuming an isotropic response, we may form a 2-d perceptual weighting function on
303 horizontal and vertical spatial frequencies $x_s,y_s$ by
304 \begin{eqnarray*}
305 c(x_s,y_s) & = & \dfrac{1}{csf((xs^2+ys^2)^{\frac{1}{2}})} \\
306 & = & 0.255*(1+0.2561*(x_s^2+y_s^2))^{0.75}
307 \end{eqnarray*}
309 Each subband in a wavelet decomposition represents a subset of spatial frequencies according
310 to level and orientation, partitioning the spatial frequency domain as per Figure \ref{fig:orientlevel}.
311 Note that this partitioning is un-normalised, since output pictures (and their compression artefacts) may
312 be viewed at a range of distances.
314 Accordingly we may pick a representative, un-normalised horizontal and vertical spatial frequency $(f_x(b),f_y(b))$ -- perhaps the middle frequency of the band. For example, an LH band $b$ at level 1 in a 1-level
315 decomposition will have mid frequency at $(pw/4,3*ph/4)$ where $ph$ and $pw$ are the padded
316 width and height of the picture (Section \ref{subbandwidthheight}). This may be turned into a true
317 spatial frequency by normalising by the number of horizontal and vertical cycles per degree the output
318 pictures will subtend at the target viewing distance and aspect ratio:
319 \[ (f_x(b)/cpd_x,f_y(b)/cpd_y)\]
321 and this value may be fed into the weighting function to get a value $c(b)$. The appropriate
322 quantisation offset for that subband is then $4*\log_2(c(b))$, which may be used to define a modified
323 quantisation matrix.
325 \end{informative*}