From 1d9b3f7f053e666b4f21cccc08e05cbc7847ae59 Mon Sep 17 00:00:00 2001 From: Thea Aarrestad <thea.aarrestad@cern.ch> Date: Wed, 6 Dec 2023 13:28:55 +0100 Subject: [PATCH] changing figure definition for Git rendering --- part2/part2_compression.ipynb | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/part2/part2_compression.ipynb b/part2/part2_compression.ipynb index 09525c5..a8e5ef4 100644 --- a/part2/part2_compression.ipynb +++ b/part2/part2_compression.ipynb @@ -27,7 +27,7 @@ "\n", "Below you can see an example of a tensor with a (symmetric) dynamic range of $x_{f}$ [-amax, amax] mapped through quantization to a an 8 bit integer, $2^8=256$ discrete values in the interval [-128, 127] (32-bit floating-point can represent ~4B numbers in the interval [-3.4e38, 3.40e38]).\n", "\n", - "<img src=\"images/8-bit-signed-integer-quantization.png\" width=\"800\">\n", + "<img src=\"images/8-bit-signed-integer-quantization.png\" width=\"800\"/>\n", "\n", "Quantization of floating point numbers can be achieved using the quantization operation\n", "$$x_{q} = Clip(Round(x_{f}/scale))$$\n", @@ -35,7 +35,7 @@ "\n", "On FPGA, we do not use int8 quantization, but fixed-point quantization, bu the idea is similar. Fixed-point representation is a way to *express fractions with integers* and offers more control over precision and range. We can split the $W$-bits making up an integer (in our case $W=8$) to represent the integer part of a number and the fractional part of the number. We usually reserve 1-bit representing the sign of the digit. The radix splits the remaining $W-$1 bits to $$I most significant bits representing the integer value and $F$ least significant bits representing the fraction. We write this as $<W,I>$, where $F=W-1-I$. Here is an example for an unsigned $<8,3>$:\n", "\n", - "<img src=\"images/fixedpoint.png\" width=\"400\">\n", + "<img src=\"images/fixedpoint.png\" width=\"400\"/>\n", "\n", "This fixed point number corresponds to $2^4\\cdot0+2^3\\cdot0+2^2\\cdot0+2^1\\cdot1+2^0\\cdot0+2^{-1}\\cdot1+2^{-2}\\cdot1+2^{-3}\\cdot0=2.75$.\n", "\n", -- GitLab