Generally the FPGA based architectures or ASIC based implementations are fixed point arithmetic based. But the majority of dedicated controllers are floating point based. It may require to interface a FPGA to a micro controller and then a fixed point to floating point conversion unit will be required. In order to discuss the floating point architectures, first the scheme for conversion from fixed point to floating point is discussed.
The steps of converting a fixed point number to a floating point number are
- Invert the number if the number is negative or if the MSB is logic 1.
- Count the leading zeros present in the number.
- Value of the mantissa is computed by left shifting the number according the leading zero present in the number.
- Subtract the leading zero count from (m-1) where m represents the number of integer bits are reserved for integer including the MSB bit.
- Add the result to the bias value to get the exponent.
- Sign of the fixed point number is the sign of the floating point number.
An example is given below to understand the fixed point to floating point conversion for 16-bit data width.
- The input fixed point number is a = 001000_0010000000 whose decimal value is 8.125 for m is 6-bits.
- As this number is positive so no inversion is required.
- There are two leading zero present in the number so it is left shifted by two bits. The result of shifting is 100000_1000000000.
- The value of the mantissa is M=00000_100000 by choosing 11-bits from MSB and excluding the MSB.
- The leading zero count is subtracted from (m-1) and the result is (5 – 2) = 3.
- The result is added to the bias to get the exponent (E = (7+3) = 10).
- Thus the floating point number is 0_1010_00000100000.
An simple scheme of fixed point to floating conversion is shown in Figure 1. Here two 4-bit adder/subtractor blocks are used to determine the exponent and one 16-bit adder/subrtactor is for inversion if necessary. The VLSH block stands for variable left shift according to the leading zero count. The leading zeros are counted by a block called leading zero counter. The operation of the architecture can be understood easily by following the steps mentioned above. In the mantissa computation path, 11-bits are taken as mantissa to form a 16-bit floating point output. This 16-bit representation can not represent all the numbers which are represented in the 16-bit fixed point representation. For example, in the 16-bit format the numbers which are less than can not be represented for exponent value of 4.