A basic circuit to accumulate two single bits is a half-adder (HA) and a full-adder (FA) accumulates three bits. Thus a HA is called as 2:2 counter and a FA is called a 3:2 counter. Again an HA is hardware efficient than a FA. Carry save adders (CSA) operates on multiple operands. A basic CSA is equivalent to a FA block. Similarly, several other counters are also exists which can be applied to design fast accumulating circuit. The major objective is to reduce the number of basic counters to reduce hardware complexity. Thus suitable arrangement of the partial products is important. An example of the partial products for a multiplier is shown in Figure 1 (a). Figure 1 (b) shows that the partial products can be reorganized to reduce the number of counters.

Once the partial products are reorganized, carry save operation can be performed. Basic 2:2 counters or 3:2 counters are applied whenever possible. It can be seen from Figure 2 (a) that at level 1, 3 HAs and 8 FAs are used. The result of level one is shown in Figure 2 (b). At level 2, 3 HAs and 4 FAs are sufficient as shown in Figure 2 (c). Similarly the last carry save addition is performed at level 3 as shown in Figure 3 (d). A carry propagation adder (CPA) is needed at the final stage to obtain the final result. This technique reduce the level of addition from six to four compared to array multiplication scheme and also reduces the number of counters (HAs and FAs).

The number of counters can be further reduced by employing the idea of reducing number of bits in each column to closest element from the set {3,4,6,9,13,19….}. This idea is illustrated in Figure 3 for the the same example shown in Figure 2 . Total number of 5 HAs and 15 FAs are used in this technique whereas total 9 HAs and 16 FAs are used in the previous scheme mentioned above. The savings of counter is substantial for higher bit multipliers.