Double dabble

In computer science, the double dabble algorithm is used to convert binary numbers into binary-coded decimal (BCD) notation.^[1]^[2] It is also known as the Shift-and-add-3 algorithm, and can be implemented using a small number of gates in computer hardware, but at the expense of high latency.^[3]

Algorithm

The algorithm operates as follows:

Suppose the original number to be converted is stored in a register that is n bits wide. Reserve a scratch space wide enough to hold both the original number and its BCD representation; $n + 4\times ceil (n /3)$ bits will be enough. It takes a maximum of 4 bits in binary to store each decimal digit.

Then partition the scratch space into BCD digits (on the left) and the original register (on the right). For example, if the original number to be converted is eight bits wide, the scratch space would be partitioned as follows:

100s Tens Ones   Original
0010 0100 0011   11110011

The diagram above shows the binary representation of 243₁₀ in the original register, and the BCD representation of 243 on the left.

The scratch space is initialized to all zeros, and then the value to be converted is copied into the "original register" space on the right.

0000 0000 0000   11110011

The algorithm then iterates n times. On each iteration, the entire scratch space is left-shifted one bit. However, before the left-shift is done, any BCD digit which is greater than 4 is incremented by 3. The increment ensures that a value of 5, incremented and left-shifted, becomes 16, thus correctly "carrying" into the next BCD digit.

The double-dabble algorithm, performed on the value 243₁₀, looks like this:

0000 0000 0000   11110011   Initialization
0000 0000 0001   11100110   Shift
0000 0000 0011   11001100   Shift
0000 0000 0111   10011000   Shift
0000 0000 1010   10011000   Add 3 to ONES, since it was 7
0000 0001 0101   00110000   Shift
0000 0001 1000   00110000   Add 3 to ONES, since it was 5
0000 0011 0000   01100000   Shift
0000 0110 0000   11000000   Shift
0000 1001 0000   11000000   Add 3 to TENS, since it was 6
0001 0010 0001   10000000   Shift
0010 0100 0011   00000000   Shift
   2    4    3
       BCD

Now eight shifts have been performed, so the algorithm terminates. The BCD digits to the left of the "original register" space display the BCD encoding of the original value 243.

Another example for the double dabble algorithm - value 65244₁₀.

 10⁴  10³  10²   10¹  10⁰    Original binary
0000 0000 0000 0000 0000   1111111011011100   Initialization
0000 0000 0000 0000 0001   1111110110111000   Shift left (1st)
0000 0000 0000 0000 0011   1111101101110000   Shift left (2nd)
0000 0000 0000 0000 0111   1111011011100000   Shift left (3rd)
0000 0000 0000 0000 1010   1111011011100000   Add 3 to 10⁰, since it was 7
0000 0000 0000 0001 0101   1110110111000000   Shift left (4th)
0000 0000 0000 0001 1000   1110110111000000   Add 3 to 10⁰, since it was 5
0000 0000 0000 0011 0001   1101101110000000   Shift left (5th)
0000 0000 0000 0110 0011   1011011100000000   Shift left (6th)
0000 0000 0000 1001 0011   1011011100000000   Add 3 to 10¹, since it was 6
0000 0000 0001 0010 0111   0110111000000000   Shift left (7th)
0000 0000 0001 0010 1010   0110111000000000   Add 3 to 10⁰, since it was 7
0000 0000 0010 0101 0100   1101110000000000   Shift left (8th)
0000 0000 0010 1000 0100   1101110000000000   Add 3 to 10¹, since it was 5
0000 0000 0101 0000 1001   1011100000000000   Shift left (9th)
0000 0000 1000 0000 1001   1011100000000000   Add 3 to 10², since it was 5
0000 0000 1000 0000 1100   1011100000000000   Add 3 to 10⁰, since it was 9
0000 0001 0000 0001 1001   0111000000000000   Shift left (10th)
0000 0001 0000 0001 1100   0111000000000000   Add 3 to 10⁰, since it was 9
0000 0010 0000 0011 1000   1110000000000000   Shift left (11th)
0000 0010 0000 0011 1011   1110000000000000   Add 3 to 10⁰, since it was 8
0000 0100 0000 0111 0111   1100000000000000   Shift left (12th)
0000 0100 0000 1010 0111   1100000000000000   Add 3 to 10¹, since it was 7
0000 0100 0000 1010 1010   1100000000000000   Add 3 to 10⁰, since it was 7
0000 1000 0001 0101 0101   1000000000000000   Shift left (13th)
0000 1011 0001 0101 0101   1000000000000000   Add 3 to 10³, since it was 8
0000 1011 0001 1000 0101   1000000000000000   Add 3 to 10¹, since it was 5
0000 1011 0001 1000 1000   1000000000000000   Add 3 to 10⁰, since it was 5
0001 0110 0011 0001 0001   0000000000000000   Shift left (14th)
0001 1001 0011 0001 0001   0000000000000000   Add 3 to 10³, since it was 6
0011 0010 0110 0010 0010   0000000000000000   Shift left (15th)
0011 0010 1001 0010 0010   0000000000000000   Add 3 to 10², since it was 6
0110 0101 0010 0100 0100   0000000000000000   Shift left (16th)
   6    5    2    4    4
            BCD

Sixteen shifts have been performed, so the algorithm terminates. The BCD digits is: 6*10⁴ + 5*10³ + 2*10² + 4*10¹ + 4*10⁰ = 65244.

C implementation

The double dabble algorithm might look like this when implemented in C. Notice that this implementation is designed to convert an "input register" of any width, by taking an array as its parameter and returning a dynamically allocated string. Also notice that this implementation does not store an explicit copy of the input register in its scratch space, as the description of the algorithm did; copying the input register into the scratch space was just a pedagogical device.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/*
   This function takes an array of n unsigned integers,
   each holding a value in the range [0, 65535],
   representing a number in the range [0, 2**(16n)-1].
   arr[0] is the most significant "digit".
   This function returns a new array containing the given
   number as a string of decimal digits.

   For the sake of brevity, this example assumes that
   calloc and realloc will never fail.
*/
void double_dabble(int n, const unsigned int *arr, char **result)
{
    int nbits = 16*n;         /* length of arr in bits */
    int nscratch = nbits/3;   /* length of scratch in bytes */
    char *scratch = calloc(1 + nscratch, sizeof *scratch);
    int i, j, k;
    int smin = nscratch-2;    /* speed optimization */

    for (i=0; i < n; ++i) {
        for (j=0; j < 16; ++j) {
            /* This bit will be shifted in on the right. */
            int shifted_in = (arr[i] & (1 << (15-j)))? 1: 0;

            /* Add 3 everywhere that scratch[k] >= 5. */
            for (k=smin; k < nscratch; ++k)
              scratch[k] += (scratch[k] >= 5)? 3: 0;

            /* Shift scratch to the left by one position. */
            if (scratch[smin] >= 8)
              smin -= 1;
            for (k=smin; k < nscratch-1; ++k) {
                scratch[k] <<= 1;
                scratch[k] &= 0xF;
                scratch[k] |= (scratch[k+1] >= 8);
            }

            /* Shift in the new bit from arr. */
            scratch[nscratch-1] <<= 1;
            scratch[nscratch-1] &= 0xF;
            scratch[nscratch-1] |= shifted_in;
        }
    }

    /* Remove leading zeros from the scratch space. */
    for (k=0; k < nscratch-1; ++k)
      if (scratch[k] != 0) break;
    nscratch -= k;
    memmove(scratch, scratch+k, nscratch+1);

    /* Convert the scratch space from BCD digits to ASCII. */
    for (k=0; k < nscratch; ++k)
      scratch[k] += '0';

    /* Resize and return the resulting string. */
    *result = realloc(scratch, nscratch+1);
    return;
}

/*
   This test driver should print the following decimal values:
   246
   16170604
   1059756703745
*/
int main(void)
{
    unsigned int arr[] = { 246, 48748, 1 };
    char *text = NULL;
    int i;
    for (i=0; i < 3; ++i) {
        double_dabble(i+1, arr, &text);
        printf("%s\n", text);
        free(text);
    }
    return 0;
}

VHDL implementation

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.numeric_std.all;


entity bin2bcd_12bit is
    Port ( binIN : in  STD_LOGIC_VECTOR (11 downto 0);
           ones : out  STD_LOGIC_VECTOR (3 downto 0);
           tens : out  STD_LOGIC_VECTOR (3 downto 0);
           hundreds : out  STD_LOGIC_VECTOR (3 downto 0);
           thousands : out  STD_LOGIC_VECTOR (3 downto 0)
          );
end bin2bcd_12bit;

architecture Behavioral of bin2bcd_12bit is

begin

bcd1: process(binIN)

  -- temporary variable
  variable temp : STD_LOGIC_VECTOR (11 downto 0);
  
  -- variable to store the output BCD number
  -- organized as follows
  -- thousands = bcd(15 downto 12)
  -- hundreds = bcd(11 downto 8)
  -- tens = bcd(7 downto 4)
  -- units = bcd(3 downto 0)
  variable bcd : UNSIGNED (15 downto 0) := (others => '0');

  -- by
  -- https://en.wikipedia.org/wiki/Double_dabble
  
  begin
    -- zero the bcd variable
    bcd := (others => '0');
    
    -- read input into temp variable
    temp(11 downto 0) := binIN;
    
    -- cycle 12 times as we have 12 input bits
    -- this could be optimized, we dont need to check and add 3 for the 
    -- first 3 iterations as the number can never be >4
    for i in 0 to 11 loop
    
      if bcd(3 downto 0) > 4 then 
        bcd(3 downto 0) := bcd(3 downto 0) + 3;
      end if;
      
      if bcd(7 downto 4) > 4 then 
        bcd(7 downto 4) := bcd(7 downto 4) + 3;
      end if;
    
      if bcd(11 downto 8) > 4 then  
        bcd(11 downto 8) := bcd(11 downto 8) + 3;
      end if;
    
      -- thousands can't be >4 for a 12-bit input number
      -- so don't need to do anything to upper 4 bits of bcd
    
      -- shift bcd left by 1 bit, copy MSB of temp into LSB of bcd
      bcd := bcd(14 downto 0) & temp(11);
    
      -- shift temp left by 1 bit
      temp := temp(10 downto 0) & '0';
    
    end loop;
 
    -- set outputs
    ones <= STD_LOGIC_VECTOR(bcd(3 downto 0));
    tens <= STD_LOGIC_VECTOR(bcd(7 downto 4));
    hundreds <= STD_LOGIC_VECTOR(bcd(11 downto 8));
    thousands <= STD_LOGIC_VECTOR(bcd(15 downto 12));
  
  end process bcd1;            
  
end Behavioral;

VHDL testbench

LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
 
 
ENTITY bin2bcd_12bit_test_file IS
END bin2bcd_12bit_test_file;
 
ARCHITECTURE behavior OF bin2bcd_12bit_test_file IS 
 
    -- Component Declaration for the Unit Under Test (UUT)
 
    COMPONENT bin2bcd_12bit
    PORT(
         binIN : IN  std_logic_vector(11 downto 0);
         ones : OUT  std_logic_vector(3 downto 0);
         tens : OUT  std_logic_vector(3 downto 0);
         hundreds : OUT  std_logic_vector(3 downto 0);
	 thousands : OUT  std_logic_vector(3 downto 0)
        );
    END COMPONENT;
    
  -- WARNING: Please, notice that there is no need for a clock signal in the testbench, since the design is strictly
  --    combinational (or concurrent, in contrast to the C implementation which is sequential).
  -- This clock is here just for simulation; you can omit all clock references and process, and use "wait for ... ns"
  --    statements instead.

   --Inputs
   signal binIN : std_logic_vector(11 downto 0) := (others => '0');
   signal clk : std_logic := '0';  -- can be omitted

 	--Outputs
   signal ones : std_logic_vector(3 downto 0);
   signal tenths : std_logic_vector(3 downto 0);
   signal hunderths : std_logic_vector(3 downto 0);
   signal thousands : std_logic_vector(3 downto 0);

   -- Clock period definitions
   constant clk_period : time := 10 ns;  -- can be omitted

   -- Miscellaneous
   signal full_number : std_logic_vector(15 downto 0);

BEGIN
 
	-- Instantiate the Unit Under Test (UUT)
   uut: bin2bcd_12bit PORT MAP (
          binIN => binIN,
          ones => ones,
          tens => tenths,
          hundreds => hunderths,
	  thousands => thousands
        );

   -- Clock process definitions  -- the whole process can be omitted
   clk_process :process
   begin
		clk <= '0';
		wait for clk_period/2;
		clk <= '1';
		wait for clk_period/2;
   end process;
 
   -- Combine signals for full number
   full_number <= thousands & hunderths & tenths & ones;

   -- Stimulus process
   stim_proc: process
   begin		
      -- hold reset state for 100 ns.
      wait for 100 ns;	

      wait for clk_period*10;

      -- insert stimulus here 
		-- should return 4095
		binIN <= X"FFF";
		wait for clk_period*10;  assert full_number = x"4095" severity error;  -- use "wait for ... ns;"

		-- should return 0
		binIN <= X"000";
		wait for clk_period*10;  assert full_number = x"0000" severity error;

		-- should return 2748
		binIN <= X"ABC";
		wait for clk_period*10;  assert full_number = x"2748" severity error;
		
		
      wait;
   end process;

END;

Historical

In the 1960s, the term double dabble was also used for a different mental algorithm, used by programmers to convert a binary number to decimal. It is performed by reading the binary number from left to right, doubling if the next bit is zero, and doubling and adding one if the next bit is one.^[4] In the example above, 11110011, the thought process would be: "one, three, seven, fifteen, thirty, sixty, one hundred twenty-one, two hundred forty-three," the same result as that obtained above.

References

↑ Gao, Shuli; Al-Khalili, D.; Chabini, N. (June 2012), "An improved BCD adder using 6-LUT FPGAs", IEEE 10th International New Circuits and Systems Conference (NEWCAS 2012), pp. 13–16, doi:10.1109/NEWCAS.2012.6328944 .
↑ Binary-to-BCD Converter: “Double-Dabble Binary-to-BCD Conversion Algorithm,” originally at http://edda.csie.dyu.edu.tw/course/fpga/Binary2BCD.pdf, as cited by Gao & Al-KhaliliChabini (2012). Archived from original, January 31, 2012.
↑ Véstias, M.P.; Neto, H.C. (March 2010), "Parallel decimal multipliers using binary multipliers", VI Southern Programmable Logic Conference (SPL 2010), pp. 73–78, doi:10.1109/SPL.2010.5483001 .
↑ Godse, D. A.; Godse, A. P. (2008), Digital Techniques, Technical Publications, p. 4, ISBN 9788184314014 .

This article is issued from Wikipedia - version of the 10/17/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.