Skip to content

Basic Examples With Poorly Drawn Diagrams

Julian Kemmerer edited this page Oct 28, 2018 · 22 revisions

Floating Point Adder

Here is the PipelineC code that describes a pipelined floating point adder:

float main(float x, float y)
{
   return x + y;
}

diagram1

Binary Adder Tree

Here is the PipelineC code that describes a pipelined binary tree summation of 8 values by instantiating 7 floating point adders:

float main(float x0, float x1, float x2, float x3,
           float x4, float x5, float x6, float x7)
{
   // First layer of 4 sums in parallel
   float sum0;
   float sum1;
   float sum2;
   float sum3;
   sum0 = x0 + x1;
   sum1 = x2 + x3;
   sum2 = x4 + x5;
   sum3 = x6 + x7;
   
   // Next layer of two sums in parallel
   float sum4;
   float sum5;
   sum4 = sum0 + sum1;
   sum5 = sum2 + sum3;
   
   // Final layer of a single sum
   float sum6;
   sum6 = sum4 + sum5;
   
   return sum6;
}

diagram2

Low latency, high resource usage matrix multiply

This code instantiates:

  • N^3 floating point multipliers
  • N^2 binary tree summations of N elements ('float_array_sumN') (of which each is is 2^(log2(N)+1)-1 floating point adders )
// Lowest latency, most resource usage
// Resource usage grows O(N^3)

#include "uintN_t.h"
#define N 2
#define float_array_sumN float_array_sum2
#define iter_t uint1_t

typedef struct an_array_t
{
	float a[N][N];
} an_array_t;


an_array_t main(float mat1[N][N], float mat2[N][N])
{
    an_array_t res;
    
    iter_t i;
    iter_t j;
    iter_t k;
    for (i = 0; i < N; i = i + 1) 
    { 
        for (j = 0; j < N; j = j + 1) 
        { 
            float res_k[N];
            for (k = 0; k < N; k = k + 1)
            {
                res_k[k] = mat1[i][k] * mat2[k][j]; 
            }
	    res.a[i][j] =  float_array_sumN(res_k);
        } 
    }
    
    return res;
}
Clone this wiki locally