Aerospace



Citadel Forums

Home

Company Information

Information Request

Linux How-to Guides

ADSP 21xx
Digital Signal Processing
Tutorials

SW Utilities

On-line Order Form

Server Support


bonk

Have you found this site useful? Did we save you time? Did we cure your head-ache? Is your hair growing back now?

Please make a donation to help with maintenance.


Objective Real-Time Software on the ADSP21XX

Circular Buffer Objects on the ADSP 21XX

General

Circular buffers are the speciality of the Data Address Generator register file. These registers provide a way to create buffers that will loop around continuously, thereby providing some elasticity between producer/consumer processes. Circular buffers can be constructed using any processor architecture, but the ADSP family has special hardware to save some instructions and make circular buffers operate without overhead. In practice, one always needs some overhead to figure out when the buffer is full or empty so the speed gain is not all that significant, but in real-time systems, every instruction counts.

Producer/Consumer Problems

A real-time system frequently has to contend with continuous streams of data. These are the classic producer/consumer problems expounded in first year computer science, at which time few students actually comprehend what it is all about. These data streams are usually handled by a system as interrupt processes. With a very fast processor, it may be tempting to simply handle the data byte by byte, without worrying about buffering it up. In practice, this is usually not a good idea, since it may be easier to handle the data in blocks of some convenient size.

Think of a typical messaging protocol. A message may consist of ten or more bytes. While it would be possible to parse the message as it is received, the checksum is usually the very last byte and if the checksum proves to be wrong, the message should be disregarded. Therefore it would be wise to buffer it up until the whole message is proved to be correct. Signal processing is usually performed on a block (or vector) of data. Therefore it would be convenient to buffer the data until there is enough to perform a vector operation.

The Case for Buffering

Buffers may be used for convenience, or to prevent the loss of data due to a bursty channel

The most compelling reason for buffering schemes though, is one of processing time. Some process in the system may take a long time to complete, or interrupts may occur in high speed bursts, followed by periods of inactivity. In these cases, buffering may be essential to avoid losing data.

General Purpose Circular Buffers

On a general purpose processor using a high level language, one can define a circular buffer object for use with an 8 bit serial port UART as follows:

Example 1. Circular Buffers in Flat C
#define SER_BUF_SIZE    0x0100
#define SER_BUF_MASK    0x00FF
#define SER_BUF_EMPTY   0xFFFF

unsigned char   ser_buffer[SER_BUF_SIZE];
unsigned int    ser_wr_index,
                ser_rd_index;


void ser_put(unsigned char data)
{
   ser_buffer[ser_wr_index] = data;
   ser_wr_index++;
   ser_wr_index &= SER_BUF_MASK;

   if(ser_rd_index == ser_wr_index)
   {
      disable();
      ser_rd_index++;
      ser_rd_index &= SER_BUF_MASK;
      enable();
   }
}


int ser_get(void)
{
   unsigned int data;

   if(ser_rd_index == ser_wr_index)
      return SER_BUF_EMPTY;

   data = (int)ser_buffer[ser_rd_index];
   ser_rd_index++;
   ser_rd_index &= SER_BUF_MASK;

   return data;
}

Note that these procedures are extremely small, to the point of being minimalist, with little error handling. Over the years I have coded quite a few circular buffers and finally came to the conclusion that these things are invariably used in cases where speed of processing is of the essence and overiding all other concerns.

The First Rule of Circular Buffers

Speed of processing is usually of the essence and overiding all other concerns

If you study these examples closely, you will notice a few small tricks. First of all, the buffer size is defined as a power of two, of 0x0100 = 128 bytes. This allows one to wrap the buffer index around with a simple bit wise AND operation. The buffer index is incremented after writing or reading and is always masked with the buffer size mask. Rather than attempting to determine whether the end of the buffer has been reached or not, it is more efficient to perform the simple AND operation every time regardless. To facilitate the use of bit wise operations, it is essential to declare the indexes as unsigned integers.

The Second Rule of Circular Buffers

First read or write, then increment the index

The ser_put procedure has a simple error handler built in. When the write index overruns the read index, the read index is bumped up. This means that old data is lost. Some people may argue that this is a serious error condition. I beg to differ, since it is a very frequent occurrence when a system is started or stopped. Reporting these errors is just a waste of processing time and does not serve any useful purpose. In fact, if the buffer is overrunning because the system is too busy to handle the incoming data, trying to report the fact will just make matters worse...

The Third Rule of Circular Buffers

To wrap an index around, it is more efficient to perform the simple AND operation every time regardless of whether the end of the buffer has been reached

Note that it is important to somehow disable interrupts when the ser_put procedure touches the read index. This is to prevent the ser_get procedure from getting confused if it would be called by an interrupt process during this time. Also note that these routines are not re-entrant, since global variables are used.

The ser_get procedure tests the indexes before using them. If they are equal, the buffer is empty. In this case the value 0xFFFF = -1 is returned. Since the buffer is defined as a byte array, this value cannot be valid data. The calling procedure can therefore test for -1 to sense when the buffer is empty.

At startup, all variables should be intialized to zero by a memory initialization process. The two buffer indexes will therefore be equal, signifying an empty buffer to start with.

Circular Buffer Restrictions on the ADSP 21XX

The ADSP buffer loop mechanism relies on the use of the l (length) registers. The buffer length has to be defined as a power of 2, for instance 16, 32, 64 or 128 words. Furthermore, the buffer has to start on a paragraph boundary, that is, depending on the size of the buffer, the start address has to be a round hexadecimal number. This complexity is handled automatically by the linker, when a buffer is defined as circular.

The buffer length restrictions are due to the simple hardware used to wrap the pointers around, much the same way as in the C example above. The m (modifyer) registers can be used to step through the buffer in varying step sizes, although so far I haven't had a single application calling for anything but a step size of one.

The difference between the ADSP automatic circular buffers and the general purpose C example is that the ADSP directly manipulates the buffer pointers, which contain the actual memory addresses of the data, while in the C code, the indexes represent indirect addresses, which means that for every buffer access, the actual address has to be calculated from scratch. This is the reason behind the ADSP special architecture and represents a significant speed advantage.

Multi Dimensional Circular Buffers

Our good friend H.Acker used four circular buffers in the the four channel data recorder. He did this by declaring four separate circular buffers and had four separate code threads to handle them. Size wise, this made for a very inefficient system, although execution wise, this is actually very efficient. Our concern here is however with programmer efficiency, which is much more expensive than processing cycles...

S.Ucker, tasked with modifying the code for a thirty two channel recorder, faced a major obstacle. He wanted to handle all channels with a single code thread, since it was clear to him that having to debug thirty two separate code threads would be an exhorbitant waste of time, never mind the waste of code space.

The solution is to define all the buffer indexes in array variables, but how to dupe the linker into allocating the circular buffers correctly? This is actually a self correcting problem! If the circular buffers are declared as one huge circular buffer, the linker will start it on a paragraph boundary and since all the sub buffers are powers of two in size, every one of them will start on a paragraph boundary too. S.Ucker's problem then reduces to an initialization problem, where he has to create a routine to calculate the starting addresses of all the sub buffers, just to get things started on the right footing.

Multi Dimensional Circular Buffers

If the circular buffers are declared as one huge circular buffer, the linker will start it on a paragraph boundary and since all the sub buffers are powers of two in size, every one of them will start on a paragraph boundary too

Circular Buffer Objects in Assembler

The following example shows the circular buffer object definitions for use with a four channel system:

Example 2. Circular Buffer Objects
/* Resolve public definitions */
#undef   PUBLIC
#undef   PROTOTYPE
#ifdef   IO
#define  PUBLIC      GLOBAL
#define  PROTOTYPE   ENTRY
#else
#define  PUBLIC      EXTERNAL
#define  PROTOTYPE   EXTERNAL
#endif

/***************************** Literals *****************************/
#define IO_BYTE_MASK    0x00FF
#define IO_EMPTY_MASK   0xFFFF

#define IO_CHAN_MAX     0x0004
#define IO_BUF_SIZE     0x0020
#define IO_BUF_MASK     0x001F
#define IO_BUF_MAX      (IO_BUF_SIZE * IO_CHAN_MAX)


/***************************** Variables ****************************/
#ifdef IO
.VAR/DM/RAM/CIRC/SEG=INT_DM
   io_data_buf[IO_BUF_MAX];

.VAR/DM/RAM/SEG=INT_DM
   io_data_rd[IO_CHAN_MAX],
   io_data_wr[IO_CHAN_MAX];
#endif

/***************************** Prototypes ***************************/
.PROTOTYPE
   io_init_buf,
   io_get_buf,
   io_put_buf;

Instead of defining four separate circular buffers, we declare one contiguous buffer io_data_buf and initialize the individual buffer points ourselves. This allows us to easily handle multi dimensional buffer objects.

Example 3. Buffer Object Initialization Method
/*********************************************************************
* Name:        io_init_buf
* Description: Initialize the circular buffer objects
* Constraints: none
* Tested OK
*********************************************************************/
io_init_buf:
   MAC_ENTER

io_init_buf_enter:
   /*
    * Initialize the circular data buffer pointers
    */
   ay0 = 0;
   ay1 = ^io_data_buf;        /* buffer base address */
   cntr = IO_CHAN_MAX;
   do io_init_buf_loop until ce;
      /* init the rd/wr pointers */
      MAC_WR_DM(ay1, ^io_data_rd, ay0)
      MAC_WR_DM(ay1, ^io_data_wr, ay0)

      ar = IO_BUF_SIZE;       /* next buffer address */
      ar = ar + ay1;
      ay1 = ar;

      ar = ay0 + 1;           /* next pointer */
      ay0 = ar;
io_init_buf_loop: nop;
    
   MAC_EXIT
   rts;

Circular Buffer Management

The Circular Buffer Objects are managed with a complimentary pair of methods. One procedure is used to save data in the buffer and uses a write pointer to keep track of the next data write position in the buffer. The other procedure is used to extract data from the buffer and uses a read pointer to keep track of the next data to read. The buffer is empty when the two pointers are equal and is overflowing when the write pointer overtakes the read pointer from behind. Note that the buffer write procedure is minimalist, with no overflow checking. One can do this if one is sure that the processor is fast enough and will always keep up with the incoming data stream, thus making excessive error handling redundant.

Once we have the buffers initialized, they can be accessed in a loop, using the procedures in example 4. The loop is used to sucessively select the next pair of read/write pointers from the read/write pointer arrays. S.Ucker can use these routines to handle all 32 buffers of his new data logger all in one go, simply by upping the literal definition IO_CHAN_MAX to 32!

Example 4. Buffer Object Methods
/*********************************************************************
* Name:        io_put_buf
* Description: write a byte to a circular buffer
* Constraints: ar = data
*              ay0 = port number
* NOTE: Protection against interrupts can be removed if the routine
*       is always called from an ISR and nesting is disabled.
*       For simplicity, there is no protection against overruns.
*       Overruns will only occur if there is a SW error, in which
*       case the system is bust and needs a reset.
*********************************************************************/
io_put_buf:
   MAC_ENTER

io_put_buf_enter:
   MAC_RD_DM(ax0, ^io_data_wr, ay0)    /* get the wr pointer */
   i3 = ax0;

   dis ints;
   m3 = 1;                             /* 1 byte at a time */
   l3 = IO_BUF_SIZE;                   /* circ buffer size */
   dm(i3, m3) = ar;                    /* put the data */
   l3 = 0;                             /* change back to linear mode */
   ena ints;
   
   ar = i3;
   dis ints;
   MAC_WR_DM(ar, ^io_data_wr, ay0)     /* save the new wr pointer */
   ena ints;

io_put_buf_exit:   
   MAC_EXIT
   rts;

/*********************************************************************
* Name:        io_get_buf
* Description: read a byte from a circular buffer
* Constraints: ar = port number
*              returns ar = data byte
*              returns ar = -1 = FFFFH when empty
*              buffer is empty when rd and wr pointers are equal
* NOTE: Protection against interrupts can be removed if the routine
*       is always called from an ISR and nesting is disabled.
*********************************************************************/
io_get_buf:
   MAC_ENTER

io_get_buf_enter:   
   ay0 = ar;                           /* save port number */
   MAC_RD_DM(ay1, ^io_data_rd, ay0)    /* get the rd pointer */
   MAC_RD_DM(ar, ^io_data_wr, ay0)     /* get the wr pointer */

   ar = ar - ay1;                      /* (wr - rd)  */
   if eq jump io_get_buf_empty;        /* empty when equal */

   dis ints;
   i3 = ay1;                           /* set up rd pointer */
   m3 = 1;                             /* 1 byte at a time */
   l3 = IO_BUF_SIZE;                   /* circ buffer size */
   ar = dm(i3, m3);                    /* get the data */
   af = pass ar;                       /* and save it in af */
   l3 = 0;                             /* change back to linear mode */
   ena ints;
   
   ar = i3;
   dis ints;
   MAC_WR_DM(ar, ^io_data_rd, ay0)     /* save the new rd pointer */
   ena ints;

   ax1 = IO_BYTE_MASK;                 /* return data in ar */
   ar = ax1 AND af;                    /* filter out garbage bits */
   jump io_get_buf_exit;

io_get_buf_empty:
   ar = IO_EMPTY_MASK;                 /* returns -1 when empty */

io_get_buf_exit:   
   MAC_EXIT
   rts;

Circular Buffer use in an ISR

The use of these buffer objects is best illustrated with an example interrupt service routine. The following ISR will test a hardware interrupt status register, read four CODEC ports and put the data into their respective circular buffers. The foreground task process can read the four buffers in a similar fashion and process all ports with a single code execution thread. The routine io_read_aud_port can be based upon the indirect I/O routines discussed in another chapter.

Example 5. Interrupt Service Routine
/*********************************************************************
* Name:        int_audin
* Description: Audio input interrupt routine
* Constraints: none
* Execution time = 5.7us (per received PCM byte, on a 2185 at 16MHz)
*********************************************************************/
int_audin:
   MAC_ISR_ENTER

   /*
    * Loop until no more interrupts pending
    * Interrupts are falling edge triggered
    * Since the 4 channels are independent, there may be more than
    * one interrupt pending.
    */
int_audin_loop:
   ar = INT_AUD_ENABLE;
   af = pass ar;
   ar = IO(INT_STAT_REG);              /* get interrupts pending */
   ar = ar AND af;
   if eq jump int_audin_exit;          /* no more interrupts */
   ax1 = ar;                           /* ax1 is pending ints */

   /*
    * Handle four ports A..D
    */
   ay0 = 0;                            /* port counter */
   ax0 = IO_AUD_REG_A;                 /* CODEC port base address */
   cntr = 4;
   do int_audin_read until ce;
      /*
       * make a port bit mask in ay1
       */
      se = ay0;                     
      ar = 1;
      sr = LSHIFT ar (lo);
      ay1 = sr0;                       /* port mask */
      
      /*
       * get data if any, from CODECs and save in buffer
       */
      ar = ax1 AND ay1;
      if eq jump int_audin_next;
      
      ar = ax0;
      call io_read_aud_port;           /* indirect IO port read */
      call io_put_buf;                 /* save the data */

      /*
       * increment the port counter
       * and port address
       */
int_audin_next:
      ar = ay0 + 1;                    /* inc port counter */
      ay0 = ar;

      ar = ax0 + 1;                    /* inc port address */
      ax0 = ar;

      
int_audin_read: nop;
   jump int_audin_loop;
   
int_audin_exit:   

   MAC_ISR_EXIT
   rti;



Copyright © 1996-2008, Aerospace Software Ltd., GPL.