Re: Version of dpi_pipe_c_send() that can handle messages of unlimited size

From: John Stickley <john_stickley_at_.....> Date: Thu Mar 30 2006 - 20:47:54 PST · This archive was generated by hypermail 2.1.8 : Thu Mar 30 2006 - 20:48:06 PST

Per,

I took a look at this model and took a slightly
different approach that I think gets around some
of the tricky parts you were dealing with.

I've attached the new version (see attached file).

Basically what I do is put in a small optimization
where if the bytes_per_element is a multiple
of sizeof(svBitVecVal) it means that all elements
are word aligned and you can do it slightly more
efficiently.

But if that is not the case, then the code deals
with the data transfer at byte granularity using
the svPut/GetPartselBit() helper functions.

Furthermore, I modified dpi_pipe_c_send() directly
to handle unlimited buffer sizes. But it calls
an internal dpi_pipe_c_send_to_fit() which is
a verbatim copy of the original dpi_pipe_c_send()
that I had.

Symetric changes can be done for dpi_pipe_c_receive().

I also agree with your comments about not needing
to support unlimited buffer size on the HDL side
due to practical considerations.

I verified that this compiled but have not yet had
a chance to test it but it should be fairly easy.

I also was thinking about another approach of putting
byte level support inside the non-blocking call
itself. The idea is that the non-blocking call would
be given a byte index into the svBitVecVal array that
it gets, and return the number of bytes it successfully
did transfer. This returned value could then be
used by the caller to bump the byte index for the
next call in the loop, without changing the base
svBitVecVal[] data pointer.

This would allow for a more efficient implementation
that would potentially reduce the amount of data copying
required.

The new prototype for the call would look something
like this:

int dpi_pipe_c_try_send(
     void *pipe_handle,       // input: pipe handle
     int byte_offset,         // input: byte offset within data array
     int bytes_per_element,   // input: #bytes/element
     int num_elements,        // input: #elements to be written
     const svBitVecVal *data, // input: data
     svBit eom );             // input: end-of-message marker flag

Instead of return success (1) or fail (0) it would
return # elements transferred (which on a full pipe
would still return a 0 as before).

The revised dpi_pipe_c_send() would then look something
like this:

void dpi_pipe_c_send(
     void *pipe_handle,         // input: pipe handle
     int bytes_per_element,     // input: #bits/element
     int num_elements,          // input: #elements to be written
     const svBitVecVal *data,   // input: data
     svBit eom )                // input: end-of-message marker flag
{
     int byte_offset = 0, elements_sent;

     while( num_elements ){
         elements_sent =
             dpi_pipe_c_try_send(
                 pipe_handle, byte_offset,
                 bytes_per_element, num_elements, data, eom )
             * bytes_per_element;
         // if( pipe is full ) wait until OK to send more
         if( elements_sent == 0 ){
             sc_event *ok_to_send = (sc_event *)
                 dpi_pipe_get_notify_context( pipe_handle );

             // if( notify ok_to_send context has not yet been set up ) ...
             if( ok_to_send == NULL ){
                 ok_to_send = new sc_event;
                 dpi_pipe_set_notify_callback(
                     pipe_handle, notify_ok_to_send_or_receive, ok_to_send );
             }
             wait( *ok_to_send );
         }
         else {
             byte_offset += elements_sent * bytes_per_element;
             num_elements -= elements_sent;
         }
     }
}

Almost as concise as the original - but now with unlimited size !

This is a little cleaner because now at the level of dpi_pipe_c_send(),
you don't even care about querying buffer size. It's all handled
within the NB API.

I'll work this a little more and send out an update.

-- johnS

Per Bojsen wrote:
> Hi,
> 
> I have enclosed an example implementation of a version of
> dpi_pipe_c_send() that can handle messages of unlimited size.
> By unlimited I refer to messages that are larger than the
> current pipe buffer size.  For now I called the function
> dpi_pipe_c_send_nl().  `nl' is for `no limit'.  It has the
> exact same prototype as dpi_pipe_c_send() and can thus be
> used as a direct replacement.  Internally, I chose to use
> John's dpi_pipe_c_send() to transfer the individual segments
> of the message.  The code also shows some possible optimizations
> when one can exploit that the segment size is an integer
> number of svBitVecVal words.  This happens when number of
> elements in the segment multiplied by number of bytes per
> segment is divisible by 4 (4 bytes per svBitVecVal word).
> It turns out that if the pipe depth is 4 elements or more
> one can always find such a nice segment size.  When the segment
> size has this property, copying and extracting bit fields
> out of the original svBitVecVal data buffer can be avoided.
> 
> The code for dpi_pipe_c_receive_nl() would follow a similar
> pattern.  I can create this if there is any interest.
> 
> Note, that my code demonstrates that the functionality I
> was calling for can be easily created at the application
> layer.  Even though this is the case, I believe the interface
> should support this at least for the blocking API since it is
> fairly straightforward and it makes the interface more
> intuitive.  This is also better aligned to the goal of
> handling variable length messages.
> 
> I can live with the non-blocking API having messages limited
> to the buffer size, although it would be preferable to have
> it support segmentation/reassembly of messages as well.
> 
> I believe unlimited message size should only be supported
> on the software side.  The reason is that on the hardware
> side messages or message fragments are read into bit
> vectors passed to the pipe functions.  These vectors are
> fixed hardware resources that must be allocated.  Typically
> they get implemented as flip flops and are thus expensive.
> Typically much more expensive than the pipe buffers for the
> same number of bits.  Therefore it is reasonable to assume
> that most practical uses of the pipes API will use fairly
> narrow data vectors (compared to average message size)
> and it is not unreasonable to require pipe buffers to be
> larger than the largest data vector used to access that
> particular pipe.  In other words, it is not unreasonable
> to require that pipe buffers are always large enough to
> accomodate any calls from the hardware side and
> segmentation/reassembly is thus not required on the hardware
> side.
> 
> Brian can you add the proposal to change the current blocking
> API to allow unlimited message sizes as an IM?  This may be
> related to Shabtay's issue e.
> 
> Per
> 
> 
> 
> ------------------------------------------------------------------------
> 
> /*
> ** DPI blocking pipes for messages of arbitrary size.
> **
> ** Copyright (C) 2006 Zaiq Technologies, Inc.  All Rights Reserved.
> **
> ** Author: Per Bojsen <per.bojsen@comcast.net>
> **
> ** Filename: dpi_pipes_nobuflimit_sysc.cxx
> **
> ** Created: Tue Mar 21 21:19:19 EST 2006
> **
> ** Note: This code is an example of DPI blocking pipes implementation
> **   that allows messages to be any size, i.e., not limited by the
> **   buffer size in the underlying non-blocking calls.
> **
> ** $Id$
> **
> */
> 
> #include "svdpi.h"
> #include "dpi_pipes.h"
> 
> // NOTE: This is a newly proposed function.  It is not yet included in
> // dpi_pipes.h.
> extern "C" int dpi_pipe_c_get_depth(void *pipe_handle);
> 
> //
> // dpi_pipe_c_send_nl() is based on John Stickley's version.  The comment
> // below is John's.
> //
> // This is the no limit blocking send function for the C endpoint of a
> // transaction input pipe.  It handles messages of any size regardless
> // of the pipe buffer depth setting.  Messages larger than the pipe
> // buffer can handle will be segmented into chunks that are less than
> // or equal to the pipe buffer size.  Each of these segments will be
> // sent with the dpi_pipe_c_send() function.
> //
> // This code shows how this function can be implemented at the application
> // layer assuming we adopt the dpi_pipe_c_get_depth() function.  Even though
> // this is possible this does not necessarily imply that it is not desirable
> // to include this functionality in the interface.
> //
> void
> dpi_pipe_c_send_nl(void              *pipe_handle,       // in: pipe handle
> 		   int                bytes_per_element, // in: #bytes/element
> 		   int                num_elements,      // in: #elems to write
> 		   const svBitVecVal *data,              // in: data
> 		   svBit              eom)               // in: end-of-message
> {
>   unsigned pipeDepth         = dpi_pipe_c_get_depth(pipe_handle);
>   unsigned elementsRemaining = bytes_per_element;
> 
>   if (elementsRemaining <= pipeDepth)
>   {
>     // If transfer size is less than pipe depth
>     // dpi_pipe_c_send() can handle it directly.
>     dpi_pipe_c_send(pipe_handle, bytes_per_element, num_elements, data, eom);
>   }
>   else if (pipeDepth >= 4)
>   {
>     // If pipe depth is greater than or equal to 4 elements we can
>     // always come up with a transfer size that is an integer multiple
>     // of 4 elements.  Integer multiples of 4 elements have the property
>     // that they will always consist of an integer number of svBitVecVal
>     // words, i.e., one can trivially get a pointer to the next segment of
>     // data to be transferred.
>     const svBitVecVal *dataP = data;
>     unsigned           pipeDepthAdj = pipeDepth & ~0x3; // Round down to nearest
>     						        // multiple of 4.
>     unsigned           bitVecValWordsPerSegment = (pipeDepthAdj / 4) *
> 						  bytes_per_element;
> 
>     // Transfer segments of the message until it is complete.  Each
>     // segment is less than or equal to the pipe depth in size.  Hence
>     // dpi_pipe_c_send() can handle each segment directly.
>     while (elementsRemaining > 0)
>     {
>       int elemsToTransfer = elementsRemaining < pipeDepthAdj ?
> 			    elementsRemaining : pipeDepthAdj;
>       int lastSegment     = elementsRemaining < pipeDepthAdj ? 1 : 0;
> 
>       // Transfer segment.  Take care to ensure the eom flag is only
>       // valid on the last transfer as it is tied to the last element
>       // of the message.
>       dpi_pipe_c_send(pipe_handle, bytes_per_element, elemsToTransfer, dataP,
> 		      lastSegment ? eom : 0);
> 
>       // Advance data pointer to the start of the next segment.
>       dataP += bitVecValWordsPerSegment;
>       elementsRemaining -= elemsToTransfer;
>     }
>   }
>   else
>   {
>     // This case is nasty.  Since the pipe depth is less than 4
>     // elements we cannoot guarantee that we can find a segment size
>     // that consists of an integer multiple of svBitVecVal words.  We
>     // transfer segments of maximum size, i.e., pipe depth.  The data
>     // is copied to a locally allocated buffer and realigned using the
>     // svGetPartSelBit() function.
> 
>     // Bytes per segment.
>     unsigned bytesPerSegment = pipeDepth * bytes_per_element;
> 
>     // Number of svBitVecVal words per segment.  Note that we round up
>     // here as the last svBitVecVal word may be fractional.
>     unsigned bitVecValWordsPerSegment =
>       (bytesPerSegment + sizeof(svBitVecVal)) / sizeof(svBitVecVal);
>     unsigned bytesInLastWord = bytesPerSegment % sizeof(svBitVecVal);
> 
>     // Allocate buffer for segment data.  Note in a real
>     // implementation this should be done somewhere else as part of
>     // initialization rather than on the fly.  One good place to do
>     // this is in the data structures stored off of the pipe handle.
>     svBitVecVal *segmentData = new svBitVecVal[bitVecValWordsPerSegment];
>     svBitVecVal *segmentDataP = segmentData;
> 
>     // Current bit offset in data buffer.  Used by svGetPartSelBit()
>     // function.
>     unsigned bitOffset = 0;
> 
>     // Transfer segments of the message until it is complete.  Each
>     // segment is less than or equal to the pipe depth in size.  Hence
>     // dpi_pipe_c_send() can handle each segment directly.
>     while (elementsRemaining > 0)
>     {
>       int elemsToTransfer = elementsRemaining < pipeDepth ?
> 			    elementsRemaining : pipeDepth;
>       int lastSegment     = elementsRemaining < pipeDepth ? 1 : 0;
>       unsigned i;
> 
>       // Recalculate bitVecValWordsPerSegment and bytesInLastWord for
>       // the last segment.  Since this is the last segment and the
>       // loop will exit after this, we do not need to retain the
>       // original values of these variables.
>       if (elemsToTransfer < (int) pipeDepth)
>       {
> 	bytesPerSegment = elemsToTransfer * bytes_per_element;
> 	bitVecValWordsPerSegment =
> 	  (bytesPerSegment + sizeof(svBitVecVal)) / sizeof(svBitVecVal);
> 	bytesInLastWord = bytesPerSegment % sizeof(svBitVecVal);
>       }
> 
>       // Copy all words of segment data except the last one which may
>       // be partial.
>       for (i = 0;
> 	   i < bitVecValWordsPerSegment - 1;
> 	   i++, bitOffset += 8 * sizeof(svBitVecVal))
>       {
> 	svGetPartselBit(segmentDataP++, data, bitOffset,
> 			8 * sizeof (svBitVecVal));
>       }
> 
>       // Handle last word of segment data here.
>       svGetPartselBit(segmentDataP, data, bitOffset, 8 * bytesInLastWord);
> 
>       // Transfer segment.  Take care to ensure the eom flag is only
>       // valid on the last transfer as it is tied to the last element
>       // of the message.
>       dpi_pipe_c_send(pipe_handle, bytes_per_element, elemsToTransfer,
> 		      segmentData, lastSegment ? eom : 0);
> 
>       // Keep track of elements transferred.
>       elementsRemaining -= elemsToTransfer;
>     }
> 
>     delete [] segmentData;
>   }
> } /* dpi_pipe_c_send_nl */
> 
> /*
> ** Local Variables:
> ** tab-width: 8
> ** End:
> */
> 
> /*
> ** End of file dpi_pipes_nobuflimit_sysc.cxx.
> */

-- 

This email may contain material that is confidential, privileged
and/or attorney work product for the sole use of the intended
recipient.  Any review, reliance or distribution by others or
forwarding without express permission        /\
is strictly prohibited. If you are     /\   |  \
not the intended recipient please     |  \ /   |
contact the sender and delete        /    \     \
all copies.                      /\_/  K2  \_    \_
______________________________/\/            \     \
John Stickley                   \             \     \
Mgr., Acceleration Methodologies \             \________________
Mentor Graphics - MED             \_
17 E. Cedar Place                   \   john_stickley@mentor.com
Ramsey, NJ  07446                    \     Phone: (201) 818-2585
________________________________________________________________

//===========================================================================
// @(#) $Id: dpi_pipes_sysc.cxx,v 1.1 2006/02/21 04:20:06 jstickle Exp $
//===========================================================================

//---------------------------------------------------------------------------
//   Mentor Graphics, Corp.
//
//   (C) Copyright, Mentor Graphics, Corp. 2003-2006
//   All Rights Reserved
//   Licensed Materials - Property of Mentor Graphics, Corp.
//
//   No part of this file may be reproduced, stored in a retrieval system,
//   or transmitted in any form or by any means --- electronic, mechanical,
//   photocopying, recording, or otherwise --- without prior written permission
//   of Mentor Graphics, Corp.
//
//   WARRANTY:
//   Use all material in this file at your own risk.  Mentor Graphics, Corp.
//   makes no claims about any material contained in this file.
//---------------------------------------------------------------------------

#include <vector>
#include <systemc.h>
#include "dpi_pipes.h"
#include "svdpi.h"

//---------------------------------------------------------------------------
// notify_ok_to_send_or_receive()                               johnS 2-12-06
//
// This is a callback function that notifies the application that there
// is room for at least 1 data element in an input pipe or least one data
// element in an output pipe.
//
// This callback function assumes it has been given a context object
// that is an sc_event that can be directly posted to.
//---------------------------------------------------------------------------

static void notify_ok_to_send_or_receive(
    void *context ){           // input: notify context
    sc_event *me = (sc_event *)context;
    me->notify();
}

//---------------------------------------------------------------------------
// dpi_pipe_c_send()                                            johnS 1-26-06
//
// This is the basic blocking send function for the C endpoint of a
// transaction input pipe. It first calls the non-blocking send function
// (dpi_pipe_c_try_send()). If it succeeds in sending on the first try,
// it returns happily.
//
// Otherwise continues to wait on an sc_event until there is at least the 
// required number of data elements in the pipe at which point the call
// to dpi_pipe_c_try_send() will succeed.
//---------------------------------------------------------------------------

void dpi_pipe_c_send_to_fit(
    void *pipe_handle,       // input: pipe handle
    int bytes_per_element,   // input: #bits/element
    int num_elements,        // input: #elements to be written
    const svBitVecVal *data, // input: data
    svBit eom )              // input: end-of-message marker flag
{
    if( !dpi_pipe_c_try_send( pipe_handle,
            bytes_per_element, num_elements, data, eom ) ) {

        sc_event *ok_to_send = (sc_event *)dpi_pipe_get_notify_context(
            pipe_handle );

        // if( notify ok_to_send context has not yet been set up ) ...
        if( ok_to_send == NULL ){
            ok_to_send = new sc_event;
            dpi_pipe_set_notify_callback(
                pipe_handle, notify_ok_to_send_or_receive, ok_to_send );
        }

        while( !dpi_pipe_c_try_send( pipe_handle,
                bytes_per_element, num_elements, data, eom ) )
            wait( *ok_to_send );
    }
}

void dpi_pipe_c_send(
    void *pipe_handle,       // input: pipe handle
    int bytes_per_element,   // input: #bits/element
    int num_elements,        // input: #elements to be written
    const svBitVecVal *data, // input: data
    svBit eom )              // input: end-of-message marker flag
{
    // dpi_pipe_c_get_depth() will always return a buffer size of at
    // least one element of size bytes_per_element.
	int pipe_depth = dpi_pipe_c_get_depth( pipe_handle, bytes_per_element );

    // Operate more efficiently on whole words if bytes_per_element is a
    // multiple of 4 bytes.
    if( (bytes_per_element&3) == 0 ){
        int elements_remaining = num_elements;
		while( elements_remaining ){
			num_elements = elements_remaining > pipe_depth ? pipe_depth
				: elements_remaining;
            dpi_pipe_c_send_to_fit( pipe_handle,
				bytes_per_element, num_elements, data, eom );
			elements_remaining -= num_elements;
			data += num_elements/4;
		}
	}

    // Otherwise use less efficient copying at byte granularity using
    // SV DPI helper functions.
	else {
        int bytes_per_buffer = pipe_depth * bytes_per_element;
        int bytes_remaining = num_elements * bytes_per_element;
        int i, num_bytes;
        int byte_index = 0;

        // Create a holder for a buffer's worth of elements rounded up
        // to the next word. This guarantees proper alignment of data
        // for each call to dpi_pipe_c_send_to_fit().
		svBitVecVal *holder = new svBitVecVal[ (bytes_per_buffer-1)/4+1 ];
		while( bytes_remaining ){

			num_bytes = bytes_remaining > bytes_per_buffer ? bytes_per_buffer
				: bytes_remaining;

            for( i=0; i<num_bytes; i++ ){
                svBitVecVal byte;
				svGetPartselBit( &byte, data, byte_index*8, 8 );
				svPutPartselBit( holder, byte, i*8, 8 );
            }
            dpi_pipe_c_send_to_fit( pipe_handle,
				bytes_per_element, num_bytes/bytes_per_element, holder, eom );
			bytes_remaining -= num_bytes;
			byte_index += num_bytes;
		}
		delete [] holder;
	}
}

//---------------------------------------------------------------------------
// dpi_pipe_c_receive()                                         johnS 1-26-06
//
// This is the basic blocking receive function for the C endpoint of a
// transaction output pipe. It first calls the non-blocking receive function
// (dpi_pipe_c_try_receive()). If it succeeds in receiving on the first try,
// it returns happily.
//
// Otherwise continues to wait on an sc_event until there is at least the
// required number of data elements in the pipe at which point the call
// to dpi_pipe_c_try_receive() will succeed.
//---------------------------------------------------------------------------

void dpi_pipe_c_receive(
    void *pipe_handle,      // input: pipe handle
    int bytes_per_element,  // input: #bits/element
    int num_elements,       // input: #elements to be read
    int *num_elements_read, // output: #elements actually read
    svBitVecVal *data,      // output: data
    svBit *eom )            // output: end-of-message marker flag
{
    if( !dpi_pipe_c_try_receive( pipe_handle,
            bytes_per_element, num_elements, num_elements_read, data, eom ) ) {

        sc_event *ok_to_receive = (sc_event *)dpi_pipe_get_notify_context(
            pipe_handle );

        // if( notify ok_to_receive context has not yet been set up ) ...
        if( ok_to_receive == NULL ){
            ok_to_receive = new sc_event;
            dpi_pipe_set_notify_callback(
                pipe_handle, notify_ok_to_send_or_receive, ok_to_receive );
        }

        while( !dpi_pipe_c_try_receive( pipe_handle,
                bytes_per_element, num_elements, num_elements_read,
                data, eom ) )
            wait( *ok_to_receive );
    }
}

//---------------------------------------------------------------------------
// dpi_pipe_c_flush()                                           johnS 1-26-06
//---------------------------------------------------------------------------

void dpi_pipe_c_flush(
    void *pipe_handle )      // input: pipe handle
{
    sc_event *ok_to_send = (sc_event *)dpi_pipe_get_notify_context(
        pipe_handle );

    if( ok_to_send == NULL )
        return;

    while( !dpi_pipe_c_try_flush(pipe_handle) )
        wait( *ok_to_send );
}