Shabtay’s take 04/28 - This is mostly
addressed with automatic flush-on-eom mode and is
quite high level.
Committee agrees.
Streaming/Reactivity
a) Is it a goal to
create a standard that allows the IP provider to create transactors that do not
need to know whether the system or the particular channel serving the
transactor is streaming/reactive?
b) How will the user
control reactivity/streaming?
** This is a consolidated proposal from JohnS
dealing with IM208-211
As per my AI this week, I
would like to make the following proposal
to try to solidify the
semantics of pipes.
As I said in last week's
meeting, this is largely a consolidation
of some of the clarifications
in Per's e-mails to questions
raised by Shabtay concerning
pipes.
Let me try to lay out all
the issues here and if we can
all agree to these, perhaps we
can close out IM's 208-211
and go a long way toward coming
to agreement on the
transaction pipe proposal.
Requirements for transaction
pipes:
1. [This addresses IM 210]
Determinism a "must requirement"
- Consumption of data from a receive pipe
on the H/W side
or production of
data to a send pipe will always occur
on the same
clock cycles from one simulation to another
2. [This partially addresses
IM 208 - user vs. vendor control
of optimization]
It is possible to implement pipes as a
reference model
of source code
built over basic DPI function calls.
As such they can be made to run on any DPI
compliant
software
simulator. Such a reference model would provide
a reactive
implementation of pipes which could be used
as the basis for
more optimized "builtin" implementations
that might deploy
batching, streaming, and concurrency
optimizations.
It is an absolute requirement however that
such optimizations
do not change
functional and deterministic behavior of a
design that runs
on the basic reactive reference model
implementation of
pipes as described above.
In other words, code using a pipe interface
must behave
identically
whether running over the reactive "reference"
implementation or
running over an optimized custom
implementation.
Within this constraint, vendors are free to
perform
any optimizations
of pipes that are appropriate
to their platform.
3. [This addresses IMs 208, 209]
Buffer depth is implementation specified. This
allows
vendors to chose
buffer depth that is optimal to
their platform.
The flush mechanism is what gives
the user the
chance to specify a "synchronization point"
to the
infrastructure indicating that an HVL thread is
switching from a
"streaming mode" where it is doing pipe
operations to a
"reactive mode" where it is doing conventional
reactive DPI
function call interactions.
As we said in the meeting last week, it is
at this point
in where queries
of the H/W simulator time will make
sense as well.
4. [This addresses IM 211]
Operation of pipes is identical whether
successive
access ops (sends
or receives) are done in 0-time
or over user clock
time, i.e. 1 access per clock.
It is strictly a function of modeling
subset as to
whether 0-time ops
are supported or not. But the
pipe interface
itself does not preclude this support.
It is useful to compare and
contrast the semantics
of pipes to those of fifos. I think the reason that
we often stumble when
discussing issues like
user vs. implementation
specified buffer depth, its
effect on determinism, etc. is
because people are
thinking of a fifo
model rather than a pipe model.
Both pipes and fifos are deterministic and have similar
functions in term of providing
buffered data throughput
capability. But they have different
basic semantics.
Here is a small listing that
tries to compare and
contrast the semantics of fifos vs. pipes:
Fifos
- Follows classical OSCI-TLM
like FIFO model
- User specified fixed sized
buffer depth
- Automatic synchronization
- Support blocking and
non-blocking put/get ops
- "Under the hood"
optimizations possible - batching
- No notion of a flush
Pipes
- Follows Unix
stream model (future/past/present semantics)
- Implementation specified
buffer depth
- User controlled
synchronization
- Makes concurrency optimization more
straightforward
- Support only blocking ops
(for determinism)
- "Under the hood"
optimizations possible - batching, concurrency
- More naturally supports
data shaping, vlm, eom,
flushing
One could argue that we may
wish to entertain the notion
of a "dpi_fifo"
reference library to augment the "dpi_pipe"
reference library currently proposed
and thus provide two
alternative DPI extension libraries
that are part of the
SCE-MI II proposal that
address different sets of user needs.
But it is useful to make the
clear distinction between
fifos and pipes and, for now, at
least converge on the semantics
of proposed pipes and making
sure they address the original
requirements of variable length
messaging.
Just to augment to what I've
said above, I would like to recap
some of Per's and other earlier
comments regarding pipes. I've
just re-stated these so it is
all in one convenient place.
The following text is
verbose so read it as desired. The main
part of my proposal is the text
above. This is just supportive
text reiterating its main
points.
------------------------------------------
This ties in and clarifies
#2 above:
Per Bojsen wrote:
> Note that the text said that concurrency
could be introduced by the
> implementation
as long as it does not alter behavior.
So we've
> established and
all agreed upon that the new DPI/function based subset
> of SCE-MI 2.0 is
a system that uses alternating execution.
This
> follows directly from
the DPI definition. However, this
applies
> only to the
behavior of the system, not necessarily to what is actually
> going on under the hood. There are plenty of opportunities to
> optimize the transport and execution that
does not change the behavior.
> This includes introducing some degree of
concurrent operation. Do
> you agree that
it does not matter that there is some degree of
> concurrent
operation as long as it behaves exactly like a purely
> alternating system would? SCE-MI 2.0 will describe the semantics
> of the
DPI/function based interface in terms of alternating execution.
> The implementation is compliant as long
as it preserves this semantics.
> It does not matter one bit how the
implementation achieves this
> under the hood,
agreed?
------------------------------------------
This ties in and clarifies
#3 above:
Per Bojsen wrote:
> > IM209 - We had some discussion
about setting buffer depth for pipes. It
> > was my understanding that
> > the infrastructure and not by the
user. Is this correct?
>
> This is my understanding as well. If the system is deterministic and
> observes alternating semantics, then I do
not see any need for a user
> setting of buffer depth. This is because the buffer depth setting
> would have no observable impact on the
behavior of the system. There
> are other problems with a user settable
buffer depth: it is unlikely
> that a given
buffer depth setting would achieve optimum performance
> in all
implementations. Note, I am saying that
I do not think a user
> setting for
buffer depth should be in the SCE-MI 2.0 standard. However,
> any
implementation is free to provide its own performance optimization
> knobs outside of
the standard which could include buffer depth setting.
> I do not see such features as leading to
a non-compliant implementation
> (necessarily).
------------------------------------------
This ties in with the
comments about pipes vs. fifos semantics or
even vs. use of plain DPI calls
and when a user would want one
over another:
Per Bojsen wrote:
> It is my understanding that pipes are
intended for streaming, batching,
> variable length messages,
and potentially can be used even for more
> exotic purposes
if the modeling subset allows it. Given
that pipes
> can be implemented at the application
layer, the choice between using
> pipes and DPI is
one of convenience in many cases. However,
since an
> implementation
can choose to provide an optimized version of the pipes,
> this would be a
factor as well in the choice to use them.
------------------------------------------
This ties in with #3 above:
John Stickley wrote:
> johnS:
> I think the main point here is that it
does not matter who
> sets the delay [buffer depth] or what the
delay is so long as
> there is a
mechanism
> to
re-synchronize the times of the pipe producer and pipe consumer
> if it becomes
necessary to enter back into a mode of reactive
> (alternating)
interactions. This is the purpose
of the
> flush call - to provide
this re-synchronization.
>
> For example, a pipe producer thread can
be sitting there jamming
> transactions into
a pipe to its heart's content. The consumer
> meanwhile is
only consuming transactions which the producer
> had written well into the past.
>
> So in this scenario at any given point
the consumer's time,
> the producer is
well into the future - how far into it,
> we don't care.
>
> Or, put differently, at any given point
in the producer's time,
> the consumer is
well into the past. How far into it,
> we don't care.
>
> But suppose producer and consumer now
want to interact reactively,
> say with plain DPI function call
interactions. They must synchronize.
> i.e. the
producer's present must become one and the same as
> the consumer's
present. To do this, producer issues a flush.
> This guarantees that the producer thread
blocks until all the
> future
transactions have dissipated to the consumer and now the
> two are
synchronized in time. At this point in time, the two have a
> common present
and are free to communicate reactively. And
> all this can be
done deterministically where interactions
> take place on the same clocks on timed
side regardless of
> how much
buffering an implementation provided or how much
> concurrency it
chose to use.
** End of JohnS proposal
Pure
DPI is alternating. 1.1 has no preference and could be either alternating or
concurrent. reactivity === alternating. These terms
are being used to mean the same thing.
Function
calls are blocking. The call occurs in zero time, even though time may be consumed
in the function called.
A
2.0 implementation must support concurrency if 1.1 models require it.
For streaming, only supported by models that are
pure sources or pure syncs. Any concurrency can be added to an alternating
system so long as it does not alter behavior. In these cases an alternating
behavior is the benchmark. These are all viewed as implementation optimizations
and should not be specified or mentioned in the specification.
Shabtay> We need to
evaluate if concurrency does not introduce lack of compliance among various
simulation and emulation environments. I don't see how the rate that messages
are sourced or synced could be the same when concurrency is used. Does it?
Per> Note that the text said that concurrency could be
introduced by the implementation as long as it does not alter behavior. So we've established and all agreed upon that
the new DPI/function based subset of SCE-MI 2.0 is a system that uses
alternating execution. This follows
directly from the DPI definition.
However, this applies only to the behavior of the system, not
necessarily to what is actually going on under the hood. There are plenty of opportunities to optimize
the transport and execution that does not change the behavior. This includes
introducing some degree of concurrent operation. Do you agree that it does not matter that
there is some degree of concurrent operation as long as it behaves exactly like
a purely alternating system would?
SCE-MI 2.0 will describe the semantics of the DPI/function based interface
in terms of alternating execution. The implementation is compliant as long as
it preserves this semantics. It does not matter one bit how the implementation
achieves this under the hood, agreed?
Shabtay>> What is done under the hood is left to the implementers.
What I care about is maintaining determinism when using all engines including
simulation. Let's simply table that.
JohnS>> Yes. The idea is to maintain
determinism, and therefore consistent behavior regardless of optimizations put
"under the hood".
Batching
is the aggregation of messages to improve communications.
Streaming
will not be required by the spec.