Clarification Questions, RE: Mentor-Proposed SCE-MI 2.0

From: Matt Kopser <mkopser_at_.....> Date: Fri Jun 17 2005 - 09:02:49 PDT · This archive was generated by hypermail 2.1.8 : Fri Jun 17 2005 - 10:32:39 PDT

     Clarification questions in the Mentor proposal for SCE-MI 2
        (questions refer to version 1.1 of Mentor's proposal)

Key Questions:
=============

1) Compliance, IP Verification

Mentor's SCE-MI 2.0 proposal is based on a subset of DPI.  How will an
IP
vendor verify that their IP complies with the recommended DPI subset, on
a simulator the provides fill DPI support?

Does Mentor propose implementation of a reference DPI implementation
(that restricts usage to the suggested subset) for verifying interface
compliance?

Some aspects of the restricted subset can be verified statically (for
example, adherence to restricted type set), but recommended limitations
on function call stack, for example, can only be verified dynamically.
How are these limitations to be verified by IP providers?

2) Standard DPI

DPI being an Accellera approved System Verilog standard implies that it
should be attempted in its entirety on all simulation/acceleration
platforms.  Does the Mentor proposal entail that vendors should support
two implementations on acceleration/emulation (one based on the Mentor
proposal and one based on complete Accellera standard), or that the DPI
implementation on Acceleration/emulation will exclusively be based on
the
Mentor proposal?

3) DPI Types, VHDL and pure Verilog hardware side

The proposal specifies a suggested subset of SystemVerilog types and
their associated mapping to 'C' types on the software side.  What is
suggested for type support in pure Verilog and VHDL?

How would transactors that are written in VHDL for example be tested in
simulation mode given that that Mentor proposed extensions are not
approved for VHDL simulation?

4) Streaming/Reactivity

What will happen if an IP provider creates an IP for simulation that is
limited to the Mentor data types but does not use transaction pipes.
Can
such transactors be used in streaming and reactive mode at user's
control? 

Alternatively, can an IP provider create IP that will work in both a
streaming and non-streaming (reactive) use model, using the proposed
pipe
mechanism at user's control?

5) Threading Requirements

In an environment utilizing the proposed pipe mechanism, it appears that
the proposal requires that the user's (test) C code be threaded.  Is
this
the case?

If not, then how is the pipe flush mechanism supposed to 'yield' control
to the C side?  (In pure DPI, if the user's C code is running -- thus
giving the user the opportunity to call the pipe flush -- the System
Verilog side of the system has already yielded control to the C side.
The only means for any other C code to run is through the user of
threads.

6) Time, Cycle Stamping

How is the passage of 'time' to be tracked on the hardware side?  Since
there is no explicit mention of time in the proposal, does this mean
that there must be an implicit 1/1 (SCE-MI 1.1) controlled clock, that
increments the cycle count?

How are messages/transactions stamped with cycle count?  Is this left
up to each transactor?  Or, is there a proposed standard way of doing
so?

Detailed Questions:
==================

7) Context Handling

How is context handling performed in a pure Verilog or VHDL environment?
Does the user have to purchase and/or license a SystemVerilog compliant
system in order to utilize the svGetScope and svGetUserData capability?

Does the proposal need to be enhanced to add a SCE-MI 2.0 API to provide
hardware-side-neutral calls to context functions?  Or, will the user
need
to call hardware-side-specific (and in some cases, vendor-specific)
functions to access context information?

Is the proposal recommending (requiring?) that IP be written with the
use
of the 'context' specification for all imported functions, or is the
choice left to the IP provider?

8) Exported Tasks

The proposal does not clearly indicate exported tasks should be
supported
or not.  What is the recommendation for exported task support and why?

9) Multiple function calls in zero time

Is this a usable mechanism?  Doesn't this encourage a low-performance
use
model ('ping-pong'-ing between the software side and the accelerator for
each function call?)  In the SCE-MI 1.1 use model, and Cadence's
proposal, multiple message transfers are achieved by simply
instantiating
multiple message port macros in parallel -- this approach does not
require multiple function calls to the software side.

10) Pipe IDs

Unique pipe IDs are the Mentor-proposal equivalent of message port macro
instance names, correct?  (The instance names of message port macros
implicitly differentiate message 'channels', but in this proposal,
integer IDs are necessary to 'uniquify'.)

How is determinism ensured when using the pipe mechanism -- if the
depth of pipe fifos is left up to the vendor to decide?

11) Transaction Pipes

What is a 'full pipe'?  The prototypes for the pipe mechanism functions
imply infinitely-long pipes.  Why would a pipe ever be 'full'?

12) Dynamic / Variable pipe data sizes

The header files supplied in the examples indicate a static value for
the
DPI_PIPE_MAX_BITS value.  This value is used to size the target data
locations in the example BFMs.  How does the IP provider size the HDL
data locations to the appropriate size for the application?

Does it limit the size of arguments defined by the modeler when calling
the send and receives methods?

What happens if DPI_PIPE_MAX_BITS is less than the value of
8 * bytes_per_element * num_elements?

The underlying atomic data size is one d to non-deterministic behavior.
SCE-MI 1.1 is very prone to this vulnerability.

[Matt] John, you state that, with the Mentor-proposed pipe mechanism,
'you have to be very careful...'.  And you state that SCE-MI 1.1 is
'very prone to...'  Are you saying that, in both cases, deterministic
behavior is up to the user and/or IP modeler to ensure?

This said, polling semantics on the SystemC side of say a TLM-DPI
conduit appears to be quite feasible (i.e. tlm_put_if::nb_put(),
tlm_put_if::nb_can_put(), tlm_get_if::nb_get(),
tlm_get_if::nb_can_get()).

Incidently, one of the things that is not yet clear to us in the Cadence
proposal is how polling on their SceMiVarMessageInPort's can be done
determinstically especially if streaming is enabled.
If reactivity controls are enforced, then it would be deterministic but
if streaming and overlapped execution between S/W and H/W are enabled,
it would seem to us that determinism is lost.

[Matt] This scenario (hardware-side-polling on a SceMiVarMessageInPort,
in streaming mode) is analogous to (as you stated above), hardware-side
polling over the proposed transaction pipes.  In both cases, the IP
provider and/or user must ultimately ensure determinism.

[Matt] If the Cadence-proposed macros are used in reactive mode, then
deterministic behavior is achieved.  With the Mentor-proposed pipe
mechanism (it appears) that it is up to the IP provider and/or end
user to explicitly flush pipes on the software side to ensure reactive/
deterministic behavior.  (Cadence has asked for clarification on how
the flush mechanism and threading considerations are expected to work
in the Mentor proposal -- clarification of these questions would help
in this area.)
> 
> * Loss in term of modeling capabilities or facility: currently the use

> of control clock mechanism allow us to process data within the 
> transactor before or after the message passing whatever the number of 
> clock cycles which are required by the implementation. This is simply 
> due to the fact that in the transactor control we can decide to stop 
> the cclock if necessary for processing the data. If the zero-delay 
> operations are now only attached to data transfers and no more to a 
> transactor control, then this restrains all the architectures which 
> can be used when data processing is to be performed within a 
> transactor. This will not simplify at all the modeling and can also 
> prevent some data processing modeling. This is another reason why the
compatibility with scemi 1.x is also an essential requirement.
> Allowing both a control + data synchronization should be the best in 
> term of modeling capabilities for the new SCEMI interface. Note that 
> the control clock + PCI like interface was doing that in SCEMI 1.x.

johnS:

I think you are talking about basic support for 0-time operations on
messages between clocks.

I have 2 comments about this.

1. As you say, for specialized cases that require this using an
    uncontrolled time based approach, you can always write a
    SCE-MI 1.1 transactor.

2. Beyond this, if you're writing the transactors only in controlled
    time as is the basic use model and intent of the SCE-MI 2
requirements,
    to really have this discussion we believe you have to consider
modeling
    subset. Consider the problem you're trying to solve. As a user, you
want
    to deal only with controlled time, yet, you're trying to perform
some
    iterative sequential operation in 0-time - say between clocks.

    This is fundamentally difficult to do in synchronous RTL that only
    uses controlled clocks with either proposal.

    However, if the modeling subset supports something like data
dependent
    loops in 0-time - which goes beyond RTL, there is nothing in the
    DPI interface itself that prevents this type of use of the
interface.

    By contrast, using a macro based approach where the macros are
    clocked by controlled clocks, it is fundamentally impossible
    to do 0-time message post-processing in the same clock cycle
    in which the message is received. Post processing would have to
    be deferred until the next clock at the earliest.

[Matt] You mean to say, using a macro-based approach, and a restricted
finite-state-machine-restricted modeling style makes is near impossible
to do 0-time message post-processing in the same clock cycle..., right?
Cadence has touched on this with regard to the contents of Appendix A
of the Mentor proposal -- there is an assumption that the DPI-based
approach allows use of an extended modeling style whereas the macro-
based SCE-MI 1.1 (and Cadence-proposed 2.0) do not.  This is a mis-
characterization of the facts, IMO.

A related topic to this is multiple messages in 0-time. This has some of
the same characteristics as 0-time message post-processing.
Again it really an issue more of what modeling subset the implementor
wishes to support. There is nothing in the current Mentor DPI interface
proposal that precludes multiple message transfers in 0-time so long as
the modeling subset supports it. This is in stark contrast to a macro
based approach that drives macros with a controlled clock. It is
fundamentally impossible to perform multiple message transfers in 0-time
with this approach.

Please see section 3.4 of my latest proposal revision update for a more
detailed discussion of this.

Another way to think of this is to ask yourself the question, "how would
I do this in a S/W simulation interface ?". Is there a reason we need
"uncontrolled time" in S/W simulators ? No, there never has been. There
is a notion of "delta cycles", but no uncontrolled time. And even delta
cycles are largely hidden from user consciousness. So, similarly we
should provide an interface where, assuming the modeling subset supports
it, it is possible to do multiple operations in 0-time without the user
having to think about "stopping the clock".

Realistically speaking, in the shorter term you may not see
implementations supporting modeling subsets that support 0-time ops on
emulation platforms.
However, that is not to say that it cannot happen in the future.

[Matt] Agreed.  And this extended subset support, if/when supported by
accelerators/emulators, could be used in conjunction with both DPI-
based models and models which instantiate SCE-MI macros.

And if it does, the SCE-MI 2 interface definition should scale to allow
for the possibility of that type of operation.

In conclusion, I think this issue really has more to do with supported
modeling subsets than with the API itself. But, I reiterate, let us not
design the API in such a way that will fundmentally prevent scaling to
more sophisticated modeling subsets in the future.

[Matt] Agreed.

Matt

<eom - Matt>

-- johnS
<eom>

> 
> Best regards,
> 
> Joseph BULONE
> HW Emulation manager
> 
> PS: if I cannot join you during the meetings, please do no hesitate to

> send me any questions.
> 
> -----Original Message-----
> From: owner-itc@eda.org [mailto:owner-itc@eda.org] On Behalf Of 
> Deneault, Damian
> Sent: Wednesday, June 01, 2005 4:47 PM
> To: 'itc@eda.org'
> Subject: ITC Meeting Minutes for May 26th
> 
> 
> 
> ITC Meeting Minutes for May 26th
> 
> Attendees
> 
> Duaine Pryor - Mentor
> John Stickley - Mentor
> Jeff Evans - Mentor
> Matt Kopser - Cadence
> Richard Newell - Aptix
> Damian Deneault - Zaiq
> Per Bojsen - Zaiq
> Tom Peng - Zaiq
> Jason Rothfuss - Cadence
> Russ Vreeland - Broadcom
> Sanjay Sawant - Tharas
> Donald Cramb - Tharas
> Bryan Sniderman - ATI
> Edmund Fong - ATI
> 
> Activity
> 
> The Mentor proposal was discussed further, both addressing individual 
> questions raised and exploring it in general. Topics included:
> - the mixing of old (SCEMI 1.x) models with DPI based models, how to 
> understand any interaction between them, what interaction is or isn't 
> allowed, and whether reducing allowed interaction would be a good 
> simplification. John S promised written document or example clarifying

> interaction and the retained elements of SCEMI 1.x
> - existence of uncontrolled time with uncontrolled time, how to handle

> it in the Mentor proposal, and whether the correlating questions exist

> with the Cadence proposal
> - the current precision of the combination DPI spec plus Mentor 
> proposal for non System Verilog language/RTL combinations
> - a previous request was repeated for a complete example
> 
> Cadence plans to deliver a written document to explain or rebut some 
> comments on their proposal. This should be posted or emailed and then 
> followed up with discussion.
> 
> The idea of a compromise or merged proposal was raised by Richard N, 
> and anyone is invited to come in with ideas for that at the next
meeting.
> 
> In order to plan the evaluation and decision process, Damian requested

> that all participants come to the next meeting with their input on 
> what specifically they feel should be discussed, explored or clarified

> before we can come to a decision or vote. We will try to lay out the 
> remaining decision process and timeline at the next meeting.
> 
> The next meeting is Thursday 6/2 at the usual 12pm/9am time.
> 
> Action Items
> 
> 1. Cadence (Matt K) will post a written response document, before the 
> 6/2 meeting.
> 
> 2. Mentor (John S) will provide a document clarifying the portions of 
> SCEMI 1.x that are retained as part of the DPI based proposal, prior 
> to the 6/2 meeting.
> 
> 3. Mentor (John S) will provide a complete example.
> 
> 4. All participants come prepared to identify remaing discussion or 
> evaluation tasks at the 6/2 meeting.
> 
> 
> 
> 

-- 

This email may contain material that is confidential, privileged and/or
attorney work product for the sole use of the intended recipient.  Any
review, reliance or distribution by others or
forwarding without express permission        /\
is strictly prohibited. If you are     /\   |  \
not the intended recipient please     |  \ /   |
contact the sender and delete        /    \     \
all copies.                      /\_/  K2  \_    \_
______________________________/\/            \     \
John Stickley                   \             \     \
Mgr., Acceleration Methodologies \             \________________
Mentor Graphics - MED             \_
17 E. Cedar Place                   \   john_stickley@mentor.com
Ramsey, NJ  07446                    \     Phone: (201) 818-2585
________________________________________________________________