Subject: RE: ISSUE:DirectC:DirectC i/f should support mechanismforcalling Verilog task/function from a DirectC application
From: Stuart Swan (stuart@cadence.com)
Date: Wed Oct 23 2002 - 16:37:48 PDT
It's nice to see some "out of the box" technical discussion related to
the C/C++ integration issues, though it concerns me that there seems
to be a recurring warning from some that "We can't think about more
general
approaches, since that will put our tight schedule at risk." It seems
to me that this kind of approach will only result in a lousy standard.
I'm in general agreement with the comments from John Stickly about what
C/C++ modeling is good for, and what sort of performance can be
obtained.
Certainly at levels of abstraction above the RTL level (but often times
still
at the cycle accurate level) very high performance can be achieved. For
example,
a number of companies today are performing cycle accurate modeling
of platforms consisting of processors (ISS models), busses, peripherals,
etc, in SystemC and getting 50K - 100K cycles per second on typical PCs.
Don't believe me? See:
-----Original Message-----
From: Stickley, John [mailto:john_stickley@mentorg.com]
Sent: Wednesday, October 23, 2002 12:35 PM
To: Kevin Cameron x3251
Cc: Stickley, John; sv-cc@eda.org
Subject: Re: ISSUE:DirectC:DirectC i/f should support
mechanismforcalling Verilog task/function from a DirectC application
Kevin,
Interesting discussion. See embedded comments.
Kevin Cameron x3251 wrote:
> From john_stickley@mentorg.com Wed Oct 23 10:37:06
2002
>
> Kevin,
>
> Kevin Cameron x3251 wrote:
>
> > > From: "Swapnajit Mittra" <mittra@juno.com>
...
> > That's why it's better to write your functions in SV
rather than
> > calling external C routines :-)
> >
>
> johnS:
>
> Kevin,
>
> I think the issue is more than just context switching.
> In fact, that's not the difficulty. That by itself is
a pretty routine
> operation in all the new C++ kernels - TestBuilder,
SystemC
> and even non-C++ testbench modeling environments such
as Verisity.
>
> In fact, from what I've seen, for pure untimed
multi-threaded
> testbenches, these environments are pretty efficient
(have
> a look at systemc.org posting on some benchmarking I
did:
>
http://www.systemc.org/hypermail/systemc-forum/1282.html)
Throughput depends on the ratio of context switching
time to actual
model evaluation, most of the C/C++ activity is
currently at the
testbench level (and cycle based) there aren't that many
threads and
the models are complex. As a general approach to
simulation it doesn't
work because a lot of the code is too short (from @/# ->
@/#). Also
if you don't know the call-depth for the C/C++ then you
have to
allocate a "big enough" stack for each thread which is
inefficient
in memory (you'ld blow your 2G Linux limit fairly
easily).
johnS: I guess it comes down whether your design
is composed mainly of a small number of long algorithmic
sequential
blocks (which tend to use native machine data types) vs a large
number
of tiny concurrent blocks (which tend to use bit_vector data
types).
SystemC is real good at the former and real bad at the latter.
We've never seen any problems with large call depth but we have
seen C++ simulators bog down when there's lots of fine grained
concurrency. However our position is that has never really been
SystemC's sweet spot anyway. It is much better at algorithmic/
behavioral system level modeling (which - regrettably HDL
environments,
at least those prior to SystemVerilog are not).
What's interesting about the case I cited above is that is is a
pipeline
of large algorithmic cores surrounded by thin timing shells
characterized
by cycle accurate activity). The computations
in the cores dwarf the cycle based activity in the shells which
is
why it performs so much better in the SystemC behavioral version
than the VHDL RTL version (8 1/2 minutes vs 9 weeks !). In the
RTL version the entire design is cycle based activity.
But generally from my point of view, we see the whole spectrum
of testbenches.
We see testbenches with lots of tiny concurrent processes, we
see testbenches
with few highly sequential processes. In the latter case, thread
context
switch overhead does not amount to much - in the former case it
does.
> They all seem to use a package called QuickThreads
which is a very
> efficient non-preemptive threading package. The place
where
> SystemC performance breaks down is when processing
> bit_vectors rather than native machine types such as
int or double.
>
> But the 0-time issue is more than just context
switching. It tends
> to make it difficult to keep coherency between the
time state
> of the C++ kernel and that of the HDL kernel. By
establishing
> 0-time "synchronization points" you can easily avoid
this inconsistency.
I don't think the DirectC approach requires a seperate
C++ kernel, so
it's less of a problem.
johnS:
True, it doesn't require it, but it should probably also
accommodate it.
TestBuilder and SystemC are good examples of independent
kernels,
though they could be tightly bundled in to a given vendor's HDL
kernel
at the discretion of that vendor - or not.
I would like to make sure we design the interface in such a way
that it
does not prevent (or at least discourage) either approach.
I'd prefer leave the topic of handling genuine
parallelism and multiple
stacks etc. to the Enhancement Committee for a later rev
of SV.
johnS:
But at least if we can provide some simple hooks into the
interface
to accommodate these environments that would be nice. I'm in the
process
of putting forth a simple proposal for how we might do that.
I'll include
an example with it also.
-- johnS
> > Kev.
> >
> > > Doug,
> > >
> > > I understand your concern. But, can not these C
models
> > > be written as cmodules (or be embedded within
cmodules) ?
> > >
> > > I think (Joao, correct me if I am wrong) a
DirectC
> > > external C function and a cmodule to some sense
> > > are analogous to a Verilog function and task
respectively.
> > > External C functions are 0-time activities
whereas
> > > cmodules are not.
> > >
> > > I would not mind if we change that proposition
(that DirectC
> > > external C functions should be 0-time
activity), but my
> > > concerns there are:
> > >
> > > o Whether that will break any other basic
structure of the
> > > whole scheme.
> > >
> > > o Whether we have sufficient time to undertake
this type of
> > > fundamental changes. Personally I think we
should rather be
> > > late than producing things that are half
cooked, but I know
> > > we are working under some tight schedule here.
> > > --
> > > Swapnajit Mittra
> > > Project VeriPage :::
http://www.angelfire.com/ca/verilog
__
______ | \
______________________/ \__ / \
\ H Dome ___/ |
John Stickley E | a __ ___/ /
\____
Principal Engineer l | l | \ /
Verification Solutions Group | f | \/ ____
Mentor Graphics Corp. - MED C \ -- / /
17 E. Cedar Place a \ __/ / /
Ramsey, NJ 07446 p | / ___/
| / /
mailto:John_Stickley@mentor.com \ /
Phone: (201)818-2585 \ /
---------
This archive was generated by hypermail 2b28 : Wed Oct 23 2002 - 16:41:52 PDT