RE: Process Control Extensions - kill/reset/throw scheduling, exception handling

From: Bishnupriya Bhattacharya <bpriya@cadence.com>
Date: Wed Sep 01 2010 - 10:13:33 PDT

Philip, Tor, John, All,

One comment about the issue that Tor had raised - Tor had pointed out that the current statement in the spec "No other processes shall execute between the time the kill() of a target process is ordered and the target obliges" is impossible to satisfy in a multi-core implementation.

Here I do think Tor has a valid point. Following the immediate semantics of kill/reset/throw_it, if we do switch in the target process on the same core where the perpetrator is running, there will still be processes executing in parallel on other cores, which defeats the above statement.

Now suppose, we do away with this particular statement from the spec, but still maintain the immediate semantics of kill et al. Can anything go wrong? It seems to me nothing more can go wrong over and above what can already go wrong in a normal SC simulation w/o process control constructs. e.g. suppose a process running on another core had obtained the handle of the soon-to-be-killed target process and tested it was alive before doing something

if (!h.zombie()) {
  do_something();
}

Now it can be that in the middle of do_something() the target is murdered - but note that the same sitn. can happen w/o kill() if the target is executing in parallel and simply returns from its function - i.e. by the target dying a natural death.

Similarly, suppose the target while getting killed catches the kill exception, and does some cleanup of member variables in the module, and suppose another parallely running process is also simultaneously accessing the same module member variables - again same sitn. of 2 processes accessing same module member variables can happen w/o kill().

The only sitn. that can be attributed specially to process control constructs is the one where parallely executing processes both try to issue a process control construct with immediate semantics on the samer target - like the example where 2 perpetrators issue a throw_it() on same target. But here also, the multi-core sitn. is no different than other normal cases - e.g. 2 processes try to notify the same event, which one wins?

When defining a multi-core solution for SystemC, a key exercise will be to identify the constructs that need to be made MT-safe by putting them in critical sections, e.g. immediate event notification. The process control constructs with immediate semantics will likely fall in that category also. Designs with such "races" are non-deterministic on multi-cores, but so are they on single-cores, since the process execution order is already allowed to be non-deterministic.

In summary, at this time, I do not see that any special treatment of process control constructs is necessary for multi-core other than to relax/remove that one line in the spec that Tor points out. As John puts it, radical changes will be necessary to make SC be multi-core; IMO the process control constructs by themselves do not pose much of an additional headache.

I'll comment on Philip's points directly below in a separate email.

Thanks,
-Bishnupriya

-----Original Message-----
From: Philipp A. Hartmann [mailto:philipp.hartmann@offis.de]
Sent: Wednesday, September 01, 2010 7:02 PM
To: john.aynsley@doulos.com
Cc: Bishnupriya Bhattacharya; Jeremiassen, Tor; Stuart Swan; owner-systemc-p1666-technical@eda.org; SystemC P1666 Technical
Subject: Re: Process Control Extensions - kill/reset/throw scheduling, exception handling

All,

I agree with John, that it may be too late in the process for P1666 to change the immediate semantics of kill/throw/reset. I just wanted to explain my original expectations in that regard.

Nevertheless, I think the open issues mentioned in my last mail need to be addressed:

- Should an implementation be obliged to detect, when an application
  calls the scheduler (wait, etc.) during unwinding? Or if it even
  swallows kill/reset requests completely? (IMHO, yes)

  The current proposal states, that these cases "shall be illegal",
  but does not explicitly require an implementation reaction to it.

- Is a kill() issued from a method process meant to be immediate (i.e.
  blocking) in the same sense ? If we keep the immediate semantics, it
  should be the case, since otherwise we need to handle concurrent
  requests again.

    This is not implemented in the OSCI PoC simulator and an
  implementation of this seems to be quite difficult, from what I can
  see.

- Can we add a "bool sc_is_unwinding()", and a
  "bool sc_process_handle::is_unwinding()" to the proposal?
  (for a rationale, please see last mail)

Thanks,
Philipp

On 01/09/10 14:22, john.aynsley@doulos.com wrote:
> All,
>
> My opinion on this recent debate is pragmatic. The process control
> extension proposal has been around for a few years now and has been
> prototyped by Cadence and within the OSCI POC sim, and I assume it is
> fit-for-purpose. I am opposed to trying to re-engineer a "better"
> solution within the P1666 Working Group along the lines Phillipp
> suggests (regardless of whether Phillipp's arguments are valid).
>
> For what it is worth, I did find the immediate semantics of kill/reset
> to be "surprising", and would have expected them to work more like
> Phillipp suggests.
>
> I take seriously the possibility of re-engineering SystemC in the
> future to exploit many-core processing. However, IMHO the necessary
> changes will be so radical that the inclusion of the proposed process
> control extensions will make little difference.
>
> John A
>
>
>
>
> From:
> "Philipp A. Hartmann" <philipp.hartmann@offis.de>
> To:
> Stuart Swan <stuart@cadence.com>
> Cc:
> "Jeremiassen, Tor" <tor@ti.com>, Bishnupriya Bhattacharya
> <bpriya@cadence.com>, SystemC P1666 Technical
> <systemc-p1666-technical@eda.org>
> Date:
> 27/08/2010 22:37
> Subject:
> Re: Process Control Extensions - kill/reset/throw scheduling,
> exception handling Sent by:
> owner-systemc-p1666-technical@eda.org
>
>
>
> Stuart, Bishnupriya, All,
>
> sorry, that I chime back in so late in the discussion, but I have been
> quite busy and out of office in the last couple of days.
>
> Please find some comments interspersed below.
>
> On 25/08/10 23:05, Stuart Swan wrote:
>
>> Again, I think the "immediate semantics" for kill/reset/throw as
> specified
>> in the Cadence proposal is the least surprising from the POV of the
> user.
>> I haven't heard anyone disagree with this point. Immediate semantics
>> are also (most likely) consistent with modeling languages such as UML
> state diagrams, etc.
>
> I agree, that users (including me) expect a kill/throw/reset to have
> an "immediate" effect on on the target. But I have indeed been quite
> surprised, that this extends to these operations being "blocking", i.e.
> are obliged to be completed when the calls return.
>
> - SystemC has simultaneity in terms of simulation time (and deltas)
> and is scheduled explicitly with a few modelling constructs. Why
> add implicit exceptions for kill/throw/reset?
>
> - sc_spawn is not "immediate" in this strong sense. The new process
> is only scheduled to run in the current evaluation cycle instead
> of being started already, when sc_spawn returns.
>
> - Usually, synchronous cancellation has to be set up explicitly
> (c.f. pthread_cancel).
>
> In short: I would have expected an effect in the same evaluation
> cycle without further guarantees. If a user is interested in the
> completion of a kill request, why not use a wait(h.terminated_event()) for this?
> Maybe a reset_event() could be useful as well.
>
> Moreover, in the currently available OSCI-internal implementation,
> kills issued from within a method process are not synchronised and
> return before the kill/reset/throw having any effect (see attached
> example). Otherwise, a method process could suddenly be blocked in
> the middle of its execution, which is certainly more surprising. So
> that's an inconsistency (or a bug in the implementation).
>
>> I can think of numerous semantic issues if we don't have immediate
> semantics - e.g. what
>> happens if two perpetrators try to throw an exception in a victim
> process in the same
>> evaluation cycle. Do we guarantee which throw wins and which one loses ?
> Do we somehow
>> notify the loser that he has "lost".
>
> Since the order of process evaluation is unspecified as of now,
> there is no need for any further guarantee here. My proposal would be
> to execute the requests in any order and react accordingly as
> currently defined (e.g. the second of two kills is a no-op).
> The only difference implementation-wise would be, that a queue (or
> any other container) of pending requests would be needed. Since these
> cases are quite likely modelling errors, we could consider to mark
> them as errors as well, though.
>
> Due to method processes having their kills delayed (see above), we
> already might need to handle these corner cases anyhow. Or we would
> need have to deal with preempted method processes, which I rather
> dislike and may be difficult to implement.
>
> [snip]
>>> |-----Original Message-----
>>> |From: Bishnupriya Bhattacharya
> [snip]
>
>>> |In internal discussions with Stuart and Mac, we concluded that the
> existing
>>> |mechanism whereby the target has the provision of catching
>>> |sc_unwind_exception (previously called sc_kill_exception) and
> performing
>>> |any necessary clean-up, is adequate to address the requriement of
>>> |the target being able to leave things in a clean state.
>
> Agreed. Two points on this:
>
> Is an implementation obliged to detect, whether an application does
> not conform to the "rethrow and no scheduling" requirement? This is
> currently not defined in the proposal.
>
> Secondly, due to the obligation to avoid any scheduling during
> unwinding caused by a kill or reset, there's the need to detect this
> case from within destructors.
>
> As an example, we have proxy objects doing protocol/synchronisation
> stuff during construction and destruction. So there are cases, where
> SystemC scheduler calls are done inside a destructor. To avoid this
> during unwinding, something like
>
> bool sc_is_unwinding() {
> return sc_get_current_process_handle().is_unwinding();
> }
>
> would be needed, since one can't catch the current exception within
> the destructor during stack unwinding. With std::uncaught_exception a
> similar thing exists in the C++ standard library for related situations.
>
> Thanks,
> Philipp
>

--
Philipp A. Hartmann
Hardware/Software Design Methodology Group
OFFIS
R&D Division Transportation | FuE-Bereich Verkehr Escherweg 2 * 26121 Oldenburg * Germany
Phone/Fax: +49-441-9722-420/282 * PGP: 0x9161A5C0 * http://www.offis.de/
-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Received on Wed Sep 1 10:14:08 2010

This archive was generated by hypermail 2.1.8 : Wed Sep 01 2010 - 10:14:10 PDT