Archive for the ‘Concepts’ Category
An Interrupt Service Routine (ISR) executes in reaction to an asynchronous hardware request, interrupting the ongoing computation in the CPU.
As an example, in an Arduino, whenever the USART subsystem receives a byte from the serial line, the CPU execution is redirected to the “USART_RX interrupt vector”, which is a predefined memory address containing the ISR to handle the byte received.
Only after the ISR returns that the interrupted computation resumes.
ISRs are often associated with a high-priority functionality that cannot wait long.
Complementing the USART example, if the execution of the ISR is too much delayed, some received bytes can be lost.
Likewise, the execution of an ISR should never take long, because other interrupts will not trigger in the meantime (although it is possible to nest ISRs).
For this reason, a typical USART ISR simply stores received bytes in a buffer so that the program can handle them afterwards.
ISRs in Céu:
Céu has primitive support for ISRs, which are declared similarly to functions.
However, instead of a name identifier, an ISR declaration requires a number that refers to the index in the interrupt vector for the specific platform.
When an interrupt occurs, not only the ISR executes, but Céu also enqueues the predefined event OS_INTERRUPT passing the ISR index.
This mechanism allows the time-critical operation associated with the interruption to be handled in the ISR, but encourage non-critical operations to be postponed and respect the event queue, which might already be holding events that occurred before the interruption.
The code snippets that follow is part of an USART driver for the Arduino.
The driver emits a READ output event to signal a received byte to other applications (i.e. they are awaiting READ).
The ISR just hold incoming bytes in a queue, while the main body is responsible for signaling each byte to all applications (in a lower priority).
/* variables to manage the buffer */ var byte[SZ] rxs; // buffer to hold received bytes var u8 rx_get; // position to get the oldest byte var u8 rx_put; // position to put the newest byte atomic do rx_get = 0; // initialize get/put rx_put = 0; // the `atomic´ block disables interrupts end /* ISR for receiving byte (index "20" in the manual) */ function isr do var u8 put = (rx_put + 1) % SZ; // next position var byte c = _UDR0; // receive the byte if put != rx_get then // check buffer space rxs[rx_put] = c; // save the received byte rx_put = put; // update to the next position end end /* DRIVER body: receive bytes in a loop */ output byte READ; // the driver outputs received bytes to applications loop do var int idx = await OS_INTERRUPT until idx==20; // USART0_RX_vect var byte c; // hold the received byte ... atomic do // protect the buffer manipulation new interrupts c = rxs[rx_get]; // get the next byte rx_get = (rx_get + 1) % SZ; // update to the next position end emit READ => c; // signal other applications ... end
Note how the real-time/high-priority code to store received bytes in the buffer runs in the ISR, while the code that processes the buffer and signal other applications runs in the body of the driver after every occurrence of OS_INTERRUPT.
Given that ISRs share data with and abruptly interrupt the normal execution body, some sort of synchronization between them is necessary.
As a matter of fact, Céu tracks all variables that ISRs access and enforces all other accesses (outside them, in the normal execution body) to be protected with
Céu provides primitive support for handling interrupt requests:
- An ISR is declared similarly to a function, but specifies the interrupt vector index associated with it.
- An ISR should only execute hard real-time operations, leaving lower priority operations to be handled in reaction to the associated OS_INTERRUPT event.
- The static analysis enforces the use of
atomicblocks for memory shared between ISRs and the normal execution body.
This is the follow up to the previous post on creating dynamic applications in Céu.
As introduced there, organisms are the main abstraction mechanism in Céu, reconciling objects and lines of execution in a single construct. The example below, extracted from the previous post, creates two instances of a class that prints the “Hello World!” message every second:
class HelloWorld with var int id; do every 1s do _printf("[%d] Hello world!\n", this.id); end end var HelloWorld hello1, hello2; hello1.id = 1; hello2.id = 2; await FOREVER; .
One thing that arises in the code is how the organisms hello1 and hello2 are declared as local variables, using the same syntax of a normal variable declaration (e.g. “var int x=0”). This implies that an organism follows the standard scoping rules of conventional imperative languages, i.e., its memory is reclaimed when its enclosing block goes out of scope. Additionally, all of its lines of execution are seamlessly killed.
Céu supports three different ways to manage the life cycle of organisms automatically: local variables, anonymous allocations, and named allocations.
The simplest way to manage organisms is to declare them as local variables, as shown in the previous example. As in the example, the two organisms don’t go out of scope, let’s include an explicit block to declare hello2 so that it goes out of scope after 5 seconds:
var HelloWorld hello1; hello1.id = 1; do var HelloWorld hello2; hello2.id = 2; await 5s; end await FOREVER; .
The organism hello2 runs in parallel with the do-end construct and is active while the block is in scope. After 5 seconds, the block goes out of scope and kills hello2, also reclaiming all its memory. As an outcome, the message “ Hello world!” stops to show up in the video.
The order of execution between blocks and organisms is determined: code inside a block executes before the organisms declared inside it. This way, the do-end has priority over hello1 (they are both in the top-level block), while await 5 has priority over hello2 (they are both inside the do-end). The order the messages appear in the video is correct, hello2 always awakes before hello1. Also, hello2 is killed before printing for the 5th time, because the await 5s has higher priority and terminates the block before hello2 has the chance to execute for the last time.
Regarding memory usage, a local organism has an associated slot in the “stack” of its enclosing block, which is calculated at compile time. Blocks in sequence can share memory slots. Local organisms do not incur overhead for calls to malloc/free.
True dynamic applications often need to create an arbitrary number of entities while the application is running. Céu supports a spawn command to dynamically create and execute a new organism:
do var int i = 1; every 1s do spawn HelloWorld with this.id = i; end; i = i + 1; end end .
We now spawn a new organism every second passing a different id in the constructor (the code between with-end). We can see in the video that every second a new organism starts printing its message on the screen. Again, the printing order is deterministic and never changes.
Note that the spawned organisms are anonymous, because there’s no way to refer to them after they are created. Anonymous organisms are useful when the interaction with the block that creates them happens only in the constructor, as in the example above.
Note also the we use an enclosing do-end apparently with no purpose in the code. However, in order to also provide seamless memory reclamation for anonymous organisms, a spawn statement must be enclosed by a do-end that defines the scope of its instances. This way, when the do-end block goes out of scope, all organisms are reclaimed in the same way local organisms are.
In the next example, we surround the previous code with a par/or that restarts the outer loop after 5 seconds:
loop do par/or do await 5s; with do var int i = 1; every 1s do spawn HelloWorld with this.id = i; end; i = i + 1; end end end end .
Now, after creating new instances during 5 seconds, the par/or terminates the do-end scope and all organisms are killed. The loop makes this pattern to execute continuously.
An anonymous organism can also be safely reclaimed when its body terminates, given that no one can refer and access its fields.
The most flexible way to deal with dynamic organisms is through the new statement, which not only spawns a new organism but also returns a reference to it:
do var HelloWorld* hello = new HelloWorld; hello:id = 1; ... // some more code end .
In the example, the returned reference is assigned to the variable hello, which is of type HelloWorld* (a pointer to the class). The organism can be manipulated through the colon operator (:), which is equivalent to the arrow operator in C (->).
A named organism is automatically reclaimed when the block holding the pointer it was first assigned goes out of scope. In the example, when the do-end block in which the hello pointer is declared goes out of scope, the referred instance is reclaimed.
For safety reasons, Céu does not allow a pointer to “escape” to an outer block. Without this precaution, a reference could last longer than the organism it points, yielding a dangling pointer in the program. In the following example, both the assignment to outer and the call to _cfunc are refused, given that their scope are broader the that of variable hello:
var HelloWorld* outer; do var HelloWorld* hello = new HelloWorld; hello:id = 1; outer = hello; // this assignment is refused at compile time _cfunc(hello); // this call is refused at compile time ... // some more code end .
In order to compile this code, we need to include finalizers to properly handle the organism going out of scope:
var HelloWorld* outer; do var HelloWorld* hello = new HelloWorld; hello:id = 1; finalize outer = hello; // outer > hello with ... // this code is executed just before do-end goes out of scope end _cfunc(hello) // _cfunc > hello (_cfunc is a global function) finalize with ... // this code is executed just before do-end goes out of scope end; end .
A finalize block is tied to the scope of the dangerous pointer and gets executed automatically just before the associated block terminates. This way, programmers have means to handle the organism being reclaimed in a safe way.
Céu support three different ways to deal with dynamic allocation of organisms:
- Local organisms should be used when the number of instances is known a priori.
- Anonymous allocations should be used when the number of instances is arbitrary and the program only interacts with them at instantiation time.
- Named allocations are the most flexible, but should only be used when the first two methods don’t apply.
In all cases, memory and trails reclamation is handled seamlessly, without programming efforts.
In practice, given that organisms have lines of execution and can react to the environment by themselves, anonymous organisms should be preferred over named organisms in order to avoid dealing with references explicitly.
In the next post, I’ll show a simple evaluation of the runtime costs of organisms in Céu.
The basic prerequisite to build dynamic applications is language support to deal with abstractions and code reuse. Programming languages provide a multitude of abstraction mechanisms, from simple abstract data types, to OO classes. Regarding an abstraction, an effective mechanism should provide means to deal with at least the following points:
- Hide its internal implementation details.
- Expose a uniform programming interface to manipulate it.
- Control its life cycle.
As an example, to build an ADT in C, one can define a struct, hide it with a typedef, expose functions to manipulate it, and control instances with local variables or malloc/free. Classes extend ADTs with richer mechanisms such as inheritance and polymorphism. Furthermore, the life cycle of an object is typically controlled automatically through a garbage collector.
Abstractions in Céu are created through organisms, which basically reconcile threads and objects into a single concept:
- An organism has intrinsic execution, being able to react to the environment on its own.
- An organism exposes properties and actions in order to interact with other organisms during its life cycle.
Like an object, an organism exposes properties and methods (events in Céu) that can be accessed and invoked (emitted in Céu) by other instances. Like a thread, an organism has its own line(s) of execution, with persistent local variables and execution state.
In contrast, an object method call typically shares the same execution context with its calling method. Likewise, a thread does not expose fields or methods.
The program below defines the class HelloWorld and executes two instances of it:
class HelloWorld with var int id; // organism interface do // organism body every 1s do _printf("[%d] Hello world!\n", this.id); end end var HelloWorld hello1, hello2; hello1.id = 1; hello2.id = 2; await FOREVER; .
The behavior can be visualized in the video on the right. The top-level code creates two instances of the class HelloWorld, initializes the exposed id fields, and then awaits forever. As organisms have “life”, the two instances react to the environment autonomously, printing the “Hello world!” message every second.
Note in the example that organisms are simply declared as normal variables, which are automatically spawned by the language runtime to execute in parallel with its enclosing block.
In the following variation, we add the event stop in the class interface and include another line of execution in the organism body:
class HelloWorld with var int id; event void stop; do par/or do every 1s do _printf("[%d] Hello world!\n", this.id); end with await this.stop; end end var HelloWorld hello1, hello2; hello1.id = 1; hello2.id = 2; await 3s500ms; emit hello1.stop; hello2.id = 5; await 2s; emit hello2.stop; await FOREVER; .
Now, besides printing the message every second, each organism also waits for the event stop in parallel. The par/or construct splits the running line of execution in two, rejoining when any of them terminate. (Céu also provides the par/and construct.)
After the top-level code instantiates the two organisms, it waits 3s500ms before taking the actions in sequence. At this point, the program has 5 active lines of execution: 1 in the top-level and 2 for each of the instances. Each organism prints its message 3 times before the top-level awakes from 3s500ms.
Then, the top-level emits the stop event to the first organism, which awakes and terminates. It also changes the id of the second organism and waits more 2s. During this period the second organism prints its message 2 times more (now with the id 5).
Note that although the first organism terminated its body, its reference hello1 is still visible. This way, the organism is still alive and its fields can be accessed normally (but now resembling a “dead” C struct).
Lines of execution in Céu are known as trails and differ from threads in the very fundamental characteristic of how they are scheduled.
Céu is a synchronous language based on Esterel, in which lines of execution advance together with a unique global notion of time.
In practical terms, this means that Céu can provide seamless lock-free shared-memory concurrency. It also means that programs are deterministic and have reproducible execution. As a tradeoff, concurrency in Céu is not suitable for algorithmic-intensive activities as there is no automatic preemption among trails.
In contrast, asynchronous models have time independence among lines of execution, but either require synchronization primitives to acquire shared resources (e.g. locks and semaphores in pthreads), or completely forbid shared access in favor of message passing (e.g processes and channels in actor-based languages). In both cases, ensuring deterministic execution requires considerable programming efforts.
The post entitled “The case for synchronous concurrency” illustrates these differences in practical terms with an example.
Céu organisms reconcile objects and threads in a single abstraction mechanism.
Classes specify the behavior of organisms, hiding implementation details and exposing an interface in which they can be manipulated by other organisms.
In the next post, I’ll show how Céu can control the life cycle of organisms with lexical scope in three different ways: local variables, named allocation, and anonymous allocation.
I’m about to release the version 0.4 of Céu (already available on github  and a dedicated VM ).
The language had many improvements , but by far, the addition of a “class” system was the most significant.
A powerful abstraction functionality is a must to augment the scope of the language from constrained embedded systems to mobile and desktop platforms.
I’ve been playing with Céu + SDL  for a while and came up with this idea of mixing objects (interface+explicit state) with trails (subprograms+implicit state) into a single functionality (an “organism”).
The videos that follow go through the most important concepts of Céu, starting from the basics of synchronous parallel compositions, up to the class system with scoped objects that are automatically reclaimed by the language run-time.
Céu with SDL
Céu dynamic organisms
 http://www.ceu-lang.org/wiki/index.php?title=C%C3%A9u_in_a_Box (see HISTORY)