The Synchronous Blog

A blog about reactive programming languages.

Posts Tagged ‘embedded-systems

A Synchronous Micro Kernel for Embedded Systems

with 2 comments

I’m working on a new operating system that follows the synchronous execution model of Céu.
In reality, the term micro kernel would be more accurate: I’m only concerned with scheduling and inter-process communication mechanisms.

Why a new micro kernel?

Many reasons. :)

General frustrations

(Accumulated over the years of using UNIX-based systems.)


We are still using select or threads to communicate with multiple applications at the same time:

  • select: multiplex all communication in a single point and deal with state explicitly (e.g., switch (state) {…}).
  • threads: decentralize the communication in multiple threads and deal with synchronization explicitly (e.g., locks, semaphores).

Both approaches are complex and error prone (threads, select/events).


It is difficult, if not impossible, to reproduce the behavior of a set of interacting applications (even of a single application):

  • Preemptive schedulers are highly sensitive to small timing variations.
  • Most programming languages offer non-deterministic concurrency models only (e.g. threads, actors).

OS/Language integration:

I’m not sure if the “everything is a file”  philosophy of UNIX-based systems is too generalist and language agnostic.
Maybe operating systems and programming languages should be designed in conjunction, with the latter offering first-class OS-level IPC mechanisms.

Particular demands

I also have particular demands related to ongoing projects in our research group.

Micro platforms:

We work with highly-constrained embedded platforms (e.g., 16MHz, 64Kb FLASH, 4Kb RAM).

With such little resources, we can take advantage of the synchronous concurrency model to be more economic:

  • Processes share the same event queue.
  • Processes share the same stack.
  • Cooperative scheduling instead of preemptive (together with language guarantees that processes do cooperate).

Distributed WSNs:

Wireless Sensor Networks demand more functionality from operating systems:

  • Processes that communicate might not be on the same network node.
  • Applications need to be replaced remotely.


The video that follows shows a working demo of the OS:

(To see the texts in the video, please use a resolution of at least 480p in the “settings” icon.)

Together with the kernel, the microcontroller is preloaded with three processes:

  • An USART driver.
  • A shell that allows us to start/stop applications.
  • An USART<=>shell bridge that stores received bytes from the serial line until the command is complete.

These processes have no special rights, they are only preloaded so that we can command the microcontroller remotely.

In the first demo (0s-35s), after booting the microcontroller, we issue a sequence of commands through the terminal:

  • Upload a blinking application to address 0xF000.
  • Load address 0xF000 as process 0.
  • Start process 0.
  • Stop process 0 after some time.

Then, with the OS still up and running, we upload a similar application (it only blinks a different LED) to the same address and repeat the sequence of commands. The visual result is to see the blinking LED to change from red to yellow.

The second demo (35s-1min14s) issues another sequence of commands to upload and start three other applications:

  • A GPIO driver.
  • Two blinking applications [code1,code2].

Now, while the first application executes and blinks the yellow LED, we upload another application to blink the red LED.

Both applications rely on the the GPIO driver and avoid to duplicate the GPIO functions (as happens in the first demo).

The link command connects the applications with the GPIO driver. For example, to link the output call PIN_MODE of application 1 with the input call PIN_MODE of the driver (application 0), we write <link 1 1 0 243> (i.e., link app 1 / output 1 => app 0 / input 243).

(Sorry, event identifiers must be “hardcoded” as numbers instead of names.)

Check also the code for the USART driver, shell, USART<=>shell bridge, and GPIO driver.

Details in another post to come.


Written by francisco

March 11, 2014 at 2:50 pm

Dynamic Memory Management in Embedded Systems

with 4 comments

Dynamic functionality in embedded systems is usually discouraged due to resource constraints.
However, some types of applications inherently require memory allocation.

As an example, protocols in sensor networks typically forward messages through nodes at a non-deterministic rate, given that the number of neighbors and transmission periods can vary.
Hence, many protocols require dynamic memory management to hold receiving messages until they are successfully forwarded.

A simple FIFO queue might not be always optimal because forwarding a message may involve multiple steps with delays (e.g. transmission acknowledgments).
In such scenario, the protocol would rather handle multiple messages at the same time, raising the possibility of a message received later be discarded first.

Unfortunately, out-of-the-box dynamic memory schemes, such as malloc/free, are not suitable for embedded systems which have quite different requirements in comparison to standard desktop systems.

Follows a list of issues concerning memory management schemes in embedded systems:

  1. Memory corruption
    Many embedded systems lack memory protection, and continuous allocations in the heap may end up corrupting the stack (and vice versa).
  2. Run-time overhead
    Memory management requires extra run-time bookkeeping. Also, in the context of embedded systems, a predictable execution model can be even more important than the fastest scheme on the average.
  3. Metadata overhead
    Metadata used by the memory manager can spend precious bytes (e.g. linked lists of free blocks).
  4. Memory fragmentation
    For constrained memory platforms, unusable holes between and inside allocated blocks (external and internal fragmentation, respectively) can waste a big percentage of available memory.
  5. Unreproducible execution
    Successive executions of the same program may allocate memory in different ways, possibly leading to different outcomes (e.g. an allocation fail).
  6. Deallocation hazards
    Properly deallocating memory is far from trivial. A missed deallocation leads to a memory leak that wastes memory, while deallocating a memory block still in use leads to a dangling pointer that will eventually crash the application.

As the C standard is loose about these issues, out-of-the-box malloc/free can perform bad in all items.
Furthermore, deallocation hazards are inherent in schemes that require an explicit free operation.

Garbage collected systems eliminate deallocation hazards, but may incur unacceptable run-time overheads.

Memory Pools

Embedded systems usually rely on memory pools to manage dynamic memory.
In the context of sensor networks, both TinyOS and Coniki OSes offer and promote the use of memory pools (through Pool and MEMB, respectively).

A memory pool allocates N predefined fixed-sized blocks of memory that can be used by the application.
Most of the raised concerns are alleviated with this scheme:

  1. Because the allocation is static, the maximum amount of memory is known at compile time, reducing considerably the risk of memory corruption.
  2. The run-time overhead is minimal as implementations use simple arrays to hold the memory blocks. For instance, in the TinyOS implementation both allocation and deallocation are O(1).
  3. TinyOS’ Pools use an auxiliary vector of size N to hold pointers for free blocks.
  4. Regardless of different allocation patterns in applications, memory pools will always guarantee the minimal N of memory blocks. Hence, external fragmentation is non-existent. Internal fragmentation, however, can be an issue and is discussed below.
  5. Given that the memory operations are simple and handle fixed-size blocks, the execution is always deterministic and predictable.
  6. Memory pools are still manipulated through malloc/free-like operations. Hence, all challenges to properly deallocate memory still hold.

Internal fragmentation occurs when an allocated memory block is bigger than the requested size.
Given that memory pools can only handle fixed-size blocks, any allocation that requests smaller blocks will contain internal fragmentation (allocation of bigger blocks always fail).
That said, embedded applications are usually simple and contain only one or two different object units that require dynamic allocation.
This way, an application will use a different memory pool for each kind of object, thus also eliminating internal fragmentation.


Memory pools are the way to go for dynamic allocation in embedded systems.

They offer memory compactness, efficient and small operations, and predictable execution.

However, programming dynamic applications is still hard and error prone, given that missing and wrong deallocations may lead to memory leaks and subtle crashes.

In the next post I will show how Céu offers safer and higher level mechanisms for dynamic applications, while using memory pools transparently under the hoods.

Written by francisco

May 20, 2013 at 7:45 pm

“Céu: Embedded, Safe, and Reactive Programming”

with one comment

We have published a technical report entitled “Céu: Embedded, Safe, and Reactive Programming”.

Enjoy the reading!


Céu is a programming language that unifies the features found in dataflow and imperative synchronous reactive languages, offering a high-level and safe alternative to event-driven and multithreaded systems for embedded systems.

Céu supports concurrent lines of execution that run in time steps and are allowed to share variables. However, the synchronous and static nature of Céu enables a compile time analysis that can enforce deterministic and memory-safe programs.

Céu also introduces first-class support for “wall-clock” time (i.e. time from the real world), and offers seamless integration with C and simulation of programs in the language itself.

The Céu compiler generates single-threaded code comparable to handcrafted C programs in terms of size and portability.

Table of Contents:

  1. Introduction
  2. The Language Céu
    1. Parallel compositions
    2. Internal events & Dataflow support
    3. Wall-clock time
    4. Integration with C
    5. Bounded execution
    6. Determinism
    7. Asynchronous execution
    8. Simulation in Céu
    9. GALS execution
  3. Demo applications
    1. WSN ring
    2. Arduino ship game
    3. SDL game simulation
  4. Implementation of Céu
    1. Temporal analysis
    2. Memory layout
    3. Gate allocation
    4. Code generation
    5. Reactive execution
    6. Evaluation
  5. Related work
    1. Synchronous model
    2. Asynchronous model
  6. Conclusion

[ANN] Céu in a Box

leave a comment »

Céu in a Box (CiB) is a distribution of Xubuntu 12.04 [1] that comes pre-installed with the Céu Programming Language environment [2].

The distribution contains the compiler together with bindings for the Arduino and TinyOS target platforms.

CiB is distributed as a single .ova file to be used wit VirtualBox [3].

Written by francisco

July 9, 2012 at 11:54 am

Céu + Arduino

leave a comment »

A couple of months ago I acquired an Arduino Uno board [1] and started working on integrating it with Céu [2].

In this blog post, I depict a demo game that uses an LCD shield on the Arduino.

In the game, the player controls a “ship” that moves on screen and has to avoid collisions with meteors.

The video shows the game executing:

The game gets faster and more meteors appear on each phase. When the ship hits a meteor, the game restarts.

The code

The outermost loop of the game runs forever and restarts the game on every new phase or on “game over”:

    1: loop do
 2-12:     // CODE 1: set game attributes
   14:     _map_generate();
   15:     _redraw(step, ship, points);
   16:     await Key; // starting key
   18:     win =
19-45:         // CODE 2: the central loop
47-60:     // CODE 3: game over
   61: end


We split the remaining code in three blocks (to be expanded further):

  • CODE 1: sets (or resets) the game attributes, such as the ship position, speed, and game points
  • CODE 2: executes the main logic of the game, such as moving the ship and detecting collisions
  • CODE 3: executes the after game code, based on the result of “CODE 2” (win or !win)

Every time the outermost loop is executed (lines 1-61), it resets the game attributes (CODE 1, lines 2-12), generates a new map, and redraws it on screen (lines 14-15).
Then, it waits for a starting key (line 16), and executes the main logic of the game in the central loop (CODE 2, lines 18-45) until the ship reaches the finish line or collides with a meteor. Based on the return status (line 18), the “game over” code (CODE 3, lines 47-60) may display an animation before restarting the game.

The game attributes (CODE 1) change depending on the result of the previous iteration of the outermost loop:

     // CODE 1: set game attributes
 2:  ship = 0;        // 1st LCD row
 3:  if !win then
 4:      dt = 500;    // game speed (500ms/step)
 5:      step = 0;    // current step
 6:      points = 0;  // number of steps alive
 7:  else
 8:      step = 0;
 9:      if dt > 100 then
10:          dt = dt - 50;
11:      end
12:  end


For the first game execution and whenever the ship collides with a meteor, variable `win´ is set to 0 (we omitted global declarations), hence, the attributes are reset to their initial values (lines 4-6). Otherwise, if the player reached the finish line (win=1), then the game gets faster, keeping the current points (lines 8-11).

The central loop of the game (CODE 2) is responsible for moving the ship as time elapses and for checking whether the ship reached the finish line or collided with a meteor:

     // CODE 2: the central loop
19:  par do
20:     loop do
21:        await(dt)ms;
22:        step = step + 1;
23:        _redraw(step, ship, points);
25:        if _MAP[ship][step] == '#' then
26:           return 0;  // a collision
27:        end
29:        if step == _FINISH then
30:           return 1;  // finish line
31:        end
33:        points = points + 1;
34:     end
35:  with
36:     loop do
37:        int key = await Key;
38:        if key == _KEY_UP then
39:           ship = 0;
40:        end
41:        if key == _KEY_DOWN then
42:           ship = 1;
43:        end
44:     end
45:  end;


The central loop is actually split in two other loops in parallel: one to run the game steps (lines 20-34), and the other to handle input from the player (lines 36-44).

The game steps run periodically, depending on the current speed of the game (line 21). For each loop iteration, the step is incremented and the screen is redrawn (lines 22-23).
Then, the ship is checked for collision with meteors (lines 25-27), and with the finish line (lines 29-31).
Céu supports returning from blocks with an assignment, hence, lines 26 and 30 escape the whole par and assign to the `win´ variable in the outer loop (line 18).
The points are incremented before each iteration of the loop (line 33).

To handle input events, we wait for key presses in a loop (line 37) and change the ship position accordingly (lines 39, 42).
Note that there are no possible race conditions on variable `ship´ because the two loops in the par statement react to different events (i.e. wall-clock time and keys).

After returning from the central loop, we run the code for the “game over” behavior, which executes an animation if the ship collided with a meteor:

     // CODE 3: game over
47:  par/or do
48:     await Key;
49:  with
50:     if !win then
51:        loop do
52:           await 100ms;
53:           _lcd.setCursor(0, ship);
54:           _lcd.write('');
58:        end
59:     end
60:  end


The animation loop (lines 51-58) continuously displays the ship in the two directions, suggesting that it has hit a meteor.
The animation is interrupted when the player presses a key (line 48), proceeding to the game restart.

Finally, we need to generate the key events to the program itself. As we use a third-party push-button component, the default Arduino binding does not provide event handling for it.
We place the whole program in parallel with the input key generator:

   0:  par do
1-61:     // CODE FOR THE GAME
  62:  with
  63:     int key = _KEY_NONE;
  64:     loop do
  65:        int read1 = _analog2key(_analogRead(0));
  66:        await 50ms;
  67:        int read2 = _analog2key(_analogRead(0));
  68:        if read1==read2 && key!=read1 then
  69:           key = read1;
  70:           if key != _KEY_NONE then
  71:              async do
  72:                 emit Key = read1;
  73:              end
  74:           end
  75:        end
  76:     end
  77:  end


The code samples data of an analog port with a delay of 50ms to avoid bouncing (lines 65-67).
If two consecutive reads point to the same key and they are different from the previous change (line 68), then we change the key (line 69) and generate a new event (in the case of a key press, lines 70-74).
The `async´ block is mandatory for generating input events to the program.


Arduino is an interesting target platform for Céu, given its memory constraints and lack of a high-level programming alternative.

The demo shows how complementary activities can be written in separate to run in parallel, reducing complexity.
The code does not become polluted with tons of globals for controlling state, easing the code maintenance.

The complete source code is around 170 lines and also includes all C definitions to generate the map, redraw the scene on the LCD, etc.



Written by francisco

July 8, 2012 at 10:25 pm

Posted in Examples

Tagged with , , ,

Céu + Wireless Sensor Networks

with one comment

Wireless Sensor Networks (WSNs) are composed of tiny devices (known as “motes”) that sense the environment and communicate via radio. The image on the right shows two “micaz” motes.

The restricted resources of motes and the unlimited possibilities for WSNs applications made this platform an interesting target for Céu [1]. We integrated Céu with TinyOS [2] in order to use the abstracted radio services the operating system already provides.

The following demo uses a fixed ring topology with three motes placed side by side within their radio ranges.
The motes should follow the same behavior: receive a message with an integer counter, show it on the leds, wait for 1 second, increment the counter, and forward it to the mote on its right.

As the topology constitutes a ring, the counter will be incremented forever while traversing the three motes. If a mote does not receive a message within 5 seconds, it should blink the red led every 500 milliseconds until a new message is received.
The mote with id=0 is responsible for initiating the process at boot time, and also when the network is down, after 10 seconds of inactivity.

Follows the video for the application executing:


The code

Let’s start with the code that receives the message and forwards it to the next mote:

 1:  loop do
 2:      _message_t* msg = await Radio_receive;
 3:      int* cnt = _Radio_getPayload(msg);
 4:      _Leds_set(*cnt);
 5:      await 1s;
 6:      *cnt = *cnt + 1;
 7:      _Radio_setDestination(msg, (_TOS_NODE_ID+1)%3);
 8:      emit Radio_send(msg);
 9:  end


The code is an endless loop that first awaits a radio message (line 2), gets a pointer to its data region (line 3), shows the received counter on the leds (line 4), and then awaits 1s (line 5) before incrementing the counter in the message (line 6) and forwarding it to the next mote (lines 7-8).
Because this code does not handle failures, it is straight to the point and easy to follow.
Actually, this is the final code for this task, as the handler for errors is placed in a parallel trail.

To handle failures, we use a monitoring trail in parallel with the communicating trail, as shown in the following code:

 0:  par do
1-9:    // COMMUNICATING TRAIL (previous code)
10:  with
11:     loop do
12:        par/or do
13:           await 5s;
14:           par do
15:              loop do
16:                 emit retry;
17:                 await 10s;
18:              end
19:           with
20:              _Leds_set(0);
21:              loop do
22:                 _Leds_led0Toggle();
23:                await 500ms;
24:              end
25:           end
26:        with
27:           await Radio_receive;
28:        end
29:     end
30:  end


The network-down behavior constitutes the lines 13 to 25. After 5 seconds of inactivity is detected (line 13), two new activities run parallel: one that retries the communication every 10 seconds by signaling the internal event retry (lines 15-18); and another that blinks the red led every 500 milliseconds (lines 20-24).

The trick to restore the normal behavior of the network is to await the Radio_receive event (line 27) in a par/or (line 12) with the network-down behavior to kill it whenever the network link is restored. By surrounding everything with a loop (line 11), we ensure that the error detection is continuous.

Finally, we need to code the initiating/retrying process that sends the first message from the mote with id=0. As expected we place the code in parallel with the other activities, as the follows:

 0:  par do
10:   with
30:  with
31:     if _TOS_NODE_ID == 0 then
32:        loop do
33:           _message_t msg;
34:           int* cnt = _Radio_getPayload(&msg);
35:           *cnt = 1;
36:           _Radio_setDestination(&msg, 1);
37:           _Radio_setPayloadLength(&msg, sizeof);
38:           emit Radio_send(&msg);
39:           await retry;
40:        end
41:     else
42:        await Forever;
43:     end
44:  end


We start by checking if the mote has id=0 (line 31). If this is not the case, we simply await forever on this trail (line 42).
Otherwise, the loop (lines 32-40) sends the first message as soon as the mote is turned on (line 36-38).
It then waits for a retry emit (line 39) to loop and resend the initial message.


This demo shows how complementary activities in an application can be written in separate and need not to be mixed in the code.
The activities are then combined together through parallel compositions and communication via internal events to achieve the intended behavior.
The complete source code is less than 70 lines and includes all definitions and code to initialize the radio.
The generated code consumes 13.5 Kbytes of ROM and 450 bytes of SRAM.



Written by francisco

July 4, 2012 at 4:33 pm

Posted in Examples

Tagged with , , , , ,

The case for synchronous concurrency

with 4 comments

The point of this post is to show that it is wrong to rely on timing issues in preemptive multithreading for activites that require a synchronized behavior. I compare the same program specification in three different implementations.

The example program is supposed to blink two leds with different frequencies (400ms and 1000ms) on an Arduino board. They must blink together every 2 seconds.

The first two implementations use different RTOSes with preemptive multithreading. The third implementation uses the synchronous language Céu.

(UPDATE) The complete source files can be found here.


The first implementation uses the ChibiOS RTOS. I omitted all code related to creating and starting the threads (which run with the same priority).

Follows the code and video for the implementation in ChibiOS:

static msg_t Thread1(void *arg) {
    while (TRUE) {
        digitalWrite(11, HIGH);
        digitalWrite(11, LOW);
static msg_t Thread2(void *arg) {
    while (TRUE) {
        digitalWrite(12, HIGH);
        digitalWrite(12, LOW);

You can see that around 1:00 the leds loose synchronism among them, and also with the metronome.


The second implementation uses the DuinOS RTOS. I also omitted all code related to creating and starting the threads (which run with the same priority).

In this example the leds were well synchronized, so I included another task that uses the serial port with a different frequency.

Follows the code and video for the implementation in DuinOS:

taskLoop (T1) {
    digitalWrite(11, HIGH);
    digitalWrite(11, LOW);

taskLoop (T2) {
    digitalWrite(12, HIGH);
    digitalWrite(12, LOW);

int c = 0;
taskLoop (T3) {

Since the beginning you can see that the leds loose synchronism among them, and also with the metronome.


The third implementation uses the synchronous language Céu.

In this example the leds were well synchronized, even with the third activity that uses the serial port.

Follows the code and video for the implementation in Céu:

par do
    loop do
        _digitalWrite(11, _HIGH);
        await 400ms;
        _digitalWrite(11, _LOW);
        await 400ms;
    loop do
        _digitalWrite(12, _HIGH);
        await 1s;
        _digitalWrite(12, _LOW);
        await 1s;
    int c = 0;
    loop do
        await 77ms;
        c = c + 1;



The execution model of preemptive multithreading does not ensure implicit synchronization among threads.

There’s nothing wrong with the RTOSes and the example implementations: the behavior shown in the videos is perfectly valid.

The problem is that usually the very first examples for these systems use blinking leds (supposely synchronized) to show how easy is to write multithreaded code. This is not true!

Preemptive multithreading should not be used as is to write this kind of highly synchronized applications. Adding semaphores or other synchronization primitives to these codes won’t help alone, they require a single thread to handle timers that is responsible to dispatching others.

I used timers in the example, but any kind of high frequency input would also behave nondeterministically in multithreaded systems.

In synchronous languages like Céu, the execution model enforces that all activities are synchronized all the time.

Written by francisco

March 23, 2012 at 10:20 pm