Developing Real-Time Systems on Application Processors

Application processor usage continues to broaden. System-on-Chips, often powered by Arm^® Cortex-A cores, are overtaking several spaces where small Arm^® Cortex-M, and other microcontroller devices, have traditionally dominated. This trend is driven by several facts, such as:

The strong requirements for connectivity, often related to IoT and not only from a hardware point of view, but also related to software, protocols and security
The need for highly interactive interfaces such as multi-touch, high resolution screens and elaborate graphical user interfaces
The decreasing price of SoCs, as consequence of its volume gain and new production capabilities

Typical cases exemplifying the statement above are the customers we see every day starting a product redesign, upgrading from a microcontroller to a microprocessor. This move offers new challenges as the design is more complicated and the operating system abstraction layer is much more complex. The difficulty of hardware design using an application processor is overcome by the use of reference designs and off-the-shelf alternatives like Computer on Modules/System on Modules or single board computers. On the operating system layer, the use of embedded Linux distributions is widespread in the industry. An immense world of open source tools is available simplifying the development of complex and feature-rich embedded systems. Such development would be very complicated and time-consuming if using microcontrollers. Despite all the benefits, the use of an operating system like Linux still raises a lot of questions and distrust when determinism and real-time control application topics are addressed.

A common approach adopted by developers is the strategy of separating time-critical tasks and regular tasks onto different processors. Hence, a Cortex-A processor, or similar, is typically selected for multimedia and connectivity features while a microcontroller is still employed to handle real-time, determinism-critical tasks. The aim of this blog post is to present some options developers may consider when developing real-time systems with application processors. We present three possible solutions to provide real-time capability to application processor-based designs.

Testing Real-Time performance

There are several benchmark tools intended to evaluate the real-time performance of software systems, nevertheless we wanted to quickly test if the approaches presented below truly improve the system behavior. Aiming to see the results, we have opted to measure the jitter on a square wave generated by the embedded system under testing on a standard GPIO. By doing this we will be able to shortly and quickly investigate the real-time performance in an easy way which can provide us some initial indication of potential optimization. We developed a simple application which toggles a GPIO at 2.5KHz (200µs High / 200µs Low). The GPIO output is connected to a scope where we measure the resulting square wave and evaluate the real output timings

Image 1: Jitter Measurement

The results observed on standard Linux distribution are presented in the Image 2. The task responsible for the GPIO toggling was configured as a RT Task (SCHED_RR) and the Kernel configured with Voluntary Kernel Preemption (CONFIG_PREEMPT_VOLUNTARY).

Image 2: Histogram of the square wave generated using the Standard Linux Kernel Configuration

The measurement distribution shows that only 92% of the samples are realized within an error below ±10% of the period. Furthermore, the worst case measured shows a delay greater than 15ms, representing an error worse than 3700% of the period.

Real-Time Linux

The first approach we present in this article is software-related. Linux is not a real-time operating system, but there are some initiatives which have greatly improved the determinism and timeliness of Linux. One of these efforts is the Real-Time Linux project. Real-Time Linux is a series of patches (PREEMPT_RT) aimed at adding new preemption options to the Linux Kernel along with other features and tools to improve its suitability for real-time tasks. You can find documentation on applying the PREEMPT_RT patch to the Linux kernel and developing applications for it at the official Real-Time Linux Wiki (formerly here).

We did the proposed tests using the PREEMPT_RT patches on a Colibri iMX6DL to exemplify the improvement in real-time performance. The documentation on preparing the Toradex Linux image to deploy the PREEMPT_RT patch is available at the Toradex Developer Center.

The histogram below (Image 3) shows the tests on a PREEMPT_RT patched Linux kernel configured for real-time Preemption. The tests show that only 0,002% of the samples are worse than the ±10% error regarding to the period. Furthermore, the worst case (0,106ms) is only 25% of the period representing a great improvement when compared to the Standard Linux configuration (Image 2).

Image 3: Histogram of the square wave generated using the Preempt-RT kernel

An example software system using the PREEMPT_RT patch is provided by Codesys Solutions. They rely on the Real-Time Linux kernel, together with the OSADL, to deploy their software PLC solution which is already widespread throughout the automation industry across thousands of devices. You can find more information about Codesys running on Toradex Computer on Modules at this link, including an executable demo.

Xenomai

Xenomai is another popular framework to make Linux a real-time system. Xenomai achieves this by adding a co-kernel to the Linux kernel. The co-kernel will handle time-critical operations and will have higher priority than the standard kernel. You can learn more here. To use the real-time capabilities of Xenomai the real-time APIs (aka libcobalt) must be used to interface user-space applications with the Cobalt core, which is responsible for ensuring real-time performance.

Image 4: Dual Core Xenomai Configuration, source: https://xenomai.org/start-here/

Documentation on how to install Xenomai on your target device can be found at the Xenomai website: www.xenomai.org. Additionally, there is a lot of Embedded Hardware which is known to work, see: hardware reference list (including the whole NXP^® i.MX SoC series).

To validate Xenomai on the i.MX6 SoC we again performed the simple square wave test. The target device was the Colibri iMX6DL by Toradex. We used the same test approach as described above for the Real-Time Linux extension. Some parts of the application code used to implement the test are presented below to highlighted the use of Xenomai APIs.

void blink(void *arg __attribute__((__unused__)))
{
  int iomask = 0;
  rt_task_set_periodic(NULL, TM_NOW, TIMESLEEP);
  while(1)
{
  	  rt_task_wait_period(NULL);
  if(iomask) SET_G35;
  else CLR_G35;
  	  iomask = 1 - iomask;
 	}
}
int main(void)
{  
 
    /* Task Creation */
    rt_task_create(&amp;blink_task, "blinkLed", 0, 99, 0);  
    rt_task_start(&amp;blink_task, &amp;blink, NULL);
 
    getchar();
    rt_task_delete(&amp;blink_task);
 
    return 0;
}

The results of the tests performed on Xenomai are presented in the chart below (image 5). Once again, the real-time solution provides a clear advantage over the time-response of the standard Linux kernel. Notice that this time, the worst cases are inside the ±10% error area.

Image 5: Histogram of the square wave generated using the Xenomai

Heterogeneous Multicore Processing

The Heterogeneous Multicore Processing (HMP) approach is a hardware solution. Application processors like the NXP i.MX7 series, the NXP i.MX6SoloX and the upcoming NXP i.MX8 series present a variety of cores with different purposes. If we consider the i.MX7S you will see a dual core processor composed of a Cortex-A7@800MHz side-by-side with a Cortex-M4@200MHz. The basic idea is that user interface and high-speed connectivity are implemented on an abstracted OS like Linux with the Cortex-A core while, independently and in parallel, executing control tasks on a Real-Time OS, like FreeRTOS, with the Cortex-M core. Both cores are able to share access to memory and peripherals allowing flexibility and freedom when defining which tasks are allocated to each core/OS. Refer to Image 6.

Image 6: i.MX7 Block diagram from datasheet at: https://www.nxp.com/docs/en/data-sheet/IMX7SCEC.pdf

Some of the advantages of using the HMP approach are:

Legacy software from microcontrollers can be more easily reused
Firmware update (M4 core) is simplified as the firmware will be a file at the filesystem of the Cortex-A OS
Increased flexibility of choosing which peripherals will be handled by each core. Since it is software defined, future changes can be made without changing hardware design

More information on developing applications for HMP-based processors are available at the following locations:

Toradex, Antmicro and The Qt Company collaboratively built a robot showcasing this concept. The robot - named TAQ - is an inverted pendulum balancing robot designed with the Toradex Computer on Module Colibri iMX7. The user interface is built upon Linux with the Qt framework running on the Cortex-A7 and the balancing/motor control is deployed on the Cortex-M4. Inter-core communication is used to remote control the robot and animate the robot face as seen in the video below:

Image 7 presents the test results on a Colibri iMX7, this time the square wave generation is held by the FreeRTOS running on the M4 core of the i.MX7 SoC. As expected, the test results overcome all the other approaches.

Image 7: Histogram of the square wave generated using the Heterogeneous Multicore Architecture.
The square wave is generated using FreeRTOS on the M4 core

Conclusion

This article presents a brief overview of some solutions available to develop real-time systems on application processors running Linux as the target operating system. This is a starting point for developers who are aiming to use microprocessors and are concerned about real-time control and determinism.

We presented one hardware-based approach, using the Heterogeneous Multicore Processing SoCs and two software based approaches namely: Linux-RT Patch and Xenomai. The results presented do not intend to compare operating systems or real-time techniques. Each of them has pros and cons, and may be more or less suitable depending on the use case.

The primary takeaway is that several feasible solutions exist for utilizing Linux with application processors in reliable real-time applications.

References

Authors: Guilherme Fernandes, CEO, Toradex Brasil
Raul Rosetto Muñoz, Application Engineer, Toradex Brasil
Leonardo Veiga, Toradex Brasil
Brandon Shibley, Senior Solutions Architect, Toradex Inc.

< Previous Back to Blog Next >

2 comments

Han Ooi - 7 years 2 months | Reply

Appreciate the writeup! It seems like embedded heterogenous compute is not a very well understood field. Where can I find developers who are competent at this type of development? Also, what software tools would you recommend?

Guilherme Fernandes - 7 years 2 months | Reply

Hi Han, Happy you liked the article.
Please feel free to reach out to Toradex if you have any question regarding the HMP. We are working on simplifying the use of HMP and are creating documentation and tutorials to make it easy using HMP.

http://developer.toradex.com/knowledge-base/freertos-on-the-cortex-m4-of-a-colibri-imx7
http://developer.toradex.com/knowledge-base/using-arm-ds-5-ide-with-cortex-m4-of-a-colibri-imx7
https://www.toradex.com/community

Webinars:
https://www.toradex.com/pt_br/webinars/introducing-i-mx-7-system-on-chip-toradex-nxp
https://www.toradex.com/webinars/developing-embedded-systems-with-heterogeneous-multicore-soc
https://www.toradex.com/pt_br/webinars/developing-embedded-systems-using-colibri-imx7-system-on-module

Also keep an eye on our upcoming webinars (https://www.toradex.com/webinars) as we plan to have more focusing on HMP.

Gomatheeeshwari - 5 years 8 months | Reply

Hi, the article is helpful. I need how the real-time tasks are scheduled in i. Mx 7 HMP system for the real-time application. Is it possible to run my own schedule in i. Mx 7 Board.

Toradex - 5 years 8 months | Reply

Hello,
Answering the first part of your question, for this blog post the Cortex-M4 of the i.MX 7 run a single, standalone application. We used a bare metal approach instead of RTOS-based.

About the second part, yes you can run your own scheduler. You can either run a bare metal application as we did and write your own task scheduler or use an RTOS of your choice. Toradex currently supports FreeRTOS on the ARM Cortex-M4 of the Colibri iMX7, therefore if you are interested in using it please have a look at our documentation at https://developer.toradex.com/software/freertos and FreeRTOS documentation at https://www.freertos.org/implementation/main.html
Best regards,
Toradex

Please login to leave a comment!

Related Blog

#Asymmetric Multicore #Embedded Linux #FreeRTOS #Heterogeneous Processing #NXP^® i.MX6 #NXP^® i.MX7 #Real Time #Toradex Partners #Yocto

Latest Blog

Friday, April 5, 2024

Introducing Customized Solutions: Making Embedded Computing Even Easier

Tuesday, March 19, 2024

Exploring 1000BASE-T1 (802.3bp) Single Pair Ethernet transformation in Automotive and Industrial Applications

Friday, January 12, 2024

i.MX 95 Verdin Evaluation Kit, more than just an updated Name

Blog:
Developing Real-Time Systems on Application Processors

2 comments

Leave a comment

Related Blog

Latest Blog

Blog:Developing Real-Time Systems on Application Processors

2 comments

Leave a comment

Related Blog

Latest Blog

Subscribe Now

Blog:
Developing Real-Time Systems on Application Processors