DMA Survival guide – Linux kernel

Hi.

I recently gave a talk in KernelTLV Meetup.
The talk was uploaded to YouTube, so if you’re interested, go a head and watch it.

The Talk is in Hebrew, but the slides are in English.

The slides can be found here:

Advertisements
DMA Survival guide – Linux kernel

Linux Serviio start up script

I created my own Serviio service script, I’m sharing it for reference and free usage here.

If you don’t know Serviio and you need a Media Server in your home network, this is your best choice. check it out here.

Actually, you only need to alter “/etc/default/serviio” with your username and path to Serviio installation.

When you’re done with the scripts add it to rc using:

sudo update-rc.d serviio default

/etc/default/serviio


NAME="Serviio Media Server"
DAEMON="/opt/serviio/bin/serviio.sh"    ## Update this to point at serviio_root/bin/serviio.sh
SERVICE_ACCOUNT="nativeguru"            ## change to an appropriate username, DON'T RUN UNDER ROOT!

/etc/init.d/serviio

#! /bin/sh
### BEGIN INIT INFO
# Provides:          serviio
# Required-Start:    $local_fs $remote_fs $network $syslog 
# Required-Stop:     $local_fs $remote_fs $network $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Serviio daemon script
# Description:       This file is a daemon script for Serviio to be
#                    placed in /etc/init.d.
### END INIT INFO

# Author: Ramon Fried <ramon.fried at gmail dot com>

# Do NOT "set -e"

# PATH should only include /usr/* if it runs after the mountnfs.sh script
PATH=/sbin:/usr/sbin:/bin:/usr/bin
DESC="Serviio Daemon"
NAME=serviio
DAEMON=/opt/serviio/bin/serviio.sh
DAEMON_ARGS=""
PIDFILE=/var/run/$NAME.pid
SCRIPTNAME=/etc/init.d/$NAME

# Exit if the package is not installed
[ -x "$DAEMON" ] || exit 0

# Exit if the default config file is missing
#[ -x /etc/default/$NAME ] || exit 0

# Read configuration variable file if it is present
[ -r /etc/default/$NAME ] && . /etc/default/$NAME

# Load the VERBOSE setting and other rcS variables
. /lib/init/vars.sh

# Define LSB log_* functions.
# Depend on lsb-base (>= 3.2-14) to ensure that this file is present
# and status_of_proc is working.
. /lib/lsb/init-functions

#
# Function that starts the daemon/service
#
do_start()
{
	# Return
	#   0 if daemon has been started
	#   1 if daemon was already running
	#   2 if daemon could not be started
	start-stop-daemon --start --quiet --pidfile $PIDFILE -c "${SERVICE_ACCOUNT}" --exec $DAEMON --test > /dev/null \
		|| return 1
	start-stop-daemon --start --quiet --pidfile $PIDFILE -m -b -c  "${SERVICE_ACCOUNT}" --exec $DAEMON -- \
		$DAEMON_ARGS \
		|| return 2
}

#
# Function that stops the daemon/service
#
do_stop()
{
	# Return
	#   0 if daemon has been stopped
	#   1 if daemon was already stopped
	#   2 if daemon could not be stopped
	#   other if a failure occurred
	start-stop-daemon --stop --quiet --retry=TERM/30/KILL/5 --pidfile $PIDFILE -c "${SERVICE_ACCOUNT}"
	RETVAL="$?"
	[ "$RETVAL" = 2 ] && return 2
	# Wait for children to finish too if this is a daemon that forks
	# and if the daemon is only ever run from this initscript.
	# If the above conditions are not satisfied then add some other code
	# that waits for the process to drop all resources that could be
	# needed by services started subsequently.  A last resort is to
	# sleep for some time.
	start-stop-daemon --stop --quiet --oknodo --retry=0/30/KILL/5 -c "${SERVICE_ACCOUNT}" --exec $DAEMON
	[ "$?" = 2 ] && return 2
	# Many daemons don't delete their pidfiles when they exit.
	rm -f $PIDFILE
	return "$RETVAL"
}

case "$1" in
  start)
	[ "$VERBOSE" != no ] && log_daemon_msg "Starting $DESC" "$NAME"
	do_start
	case "$?" in
		0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
		2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
	esac
	;;
  stop)
	[ "$VERBOSE" != no ] && log_daemon_msg "Stopping $DESC" "$NAME"
	do_stop
	case "$?" in
		0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
		2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
	esac
	;;
  status)
	status_of_proc -p $PIDFILE "$DAEMON" "$NAME" && exit 0 || exit $?
	;;
  restart)
	log_daemon_msg "Restarting $DESC" "$NAME"
	do_stop
	case "$?" in
	  0|1)
		do_start
		case "$?" in
			0) log_end_msg 0 ;;
			1) log_end_msg 1 ;; # Old process is still running
			*) log_end_msg 1 ;; # Failed to start
		esac
		;;
	  *)
		# Failed to stop
		log_end_msg 1
		;;
	esac
	;;
  *)
	echo "Usage: $SCRIPTNAME {start|stop|status|restart}" >&2
	exit 3
	;;
esac

:

Linux Serviio start up script

Why you should avoid using SIGALRM for timer

I came across a very interesting bug that occurred in a Linux software stack that was ported from some kind of embedded processor back in the days.

The processor back in the days had a probably a single timer interrupt, so the software team decided to implement over it a series of SW timers.

They did something which is quite usual in embedded world, they created a array of active timers, and always set the HW timer interrupt to occur when the nearest SW timer expires.

This requires a bit of a bookkeeping, as we need to dynamically update the HW timer whenever a new “closer” timer is scheduled. and this happens every-time a timer is created, stopped or expire.

We’re working of course on a multi-tasks system that work asynchronously so, a mutex was used to synchronize all of this.

This worked great for embedded. but when the software was ported to Linux. they didn’t want to change the architecture so what they did was to emulate the HW timer interrupt using the alarm signal (SIGALRM)

It worked, it was kept like this for few years. until a bug that was raised from a customer about the system going unresponsive once in a while (could be days) and required reboot.

I asked the customer to crash the application on purpose using SIGABRT when the problems occurs again and send me the coredump.

The precise commands I sent to the client are:

$ ulimit -c unlimited
$ kill -6 $(pidof -s APPNAME)

Once I had the coredump, I analyzed it with gdb and found that there was a blocked call to setitimer()

In the callstack I saw that a thread was requesting a new timer, which called in turn the bookkeeping function that updates the “HW interrupt” to the closest timeout.

I opened LXR and searched for the setitimer syscall implementation in Linux kernel.

Here’s the code for the function:

int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue)
{
        struct task_struct *tsk = current;
        struct hrtimer *timer;
        ktime_t expires;

        /*
         * Validate the timevals in value.
         */
        if (!timeval_valid(&amp;value-&gt;it_value) ||
            !timeval_valid(&amp;value-&gt;it_interval))
                return -EINVAL;

        switch (which) {
        case ITIMER_REAL:
again:
                 spin_lock_irq(&amp;tsk-&gt;sighand-&gt;siglock);
                 timer = &amp;tsk-&gt;signal-&gt;real_timer;
                 if (ovalue) {
                       ovalue-&gt;it_value = itimer_get_remtime(timer);
                       ovalue-&gt;it_interval
                                 =
ktime_to_timeval(tsk-&gt;signal-&gt;it_real_incr);
                }
                /* We are sharing -&gt;siglock with it_real_fn() */
                if (hrtimer_try_to_cancel(timer) &lt; 0) {
                        spin_unlock_irq(&amp;tsk-&gt;sighand-&gt;siglock);
                        goto again;
                }
                expires = timeval_to_ktime(value-&gt;it_value);
                if (expires.tv64 != 0) {
                        tsk-&gt;signal-&gt;it_real_incr =
                                timeval_to_ktime(value-&gt;it_interval);
                        hrtimer_start(timer, expires, HRTIMER_MODE_REL);
                } else
                        tsk-&gt;signal-&gt;it_real_incr.tv64 = 0;

                trace_itimer_state(ITIMER_REAL, value, 0);
                spin_unlock_irq(&amp;tsk-&gt;sighand-&gt;siglock);
                break;
        case ITIMER_VIRTUAL:
                set_cpu_itimer(tsk, CPUCLOCK_VIRT, value, ovalue);
                break;
        case ITIMER_PROF:
                set_cpu_itimer(tsk, CPUCLOCK_PROF, value, ovalue);
                break;
        default:
                return -EINVAL;
        }
        return 0;
}

Pay a close lock at line: 17. the call to spin_lock_irq()
This is the only place we can get stuck in this syscall. but why are we blocked, why we can’t get a lock to siglock ?

I grep’ed the kernel code for siglock, who else takes it, and it appears that it is used when a signal is delivered to a user process.
It is kept locked until the user signal handler function returns… (you can read the code here.

OK. so it appears that a signal handler was running alongside of the setitimer() function call.
Let’s review the code

void sig_timer_handler(int signo) {
  pthread_mutex_lock(&amp;g_timer_mutex);
  ..
  ..
  pthread_mutex_unlock(&amp;g_timer_mutex);
}

void do_timer_bookkeeping(void) {
  pthread_mutex_lock(&amp;g_timer_mutex);
  ..
  setitimer(...);
  ..
  pthread_mutex_unlock(&amp;g_timer_mutex);
}

Do you see the problem ????

The problem occurs when the signal function is triggered just before we call setitimer().
The signal handler blocks, as the mutex is currently being held by do_timer_bookkeeping() and this is the deadlock.

OK.

So what’s wrong here ? besides using signals in the first place ?(I think I heard someone says signals are evil)
What’s wrong is that there are a limited set of functions that you can call while you’re in a signal handler, and ptherad_mutex_lock() is not a part of that list.
Whenever you call function not in the list (you can find the list here) the outcome is undefined and subject to implementation.

To fix it, I removed all the SIGALRM code and replaced it with setitimer.
I set the previous SIGALRM handler as the thread that will be spawned whenever the timer expired.

This allowed me to copy most of the code intact without introducing more bugs to the system.

Why you should avoid using SIGALRM for timer