• Using strace with multiples PIDs

    For debugging purposes, it’s sometimes necessary to debug multiples PIDs at a same time with strace tool.

    I will take a simple example: PHP-FPM. PHP-FPM is creating several processes depending on its needs, and if you want to perform debugging on it, you can’t easily know what each process is doing. In order to get the results of all the PIDs created for php-fpm, you can use the following command:

    strace -tt -T $(pidof 'php-fpm: pool www' | sed 's/\([0-9]*\)/\-p \1/g')
    

    In this command, you can see:

    • -tt” option: displays a more precise time on each line (with microseconds)
    • -T” option: show the time spent in the call
    • pidof ‘php-fpm: pool www’“: retrieves all the PIDs of processes called “php-fpm: pool www” (you can adapt it depending on your process name)

    Thanks to this command, you will get the strace result for all your PHP-FPM processes (you can filter them later thanks to PID displayed at the beginning of each line).

  • Generating core dumps for PHP-FPM

    When you are getting some errors from PHP-FPM like “signal 11 (core dumped)” in your logs, you can need to generate some core dumps to understand what’s happening.

    Install packages

    You first need to install some packages to allow you generating dumps:

    apt-get install gdb php5-dbg

    System core updates

    You will then need to update some sysctl parameters. Those commands will request root access to be executed.
    Obviously, you can change the directory in which one you want to put the core dumps depending on your configuration (here /opt/core/ is used):

    echo '/opt/core/core-%e.%p' > /proc/sys/kernel/core_pattern
    echo 0 > /proc/sys/kernel/core_uses_pid
    ulimit -c unlimited
    

    You can use several patterns for naming your core dumps files:

    %% a single % character
    %p PID of dumped process
    %u (numeric) real UID of dumped process
    %g (numeric) real GID of dumped process
    %s number of signal causing dump
    %t time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC)
    %h hostname (same as nodename returned by uname(2))
    %e executable filename (without path prefix)
    %E pathname of executable, with slashes ('/') replaced by exclamation marks ('!').
    %c core file size soft resource limit of crashing process (since Linux 2.6.24)
    

    Update PHP-FPM config

    Once system configuration performed, you will need to update php-fpm configuration as well.
    Edit file /etc/php5/fpm/php-fpm.conf (or specific pool configuration file under pool.d directory) and uncomment following line:

    rlimit_core = unlimited

    Once done, restart php5-fpm service:

    /etc/init.d/php5-fpm restart

    Your core dumps will now be created in the folder you indicated at the beginning as soon as a new core dump will be generated.

    Check your core dumps

    Check your PHP-FPM logs, and if you see something like:

    [01-Jan-2015 05:30:15] WARNING: [pool www] child 547934 exited on signal 11 (SIGSEGV - core dumped) after 410.674135 seconds from start
    

    Go to the folder you chose for storing core dumps and you will see your core dump files:

    # ls -l /opt/core/*
    -rw------- 1 www-data www-data 124512498 Jan  1 05:30 /opt/core/core-php5-fpm.547934
    

    Analyze a core dump

    Once core dump generated, you will need to analyze this file to see why this core dump has been generated.
    For that, you will need to use a standard tool called gdb.

    Use this command line with the new file just generated to launch a debug shell and start analysis:

    gdb /usr/sbin/php5-fpm /opt/core/core-php5-fpm.547934

    Once the shell is launched you can use different commands to analyze output like:

    • backtrace: it will display the simple backtrace of core dump
      (gdb) bt
    • backtrace full: will display the full detailed backtrace
      (gdb) bt full

    You will so get the backtrace of code that generated this core dump and be able to debug the application easily!

    WARNING: Be careful to those core dumps which can be quite big and take lots of disk space very quickly if lots of core dumps are generated.