casynchronouspipeglib

GLib: Detecting output from a Python script


Objective: Spawn a Python script from C/GLib, and detect when Python outputs a string.

The following Python script slow_output.py repeatedly prints a string to STDOUT.

import time
for x in range(10):
    print("Eat imagine you chiefly few end ferrars compass. Be visitor am ferrars inquiry. Latter law remark two lively thrown. Spot set they know rest its. Raptures law diverted believed jennings consider children the see. Had invited beloved carried the colonel. Occasional principles discretion it as he unpleasing boisterous. She sing dear now son half.")
    time.sleep(5)

I want to launch this script from C/GLib, and then run a callback when the script prints output. I set up a g_child_watch_add() for the Python process, watching for condition G_IO_IN. Unfortunately, the callback emit() associated with this condition is never run. Why not?

#include <glib.h>
/* gcc -g -Wall `pkg-config --cflags glib-2.0` main.c `pkg-config --libs glib-2.0` */


/* The following callback runs correctly when the Python process ends. */
static void child_ended (GPid pid, gint status, gpointer user_data) {
    g_print("Child ended\n");
    g_spawn_close_pid (pid);
    exit(-1);
}


/* This callback is never run. */
gboolean emit(GIOChannel *source, GIOCondition condition, gpointer data) {
    g_print("Condition detected\n");
    return FALSE;
}

int main() {
    gchar **argv;
    gint argc;
    GPid child_pid;
    gint output_fd;
    GError *error = NULL;

    /* Set up the command to spawn the Python script. */
    gboolean test = g_shell_parse_argv("/usr/bin/python3 /home/joe/slow_output.py", &argc, &argv, &error);

    /* Spawn the script. */
    test = g_spawn_async_with_pipes(
        "/tmp/",
        argv,
        NULL,
        G_SPAWN_DO_NOT_REAP_CHILD,
        NULL,
        NULL,
        &child_pid,
        NULL,
        &output_fd,
        NULL,
        &error);

/* Instantiate a main loop. */
    GMainLoop *mainloop = g_main_loop_new(NULL,FALSE);
    
/* Create a GIOChannel for the script's file descriptor. */
    GIOChannel *channel = g_io_channel_unix_new(output_fd);

/* Watch for the script's termination. */
    g_child_watch_add (child_pid, child_ended, NULL);
    
/* Watch for output emitted by the script. Even though I specify G_IO_IN, 
   and the script is outputting text, the callback emit() never runs. */
    guint watch = g_io_add_watch(
        channel,
        G_IO_IN,
        (GIOFunc)emit,
        NULL);

    g_main_loop_run(mainloop);

    return 0;
}

Solution

  • Python, much like C stdio, uses buffered output. You don't notice it because it switches to a different buffering mode when it detects that STDOUT or sys.stdout is connected to an interactive terminal.

    When sys.stdout is connected to a terminal, Python uses line buffering – where the print() writing a final \n causes it to immediately flush the output buffer – but for non-terminal outputs (such as pipes or files) this is disabled, so all output is held until the buffer fills up or the program manually flushes it.

    So with your example string, it would take at least 24 lines for Python to fill its default 8 kB buffer and flush its contents to the OS as one large chunk. If you want to see output earlier, you must either:

    If the child program were written in C, you would use setlinebuf(…) or fflush(…) for the same thing. (Similarly, g_io_channel_set_buffered(…) or g_io_channel_flush() for GLib.)