How to parallelize processes in PHP

Today we will explain how to fork a process in our PHP scripts, so, we can parallelize processes that have a big load for the processor or simply they can be launched in parallel since they do not have dependencies between them and all of them solve partially a common task.

Why do we need to parallelize processes?

When we develop and algorithm, depending on the language that we use to programm it, we can solve the parts of the algorithm sequentially or in parallel. If we have a section of the algorithm which has a big processing load and requires a long time for its resolution, we could solve other parts while the slowest part of the algorithm is being solved. So, the total running time of our algorithm goes from being the sum of the parts to only the slowest part. It is true that we can only take advantage from parallelization when we have redundant resources (e.g: processors), but most of the existing servers have multiple cores, so it has enough income to learn to parallelize our processes.

Theoretical basis for parallelization tells us that when a parent process launches a child process, another process is created in memory identical to the parent with an own (different than father’s) process ID (pid) and is executed from the following statement that created the child process.

Forking a process in PHP

First of all, we must know that in order to parallelize processes we need to have installed the PHP process control extension (http://www.php.net/manual/en/refs.fileprocess.process.php). It is important to remark that pcntl will only work on the CLI (Command Line Interface). Within this extension, the modules that allow us to parallelize our processes are Process Control itself (PHP PCNTL http://www.php.net/manual/en/book.pcntl.php) and the module to share memory between processes (PHP shared memory http://www.php.net/manual/en/book.shmop.php). In order to explain the parallelization process we will put a (rather simple) example in which a (parent) process launch 10 threads in parallel (children) which will generate a random number. Each generated number will be placed in an area of ​​shared memory with his parent, the latter picks up each number generated by its children and returns the sum of all of them.

Let’s see the code:

<?php
function multiple_forks()
{
    $array_pids = array();
    $sum = 0;
    //Store the parent's process id
    $parent_pid = getmypid();
    for($i=0;$i<10;$i++)     {         if(getmypid() == $parent_pid)         {//We are in the parent process, so we launch the thread and store its pid             $array_pids[] = pcntl_fork(); //pcntl_fork allows us to launch a thread         }     }     //Once we have launched the 10 threads we start to generate the random numbers (in the children processes)     //or we sum them all if we are in the parent process     if(getmypid() == $parent_pid)     {//We are in the parent process         while(count($array_pids) > 0)
        {//While there are children being executed, we wait for them to finish
            $pid = pcntl_waitpid(-1,$status);
            //Open the shared memory with our child $pid
            $shared_id = shmop_open($pid,"a",0,0);
            $share_data = shmop_read($shared_id,0,shmop_size($shared_id));
            $sum += $share_data;
            //Mark the memory block to be deleted and close it
            shmop_delete($shared_id);
            shmop_close($shared_id);
            //Delete the process from the children queue
            foreach($array_pids as $key => $child)
            {
                if($pid == $child) unset($array_pids[$key]);
            }
        }
    }
    else
    {//We are in the child thread
        $num = rand(0,100);
        $shared_id = shmop_open(getmypid(),"c",0644,strlen($num));
        if(!$shared_id)
        {//It was impossible to create the shared memory
            echo "There was an error while trying to create the shared memory in the child process ".getmypid()."n";
        }
        else
        {
            if(strlen($num) != shmop_write($shared_id,$num,0))
            {
                echo "There was an error while trying to write the number $num in the child ".getmypid()."n";
            }
            else
            {
                shmop_close($shared_id);
            }
         }
         //Exit telling the parent that everything was fine
         exit(0);
    }
    return $sum;
}
?>

Most likely the code can be optimized further, but it is not the purpose of this article. Let’s take a look to the functions of process control and shared memory we used in the code:

  • pcntl_fork. It allows us to launch children from a parent process, returning the child pid. (pcntl_fork in PHP)
  • pcntl_waitpid. It allows us to set the parent process in “waiting mode” for his child terminates its execution. The parameter -1 indicates that awaits any child that ends, the first to do so. (pcntl_waitpid in PHP).
  • shmop_open. We can create or open a memory block. The first parameter is a handle mode, nothing like using the pid of the child as an identifier, thus, the parent can know the identifier that memory block was created with and access to shared data. (shmop_open en PHP)
  • shmop_read. We can read a block of shared memory. (shmop_read in PHP)
  • shmop_delete. Sets the memory block to be released. The block will be automatically released by the system when all concurrent processes associated with that block are detached. (shmop_delete in PHP)
  • shmop_close. Closes a memory block, pointing out to the system that the process decouples the block. (shmop_close in PHP)
  • shmop_size. It allows us to know the size that a block occupies of shared memory. (shmop_size in PHP)
  • shmop_write. It allows us to write data to a shared memory block. (shmop_write in PHP)

With this, we have the basic knowledge necessary to create parallel processes in our controllers and launch as many threads as processors we have in our servers.

1 thought on “How to parallelize processes in PHP

Leave a Reply

Your email address will not be published. Required fields are marked *