Parallelize a Bash FOR Loop












81














I have been trying to parallelize the following script, specifically each of the three FOR loop instances, using GNU Parallel but haven't been able to. The 4 commands contained within the FOR loop run in series, each loop taking around 10 minutes.



#!/bin/bash

kar='KAR5'
runList='run2 run3 run4'
mkdir normFunc
for run in $runList
do
fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear

rm -f *.mat
done









share|improve this question





























    81














    I have been trying to parallelize the following script, specifically each of the three FOR loop instances, using GNU Parallel but haven't been able to. The 4 commands contained within the FOR loop run in series, each loop taking around 10 minutes.



    #!/bin/bash

    kar='KAR5'
    runList='run2 run3 run4'
    mkdir normFunc
    for run in $runList
    do
    fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
    fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
    fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
    fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear

    rm -f *.mat
    done









    share|improve this question



























      81












      81








      81


      42





      I have been trying to parallelize the following script, specifically each of the three FOR loop instances, using GNU Parallel but haven't been able to. The 4 commands contained within the FOR loop run in series, each loop taking around 10 minutes.



      #!/bin/bash

      kar='KAR5'
      runList='run2 run3 run4'
      mkdir normFunc
      for run in $runList
      do
      fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
      fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
      fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
      fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear

      rm -f *.mat
      done









      share|improve this question















      I have been trying to parallelize the following script, specifically each of the three FOR loop instances, using GNU Parallel but haven't been able to. The 4 commands contained within the FOR loop run in series, each loop taking around 10 minutes.



      #!/bin/bash

      kar='KAR5'
      runList='run2 run3 run4'
      mkdir normFunc
      for run in $runList
      do
      fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
      fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
      fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
      fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear

      rm -f *.mat
      done






      shell-script gnu-parallel






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 27 '16 at 14:42









      Jeff Schaller

      38.7k1053125




      38.7k1053125










      asked Dec 5 '13 at 21:04









      Ravnoor S Gill

      508154




      508154






















          10 Answers
          10






          active

          oldest

          votes


















          70














          Why don't you just fork (aka. background) them?



          foo () {
          local run=$1
          fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
          fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
          fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
          fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear
          }

          for run in $runList; do foo "$run" & done


          In case that's not clear, the significant part is here:



          for run in $runList; do foo "$run" & done
          ^


          Causing the function to be executed in a forked shell in the background. That's parallel.






          share|improve this answer



















          • 5




            That worked like a charm. Thank you. Such a simple implementation (Makes me feel so stupid now!).
            – Ravnoor S Gill
            Dec 5 '13 at 21:24






          • 7




            In case I had 8 files to run in parallel but only 4 cores, could that be integrated in such a setting or would that require a Job Scheduler?
            – Ravnoor S Gill
            Dec 5 '13 at 21:27






          • 5




            It doesn't really matter in this context; it's normal for the system to have more active processes than cores. If you have many short tasks, ideally you would feed a queue serviced by a number or worker threads < the number of cores. I don't know how often that is really done with shell scripting (in which case, they wouldn't be threads, they'd be independent processes) but with relatively few long tasks it would be pointless. The OS scheduler will take care of them.
            – goldilocks
            Dec 5 '13 at 21:50








          • 12




            You also might want to add a wait command at the end so the master script does not exit until all of the background jobs do.
            – psusi
            Nov 19 '15 at 0:22






          • 1




            I would also fine it useful to limit the number of concurrent processes: my processes each use 100% of a core's time for about 25 minutes. This is on a shared server with 16 cores, where many people are running jobs. I need to run 23 copies of the script. If I run them all concurrently, then I swamp the server, and make it useless for everyone else for an hour or two (load goes up to 30, everything else slows way down). I guess it could be done with nice, but then I don't know if it'd ever finish..
            – naught101
            Nov 26 '15 at 23:00



















          106














          Sample task



          task(){
          sleep 0.5; echo "$1";
          }


          Sequential runs



          for thing in a b c d e f g; do 
          task "$thing"
          done


          Parallel runs



          for thing in a b c d e f g; do 
          task "$thing" &
          done


          Parallel runs in N-process batches



          N=4
          (
          for thing in a b c d e f g; do
          ((i=i%N)); ((i++==0)) && wait
          task "$thing" &
          done
          )


          It's also possible to use FIFOs as semaphores and use them to ensure that new processes are spawned as soon as possible and that no more than N processes runs at the same time. But it requires more code.



          N processes with a FIFO-based semaphore:



          open_sem(){
          mkfifo pipe-$$
          exec 3<>pipe-$$
          rm pipe-$$
          local i=$1
          for((;i>0;i--)); do
          printf %s 000 >&3
          done
          }
          run_with_lock(){
          local x
          read -u 3 -n 3 x && ((0==x)) || exit $x
          (
          ( "$@"; )
          printf '%.3d' $? >&3
          )&
          }

          N=4
          open_sem $N
          for thing in {a..g}; do
          run_with_lock task $thing
          done





          share|improve this answer



















          • 3




            The line with wait in it basically lets all processes run, until it hits the nth process, then waits for all of the others to finish running, is that right?
            – naught101
            Nov 26 '15 at 23:03












          • If i is zero, call wait. Increment i after the zero test.
            – PSkocik
            Nov 26 '15 at 23:08








          • 1




            Love the n parallel runs! Thank you.
            – joshperry
            Sep 15 '16 at 16:31






          • 1




            @naught101 Yes. wait w/ no arg waits for all children. That makes it a little wasteful. The pipe-based-semaphore approach gives you more fluent concurrency (I've been using that in a custom shell based build system along with -nt/-ot checks successfully for a while now)
            – PSkocik
            Mar 10 at 20:02










          • what does "$1" mean here?
            – kyle
            Apr 8 at 0:01





















          56














          for stuff in things
          do
          ( something
          with
          stuff ) &
          done
          wait # for all the something with stuff


          Whether it actually works depends on your commands; I'm not familiar with them. The rm *.mat looks a bit prone to conflicts if it runs in parallel...






          share|improve this answer



















          • 2




            This runs perfectly as well. You are right I would have to change rm *.mat to something like rm $run".mat" to get it to work without one process interfering with the other. Thank you.
            – Ravnoor S Gill
            Dec 5 '13 at 21:38












          • @RavnoorSGill Welcome to Stack Exchange! If this answer solved your problem, please mark it as accepted by ticking the check mark next to it.
            – Gilles
            Dec 5 '13 at 23:54






          • 5




            +1 for wait, which I forgot.
            – goldilocks
            Dec 6 '13 at 12:13






          • 3




            If there are tons of 'things', won't this start tons of processes? It would be better to start only a sane number of processes simultaneously, right?
            – David Doria
            Mar 20 '15 at 15:17










          • @DavidDoria sure, this is meant for small scale. (The example in the question had only three items). I use this style for unlocking a dozen LUKS containers on bootup... if I had a lot more, I'd have to use some other method, but on a small scale this is simple enough.
            – frostschutz
            Mar 20 '15 at 16:41





















          24














          for stuff in things
          do
          sem -j+0 ( something
          with
          stuff )
          done
          sem --wait


          This will use semaphores, parallelizing as many iterations as the number of available cores (-j +0 means you will parallelize N+0 jobs, where N is the number of available cores).



          sem --wait tells to wait until all the iterations in the for loop have terminated execution before executing the successive lines of code.



          Note: you will need "parallel" from the GNU parallel project (sudo apt-get install parallel).






          share|improve this answer



















          • 1




            is it possible to go past 60? mine throws an error saying not enough file descriptors.
            – chovy
            Nov 27 '15 at 7:47



















          7














          One really easy way that I often use:



          cat "args" | xargs -P $NUM_PARALLEL command


          This will run the command, passing in each line of the "args" file, in parallel, running at most $NUM_PARALLEL at the same time.



          You can also look into the -I option for xargs, if you need to substitute the input arguments in different places.






          share|improve this answer





























            6














            It seems the fsl jobs are depending on eachother, so the 4 jobs cannot be run in parallel. The runs, however, can be run in parallel.



            Make a bash function running a single run and run that function in parallel:



            #!/bin/bash

            myfunc() {
            run=$1
            kar='KAR5'
            mkdir normFunc
            fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
            fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
            fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
            fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear
            }

            export -f myfunc
            parallel myfunc ::: run2 run3 run4


            To learn more watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1 and spend an hour walking through the tutorial http://www.gnu.org/software/parallel/parallel_tutorial.html Your command line will love you for it.






            share|improve this answer





















            • If you're using a non-bash shell you'll need to also export SHELL=/bin/bash before running parallel. Otherwise you'll get an error like: Unknown command 'myfunc arg'
              – AndrewHarvey
              Jul 31 '15 at 3:39








            • 1




              @AndrewHarvey: isn't that what the shebang is for?
              – naught101
              Nov 26 '15 at 23:02



















            2














            Parallel execution in max N-process concurrent



            #!/bin/bash

            N=4

            for i in {a..z}; do
            (
            # .. do your stuff here
            echo "starting task $i.."
            sleep $(( (RANDOM % 3) + 1))
            ) &

            # allow only to execute $N jobs in parallel
            if [[ $(jobs -r -p | wc -l) -gt $N ]]; then
            # wait only for first job
            wait -n
            fi

            done

            # wait for pending jobs
            wait

            echo "all done"





            share|improve this answer





























              0














              I had trouble with @PSkocik's solution. My system does not have GNU Parallel available as a package and sem threw an exception when I built and ran it manually. I then tried the FIFO semaphore example as well which also threw some other errors regarding communication.



              @eyeApps suggested xargs but I didn't know how to make it work with my complex use case (examples would be welcome).



              Here is my solution for parallel jobs which process up to N jobs at a time as configured by _jobs_set_max_parallel:



              _lib_jobs.sh:



              function _jobs_get_count_e {
              jobs -r | wc -l | tr -d " "
              }

              function _jobs_set_max_parallel {
              g_jobs_max_jobs=$1
              }

              function _jobs_get_max_parallel_e {
              [[ $g_jobs_max_jobs ]] && {
              echo $g_jobs_max_jobs

              echo 0
              }

              echo 1
              }

              function _jobs_is_parallel_available_r() {
              (( $(_jobs_get_count_e) < $g_jobs_max_jobs )) &&
              return 0

              return 1
              }

              function _jobs_wait_parallel() {
              # Sleep between available jobs
              while true; do
              _jobs_is_parallel_available_r &&
              break

              sleep 0.1s
              done
              }

              function _jobs_wait() {
              wait
              }


              Example usage:



              #!/bin/bash

              source "_lib_jobs.sh"

              _jobs_set_max_parallel 3

              # Run 10 jobs in parallel with varying amounts of work
              for a in {1..10}; do
              _jobs_wait_parallel

              # Sleep between 1-2 seconds to simulate busy work
              sleep_delay=$(echo "scale=1; $(shuf -i 10-20 -n 1)/10" | bc -l)

              ( ### ASYNC
              echo $a
              sleep ${sleep_delay}s
              ) &
              done

              # Visualize jobs
              while true; do
              n_jobs=$(_jobs_get_count_e)

              [[ $n_jobs = 0 ]] &&
              break

              sleep 0.1s
              done





              share|improve this answer































                0














                In my case, I can't use semaphore (I'm in git-bash on Windows), so I came up with a generic way to split the task among N workers, before they begin.



                It works well if the tasks take roughly the same amount of time. The disadvantage is that, if one of the workers takes a long time to do its part of the job, the others that already finished won't help.



                Splitting the job among N workers (1 per core)



                # array of assets, assuming at least 1 item exists
                listAssets=( {a..z} ) # example: a b c d .. z
                # listAssets=( ~/"path with spaces/"*.txt ) # could be file paths

                # replace with your task
                task() { # $1 = idWorker, $2 = asset
                echo "Worker $1: Asset '$2' START!"
                # simulating a task that randomly takes 3-6 seconds
                sleep $(( ($RANDOM % 4) + 3 ))
                echo " Worker $1: Asset '$2' OK!"
                }

                nVirtualCores=$(nproc --all)
                nWorkers=$(( $nVirtualCores * 1 )) # I want 1 process per core

                worker() { # $1 = idWorker
                echo "Worker $1 GO!"
                idAsset=0
                for asset in "${listAssets[@]}"; do
                # split assets among workers (using modulo); each worker will go through
                # the list and select the asset only if it belongs to that worker
                (( idAsset % nWorkers == $1 )) && task $1 "$asset"
                (( idAsset++ ))
                done
                echo " Worker $1 ALL DONE!"
                }

                for (( idWorker=0; idWorker<nWorkers; idWorker++ )); do
                # start workers in parallel, use 1 process for each
                worker $idWorker &
                done
                wait # until all workers are done





                share|improve this answer





























                  0














                  I really like the answer from @lev as it provides control over the maximum number of processes in a very simple manner. However as described in the manual, sem does not work with brackets.



                  for stuff in things
                  do
                  sem -j +0 "something;
                  with;
                  stuff"
                  done
                  sem --wait


                  Does the job.




                  -j +N Add N to the number of CPU cores. Run up to this many jobs in parallel. For compute intensive jobs -j +0 is useful as it will run number-of-cpu-cores jobs simultaneously.



                  -j -N Subtract N from the number of CPU cores. Run up to this many jobs in parallel. If the evaluated number is less than 1 then 1 will be used. See also --use-cpus-instead-of-cores.







                  share|improve this answer




















                    protected by Kusalananda Dec 17 at 10:34



                    Thank you for your interest in this question.
                    Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                    Would you like to answer one of these unanswered questions instead?














                    10 Answers
                    10






                    active

                    oldest

                    votes








                    10 Answers
                    10






                    active

                    oldest

                    votes









                    active

                    oldest

                    votes






                    active

                    oldest

                    votes









                    70














                    Why don't you just fork (aka. background) them?



                    foo () {
                    local run=$1
                    fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                    fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                    fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
                    fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear
                    }

                    for run in $runList; do foo "$run" & done


                    In case that's not clear, the significant part is here:



                    for run in $runList; do foo "$run" & done
                    ^


                    Causing the function to be executed in a forked shell in the background. That's parallel.






                    share|improve this answer



















                    • 5




                      That worked like a charm. Thank you. Such a simple implementation (Makes me feel so stupid now!).
                      – Ravnoor S Gill
                      Dec 5 '13 at 21:24






                    • 7




                      In case I had 8 files to run in parallel but only 4 cores, could that be integrated in such a setting or would that require a Job Scheduler?
                      – Ravnoor S Gill
                      Dec 5 '13 at 21:27






                    • 5




                      It doesn't really matter in this context; it's normal for the system to have more active processes than cores. If you have many short tasks, ideally you would feed a queue serviced by a number or worker threads < the number of cores. I don't know how often that is really done with shell scripting (in which case, they wouldn't be threads, they'd be independent processes) but with relatively few long tasks it would be pointless. The OS scheduler will take care of them.
                      – goldilocks
                      Dec 5 '13 at 21:50








                    • 12




                      You also might want to add a wait command at the end so the master script does not exit until all of the background jobs do.
                      – psusi
                      Nov 19 '15 at 0:22






                    • 1




                      I would also fine it useful to limit the number of concurrent processes: my processes each use 100% of a core's time for about 25 minutes. This is on a shared server with 16 cores, where many people are running jobs. I need to run 23 copies of the script. If I run them all concurrently, then I swamp the server, and make it useless for everyone else for an hour or two (load goes up to 30, everything else slows way down). I guess it could be done with nice, but then I don't know if it'd ever finish..
                      – naught101
                      Nov 26 '15 at 23:00
















                    70














                    Why don't you just fork (aka. background) them?



                    foo () {
                    local run=$1
                    fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                    fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                    fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
                    fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear
                    }

                    for run in $runList; do foo "$run" & done


                    In case that's not clear, the significant part is here:



                    for run in $runList; do foo "$run" & done
                    ^


                    Causing the function to be executed in a forked shell in the background. That's parallel.






                    share|improve this answer



















                    • 5




                      That worked like a charm. Thank you. Such a simple implementation (Makes me feel so stupid now!).
                      – Ravnoor S Gill
                      Dec 5 '13 at 21:24






                    • 7




                      In case I had 8 files to run in parallel but only 4 cores, could that be integrated in such a setting or would that require a Job Scheduler?
                      – Ravnoor S Gill
                      Dec 5 '13 at 21:27






                    • 5




                      It doesn't really matter in this context; it's normal for the system to have more active processes than cores. If you have many short tasks, ideally you would feed a queue serviced by a number or worker threads < the number of cores. I don't know how often that is really done with shell scripting (in which case, they wouldn't be threads, they'd be independent processes) but with relatively few long tasks it would be pointless. The OS scheduler will take care of them.
                      – goldilocks
                      Dec 5 '13 at 21:50








                    • 12




                      You also might want to add a wait command at the end so the master script does not exit until all of the background jobs do.
                      – psusi
                      Nov 19 '15 at 0:22






                    • 1




                      I would also fine it useful to limit the number of concurrent processes: my processes each use 100% of a core's time for about 25 minutes. This is on a shared server with 16 cores, where many people are running jobs. I need to run 23 copies of the script. If I run them all concurrently, then I swamp the server, and make it useless for everyone else for an hour or two (load goes up to 30, everything else slows way down). I guess it could be done with nice, but then I don't know if it'd ever finish..
                      – naught101
                      Nov 26 '15 at 23:00














                    70












                    70








                    70






                    Why don't you just fork (aka. background) them?



                    foo () {
                    local run=$1
                    fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                    fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                    fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
                    fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear
                    }

                    for run in $runList; do foo "$run" & done


                    In case that's not clear, the significant part is here:



                    for run in $runList; do foo "$run" & done
                    ^


                    Causing the function to be executed in a forked shell in the background. That's parallel.






                    share|improve this answer














                    Why don't you just fork (aka. background) them?



                    foo () {
                    local run=$1
                    fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                    fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                    fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
                    fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear
                    }

                    for run in $runList; do foo "$run" & done


                    In case that's not clear, the significant part is here:



                    for run in $runList; do foo "$run" & done
                    ^


                    Causing the function to be executed in a forked shell in the background. That's parallel.







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Dec 5 '13 at 21:18









                    jordanm

                    30.2k28292




                    30.2k28292










                    answered Dec 5 '13 at 21:11









                    goldilocks

                    61.5k13151207




                    61.5k13151207








                    • 5




                      That worked like a charm. Thank you. Such a simple implementation (Makes me feel so stupid now!).
                      – Ravnoor S Gill
                      Dec 5 '13 at 21:24






                    • 7




                      In case I had 8 files to run in parallel but only 4 cores, could that be integrated in such a setting or would that require a Job Scheduler?
                      – Ravnoor S Gill
                      Dec 5 '13 at 21:27






                    • 5




                      It doesn't really matter in this context; it's normal for the system to have more active processes than cores. If you have many short tasks, ideally you would feed a queue serviced by a number or worker threads < the number of cores. I don't know how often that is really done with shell scripting (in which case, they wouldn't be threads, they'd be independent processes) but with relatively few long tasks it would be pointless. The OS scheduler will take care of them.
                      – goldilocks
                      Dec 5 '13 at 21:50








                    • 12




                      You also might want to add a wait command at the end so the master script does not exit until all of the background jobs do.
                      – psusi
                      Nov 19 '15 at 0:22






                    • 1




                      I would also fine it useful to limit the number of concurrent processes: my processes each use 100% of a core's time for about 25 minutes. This is on a shared server with 16 cores, where many people are running jobs. I need to run 23 copies of the script. If I run them all concurrently, then I swamp the server, and make it useless for everyone else for an hour or two (load goes up to 30, everything else slows way down). I guess it could be done with nice, but then I don't know if it'd ever finish..
                      – naught101
                      Nov 26 '15 at 23:00














                    • 5




                      That worked like a charm. Thank you. Such a simple implementation (Makes me feel so stupid now!).
                      – Ravnoor S Gill
                      Dec 5 '13 at 21:24






                    • 7




                      In case I had 8 files to run in parallel but only 4 cores, could that be integrated in such a setting or would that require a Job Scheduler?
                      – Ravnoor S Gill
                      Dec 5 '13 at 21:27






                    • 5




                      It doesn't really matter in this context; it's normal for the system to have more active processes than cores. If you have many short tasks, ideally you would feed a queue serviced by a number or worker threads < the number of cores. I don't know how often that is really done with shell scripting (in which case, they wouldn't be threads, they'd be independent processes) but with relatively few long tasks it would be pointless. The OS scheduler will take care of them.
                      – goldilocks
                      Dec 5 '13 at 21:50








                    • 12




                      You also might want to add a wait command at the end so the master script does not exit until all of the background jobs do.
                      – psusi
                      Nov 19 '15 at 0:22






                    • 1




                      I would also fine it useful to limit the number of concurrent processes: my processes each use 100% of a core's time for about 25 minutes. This is on a shared server with 16 cores, where many people are running jobs. I need to run 23 copies of the script. If I run them all concurrently, then I swamp the server, and make it useless for everyone else for an hour or two (load goes up to 30, everything else slows way down). I guess it could be done with nice, but then I don't know if it'd ever finish..
                      – naught101
                      Nov 26 '15 at 23:00








                    5




                    5




                    That worked like a charm. Thank you. Such a simple implementation (Makes me feel so stupid now!).
                    – Ravnoor S Gill
                    Dec 5 '13 at 21:24




                    That worked like a charm. Thank you. Such a simple implementation (Makes me feel so stupid now!).
                    – Ravnoor S Gill
                    Dec 5 '13 at 21:24




                    7




                    7




                    In case I had 8 files to run in parallel but only 4 cores, could that be integrated in such a setting or would that require a Job Scheduler?
                    – Ravnoor S Gill
                    Dec 5 '13 at 21:27




                    In case I had 8 files to run in parallel but only 4 cores, could that be integrated in such a setting or would that require a Job Scheduler?
                    – Ravnoor S Gill
                    Dec 5 '13 at 21:27




                    5




                    5




                    It doesn't really matter in this context; it's normal for the system to have more active processes than cores. If you have many short tasks, ideally you would feed a queue serviced by a number or worker threads < the number of cores. I don't know how often that is really done with shell scripting (in which case, they wouldn't be threads, they'd be independent processes) but with relatively few long tasks it would be pointless. The OS scheduler will take care of them.
                    – goldilocks
                    Dec 5 '13 at 21:50






                    It doesn't really matter in this context; it's normal for the system to have more active processes than cores. If you have many short tasks, ideally you would feed a queue serviced by a number or worker threads < the number of cores. I don't know how often that is really done with shell scripting (in which case, they wouldn't be threads, they'd be independent processes) but with relatively few long tasks it would be pointless. The OS scheduler will take care of them.
                    – goldilocks
                    Dec 5 '13 at 21:50






                    12




                    12




                    You also might want to add a wait command at the end so the master script does not exit until all of the background jobs do.
                    – psusi
                    Nov 19 '15 at 0:22




                    You also might want to add a wait command at the end so the master script does not exit until all of the background jobs do.
                    – psusi
                    Nov 19 '15 at 0:22




                    1




                    1




                    I would also fine it useful to limit the number of concurrent processes: my processes each use 100% of a core's time for about 25 minutes. This is on a shared server with 16 cores, where many people are running jobs. I need to run 23 copies of the script. If I run them all concurrently, then I swamp the server, and make it useless for everyone else for an hour or two (load goes up to 30, everything else slows way down). I guess it could be done with nice, but then I don't know if it'd ever finish..
                    – naught101
                    Nov 26 '15 at 23:00




                    I would also fine it useful to limit the number of concurrent processes: my processes each use 100% of a core's time for about 25 minutes. This is on a shared server with 16 cores, where many people are running jobs. I need to run 23 copies of the script. If I run them all concurrently, then I swamp the server, and make it useless for everyone else for an hour or two (load goes up to 30, everything else slows way down). I guess it could be done with nice, but then I don't know if it'd ever finish..
                    – naught101
                    Nov 26 '15 at 23:00













                    106














                    Sample task



                    task(){
                    sleep 0.5; echo "$1";
                    }


                    Sequential runs



                    for thing in a b c d e f g; do 
                    task "$thing"
                    done


                    Parallel runs



                    for thing in a b c d e f g; do 
                    task "$thing" &
                    done


                    Parallel runs in N-process batches



                    N=4
                    (
                    for thing in a b c d e f g; do
                    ((i=i%N)); ((i++==0)) && wait
                    task "$thing" &
                    done
                    )


                    It's also possible to use FIFOs as semaphores and use them to ensure that new processes are spawned as soon as possible and that no more than N processes runs at the same time. But it requires more code.



                    N processes with a FIFO-based semaphore:



                    open_sem(){
                    mkfifo pipe-$$
                    exec 3<>pipe-$$
                    rm pipe-$$
                    local i=$1
                    for((;i>0;i--)); do
                    printf %s 000 >&3
                    done
                    }
                    run_with_lock(){
                    local x
                    read -u 3 -n 3 x && ((0==x)) || exit $x
                    (
                    ( "$@"; )
                    printf '%.3d' $? >&3
                    )&
                    }

                    N=4
                    open_sem $N
                    for thing in {a..g}; do
                    run_with_lock task $thing
                    done





                    share|improve this answer



















                    • 3




                      The line with wait in it basically lets all processes run, until it hits the nth process, then waits for all of the others to finish running, is that right?
                      – naught101
                      Nov 26 '15 at 23:03












                    • If i is zero, call wait. Increment i after the zero test.
                      – PSkocik
                      Nov 26 '15 at 23:08








                    • 1




                      Love the n parallel runs! Thank you.
                      – joshperry
                      Sep 15 '16 at 16:31






                    • 1




                      @naught101 Yes. wait w/ no arg waits for all children. That makes it a little wasteful. The pipe-based-semaphore approach gives you more fluent concurrency (I've been using that in a custom shell based build system along with -nt/-ot checks successfully for a while now)
                      – PSkocik
                      Mar 10 at 20:02










                    • what does "$1" mean here?
                      – kyle
                      Apr 8 at 0:01


















                    106














                    Sample task



                    task(){
                    sleep 0.5; echo "$1";
                    }


                    Sequential runs



                    for thing in a b c d e f g; do 
                    task "$thing"
                    done


                    Parallel runs



                    for thing in a b c d e f g; do 
                    task "$thing" &
                    done


                    Parallel runs in N-process batches



                    N=4
                    (
                    for thing in a b c d e f g; do
                    ((i=i%N)); ((i++==0)) && wait
                    task "$thing" &
                    done
                    )


                    It's also possible to use FIFOs as semaphores and use them to ensure that new processes are spawned as soon as possible and that no more than N processes runs at the same time. But it requires more code.



                    N processes with a FIFO-based semaphore:



                    open_sem(){
                    mkfifo pipe-$$
                    exec 3<>pipe-$$
                    rm pipe-$$
                    local i=$1
                    for((;i>0;i--)); do
                    printf %s 000 >&3
                    done
                    }
                    run_with_lock(){
                    local x
                    read -u 3 -n 3 x && ((0==x)) || exit $x
                    (
                    ( "$@"; )
                    printf '%.3d' $? >&3
                    )&
                    }

                    N=4
                    open_sem $N
                    for thing in {a..g}; do
                    run_with_lock task $thing
                    done





                    share|improve this answer



















                    • 3




                      The line with wait in it basically lets all processes run, until it hits the nth process, then waits for all of the others to finish running, is that right?
                      – naught101
                      Nov 26 '15 at 23:03












                    • If i is zero, call wait. Increment i after the zero test.
                      – PSkocik
                      Nov 26 '15 at 23:08








                    • 1




                      Love the n parallel runs! Thank you.
                      – joshperry
                      Sep 15 '16 at 16:31






                    • 1




                      @naught101 Yes. wait w/ no arg waits for all children. That makes it a little wasteful. The pipe-based-semaphore approach gives you more fluent concurrency (I've been using that in a custom shell based build system along with -nt/-ot checks successfully for a while now)
                      – PSkocik
                      Mar 10 at 20:02










                    • what does "$1" mean here?
                      – kyle
                      Apr 8 at 0:01
















                    106












                    106








                    106






                    Sample task



                    task(){
                    sleep 0.5; echo "$1";
                    }


                    Sequential runs



                    for thing in a b c d e f g; do 
                    task "$thing"
                    done


                    Parallel runs



                    for thing in a b c d e f g; do 
                    task "$thing" &
                    done


                    Parallel runs in N-process batches



                    N=4
                    (
                    for thing in a b c d e f g; do
                    ((i=i%N)); ((i++==0)) && wait
                    task "$thing" &
                    done
                    )


                    It's also possible to use FIFOs as semaphores and use them to ensure that new processes are spawned as soon as possible and that no more than N processes runs at the same time. But it requires more code.



                    N processes with a FIFO-based semaphore:



                    open_sem(){
                    mkfifo pipe-$$
                    exec 3<>pipe-$$
                    rm pipe-$$
                    local i=$1
                    for((;i>0;i--)); do
                    printf %s 000 >&3
                    done
                    }
                    run_with_lock(){
                    local x
                    read -u 3 -n 3 x && ((0==x)) || exit $x
                    (
                    ( "$@"; )
                    printf '%.3d' $? >&3
                    )&
                    }

                    N=4
                    open_sem $N
                    for thing in {a..g}; do
                    run_with_lock task $thing
                    done





                    share|improve this answer














                    Sample task



                    task(){
                    sleep 0.5; echo "$1";
                    }


                    Sequential runs



                    for thing in a b c d e f g; do 
                    task "$thing"
                    done


                    Parallel runs



                    for thing in a b c d e f g; do 
                    task "$thing" &
                    done


                    Parallel runs in N-process batches



                    N=4
                    (
                    for thing in a b c d e f g; do
                    ((i=i%N)); ((i++==0)) && wait
                    task "$thing" &
                    done
                    )


                    It's also possible to use FIFOs as semaphores and use them to ensure that new processes are spawned as soon as possible and that no more than N processes runs at the same time. But it requires more code.



                    N processes with a FIFO-based semaphore:



                    open_sem(){
                    mkfifo pipe-$$
                    exec 3<>pipe-$$
                    rm pipe-$$
                    local i=$1
                    for((;i>0;i--)); do
                    printf %s 000 >&3
                    done
                    }
                    run_with_lock(){
                    local x
                    read -u 3 -n 3 x && ((0==x)) || exit $x
                    (
                    ( "$@"; )
                    printf '%.3d' $? >&3
                    )&
                    }

                    N=4
                    open_sem $N
                    for thing in {a..g}; do
                    run_with_lock task $thing
                    done






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Dec 17 at 10:28

























                    answered Jul 16 '15 at 14:05









                    PSkocik

                    17.7k44994




                    17.7k44994








                    • 3




                      The line with wait in it basically lets all processes run, until it hits the nth process, then waits for all of the others to finish running, is that right?
                      – naught101
                      Nov 26 '15 at 23:03












                    • If i is zero, call wait. Increment i after the zero test.
                      – PSkocik
                      Nov 26 '15 at 23:08








                    • 1




                      Love the n parallel runs! Thank you.
                      – joshperry
                      Sep 15 '16 at 16:31






                    • 1




                      @naught101 Yes. wait w/ no arg waits for all children. That makes it a little wasteful. The pipe-based-semaphore approach gives you more fluent concurrency (I've been using that in a custom shell based build system along with -nt/-ot checks successfully for a while now)
                      – PSkocik
                      Mar 10 at 20:02










                    • what does "$1" mean here?
                      – kyle
                      Apr 8 at 0:01
















                    • 3




                      The line with wait in it basically lets all processes run, until it hits the nth process, then waits for all of the others to finish running, is that right?
                      – naught101
                      Nov 26 '15 at 23:03












                    • If i is zero, call wait. Increment i after the zero test.
                      – PSkocik
                      Nov 26 '15 at 23:08








                    • 1




                      Love the n parallel runs! Thank you.
                      – joshperry
                      Sep 15 '16 at 16:31






                    • 1




                      @naught101 Yes. wait w/ no arg waits for all children. That makes it a little wasteful. The pipe-based-semaphore approach gives you more fluent concurrency (I've been using that in a custom shell based build system along with -nt/-ot checks successfully for a while now)
                      – PSkocik
                      Mar 10 at 20:02










                    • what does "$1" mean here?
                      – kyle
                      Apr 8 at 0:01










                    3




                    3




                    The line with wait in it basically lets all processes run, until it hits the nth process, then waits for all of the others to finish running, is that right?
                    – naught101
                    Nov 26 '15 at 23:03






                    The line with wait in it basically lets all processes run, until it hits the nth process, then waits for all of the others to finish running, is that right?
                    – naught101
                    Nov 26 '15 at 23:03














                    If i is zero, call wait. Increment i after the zero test.
                    – PSkocik
                    Nov 26 '15 at 23:08






                    If i is zero, call wait. Increment i after the zero test.
                    – PSkocik
                    Nov 26 '15 at 23:08






                    1




                    1




                    Love the n parallel runs! Thank you.
                    – joshperry
                    Sep 15 '16 at 16:31




                    Love the n parallel runs! Thank you.
                    – joshperry
                    Sep 15 '16 at 16:31




                    1




                    1




                    @naught101 Yes. wait w/ no arg waits for all children. That makes it a little wasteful. The pipe-based-semaphore approach gives you more fluent concurrency (I've been using that in a custom shell based build system along with -nt/-ot checks successfully for a while now)
                    – PSkocik
                    Mar 10 at 20:02




                    @naught101 Yes. wait w/ no arg waits for all children. That makes it a little wasteful. The pipe-based-semaphore approach gives you more fluent concurrency (I've been using that in a custom shell based build system along with -nt/-ot checks successfully for a while now)
                    – PSkocik
                    Mar 10 at 20:02












                    what does "$1" mean here?
                    – kyle
                    Apr 8 at 0:01






                    what does "$1" mean here?
                    – kyle
                    Apr 8 at 0:01













                    56














                    for stuff in things
                    do
                    ( something
                    with
                    stuff ) &
                    done
                    wait # for all the something with stuff


                    Whether it actually works depends on your commands; I'm not familiar with them. The rm *.mat looks a bit prone to conflicts if it runs in parallel...






                    share|improve this answer



















                    • 2




                      This runs perfectly as well. You are right I would have to change rm *.mat to something like rm $run".mat" to get it to work without one process interfering with the other. Thank you.
                      – Ravnoor S Gill
                      Dec 5 '13 at 21:38












                    • @RavnoorSGill Welcome to Stack Exchange! If this answer solved your problem, please mark it as accepted by ticking the check mark next to it.
                      – Gilles
                      Dec 5 '13 at 23:54






                    • 5




                      +1 for wait, which I forgot.
                      – goldilocks
                      Dec 6 '13 at 12:13






                    • 3




                      If there are tons of 'things', won't this start tons of processes? It would be better to start only a sane number of processes simultaneously, right?
                      – David Doria
                      Mar 20 '15 at 15:17










                    • @DavidDoria sure, this is meant for small scale. (The example in the question had only three items). I use this style for unlocking a dozen LUKS containers on bootup... if I had a lot more, I'd have to use some other method, but on a small scale this is simple enough.
                      – frostschutz
                      Mar 20 '15 at 16:41


















                    56














                    for stuff in things
                    do
                    ( something
                    with
                    stuff ) &
                    done
                    wait # for all the something with stuff


                    Whether it actually works depends on your commands; I'm not familiar with them. The rm *.mat looks a bit prone to conflicts if it runs in parallel...






                    share|improve this answer



















                    • 2




                      This runs perfectly as well. You are right I would have to change rm *.mat to something like rm $run".mat" to get it to work without one process interfering with the other. Thank you.
                      – Ravnoor S Gill
                      Dec 5 '13 at 21:38












                    • @RavnoorSGill Welcome to Stack Exchange! If this answer solved your problem, please mark it as accepted by ticking the check mark next to it.
                      – Gilles
                      Dec 5 '13 at 23:54






                    • 5




                      +1 for wait, which I forgot.
                      – goldilocks
                      Dec 6 '13 at 12:13






                    • 3




                      If there are tons of 'things', won't this start tons of processes? It would be better to start only a sane number of processes simultaneously, right?
                      – David Doria
                      Mar 20 '15 at 15:17










                    • @DavidDoria sure, this is meant for small scale. (The example in the question had only three items). I use this style for unlocking a dozen LUKS containers on bootup... if I had a lot more, I'd have to use some other method, but on a small scale this is simple enough.
                      – frostschutz
                      Mar 20 '15 at 16:41
















                    56












                    56








                    56






                    for stuff in things
                    do
                    ( something
                    with
                    stuff ) &
                    done
                    wait # for all the something with stuff


                    Whether it actually works depends on your commands; I'm not familiar with them. The rm *.mat looks a bit prone to conflicts if it runs in parallel...






                    share|improve this answer














                    for stuff in things
                    do
                    ( something
                    with
                    stuff ) &
                    done
                    wait # for all the something with stuff


                    Whether it actually works depends on your commands; I'm not familiar with them. The rm *.mat looks a bit prone to conflicts if it runs in parallel...







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Dec 5 '13 at 21:24

























                    answered Dec 5 '13 at 21:10









                    frostschutz

                    25.8k15280




                    25.8k15280








                    • 2




                      This runs perfectly as well. You are right I would have to change rm *.mat to something like rm $run".mat" to get it to work without one process interfering with the other. Thank you.
                      – Ravnoor S Gill
                      Dec 5 '13 at 21:38












                    • @RavnoorSGill Welcome to Stack Exchange! If this answer solved your problem, please mark it as accepted by ticking the check mark next to it.
                      – Gilles
                      Dec 5 '13 at 23:54






                    • 5




                      +1 for wait, which I forgot.
                      – goldilocks
                      Dec 6 '13 at 12:13






                    • 3




                      If there are tons of 'things', won't this start tons of processes? It would be better to start only a sane number of processes simultaneously, right?
                      – David Doria
                      Mar 20 '15 at 15:17










                    • @DavidDoria sure, this is meant for small scale. (The example in the question had only three items). I use this style for unlocking a dozen LUKS containers on bootup... if I had a lot more, I'd have to use some other method, but on a small scale this is simple enough.
                      – frostschutz
                      Mar 20 '15 at 16:41
















                    • 2




                      This runs perfectly as well. You are right I would have to change rm *.mat to something like rm $run".mat" to get it to work without one process interfering with the other. Thank you.
                      – Ravnoor S Gill
                      Dec 5 '13 at 21:38












                    • @RavnoorSGill Welcome to Stack Exchange! If this answer solved your problem, please mark it as accepted by ticking the check mark next to it.
                      – Gilles
                      Dec 5 '13 at 23:54






                    • 5




                      +1 for wait, which I forgot.
                      – goldilocks
                      Dec 6 '13 at 12:13






                    • 3




                      If there are tons of 'things', won't this start tons of processes? It would be better to start only a sane number of processes simultaneously, right?
                      – David Doria
                      Mar 20 '15 at 15:17










                    • @DavidDoria sure, this is meant for small scale. (The example in the question had only three items). I use this style for unlocking a dozen LUKS containers on bootup... if I had a lot more, I'd have to use some other method, but on a small scale this is simple enough.
                      – frostschutz
                      Mar 20 '15 at 16:41










                    2




                    2




                    This runs perfectly as well. You are right I would have to change rm *.mat to something like rm $run".mat" to get it to work without one process interfering with the other. Thank you.
                    – Ravnoor S Gill
                    Dec 5 '13 at 21:38






                    This runs perfectly as well. You are right I would have to change rm *.mat to something like rm $run".mat" to get it to work without one process interfering with the other. Thank you.
                    – Ravnoor S Gill
                    Dec 5 '13 at 21:38














                    @RavnoorSGill Welcome to Stack Exchange! If this answer solved your problem, please mark it as accepted by ticking the check mark next to it.
                    – Gilles
                    Dec 5 '13 at 23:54




                    @RavnoorSGill Welcome to Stack Exchange! If this answer solved your problem, please mark it as accepted by ticking the check mark next to it.
                    – Gilles
                    Dec 5 '13 at 23:54




                    5




                    5




                    +1 for wait, which I forgot.
                    – goldilocks
                    Dec 6 '13 at 12:13




                    +1 for wait, which I forgot.
                    – goldilocks
                    Dec 6 '13 at 12:13




                    3




                    3




                    If there are tons of 'things', won't this start tons of processes? It would be better to start only a sane number of processes simultaneously, right?
                    – David Doria
                    Mar 20 '15 at 15:17




                    If there are tons of 'things', won't this start tons of processes? It would be better to start only a sane number of processes simultaneously, right?
                    – David Doria
                    Mar 20 '15 at 15:17












                    @DavidDoria sure, this is meant for small scale. (The example in the question had only three items). I use this style for unlocking a dozen LUKS containers on bootup... if I had a lot more, I'd have to use some other method, but on a small scale this is simple enough.
                    – frostschutz
                    Mar 20 '15 at 16:41






                    @DavidDoria sure, this is meant for small scale. (The example in the question had only three items). I use this style for unlocking a dozen LUKS containers on bootup... if I had a lot more, I'd have to use some other method, but on a small scale this is simple enough.
                    – frostschutz
                    Mar 20 '15 at 16:41













                    24














                    for stuff in things
                    do
                    sem -j+0 ( something
                    with
                    stuff )
                    done
                    sem --wait


                    This will use semaphores, parallelizing as many iterations as the number of available cores (-j +0 means you will parallelize N+0 jobs, where N is the number of available cores).



                    sem --wait tells to wait until all the iterations in the for loop have terminated execution before executing the successive lines of code.



                    Note: you will need "parallel" from the GNU parallel project (sudo apt-get install parallel).






                    share|improve this answer



















                    • 1




                      is it possible to go past 60? mine throws an error saying not enough file descriptors.
                      – chovy
                      Nov 27 '15 at 7:47
















                    24














                    for stuff in things
                    do
                    sem -j+0 ( something
                    with
                    stuff )
                    done
                    sem --wait


                    This will use semaphores, parallelizing as many iterations as the number of available cores (-j +0 means you will parallelize N+0 jobs, where N is the number of available cores).



                    sem --wait tells to wait until all the iterations in the for loop have terminated execution before executing the successive lines of code.



                    Note: you will need "parallel" from the GNU parallel project (sudo apt-get install parallel).






                    share|improve this answer



















                    • 1




                      is it possible to go past 60? mine throws an error saying not enough file descriptors.
                      – chovy
                      Nov 27 '15 at 7:47














                    24












                    24








                    24






                    for stuff in things
                    do
                    sem -j+0 ( something
                    with
                    stuff )
                    done
                    sem --wait


                    This will use semaphores, parallelizing as many iterations as the number of available cores (-j +0 means you will parallelize N+0 jobs, where N is the number of available cores).



                    sem --wait tells to wait until all the iterations in the for loop have terminated execution before executing the successive lines of code.



                    Note: you will need "parallel" from the GNU parallel project (sudo apt-get install parallel).






                    share|improve this answer














                    for stuff in things
                    do
                    sem -j+0 ( something
                    with
                    stuff )
                    done
                    sem --wait


                    This will use semaphores, parallelizing as many iterations as the number of available cores (-j +0 means you will parallelize N+0 jobs, where N is the number of available cores).



                    sem --wait tells to wait until all the iterations in the for loop have terminated execution before executing the successive lines of code.



                    Note: you will need "parallel" from the GNU parallel project (sudo apt-get install parallel).







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Feb 9 '17 at 22:40









                    Jonas Stein

                    1,12621135




                    1,12621135










                    answered Jul 16 '15 at 13:34









                    lev

                    38125




                    38125








                    • 1




                      is it possible to go past 60? mine throws an error saying not enough file descriptors.
                      – chovy
                      Nov 27 '15 at 7:47














                    • 1




                      is it possible to go past 60? mine throws an error saying not enough file descriptors.
                      – chovy
                      Nov 27 '15 at 7:47








                    1




                    1




                    is it possible to go past 60? mine throws an error saying not enough file descriptors.
                    – chovy
                    Nov 27 '15 at 7:47




                    is it possible to go past 60? mine throws an error saying not enough file descriptors.
                    – chovy
                    Nov 27 '15 at 7:47











                    7














                    One really easy way that I often use:



                    cat "args" | xargs -P $NUM_PARALLEL command


                    This will run the command, passing in each line of the "args" file, in parallel, running at most $NUM_PARALLEL at the same time.



                    You can also look into the -I option for xargs, if you need to substitute the input arguments in different places.






                    share|improve this answer


























                      7














                      One really easy way that I often use:



                      cat "args" | xargs -P $NUM_PARALLEL command


                      This will run the command, passing in each line of the "args" file, in parallel, running at most $NUM_PARALLEL at the same time.



                      You can also look into the -I option for xargs, if you need to substitute the input arguments in different places.






                      share|improve this answer
























                        7












                        7








                        7






                        One really easy way that I often use:



                        cat "args" | xargs -P $NUM_PARALLEL command


                        This will run the command, passing in each line of the "args" file, in parallel, running at most $NUM_PARALLEL at the same time.



                        You can also look into the -I option for xargs, if you need to substitute the input arguments in different places.






                        share|improve this answer












                        One really easy way that I often use:



                        cat "args" | xargs -P $NUM_PARALLEL command


                        This will run the command, passing in each line of the "args" file, in parallel, running at most $NUM_PARALLEL at the same time.



                        You can also look into the -I option for xargs, if you need to substitute the input arguments in different places.







                        share|improve this answer












                        share|improve this answer



                        share|improve this answer










                        answered Jan 28 '17 at 7:05









                        eyeApps LLC

                        17614




                        17614























                            6














                            It seems the fsl jobs are depending on eachother, so the 4 jobs cannot be run in parallel. The runs, however, can be run in parallel.



                            Make a bash function running a single run and run that function in parallel:



                            #!/bin/bash

                            myfunc() {
                            run=$1
                            kar='KAR5'
                            mkdir normFunc
                            fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                            fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                            fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
                            fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear
                            }

                            export -f myfunc
                            parallel myfunc ::: run2 run3 run4


                            To learn more watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1 and spend an hour walking through the tutorial http://www.gnu.org/software/parallel/parallel_tutorial.html Your command line will love you for it.






                            share|improve this answer





















                            • If you're using a non-bash shell you'll need to also export SHELL=/bin/bash before running parallel. Otherwise you'll get an error like: Unknown command 'myfunc arg'
                              – AndrewHarvey
                              Jul 31 '15 at 3:39








                            • 1




                              @AndrewHarvey: isn't that what the shebang is for?
                              – naught101
                              Nov 26 '15 at 23:02
















                            6














                            It seems the fsl jobs are depending on eachother, so the 4 jobs cannot be run in parallel. The runs, however, can be run in parallel.



                            Make a bash function running a single run and run that function in parallel:



                            #!/bin/bash

                            myfunc() {
                            run=$1
                            kar='KAR5'
                            mkdir normFunc
                            fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                            fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                            fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
                            fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear
                            }

                            export -f myfunc
                            parallel myfunc ::: run2 run3 run4


                            To learn more watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1 and spend an hour walking through the tutorial http://www.gnu.org/software/parallel/parallel_tutorial.html Your command line will love you for it.






                            share|improve this answer





















                            • If you're using a non-bash shell you'll need to also export SHELL=/bin/bash before running parallel. Otherwise you'll get an error like: Unknown command 'myfunc arg'
                              – AndrewHarvey
                              Jul 31 '15 at 3:39








                            • 1




                              @AndrewHarvey: isn't that what the shebang is for?
                              – naught101
                              Nov 26 '15 at 23:02














                            6












                            6








                            6






                            It seems the fsl jobs are depending on eachother, so the 4 jobs cannot be run in parallel. The runs, however, can be run in parallel.



                            Make a bash function running a single run and run that function in parallel:



                            #!/bin/bash

                            myfunc() {
                            run=$1
                            kar='KAR5'
                            mkdir normFunc
                            fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                            fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                            fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
                            fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear
                            }

                            export -f myfunc
                            parallel myfunc ::: run2 run3 run4


                            To learn more watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1 and spend an hour walking through the tutorial http://www.gnu.org/software/parallel/parallel_tutorial.html Your command line will love you for it.






                            share|improve this answer












                            It seems the fsl jobs are depending on eachother, so the 4 jobs cannot be run in parallel. The runs, however, can be run in parallel.



                            Make a bash function running a single run and run that function in parallel:



                            #!/bin/bash

                            myfunc() {
                            run=$1
                            kar='KAR5'
                            mkdir normFunc
                            fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                            fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12
                            fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
                            fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear
                            }

                            export -f myfunc
                            parallel myfunc ::: run2 run3 run4


                            To learn more watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1 and spend an hour walking through the tutorial http://www.gnu.org/software/parallel/parallel_tutorial.html Your command line will love you for it.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Dec 6 '13 at 10:16









                            Ole Tange

                            12k1451105




                            12k1451105












                            • If you're using a non-bash shell you'll need to also export SHELL=/bin/bash before running parallel. Otherwise you'll get an error like: Unknown command 'myfunc arg'
                              – AndrewHarvey
                              Jul 31 '15 at 3:39








                            • 1




                              @AndrewHarvey: isn't that what the shebang is for?
                              – naught101
                              Nov 26 '15 at 23:02


















                            • If you're using a non-bash shell you'll need to also export SHELL=/bin/bash before running parallel. Otherwise you'll get an error like: Unknown command 'myfunc arg'
                              – AndrewHarvey
                              Jul 31 '15 at 3:39








                            • 1




                              @AndrewHarvey: isn't that what the shebang is for?
                              – naught101
                              Nov 26 '15 at 23:02
















                            If you're using a non-bash shell you'll need to also export SHELL=/bin/bash before running parallel. Otherwise you'll get an error like: Unknown command 'myfunc arg'
                            – AndrewHarvey
                            Jul 31 '15 at 3:39






                            If you're using a non-bash shell you'll need to also export SHELL=/bin/bash before running parallel. Otherwise you'll get an error like: Unknown command 'myfunc arg'
                            – AndrewHarvey
                            Jul 31 '15 at 3:39






                            1




                            1




                            @AndrewHarvey: isn't that what the shebang is for?
                            – naught101
                            Nov 26 '15 at 23:02




                            @AndrewHarvey: isn't that what the shebang is for?
                            – naught101
                            Nov 26 '15 at 23:02











                            2














                            Parallel execution in max N-process concurrent



                            #!/bin/bash

                            N=4

                            for i in {a..z}; do
                            (
                            # .. do your stuff here
                            echo "starting task $i.."
                            sleep $(( (RANDOM % 3) + 1))
                            ) &

                            # allow only to execute $N jobs in parallel
                            if [[ $(jobs -r -p | wc -l) -gt $N ]]; then
                            # wait only for first job
                            wait -n
                            fi

                            done

                            # wait for pending jobs
                            wait

                            echo "all done"





                            share|improve this answer


























                              2














                              Parallel execution in max N-process concurrent



                              #!/bin/bash

                              N=4

                              for i in {a..z}; do
                              (
                              # .. do your stuff here
                              echo "starting task $i.."
                              sleep $(( (RANDOM % 3) + 1))
                              ) &

                              # allow only to execute $N jobs in parallel
                              if [[ $(jobs -r -p | wc -l) -gt $N ]]; then
                              # wait only for first job
                              wait -n
                              fi

                              done

                              # wait for pending jobs
                              wait

                              echo "all done"





                              share|improve this answer
























                                2












                                2








                                2






                                Parallel execution in max N-process concurrent



                                #!/bin/bash

                                N=4

                                for i in {a..z}; do
                                (
                                # .. do your stuff here
                                echo "starting task $i.."
                                sleep $(( (RANDOM % 3) + 1))
                                ) &

                                # allow only to execute $N jobs in parallel
                                if [[ $(jobs -r -p | wc -l) -gt $N ]]; then
                                # wait only for first job
                                wait -n
                                fi

                                done

                                # wait for pending jobs
                                wait

                                echo "all done"





                                share|improve this answer












                                Parallel execution in max N-process concurrent



                                #!/bin/bash

                                N=4

                                for i in {a..z}; do
                                (
                                # .. do your stuff here
                                echo "starting task $i.."
                                sleep $(( (RANDOM % 3) + 1))
                                ) &

                                # allow only to execute $N jobs in parallel
                                if [[ $(jobs -r -p | wc -l) -gt $N ]]; then
                                # wait only for first job
                                wait -n
                                fi

                                done

                                # wait for pending jobs
                                wait

                                echo "all done"






                                share|improve this answer












                                share|improve this answer



                                share|improve this answer










                                answered Apr 10 at 8:04









                                Tomasz Hławiczka

                                212




                                212























                                    0














                                    I had trouble with @PSkocik's solution. My system does not have GNU Parallel available as a package and sem threw an exception when I built and ran it manually. I then tried the FIFO semaphore example as well which also threw some other errors regarding communication.



                                    @eyeApps suggested xargs but I didn't know how to make it work with my complex use case (examples would be welcome).



                                    Here is my solution for parallel jobs which process up to N jobs at a time as configured by _jobs_set_max_parallel:



                                    _lib_jobs.sh:



                                    function _jobs_get_count_e {
                                    jobs -r | wc -l | tr -d " "
                                    }

                                    function _jobs_set_max_parallel {
                                    g_jobs_max_jobs=$1
                                    }

                                    function _jobs_get_max_parallel_e {
                                    [[ $g_jobs_max_jobs ]] && {
                                    echo $g_jobs_max_jobs

                                    echo 0
                                    }

                                    echo 1
                                    }

                                    function _jobs_is_parallel_available_r() {
                                    (( $(_jobs_get_count_e) < $g_jobs_max_jobs )) &&
                                    return 0

                                    return 1
                                    }

                                    function _jobs_wait_parallel() {
                                    # Sleep between available jobs
                                    while true; do
                                    _jobs_is_parallel_available_r &&
                                    break

                                    sleep 0.1s
                                    done
                                    }

                                    function _jobs_wait() {
                                    wait
                                    }


                                    Example usage:



                                    #!/bin/bash

                                    source "_lib_jobs.sh"

                                    _jobs_set_max_parallel 3

                                    # Run 10 jobs in parallel with varying amounts of work
                                    for a in {1..10}; do
                                    _jobs_wait_parallel

                                    # Sleep between 1-2 seconds to simulate busy work
                                    sleep_delay=$(echo "scale=1; $(shuf -i 10-20 -n 1)/10" | bc -l)

                                    ( ### ASYNC
                                    echo $a
                                    sleep ${sleep_delay}s
                                    ) &
                                    done

                                    # Visualize jobs
                                    while true; do
                                    n_jobs=$(_jobs_get_count_e)

                                    [[ $n_jobs = 0 ]] &&
                                    break

                                    sleep 0.1s
                                    done





                                    share|improve this answer




























                                      0














                                      I had trouble with @PSkocik's solution. My system does not have GNU Parallel available as a package and sem threw an exception when I built and ran it manually. I then tried the FIFO semaphore example as well which also threw some other errors regarding communication.



                                      @eyeApps suggested xargs but I didn't know how to make it work with my complex use case (examples would be welcome).



                                      Here is my solution for parallel jobs which process up to N jobs at a time as configured by _jobs_set_max_parallel:



                                      _lib_jobs.sh:



                                      function _jobs_get_count_e {
                                      jobs -r | wc -l | tr -d " "
                                      }

                                      function _jobs_set_max_parallel {
                                      g_jobs_max_jobs=$1
                                      }

                                      function _jobs_get_max_parallel_e {
                                      [[ $g_jobs_max_jobs ]] && {
                                      echo $g_jobs_max_jobs

                                      echo 0
                                      }

                                      echo 1
                                      }

                                      function _jobs_is_parallel_available_r() {
                                      (( $(_jobs_get_count_e) < $g_jobs_max_jobs )) &&
                                      return 0

                                      return 1
                                      }

                                      function _jobs_wait_parallel() {
                                      # Sleep between available jobs
                                      while true; do
                                      _jobs_is_parallel_available_r &&
                                      break

                                      sleep 0.1s
                                      done
                                      }

                                      function _jobs_wait() {
                                      wait
                                      }


                                      Example usage:



                                      #!/bin/bash

                                      source "_lib_jobs.sh"

                                      _jobs_set_max_parallel 3

                                      # Run 10 jobs in parallel with varying amounts of work
                                      for a in {1..10}; do
                                      _jobs_wait_parallel

                                      # Sleep between 1-2 seconds to simulate busy work
                                      sleep_delay=$(echo "scale=1; $(shuf -i 10-20 -n 1)/10" | bc -l)

                                      ( ### ASYNC
                                      echo $a
                                      sleep ${sleep_delay}s
                                      ) &
                                      done

                                      # Visualize jobs
                                      while true; do
                                      n_jobs=$(_jobs_get_count_e)

                                      [[ $n_jobs = 0 ]] &&
                                      break

                                      sleep 0.1s
                                      done





                                      share|improve this answer


























                                        0












                                        0








                                        0






                                        I had trouble with @PSkocik's solution. My system does not have GNU Parallel available as a package and sem threw an exception when I built and ran it manually. I then tried the FIFO semaphore example as well which also threw some other errors regarding communication.



                                        @eyeApps suggested xargs but I didn't know how to make it work with my complex use case (examples would be welcome).



                                        Here is my solution for parallel jobs which process up to N jobs at a time as configured by _jobs_set_max_parallel:



                                        _lib_jobs.sh:



                                        function _jobs_get_count_e {
                                        jobs -r | wc -l | tr -d " "
                                        }

                                        function _jobs_set_max_parallel {
                                        g_jobs_max_jobs=$1
                                        }

                                        function _jobs_get_max_parallel_e {
                                        [[ $g_jobs_max_jobs ]] && {
                                        echo $g_jobs_max_jobs

                                        echo 0
                                        }

                                        echo 1
                                        }

                                        function _jobs_is_parallel_available_r() {
                                        (( $(_jobs_get_count_e) < $g_jobs_max_jobs )) &&
                                        return 0

                                        return 1
                                        }

                                        function _jobs_wait_parallel() {
                                        # Sleep between available jobs
                                        while true; do
                                        _jobs_is_parallel_available_r &&
                                        break

                                        sleep 0.1s
                                        done
                                        }

                                        function _jobs_wait() {
                                        wait
                                        }


                                        Example usage:



                                        #!/bin/bash

                                        source "_lib_jobs.sh"

                                        _jobs_set_max_parallel 3

                                        # Run 10 jobs in parallel with varying amounts of work
                                        for a in {1..10}; do
                                        _jobs_wait_parallel

                                        # Sleep between 1-2 seconds to simulate busy work
                                        sleep_delay=$(echo "scale=1; $(shuf -i 10-20 -n 1)/10" | bc -l)

                                        ( ### ASYNC
                                        echo $a
                                        sleep ${sleep_delay}s
                                        ) &
                                        done

                                        # Visualize jobs
                                        while true; do
                                        n_jobs=$(_jobs_get_count_e)

                                        [[ $n_jobs = 0 ]] &&
                                        break

                                        sleep 0.1s
                                        done





                                        share|improve this answer














                                        I had trouble with @PSkocik's solution. My system does not have GNU Parallel available as a package and sem threw an exception when I built and ran it manually. I then tried the FIFO semaphore example as well which also threw some other errors regarding communication.



                                        @eyeApps suggested xargs but I didn't know how to make it work with my complex use case (examples would be welcome).



                                        Here is my solution for parallel jobs which process up to N jobs at a time as configured by _jobs_set_max_parallel:



                                        _lib_jobs.sh:



                                        function _jobs_get_count_e {
                                        jobs -r | wc -l | tr -d " "
                                        }

                                        function _jobs_set_max_parallel {
                                        g_jobs_max_jobs=$1
                                        }

                                        function _jobs_get_max_parallel_e {
                                        [[ $g_jobs_max_jobs ]] && {
                                        echo $g_jobs_max_jobs

                                        echo 0
                                        }

                                        echo 1
                                        }

                                        function _jobs_is_parallel_available_r() {
                                        (( $(_jobs_get_count_e) < $g_jobs_max_jobs )) &&
                                        return 0

                                        return 1
                                        }

                                        function _jobs_wait_parallel() {
                                        # Sleep between available jobs
                                        while true; do
                                        _jobs_is_parallel_available_r &&
                                        break

                                        sleep 0.1s
                                        done
                                        }

                                        function _jobs_wait() {
                                        wait
                                        }


                                        Example usage:



                                        #!/bin/bash

                                        source "_lib_jobs.sh"

                                        _jobs_set_max_parallel 3

                                        # Run 10 jobs in parallel with varying amounts of work
                                        for a in {1..10}; do
                                        _jobs_wait_parallel

                                        # Sleep between 1-2 seconds to simulate busy work
                                        sleep_delay=$(echo "scale=1; $(shuf -i 10-20 -n 1)/10" | bc -l)

                                        ( ### ASYNC
                                        echo $a
                                        sleep ${sleep_delay}s
                                        ) &
                                        done

                                        # Visualize jobs
                                        while true; do
                                        n_jobs=$(_jobs_get_count_e)

                                        [[ $n_jobs = 0 ]] &&
                                        break

                                        sleep 0.1s
                                        done






                                        share|improve this answer














                                        share|improve this answer



                                        share|improve this answer








                                        edited Jun 18 '17 at 4:26

























                                        answered Mar 16 '17 at 22:35









                                        Zhro

                                        342413




                                        342413























                                            0














                                            In my case, I can't use semaphore (I'm in git-bash on Windows), so I came up with a generic way to split the task among N workers, before they begin.



                                            It works well if the tasks take roughly the same amount of time. The disadvantage is that, if one of the workers takes a long time to do its part of the job, the others that already finished won't help.



                                            Splitting the job among N workers (1 per core)



                                            # array of assets, assuming at least 1 item exists
                                            listAssets=( {a..z} ) # example: a b c d .. z
                                            # listAssets=( ~/"path with spaces/"*.txt ) # could be file paths

                                            # replace with your task
                                            task() { # $1 = idWorker, $2 = asset
                                            echo "Worker $1: Asset '$2' START!"
                                            # simulating a task that randomly takes 3-6 seconds
                                            sleep $(( ($RANDOM % 4) + 3 ))
                                            echo " Worker $1: Asset '$2' OK!"
                                            }

                                            nVirtualCores=$(nproc --all)
                                            nWorkers=$(( $nVirtualCores * 1 )) # I want 1 process per core

                                            worker() { # $1 = idWorker
                                            echo "Worker $1 GO!"
                                            idAsset=0
                                            for asset in "${listAssets[@]}"; do
                                            # split assets among workers (using modulo); each worker will go through
                                            # the list and select the asset only if it belongs to that worker
                                            (( idAsset % nWorkers == $1 )) && task $1 "$asset"
                                            (( idAsset++ ))
                                            done
                                            echo " Worker $1 ALL DONE!"
                                            }

                                            for (( idWorker=0; idWorker<nWorkers; idWorker++ )); do
                                            # start workers in parallel, use 1 process for each
                                            worker $idWorker &
                                            done
                                            wait # until all workers are done





                                            share|improve this answer


























                                              0














                                              In my case, I can't use semaphore (I'm in git-bash on Windows), so I came up with a generic way to split the task among N workers, before they begin.



                                              It works well if the tasks take roughly the same amount of time. The disadvantage is that, if one of the workers takes a long time to do its part of the job, the others that already finished won't help.



                                              Splitting the job among N workers (1 per core)



                                              # array of assets, assuming at least 1 item exists
                                              listAssets=( {a..z} ) # example: a b c d .. z
                                              # listAssets=( ~/"path with spaces/"*.txt ) # could be file paths

                                              # replace with your task
                                              task() { # $1 = idWorker, $2 = asset
                                              echo "Worker $1: Asset '$2' START!"
                                              # simulating a task that randomly takes 3-6 seconds
                                              sleep $(( ($RANDOM % 4) + 3 ))
                                              echo " Worker $1: Asset '$2' OK!"
                                              }

                                              nVirtualCores=$(nproc --all)
                                              nWorkers=$(( $nVirtualCores * 1 )) # I want 1 process per core

                                              worker() { # $1 = idWorker
                                              echo "Worker $1 GO!"
                                              idAsset=0
                                              for asset in "${listAssets[@]}"; do
                                              # split assets among workers (using modulo); each worker will go through
                                              # the list and select the asset only if it belongs to that worker
                                              (( idAsset % nWorkers == $1 )) && task $1 "$asset"
                                              (( idAsset++ ))
                                              done
                                              echo " Worker $1 ALL DONE!"
                                              }

                                              for (( idWorker=0; idWorker<nWorkers; idWorker++ )); do
                                              # start workers in parallel, use 1 process for each
                                              worker $idWorker &
                                              done
                                              wait # until all workers are done





                                              share|improve this answer
























                                                0












                                                0








                                                0






                                                In my case, I can't use semaphore (I'm in git-bash on Windows), so I came up with a generic way to split the task among N workers, before they begin.



                                                It works well if the tasks take roughly the same amount of time. The disadvantage is that, if one of the workers takes a long time to do its part of the job, the others that already finished won't help.



                                                Splitting the job among N workers (1 per core)



                                                # array of assets, assuming at least 1 item exists
                                                listAssets=( {a..z} ) # example: a b c d .. z
                                                # listAssets=( ~/"path with spaces/"*.txt ) # could be file paths

                                                # replace with your task
                                                task() { # $1 = idWorker, $2 = asset
                                                echo "Worker $1: Asset '$2' START!"
                                                # simulating a task that randomly takes 3-6 seconds
                                                sleep $(( ($RANDOM % 4) + 3 ))
                                                echo " Worker $1: Asset '$2' OK!"
                                                }

                                                nVirtualCores=$(nproc --all)
                                                nWorkers=$(( $nVirtualCores * 1 )) # I want 1 process per core

                                                worker() { # $1 = idWorker
                                                echo "Worker $1 GO!"
                                                idAsset=0
                                                for asset in "${listAssets[@]}"; do
                                                # split assets among workers (using modulo); each worker will go through
                                                # the list and select the asset only if it belongs to that worker
                                                (( idAsset % nWorkers == $1 )) && task $1 "$asset"
                                                (( idAsset++ ))
                                                done
                                                echo " Worker $1 ALL DONE!"
                                                }

                                                for (( idWorker=0; idWorker<nWorkers; idWorker++ )); do
                                                # start workers in parallel, use 1 process for each
                                                worker $idWorker &
                                                done
                                                wait # until all workers are done





                                                share|improve this answer












                                                In my case, I can't use semaphore (I'm in git-bash on Windows), so I came up with a generic way to split the task among N workers, before they begin.



                                                It works well if the tasks take roughly the same amount of time. The disadvantage is that, if one of the workers takes a long time to do its part of the job, the others that already finished won't help.



                                                Splitting the job among N workers (1 per core)



                                                # array of assets, assuming at least 1 item exists
                                                listAssets=( {a..z} ) # example: a b c d .. z
                                                # listAssets=( ~/"path with spaces/"*.txt ) # could be file paths

                                                # replace with your task
                                                task() { # $1 = idWorker, $2 = asset
                                                echo "Worker $1: Asset '$2' START!"
                                                # simulating a task that randomly takes 3-6 seconds
                                                sleep $(( ($RANDOM % 4) + 3 ))
                                                echo " Worker $1: Asset '$2' OK!"
                                                }

                                                nVirtualCores=$(nproc --all)
                                                nWorkers=$(( $nVirtualCores * 1 )) # I want 1 process per core

                                                worker() { # $1 = idWorker
                                                echo "Worker $1 GO!"
                                                idAsset=0
                                                for asset in "${listAssets[@]}"; do
                                                # split assets among workers (using modulo); each worker will go through
                                                # the list and select the asset only if it belongs to that worker
                                                (( idAsset % nWorkers == $1 )) && task $1 "$asset"
                                                (( idAsset++ ))
                                                done
                                                echo " Worker $1 ALL DONE!"
                                                }

                                                for (( idWorker=0; idWorker<nWorkers; idWorker++ )); do
                                                # start workers in parallel, use 1 process for each
                                                worker $idWorker &
                                                done
                                                wait # until all workers are done






                                                share|improve this answer












                                                share|improve this answer



                                                share|improve this answer










                                                answered May 26 at 19:14









                                                geekley

                                                1012




                                                1012























                                                    0














                                                    I really like the answer from @lev as it provides control over the maximum number of processes in a very simple manner. However as described in the manual, sem does not work with brackets.



                                                    for stuff in things
                                                    do
                                                    sem -j +0 "something;
                                                    with;
                                                    stuff"
                                                    done
                                                    sem --wait


                                                    Does the job.




                                                    -j +N Add N to the number of CPU cores. Run up to this many jobs in parallel. For compute intensive jobs -j +0 is useful as it will run number-of-cpu-cores jobs simultaneously.



                                                    -j -N Subtract N from the number of CPU cores. Run up to this many jobs in parallel. If the evaluated number is less than 1 then 1 will be used. See also --use-cpus-instead-of-cores.







                                                    share|improve this answer


























                                                      0














                                                      I really like the answer from @lev as it provides control over the maximum number of processes in a very simple manner. However as described in the manual, sem does not work with brackets.



                                                      for stuff in things
                                                      do
                                                      sem -j +0 "something;
                                                      with;
                                                      stuff"
                                                      done
                                                      sem --wait


                                                      Does the job.




                                                      -j +N Add N to the number of CPU cores. Run up to this many jobs in parallel. For compute intensive jobs -j +0 is useful as it will run number-of-cpu-cores jobs simultaneously.



                                                      -j -N Subtract N from the number of CPU cores. Run up to this many jobs in parallel. If the evaluated number is less than 1 then 1 will be used. See also --use-cpus-instead-of-cores.







                                                      share|improve this answer
























                                                        0












                                                        0








                                                        0






                                                        I really like the answer from @lev as it provides control over the maximum number of processes in a very simple manner. However as described in the manual, sem does not work with brackets.



                                                        for stuff in things
                                                        do
                                                        sem -j +0 "something;
                                                        with;
                                                        stuff"
                                                        done
                                                        sem --wait


                                                        Does the job.




                                                        -j +N Add N to the number of CPU cores. Run up to this many jobs in parallel. For compute intensive jobs -j +0 is useful as it will run number-of-cpu-cores jobs simultaneously.



                                                        -j -N Subtract N from the number of CPU cores. Run up to this many jobs in parallel. If the evaluated number is less than 1 then 1 will be used. See also --use-cpus-instead-of-cores.







                                                        share|improve this answer












                                                        I really like the answer from @lev as it provides control over the maximum number of processes in a very simple manner. However as described in the manual, sem does not work with brackets.



                                                        for stuff in things
                                                        do
                                                        sem -j +0 "something;
                                                        with;
                                                        stuff"
                                                        done
                                                        sem --wait


                                                        Does the job.




                                                        -j +N Add N to the number of CPU cores. Run up to this many jobs in parallel. For compute intensive jobs -j +0 is useful as it will run number-of-cpu-cores jobs simultaneously.



                                                        -j -N Subtract N from the number of CPU cores. Run up to this many jobs in parallel. If the evaluated number is less than 1 then 1 will be used. See also --use-cpus-instead-of-cores.








                                                        share|improve this answer












                                                        share|improve this answer



                                                        share|improve this answer










                                                        answered Oct 11 at 11:58









                                                        moritzschaefer

                                                        62




                                                        62

















                                                            protected by Kusalananda Dec 17 at 10:34



                                                            Thank you for your interest in this question.
                                                            Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                                                            Would you like to answer one of these unanswered questions instead?



                                                            Popular posts from this blog

                                                            Morgemoulin

                                                            Scott Moir

                                                            Souastre