JIRA ISSUES:
The purpose of this page is to document how to run the original ANTS registration program to define a reference point for developing a similar result with the ITKv4 antsRegistation program.
Failing Experiment 1:
This work is being lead by Eun Young Kim
Tests on the failure case of ANTS
Test Case Experiment 2:
The set of experiments and test cases were prepared in the following experiment parent directory:
/hjohnson/HDNI/20120907_ANTS_COMPARISONS
In the "orig_data" sub directory we have two test cases:
 a.nii.gz  A very small brain
 b.nii.gz  A very large brain
Using BCD (BRAINSConstellationDetector), the following data are prepared from the original data:
 a_acpc.nii.gz  ACPC aligned version with neck chopped off
 b_acpc.nii.gz  ACPC aligned version with neck chopped off
BCD is used by the following python script:
 DoBCD.py  A quick script to run BCD for making a_acpc.nii.gz and b_acpc.nii.gz
Also we could use a bash script to run the BCD instead this python code. The results of this way can be found in "BCD_aligned_data" subdirectory.
The results of legacy ANTS program are presented in the "OrigAnts_results" sub directory:
 OrigAnts.sh  A version of ANTS that was intended to match the defaults as set out in the buildtemplateparallel.sh scripts.
In the "NewAnts_results" sub directory, we provide the results of the new "antsRegistration" program with a primary command line parameters to compare the outputs with the previous ANTS program:
 NewAnts.sh  A new ITKv4 implementation of the various Anatomical Normalization Tool called antsRegistration.
The bash scripts for the legacy "ANTS" program and the new "antsRegistration" program are as follows:
#!/bin/bash # \author Hans J. Johnson # 20120907 # # This is the orignal ANTS registration program. We want to find an equivalent verison with # of command line options for the antsRegistration program. # # PROGPATH=~/src/BSAclang31/bin #This is stupid, but Slicer needs the binaries in the lib tree. PROGPATH2=~/src/BSAclang31/lib fi=../orig_data/a_acpc.nii.gz mi=../orig_data/b_acpc.nii.gz if [ ! f Iteration01_tfmInverseWarp.nii.gz ]; then time ${PROGPATH}/ANTS 3 \ MIoption 32x16000 \ imagemetric CC[${fi},${mi},1,5] \ numberofaffineiterations 10000x10000x10000x10000x10000 \ numberofiterations 50x35x15 \ outputnaming Iteration01_tfm \ regularization Gauss[3.0,0.0] \ transformationmodel SyN[0.25] \ useHistogramMatching 1 fi # I'_m(x) = I_m( Warp( Affine(x) ) ) time ${PROGPATH}/antsApplyTransforms d 3 r ${fi} i ${mi} t [Iteration01_tfmWarp.nii.gz,0] t [Iteration01_tfmAffine.txt,0] o b2a.nii.gz # I'_f(x) = I_f( Affine^1( Warp^1(x) ) ) time ${PROGPATH}/antsApplyTransforms d 3 r ${mi} i ${fi} t [Iteration01_tfmAffine.txt,1] t [Iteration01_tfmInverseWarp.nii.gz,0] o a2b.nii.gz time ${PROGPATH}/ImageCalculator sub in b2a.nii.gz ${fi} ofsqr ofsqrt out diff_a.nii.gz time ${PROGPATH}/ImageCalculator sub in a2b.nii.gz ${mi} ofsqr ofsqrt out diff_b.nii.gz
One criteria for comparison is measuring the similarity between the Fixed image and Warped moving image by using the "MeasureImageSimilarity" program.
Our goal is to have final metric value by antsRegistration program equivalent to old ANTS program.
Old ANTS program generates a crosscorrelation metric value of 0.892644 for our two test dataset:
/ipldev/scratch/aghayoor/ANTS/release/bin/MeasureImageSimilarity 3 1 ../orig_data/a_acpc.nii.gz b2a.nii.gz log test.nii.gz ../orig_data/a_acpc.nii.gz : b2a.nii.gz => CC 0.892644
/ipldev/scratch/aghayoor/ANTS/release/bin/MeasureImageSimilarity 3 2 ../orig_data/a_acpc.nii.gz b2a.nii.gz log test.nii.gz ../orig_data/a_acpc.nii.gz : b2a.nii.gz => MI 0.682343
Notice that the following command line is provided to compare the initial results of the "antsRegistration" program with the old "ANTS" program. It does not contain necessarily optimized efficient parameters.
#!/bin/bash # \author Ali Gahyoor # 20120908 # This is the new ITKv4 implementation of the various Anatomical Normalization Tool called antsRegistration. # This script show the command line options that produce a result similar to ANTS # PROGPATH=~/src/BSAclang31/bin #This is stupid, but Slicer needs the binaries in the lib tree. PROGPATH2=~/src/BSAclang31/lib fi=../orig_data/a_acpc.nii.gz mi=../orig_data/b_acpc.nii.gz if [ ! f Iteration01_tfm1InverseWarp.nii.gz ]; then time ${PROGPATH}/antsRegistration d 3 \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.25]" \ convergence 10000x10000x10000x10000x10000 \ shrinkfactors 5x4x3x2x1 \ smoothingsigmas 0x0x0x0x0 \ metric CC[${fi},${mi},1,5] \ transform "SyN[0.25,3.0,0.0]" \ convergence 50x35x15 \ shrinkfactors 3x2x1 \ smoothingsigmas 0x0x0 \ usehistogrammatching 1 \ output Iteration01_tfm fi # I'_m(x) = I_m( Warp( Affine(x) ) ) time ${PROGPATH}/antsApplyTransforms d 3 r ${fi} i ${mi} t [Iteration01_tfm1Warp.nii.gz,0] t [Iteration01_tfm0Affine.txt,0] o b2a.nii.gz # I'_f(x) = I_f( Affine^1( Warp^1(x) ) ) time ${PROGPATH}/antsApplyTransforms d 3 r ${mi} i ${fi} t [Iteration01_tfm0Affine.txt,1] t [Iteration01_tfm1InverseWarp.nii.gz,0] o a2b.nii.gz time ${PROGPATH2}/ImageCalculator sub in b2a.nii.gz ${fi} ofsqr ofsqrt out diff_a.nii.gz time ${PROGPATH2}/ImageCalculator sub in a2b.nii.gz ${mi} ofsqr ofsqrt out diff_b.nii.gz
The "Comparison.sh" is written to evaluate the difference between the results of the above two programs:
#!/bin/bash # This script comapres the results of legacy ANTS with the results of the new antsRegistration program. DATAPATH1=./OrigAnts_results DATAPATH2=./NewAnts_results PROGPATH=/ipldev/scratch/aghayoor/BS/release/bin if [ f ${DATAPATH1}/b2a.nii.gz a f ${DATAPATH2}/b2a.nii.gz ]; then time ${PROGPATH}/ImageCalculator sub in ${DATAPATH1}/b2a.nii.gz ${DATAPATH2}/b2a.nii.gz ofsqr ofsqrt out diff_results.nii.gz fi if [ f ${DATAPATH1}/a2b.nii.gz a f ${DATAPATH2}/a2b.nii.gz ]; then time ${PROGPATH}/ImageCalculator sub in ${DATAPATH1}/a2b.nii.gz ${DATAPATH2}/a2b.nii.gz ofsqr ofsqrt out diff_Inv_results.nii.gz fi
Here we present a slice view of the results:
a.nii.gz  A very small brain
b.nii.gz  A very large brain
a_acpc.nii.gz  The fixed image of the registration process
b_acpc.nii.gz  The moving image of the registration process
Results of registration "b" to "a" by old ANTS program
Results of registration "b" to "a" by new "antsRegistration" program
The magnified difference of above registration results
As the results show, we can visually verify that the results of these two ants programs are pretty close to each other.
The next step of this project is to inspect through the parameters of the new "antsRegistration" program to find the best parameter set that runs the program most efficiently.
First, we break the program to the smaller parts and inspect the effects of changes. Therefore, at the first level, I just have used Affine registration.
Note: running time of the program is a very important criteria to compare the effectiveness of changes; therefore, before going through comparing the different permutation of parameters.
Using of "user" time makes sense because by definition: "This is only actual CPU time used in executing the process. Other processes and time the process spends blocked do not count towards this figure."
I have inspected the effectiveness of using this time by running the same program several times based on the different situations:
We have a constrain on the number of cpu cores used by defining the environment variable: NSLOTS = 1 and 2 Iteration 3: Program is run alone under the new constrains (NSLOTS = 2)
Iteration 4: Program is run parallel to other procedures under the new constrains (NSLOTS = 1)
Iteration 5: Program is run parallel to other procedures under the new constrains (NSLOTS = 2)
Despite the huge difference between the "real" time while we run the program "alone" or in "parallel", the "user" time, specified by red, remains relatively the same if use a constrain on the number of cpu cores used. As the difference of few seconds does not matter in our comparison, we use the "user" time as the run time for each experiment.

Based on the definition of "user" time, I expected that user time would be the clock time when a onecore CPU is used without any wait or hold. However, I just noticed that the "user" time has largely different values for the same experiment when it is run on different systems. Unfortunately, I had run my previous set of experiments on different systems to speed up the comparison procedure, so I just noticed that my previous comparison results was not true in some cases.
To inspect the other probable changes, I ran our first experiment "Series1" on different systems. In this experiment, smoothing sigma is zero at all levels. Figure below shows the results.
It can be seen those systems that have the same CPU specifications generate exactly the same results. Therefore, we have three groups of results from {neuron, athena, wundt}, {dendrtie, pandora} and {hera}; however, fortunately the program is ended with the same results on all systems.
I inspected the code to find the reason of this different behavior. My guess was that there is a random number generator inside the program that is not used at the full resolution level. For example, the "shrinkImageFilter" chooses the voxels of each sub image randomly from the full resolution image, but it was not true. I searched more, but I could not find the reason.
Further inspection shows that despite the slightly different behavior, the final results are the same. Also, when we repeat this experiment using the smoothing sigma, all the systems show exactly the same behavior . Figure below shows this fact about the "Series2" experiment in which we have used smoothing sigma as "4x3x2x1x0" instead of "0x0x0x0x0".
I inspected this fact again by two other randomly picked experiments: "Series7" and "Series13".
"Series2", "Series7" and "Series13" experiments have totally different parameter sets. Therefore, we can see that just using the smoothing sigma can increase the consistency of the results on different systems with different CPU specifications.

Then I looked at the code to find out how the smoothing parameter (sigma) is applied inside the program. "antsRegistration" uses the "itkImageRegistrationMethodv4", and this it uses the "itkDiscreteGaussianImageFilter" that does the smoothing operation using the "itkGaussianOperator". Ants program passes the "sigma^2" as the variance to the gaussian filter, and this sigma is used as physical units not the voxel units. It means that the actual sigma will be:
σ = (ourSigmaValue^2) / imageSpacing;
Then, the gaussian operator generates an 1D smoothing operation for each direction as:
coeff[0]  coeff[1]  ...  coeff[i]  ... 

While the
coeff[0]+coeff[1]+...+coeff[i]+... < 1.
and:
Which the Ii is the Bessel function of order i.
Note that the maximum length width for each 1D operator is set as 32.
To have reliable results in this program, we should not use smoothing sigma larger than 11.
In this case we get a warning: "itkGaussianOperator (0x7fb...): Kernel size has exceeded the specified maximum width of 32 and has been truncated to 33 elements. You can raise the maximum width using the SetMaximumKernelWidth method."
Also, with smoothing sigma equals to 12 or higher, sometimes the metric value will be less than 1, which is so unacceptable!

Notice: to have consistency in time, I ran all the experiments again on "wundt". They took 2 to 5 minutes which is much lesser than "dendrite" with about 12 to 15 minutes.

Here, I have inspected how the metric value changes correspondence to different parameters when just Affine transform is used. In this situation we can see the effects of parameters while it does not take much time.
Series 1 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration01_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.25]" \ convergence 100x100x100x100x100 \ shrinkfactors 5x4x3x2x1 \ smoothingsigmas 0x0x0x0x0 \ useestimatelearningrateonce Series 2 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration02_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.25]" \ convergence 100x100x100x100x100 \ shrinkfactors 5x4x3x2x1 \ smoothingsigmas 4x3x2x1x0 \ useestimatelearningrateonce 
 In this experiment we want to see the effect of smoothing parameters on the results. For the "Series1", we have used the same parameter space as they are used for initial results above. Just notice that we have used 100 instead of 100000 as the number of iteration at each level because the algorithm stops automatically stops at each level while the convergence is lower than the defined threshold (default as 1e6), and here this always happens in less than 100 iterations. The "Series 1" experiment stops after 84 iterations at the following time: time : user 105m45.245s  In the experiment "Series 2" the only change has happened in the "smoothingsigmas". Despite the initial jump at the similarity metric (that is expected because smoothing increases the correlation between the voxels), we see the same results at the last level (full resolution level). Although the final metric value is the same for both experiments, the "Series 2" experiment stops after 90 iterations but at lesser time: time : user 95m5.061s Also, it makes sense that we should use smoothing sigma for the robustness when we do the experiment on a large data set.  
Series 3 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration03_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.25]" \ convergence 100x100x100 \ shrinkfactors 5x3x1 \ smoothingsigmas 7x3x1 \ useestimatelearningrateonce
Series 4 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration04_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.25]" \ convergence [100x100x100,1e6,5] \ shrinkfactors 5x3x1 \ smoothingsigmas 7x3x1 \ useestimatelearningrateonce
Series 5 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration05_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.75]" \ convergence [100x100x100,1e6,5] \ shrinkfactors 5x3x1 \ smoothingsigmas 7x3x1 \ useestimatelearningrateonce
Series6 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration06_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.75]" \ convergence [100x100x100,1e6,5] \ shrinkfactors 9x3x1 \ smoothingsigmas 9x3x1 \ useestimatelearningrateonce
Series7 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration07_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.75]" \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ useestimatelearningrateonce  Now, we want just to use 3 iteration levels and make some changes to shrink factors and smoothing sigma. In "Series 3", three levels of iterations is used. The running time for "Series 3" is as follows: time : user 95m36.588s To compare with "Series 2", we have very better results in approximately the same time (metric value is ended in 0.48 instead of 0.42). This increased has happened because we have used smoothing sigma 1 at the full resolution level. I think having greater sigma values cause that we have approximately the same running time despite having much lesser iteration numbers.  At the next step, in experiment "Series 4", I just change the convergence windows size to 5. Previously the default value (10) was used. Compared to experiment "Series 3", we see the same results in much lesser time and iteration numbers. There are 43 iterations rather than 57, and the running time of "Series 4" is as follows: time : user 69m19.952s  Now, in experiment "Series 5", the "gradientStep" has changed to 0.75. The comparison shows that the running time has decreased while we still have the same results. The running time of "Series 5" is: time : user 62m50.063s
At this step, at "Series 6", I have increased the shrink factor at the first level. In this case we have the same results in less iterations but more time. The reason is obvious when we look at the diagram. At the last level (full resolution level) we have more iterations (14 iterations rather than 9 iterations). The running time of "Series 6" is: time : user 104m22.675s  In experiment "Series 7", I used 4 shrinking level to decrease the number of iterations of the last level. This will avoid the pitfall of the "Series 6". I have used 9x5x3x1 as the shrink factor. Fortunately we see the same results in much lesser time. By this technique we have just 6 iterations at the full resolution level. (Compare to 9 iterations in Series5 and 14 iterations in Series6). The running time for "Series 7" is: time : user 44m25.746s That is our best results till now!   
Series 8 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration08_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[1.00]" \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ useestimatelearningrateonce
Series 9 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration09_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.9]" \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ useestimatelearningrateonce 
 Here, in "Series 8", I want to inspect the effect of "gradientStep" on the resutls of "Series 7". I have increased the gradient step to 1. Despite the decrement in the total number of iterations, the running time of the program has been increased. It happens because first levels are done in lesser iterations while the last two levels have been increased by 1 iteration each. The running time of "Series 8" is: time : user 85m3.091s  Now, in "Series 9", I choose a gradient step value between the two previous experiments as 0.9. The running time of "Series 9" is: time : user 44m35.203s The comparison shows almost the same results at the same time; however, choosing the gradient step as 0.75 is more robust than the 0.9. Therefore, we still choose the "Series 7" as our best results till now. It (44m25.746s) shows an improvement of about %50 in running time compare to our initial result 105m45.245s.  
Series 10 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration10_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.75]" \ convergence [100x100,1e6,5] \ shrinkfactors 5x1 \ smoothingsigmas 7x1 \ useestimatelearningrateonce
Series 11 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration11_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.75]" \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 12x9x5x1 \ smoothingsigmas 12x9x5x1 \ useestimatelearningrateonce
Series 12 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration12_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.75]" \ convergence [100x100x100x100x100,1e6,5] \ shrinkfactors 12x9x5x3x1 \ smoothingsigmas 12x9x5x3x1 \ useestimatelearningrateonce 
 Now, at this step I consider the "Series 7" as our current best results, and try to consider other situations of shrink factors and number of levels. First, in "Series 10", I have chosen just two levels of shrinking. The running time is: time : user 66m25.429s Although, we have the same results at lesser iterations, the running time has been increased because of the larger number of iterations at the full resolution level.  Then, in "Series 11", I have used the same levels with different shrink factors. I have increased the shrink factors to see whether the running time will be decreased or not. However, the running time increased dramatically! time : user 148m32.839s  After that, I added a level to the experiment "Series 7" with shrink factor 12 to see that it could help to decrease the number of iterations at the full resolution level. However, the number of iterations and the running time was increased. The running time for "Series 12" is: time : user 62m15.214s 
Notice that in experiments "Series11" and "Series12", using the smoothing sigma of 12 at the first level cause that the metric values exceed the 1 line! that I do not have any idea what it means! Even if I use smoothing sigma as 11 instead of 12 for these two experiments, again their running times would be "106m4.095s" and "79m28.401s" that shows no improvement.
 
Series 13 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration13_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.75]" \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 8x4x2x0 \ useestimatelearningrateonce  Finally, in "Series 13", I have considered the effect of smoothing sigma. The final value of metric has changed to "0.42" because we do not do smoothing at the last step. It cause that the algorithm stops at lesser iterations but the total running time has been increased. time : user 50m40.258s  
Series14 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration15_tfm \ metric MI[${fi},${mi},1,16] \ transform "Affine[0.75]" \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ useestimatelearningrateonce Series 15 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration15_tfm \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.75]" \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 8x4x2x1 \ smoothingsigmas 8x4x2x1 \ useestimatelearningrateonce  Here, in "Series 14", I want to check the effect of histogram bin on the registration process. In compare with Series7, I have used 16 histogram bin instead of 32. It gives poor results for both the metric value and the running time. The running time is: time : user 78m28.054s
Dr Christensen suggested that using the shrink factors as numbers of power of 2 may improve the running time by decreasing the cost of operations in CPU and Cache. Therefore, in "Series 15" I used shrink factors 8x4x2x1 instead of 9x5x3x1 in Series 7. The comparison of results is shown. The running time of "Series 15" is: time : user 52m49.575s. As it can be seen, the running time and number of steps has been increased. Therefore, using bigger shrink factors is more important than using variables as power of 2.  

Here we start a new phase of experiments. These experiments show the effect of doing the registration procedure in several stages.
Series 16 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration16_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32] \ convergence [100x100x100,1e6,2] \ shrinkfactors 8x4x2 \ smoothingsigmas 0x0x0 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 0 \ Series 17 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration17_tfm \ transform "Affine[0.75]" \ metric MeanSquares[${fi},${mi},1] \ convergence [100x100x100,1e6,2] \ shrinkfactors 8x4x2 \ smoothingsigmas 0x0x0 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 0 Series 18 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration18_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 0 Series 19 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration16_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32] \ convergence [100x100x100,1e6,2] \ shrinkfactors 8x4x2 \ smoothingsigmas 8x4x2 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 0 
 To start this phase of experiments, I started with "Series 16" that uses power of 2 factors for the shrink factor parameters, and smoothing sigam az zero for all iterations. Using smoothing sigma as zero in a shrunk image is like using just a sample of image to find the Affine parameters. Also, to speed up the program, the convergence windows size is chosen as 2 instead of 5 for the first stage (Notice that 2 is the minimum windows size possible. Using 1 returns error). Unexpectedly, results show an increment in the running time. Most of the running time is dominated by the last stage. time: user 67m15.714s  In "Sereis 17" I used "Mean Square" as the metric of the first stage. The running time is getting even worse: time : user 94m51.865s  After these unexpected results because of the huge number of iteration of the last stage, I just used the second stage of the last two experiments as a onestage registration program in "Series19". The registration process (in green color) shows that although the results of the first step does not initialize the second step, it helps the registration stops in lesser iterations because, at each iteration, the metric value is calculated based on the results of both stages. time : user 202m7.946s  Then to see the effects of smoothing sigma on the multi stage processing I have run "Series 19". The only difference between the "Series 19" and "Series 16" is that I have used smoothing sigma at the first stage. It causes that the second stage is run by much less iterations. time : user 43m2.444s

Series20 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration20_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32] \ convergence [100x100x100,1e6,2] \ shrinkfactors 16x8x4x2 \ smoothingsigmas 0x0x0x0 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 0  Here, in "Series 20", I want to inspect the effect of extra shrinking levels. I just have added a more shrink level to the first stage of Series16; therefore we have shrinking factors as "16x8x4x2" instead of "8x4x2". It is like that the whole diagram of Series16 has shifted one level to left. The results and running time are not affected. time : user 67m16.300s  
Series 21 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration21_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32] \ convergence [100x100x100,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1  Till now, our best result is attained in "Series7" that is run by just one stage. Here, in "Series 21", I want to run the same procedure in two stages and see the results. The results show that the registration process is stopped in lesser iterations but approximately the same time. time : user 42m15.854s  
Series 22 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration21_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32] \ convergence [100x100x100,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,0.5,32,Regular,0.5] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 Series 23 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration21_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,0.5,32,Regular,0.5] \ convergence [100x100x100,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,0.25,32,Regular,0.25] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 Series 24 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration21_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 Series 25 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration21_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.05] \ convergence [100x100x100,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.05] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 Series 26 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration21_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Random,0.05] \ convergence [100x100x100,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Random,0.05] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 Series 27 NSLOTS=1;time ${PROGPATH}/antsRegistration d 3 \ output Iteration07_tfm \ metric MI[${fi},${mi},1,32,Regular,0.05] \ transform "Affine[0.75]" \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ 
 In the next experiments, I want to inspect the effect of "samplingPercentage" in the metric value. The default is 1 as it is in "Series 21". In "Series 22", I have changed the sampling percentage to 0.5 for the second stage. The results are shown by red line. As it is shown, it generates the same results as "Series21" in half time. time: user 24m51.542s  In "Series 23", the sampling percentage of the first stage has changed to 0.5, and I have changed that to 0.25 for the second stage. The results are shown in green. The registration procedure stops at more iteration but half time! time: user 16m25.414s  In "Series 24" sampling percentage has been set to 0.1 for both stages. Again, same final results in half time. time : user 8m10.764s  I continued these experiments by decreasing the sampling percentage to 0.05 for both stages in "Series25". Again the running time is half of before. time : user 5m5.872s
Then, in "Series26", I have used "Random" sampling strategy instead of "Regular". The running time has not changed, but the registration process stops in more iterations, and it converges to a lower value. So I prefer to use "Regular" strategy for sampling. time : user 5m44.255s ======= Finally, in "Series27", I have done the two stage process of Series25 in just one stage to see the results. The running time has not changed; just the registration process stops in less iterations. time : user 5m59.654s At least in the case of Affine, we can say that using multi stage registration has no benefit to running time.  Set the percentage sample less than 0.05 cause unreliable results for the registration process because the metric value has not enough valid samples to work. I suggest to choose sampling percentage to 0.1 for more robustness. 
 
Now we consider the effects of parameter on the full registration process.
Experiment 0 NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.25]" \ convergence 10000x10000x10000x10000x10000 \ shrinkfactors 5x4x3x2x1 \ smoothingsigmas 0x0x0x0x0 \ metric CC[${fi},${mi},1,5] \ transform "SyN[0.25,3.0,0.0]" \ convergence 50x35x15 \ shrinkfactors 3x2x1 \ smoothingsigmas 0x0x0 \ usehistogrammatching 1 \ output Iteration01_tfm fi

 For the "Ex0", we have used the same parameter space as they are used for initial results. There is no smoothing sigma, and shrink factors are low. we want to see how much improvement we can gain in running time by changing the parameters properly. The final value for CC metric is 0.474 and the running time is: time : user 1087m51.344s 

Experiment 1 NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ output Iteration01_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ transform "SyN[0.9,3.0,0.0]" \ metric CC[${fi},${mi},1,5,Regular,0.1] \ convergence [75x45x15x10,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ usehistogrammatching 1 Experiment 2 NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ output Iteration02_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ transform "SyN[0.9,3.0,0.0]" \ metric CC[${fi},${mi},1,5,Regular,0.1] \ convergence [75x45x15,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ usehistogrammatching 1 \ transform "SyN[0.9,3.0,0.0]" \ metric CC[${fi},${mi},1,5,Regular,0.1] \ convergence [10,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ usehistogrammatching 1 Experiment 3 NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ output Iteration03_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ transform "SyN[0.9,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [75x45x15,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ usehistogrammatching 1 \ transform "SyN[0.9,3.0,0.0]" \ metric CC[${fi},${mi},1,5,Regular,0.1] \ convergence [10,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ usehistogrammatching 1 Experiment 4 NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ output Iteration03_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ transform "SyN[0.9,3.0,0.0]" \ metric MeanSquares[${fi},${mi},1,Regular,0.1] \ convergence [75x45x15,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ usehistogrammatching 1 \ transform "SyN[0.9,3.0,0.0]" \ metric CC[${fi},${mi},1,5,Regular,0.1] \ convergence [10,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ usehistogrammatching 1 
 Here, I want to start our first set of experiments to evaluate our whole registration process including SyN registration. In "Experiment 1", I have chosen my Affine parameters as "Series24". The sampling percentage has been set to 0.1 for robustness. However, just a onestage SyN registration is used by CC as metric. Results are shown in blue. The algorithm is converged to 0.532, and the running time is: time : user 741m46.763s  In "Experiment 2", I have run the SyN registration in two stages. CC is used as our metric in both stages. Here, the algorithm is converged to 0.563. Also, the results show that just running the SyN in two stages cause a huge improvement in running time. time : user 519m5.170s  In "Experiment 3", I repeated the previous experiment just by using MI as the metric of the first stage of SyN registration. Results are shown in green. The converged value for metric is 0.527. The registration process is converged in less iterations; however, despite of my expectation, the running time has not changed. time : user 527m45.124s NOTE that we should consider 5 to 10 percent tolerance in running time. Therefore, experiments 2 and 3 do not have any preference to each other in the case of running time.  Finally, "Experiment 4" is the same as Ex3 except that "MeanSquares" is used as the metric of the first stage of the SyN registration. The metric value is converged to 0.529 and again there is no change in running time. time : user 518m0.362s  
Experiment 5 NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ output Iteration05_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ transform "SyN[0.9,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [75x45x15,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ usehistogrammatching 1 \ transform "SyN[0.9,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [10,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ usehistogrammatching 1 Experiment 6 NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ output Iteration06_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ transform "SyN[0.9,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [75x45x15x10,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ usehistogrammatching 1 Experiment 7 NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ output Iteration07_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ transform "SyN[0.9,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [75x45x15x10,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ usehistogrammatching 1 Experiment 8 NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ output Iteration08_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ transform "SyN[0.9,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [75x45x15,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ usehistogrammatching 1 \ transform "SyN[0.9,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [10,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ usehistogrammatching 1 
 Now, in "Ex5", I decided to use "MI" as a metric for both stage of SyN registration process. The metric value is converged to 0.52, and we have a large improvement on the running time. time : user 74m13.975s This improvement is so important when we compare that with our initial running time that was 1087m.  We want the parameter set as simple as possible; therefore, in "Ex6", I decided to do SyN in just one step. The running time is: time : user 75m55.143s Fortunately, we have the same running time with a simpler parameter set. However, now the metric value is converged to 0.44.  To make the parameter command set more simple, In "Ex7", I merged the two stage of the "Affine" registration in just one. Again the running time has not changed. (Note that we consider a 5 to 10 percent tolerance regarding the running time). time : user 79m4.465s And the metric value is converged to: 0.44  It seems that we should compromise between the simplicity of the parameter set, and the robustness of the results. (My criteria for robustness is the final metric value after the stoppage of the algorithm). Therefore, In "Ex8", I do the "Affine" registration in one stage, and the "SyN" registration is done in two stages. The metric value is converged to 0.52, and the running time is: time : user 72m28.498s
====================== Presenting the comparison results in just picture, makes that so busy and reduces the readability of the results. Here, the comparison results of these four experiments are presented in two pictures. In the first picture, experiments 5 and 8 are compared which the syn registration is done in two stages. Then, the comparison is done with experiments 6 and 7 which include just one stage of syn procedure. The comparison between two figures shows that the metric has a smoother behavior when we do the syn registration in two stages. Also, the final convergence value is better (0.52 instead of 0.44). As conclusion, I prefer to use the set parameters of "experiment 8" because it is simpler than "experiment 5" with the same behavior and running time. 
Experiment 9 NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ output Iteration10_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ transform "SyN[0.9,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [75x45x15,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ usehistogrammatching 1 \ transform "SyN[0.7,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [10,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ usehistogrammatching 1 Experiment 10 NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ output Iteration10_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ transform "SyN[0.9,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [75x45x15,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ usehistogrammatching 1 \ transform "SyN[1.2,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [10,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ usehistogrammatching 1 Experiment 11 NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ output Iteration10_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ transform "SyN[0.9,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [75x45x15,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ usehistogrammatching 1 \ transform "SyN[1.5,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [10,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ usehistogrammatching 1 Experiment 12 NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ output Iteration10_tfm \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ transform "SyN[0.9,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [75x45x15,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ usehistogrammatching 1 \ transform "SyN[2,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [10,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ usehistogrammatching 1 
 Previously, based on some experiments in a single stage SyN registration, we had shown that 0.9 is the best choice for the gradient step value. These results are presented in Appendix A. Now in the final set of experiments, we compare the effects of gradient step on the last stage of the "SyN" registration. In "Ex9", I have used 0.7 instead of 0.9. The convergence value and the running time are not changed. time : user 71m58.214s  In "Ex10", the gradient minimum step is changed to 1.2. We have a bit improvement in the running time with no change in the convergence value. time : user 66m1.720s  In "Ex11" the gradient minimum step is increased to 1.5. We have no change in the running time with a slight decrease in the convergence value (about 0.05). time : user user 66m34.805s  Finally, in "Ex12", the gradient minimum step is increased to 2. The convergence value is decreased to 0.49, and the running time has been increased. time : user 69m58.046s =========== The figure show the comparison results. Based on our experiments, the best choice for the Gradient Step value is 1.2. 
==============================================================================================================================================
============================================================= APPENDIX A =====================================================================
==============================================================================================================================================
The following experiments are done independently and they show that for a one stage "SyN" registration, 0.9 is the best results for the minimum gradient step.
Series1 time ${PROGPATH}/antsRegistration d 3 \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.75]" \ convergence 100x100x100 \ shrinkfactors 3x2x1 \ smoothingsigmas 2x1x0 \ metric CC[${fi},${mi},1,5] \ transform "SyN[0.25,3.0,0.0]" \ convergence 65x45x10 \ shrinkfactors 3x2x1 \ smoothingsigmas 2x1x0 \ usehistogrammatching 1 \ output Iteration03_tfm Series2 time ${PROGPATH}/antsRegistration d 3 \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.75]" \ convergence 100x100x100 \ shrinkfactors 3x2x1 \ smoothingsigmas 2x1x0 \ metric CC[${fi},${mi},1,5] \ transform "SyN[0.65,3.0,0.0]" \ convergence 65x45x10 \ shrinkfactors 3x2x1 \ smoothingsigmas 2x1x0 \ usehistogrammatching 1 \ output Iteration04_tfm Series3 time ${PROGPATH}/antsRegistration d 3 \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.75]" \ convergence 100x100x100 \ shrinkfactors 3x2x1 \ smoothingsigmas 2x1x0 \ metric CC[${fi},${mi},1,5] \ transform "SyN[0.9,3.0,0.0]" \ convergence 65x45x10 \ shrinkfactors 3x2x1 \ smoothingsigmas 2x1x0 \ usehistogrammatching 1 \ output Iteration05_tfm Series 4 time ${PROGPATH}/antsRegistration d 3 \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.75]" \ convergence 100x100x100 \ shrinkfactors 3x2x1 \ smoothingsigmas 2x1x0 \ metric CC[${fi},${mi},1,5] \ transform "SyN[1.2,3.0,0.0]" \ convergence 65x45x10 \ shrinkfactors 3x2x1 \ smoothingsigmas 2x1x0 \ usehistogrammatching 1 \ output Iteration06_tfm 
For "SyN", the gradient step value of 0.9 gives the best resutls. time : user 635m35.323s 
==============================================================================================================================================
==============================================================================================================================================
==============================================================================================================================================
Conclusion
At this step I have tested several experiments. We can finish this set of experiments when we meet two criteria:
1 New "antsRegistration" program generates results better or equivalent to legacy "ANTS".
We can use "MeasureImageSimilarity" program to judge about the final results.
2 New "antsRegistration" program runs in equivalent or lesser time than legacy "ANTS" program.
As we showed, the CC similarity measure for old ANTS program is: CC 0.892644;
The old ANTS running time is : real 80m54.880s
A slice show of difference between the fixed image and warped moving image is presented below:
Based on my inspection of parameters; "Experiment 10" generates the fastest results. Its running time in "wundt" system is: real 6m29.342s
The CC similarity measure for this experiment is: CC 0.882726
Also, here a slice of results is presented:
As you can see the "Experiment 10" is so fast (just 6 minutes on the same system), but based on the analyze of code, it does not seem to be generally robust.
Therefore, we found the following command line tentatively:
${PROGPATH}/antsRegistration d 3 \ output [Iteration000_tfm,outputWarpedImage.nii.gz,outputInverseWarpedImage.nii.gz] \ writecompositetransform 1 \ \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,1] \ masks [${fim},${mim}] \ convergence [1000x1000x100,1e6,10] \ shrinkfactors 6x4x2 \ smoothingsigmas 6x4x2 \ usehistogrammatching 1 \ \ transform "SyN[0.25,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,1] \ masks [${fim},${mim}] \ convergence [150x100x75,1e6,10] \ shrinkfactors 3x2x1 \ smoothingsigmas 3x2x0 \ usehistogrammatching 1
The CC similarity metric is: CC 0.885629
The running time is: real 54m3.104s, and a Slice of its results is
Then we want to look to the results of my initial guess of parameters. "Experiment 0" shows my command line.
The running time on "wundt" is: real 54m2.327s
Also, the CC similarity metric is: CC 0.890147
That shows the closest results to old ANTS program. A slice of results:
My initial guess of parameters considers the smoothing sigma zero at each level. Here, I have modified my initial guess of parameters just by changing the smoothing sigma.
NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ metric MI[${fi},${mi},1,32] \ transform "Affine[0.25]" \ convergence 10000x10000x10000x10000x10000 \ shrinkfactors 5x4x3x2x1 \ smoothingsigmas 4x3x2x1x0 \ metric CC[${fi},${mi},1,5] \ transform "SyN[0.25,3.0,0.0]" \ convergence 50x35x15 \ shrinkfactors 3x2x1 \ smoothingsigmas 2x1x0 \ usehistogrammatching 1 \ output Iteration001_tfm
Running time is 54m44.540s. Also, the CC similarity metric is: CC 0.886948.
A slice of results:
All of the four experiments presented above generates close results to the legacy ANTS program. Three of them have the same running time (about 50 minutes) and the fast one is running on about 6 minutes.
Based on three criteria (running time, final similarity metric, and robustness considerations) I recommend "Experiment 14" as my suggested parameter set for the new ants program.
Now we check that wether we have met the goals and the expectations of this project or not:
1 CC similarity metric for "Experiment14": 0.886948
CC similarity metric for old ANTS: 0.892644
=> They generates equivalent results because we just have about 1% difference in the metric value.
2 Running time of "Experiment14": 54 minutes
Running time of old ANTS: 80 minutes
=> The new antsRegistration program is run in much less time.
Therefore, based on our first goals, This project is done.
There are two more considerations about this project:
1 First, I am curious to do more inspection about the metric sampling percentage parameter. We rejected our fast parameter set (with about 6 minutes running time) because of the robustness issues, but I want to check whether we can find a better compromise between the robustness and the running time.
To expand this, lets inspect the "Experiment 10" in more details. This experiment is much faster than the others (~about 9 times faster).
NSLOTS=2;time ${PROGPATH}/antsRegistration d 3 \ output Iteration10_tfm \ \ transform "Affine[0.75]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [100x100x100x100,1e6,5] \ shrinkfactors 9x5x3x1 \ smoothingsigmas 9x5x3x1 \ \ transform "SyN[0.9,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [75x45x15,1e6,2] \ shrinkfactors 9x5x3 \ smoothingsigmas 9x5x3 \ usehistogrammatching 1 \ \ transform "SyN[1.2,3.0,0.0]" \ metric MI[${fi},${mi},1,32,Regular,0.1] \ convergence [10,1e6,5] \ shrinkfactors 1 \ smoothingsigmas 1 \ usehistogrammatching 1
In this experiment registration is done in 3 stages, one affine stage and two syn stages.
Reducing the metric sampling percentage to 0.1 is the main reason that decreases the running time, so lets inspect the effect of this parameter more.
Assume we use shrink factor of 1; it cause that the virtual domain be the same as the input moving image. Now, the sampling percentage of 0.125 means that we use 1/8 of the samples of the virtual domain. It equals to using the shrink factor of 2. The only difference is the way that they do the downsampling of image.
Using shrink factor of 2, cause a uniform down sampling in all directions (x,y,z); however, using the metric sampling percentage of 0.125 with Regular type arranges all the samples in a vector; then, it chooses every 8 sample, and it is not a uniform sampling in all directions.
Although using the above two options (metric sampling percentage or shrink factor) should generate close results, I prefer using the shrink factor. To clarify that, assume these two parameter sets:
Type  Sampling percentage  Shrink factor  Effective samples of image in registration 

Regular  0.125  1  1/8 
Regular  1  2  1/8 
Consider that in the second one, we can participate all the voxels of image by using a proper smoothing factor, which increases the robustness of our method. Also, by using the shrink factor, we can do a uniform distributed sampling over the image that helps toward a better registration.
Finally, I should mention that when we are going to use sampling percentage of 1, it is better to use Dens(None) sampling type instead of Regular. In this case we can reduce the calculation time because we do not have to arrange the voxels in the vector and choose some sample of them.
2 Some of our suggested parameter sets generate improper results in other data set.
see this: Tests on the failure case of ANTS.
As indicated in this link, while the registration process is done properly by old ANTS program, and the BRAINSFit, the recommended command lines in experiments 10 and 13 cannot generate proper results.
We should be sure that our final command line is consistent with all data set. Therefore, we use the cross correlation (CC) similarity metric to compare the results of our recommended command line of the new "antsRegistration" program with the results of old "ANTS" program and "BRAINSFit".
First we should mention that the CC metric between fixed image and the initial moving image is:
Fixed (subject) image: "/paulsen/Experiments/20120801.SubjectOrganized_Results/PHD_024/0131/34285/TissueClassify/BABC/t1_average_BRAINSABC.nii.gz"
Initial template (moving) image: "/ipldev/scratch/eunyokim/src/BRAINS_TESTING_DEBUG/build/ReferenceAtlasbuild/Atlas/Atlas_20120830/template_t1.nii.gz"
=> CC 0.941673
 Experiment 0 (our initial guess of parameters for antsRegistration program)
 Fix (subject) image: "/paulsen/Experiments/20120801.SubjectOrganized_Results/PHD_024/0131/34285/TissueClassify/BABC/t1_average_BRAINSABC.nii.gz"
 Warped moving image: "/hjohnson/HDNI/SynRegistrationTest/AliNewTest/b2a_test0.nii.gz"
=> CC similarity metric: 0.98105
 Experiment 14 (final recommended parameter set for antsRegistration program)
 Fix (subject) image: "/paulsen/Experiments/20120801.SubjectOrganized_Results/PHD_024/0131/34285/TissueClassify/BABC/t1_average_BRAINSABC.nii.gz"
 Warped moving image: "/hjohnson/HDNI/SynRegistrationTest/AliNewTest/b2a_test1.nii.gz"
=> CC similarity metric: 0.982432
In both cases we have higher quality results than old ANTS and the BRAINSFit:
 Old ANTS program:
 Fix (subject) image: "/paulsen/Experiments/20120801.SubjectOrganized_Results/PHD_024/0131/34285/TissueClassify/BABC/t1_average_BRAINSABC.nii.gz"
 Warped moving image: "/hjohnson/HDNI/SynRegistrationTest/OldANTS_Trial/b2a.nii.gz"
=> CC similarity metric for old ANTS: CC 0.979647
 BRAINSFit:
 Fix (subject) image: "/paulsen/Experiments/20120801.SubjectOrganized_Results/PHD_024/0131/34285/TissueClassify/BABC/t1_average_BRAINSABC.nii.gz"
 Warped moving image: /hjohnson/HDNI/SynRegistrationTest/BRAINSABC_Trial/atlas_to_subject_warpped.nii.gz:
=> CC similarity metric for BRAINSFit: 0.961353

As final conclusion, our recommended command line in "Experiment 14" generates equivalent results to old ANTS program in less time.
To prove the equivalency of results, we have used CC similarity metric measure:
Data Set 1
Fixed image: /hjohnson/HDNI/20120907_ANTS_COMPARISONS/orig_data/a_acpc.nii.gz
Moving image: /hjohnson/HDNI/20120907_ANTS_COMPARISONS/orig_data/b_acpc.nii.gz
CC similarity metric measure for legacy "ANTS" program: 0.892644
CC similarity metric measure for new "antsRegistration" program: 0.886948
The resutls are presented in: "/hjohnson/HDNI/20120907_ANTS_COMPARISONS".
Data Set 2
Fixed image: /paulsen/Experiments/20120801.SubjectOrganized_Results/PHD_024/0131/34285/TissueClassify/BABC/t1_average_BRAINSABC.nii.gz
Moving image: /ipldev/scratch/eunyokim/src/BRAINS_TESTING_DEBUG/build/ReferenceAtlasbuild/Atlas/Atlas_20120830/template_t1.nii.gz
CC similarity metric measure for legacy "ANTS" program: 0.979647
CC similarity metric measure for new "antsRegistration" program: 0.982432
The results are presented in: "/hjohnson/HDNI/SynRegistrationTest".
Also, to compare the running time, we ran both program on the same system (wundt) under same circumstances. The real running time for "antsRegistration" program is about 54 minutes, and it is about 35 percent better than the running time of old ANTS program (~80 minutes).