Mira scripts are very fast, especially working with "big data" that involve large arrays and images. To show what "fast" means in the context of a scripted procedure, the table below lists some benchmarks which include a mix of high- and low-level scripting operations. These results suggest what you can expect from your numerically intensive scripts. Many benchmarks list the full script source code used.
In the benchmarks, the terms "image" and "array" may used interchangeably, since an image is a numeric data array with metadata (also known as a "header") and the array operation works on the array data of an image. Also, the benchmarks often refer to a "table", like t = {} which is the syntax for creating an empty "table" named t. In the lua language, a "table" is a collection of values which may be treated like a data structure or indexed as an array using integral indices, like t[3] or a[k].
Notice the speed difference between procedures that use high level operations such as I:Set(t) rather than script-level procedures involving a loop (e.g., compare benchmarks 2 and 5). When working with large arrays, the advantage of using a library call rather than a script loop should be implemented when possible. This is why the Mira script language includes an extensive collection of optimized array methods.
NOTE: Some capabilities are available in the Mira MX x64 scripting language but not Mira Pro x64.
Benchmark | Speed | ||
1 |
Loop overhead (nothing between "do" and "end"): for k=1,1000000 do end |
115 million loops / sec | |
2 | Create a 64-bit real image from an
array
of 1 million elements. Create a linear series of values from 1 to 100000,
then assigm to an image t = TSeries(1000000) I:Set(t) where t is an array of 1 million values and I is a CImage class object. | 12.7 million elements / sec | |
3 | Create a 64-bit real image from an array of 250,000 elements using I:Set(t) where t is an array of 250,000 elements. | 13.8 million elements / sec | |
4 | Create a 64-bit real image from an array of 10,000 elements using I:Set(t) where t is an array of 10,000 elements. | 13.1 million elements / sec | |
5 | Set 1 million elements in an
array using
a scripted loop, First create an array, then run the assignment loop: t={} for k=1,1000000 do t[k]=k end This script illustrates loop processing overhead. | 15.1 million elements / sec | |
6 | Create and set 1 million elements in a
local table using local t={}; for k=1,10000000 do t[k]=k end | 10.4 million elements / sec | |
7 | Perform 10 million multiply's using
local values: local n=0; local m=0; for i=1,10000000 do k=n*m end | 32 million multiplications / sec | |
8 |
Perform 10 million divides using local values: local n=0; local m=0; for i=1,10000000 do k=n/m end | 31 million divisions / sec | |
9 | Perform 10 million adds using local values: local m=0; local n=0; for i=1,10000000 do k=n+m end | 32 million adds / sec | |
10 | Perform 10 million additions using global values: n=0; m=0; for i=1,10000000 do k=n+1000 end | 11.1 million adds / sec | |
11 |
Perform 10 million additions using global values: k=0; for i=1,10000000 do k=k+1 end | 12.4 million adds / sec | |
12 |
Perform 10 million divides and save in a local array: local t={}; local m=3; for i=1,10000000 do t[k]=k/m end | 6.85 million / sec | |
13 | Least squares solution of 100 points with 4 parameters and 3 variables using a "hyperplane" basis function declared in the script | 24 fits / sec | |
14 | Least squares solution of 100 points with 4 parameters and 3 variables using internal "hyperplane" basis function | 1,800 fits / sec | |
15 | Least squares solution of 10 points with 4 parameters and 3 variables using internal "hyperplane" basis function | 18,000 fits / sec | |
16 | Least squares solution of 1000 points using a 3x2 (6 term) 2-D polynomial. | 1,250 fits / sec | |
17 | Least squares solution using CLsqFit class to fit 10 points with a 3x2 (6 parameter) 2-D polynomial. | 24,400 fits / sec | |
18 | Least squares solution using a 6 term polynomial to
fit 1000 points.
This example uses the built-in function TFit, although
greater versatility is available by using the
CLsqFit
class directly. The data to be fit are in an array t. This function returns
4
results: the array of coefficients, array of coefficient
errors, the fit standard deviation, and
the sample mean: t = {} c, e, s, m = TFit(t,6) | 900 fits /sec | |
19 | Create 1 million uniformly distributed random numbers. t = TRand(1000000) | 4.1 million numbers / sec | |
20 | Create 1 million Gaussian distributed
random numbers. t = TGaussDev(1000000) | 862,000 numbers / sec | |
21 |
Histogram of 1 million real numbers using 100 bins. Using non default parameters requires
a class method instead of THist: H = NewHist() H:SetBins(100) H:Calc() | 6 million numbers / sec | |
22 |
Histogram of 1 million real numbers, pre-sorted. This uses the global
THist function, which is the function that is
benchmarked: t = TRand(1000000) TSort(t) THist(t) | 11 million numbers / sec | |
23 | Add two 1000x1000 64-bit real images | 236 images / sec | |
24 | Add two 1000x1000 32-bit real images | 500 images / sec | |
25 | Add two 1000x1000 16-bit integer images | 704 images / sec | |
26 | Add two 1000x1000 24-bit RGB images | 242 images / sec | |
27 | Add two 1200x960 48-bit URGB images (RGB with 16 bits per channel) | 225 images / sec | |
28 | Multiply 1000x1000 32-bit real images | 300 images / sec | |
29 | Multiply 1000x1000 32-bit real image by a number | 210 images / sec | |
30 | Divide two 1000x1000 32-bit real images | 106 images sec | |
31 |
Start with a 1000x1000 pixel 16-bit integer image.
Convert it to "float" data type, add 1000.0 to each pixel, divide
the two images, and then raise the resulting pixel values to a power. In
this script, you may also think of the image simply as a data array: I[1]:SetDatatype("float") I[2] = I[1] + 1000 I[3] = I[1] / I[2] I[4] = I[1] ^ I[3] This script creates 2 intermediate images, preserving the original image, and creates a new final image. The last step raises I[1] to the power of image I[3], pixel by pixel (extremely CPU-intensive). The benchmark includes all 4 steps. | 10.3 million pixels / sec | |
32 | Same as above plus display all 4 final images in a new image window. This includes computation of an image histogram, transfer function, and palette mapping for each image. | 2.7 images / sec | |
33 | Load 1 Megapixel image of 16 bit pixels from hard drive, compute complete image histogram and auto-scale transfer function using gamma=0.6, then display in a new window. | 8 images / sec |
Testing Procedure and the Test Machine
test machine used was chosen to be representative of a modest PC available to Mira users. We did not want to present best-case results that many users would not be able to achieve. The test machine uses a 3.0 GHz Pentium Core-2 Duo E-6850 CPU with 4 GB of DDR-2 RAM. Most Mira users will have a machine more capable than this and can expect faster timings. To increase the significance of the benchmarks, the entire procedure, including loops, was repeated in a loop of 10 to 10000 cycles and the time result was divided accordingly.