Working with Matrices
The provided CMatrix class manages 2-dimensional matrices. The members of a CMatrix may be numbers, as in the classical mathematical sense of a matrix, or they may be any other type of Lua object, such as strings or userdata (objects). This topic describes how to work with matrices using the CMatrix class.
The CMatrix consists of rows of CArray objects, with each CArray managing the values at column indices. Depending upon the situation, you can access the members of the matrix by subscripts in [a][b] form, or using one of the Get... methods in the CMatrix class. For example, suppose a CMatrix M exists and has an element at row j, column i. Then the value of the element at row j, column i can be retrieved using any of these 3 methods:
value = M[j][i].
value = M:Get(j,i)
row = M:GetRow(j) ; value = row:Get(i)
In method 3, above, notice the use of a semicolon (;) to separate two statements. The first statement obtains the row (which is a CArray object) at index j. The second statement then gets value from column i of that row. This code assumes that row j exists and has an element at index i. If using method 3, you might wish to test that row is not nil before fetching the value, for example, using if/then/end, like this:
row = M:GetRow(j) ; if (row) then value = row:Get(i) else value=0 end
In this version of method 3, the code returns value=0 if the row is empty (nil). You might choose some other special value other than 0 to signify an empty row.
A filled matrix has a value in every element. An image is a case of a packed matrix; every image pixel has a value.
Suppose we have a matrix like this, consisting of 4 rows of 6 elements each. We say this has 4 rows and 6 columns, making 24 elements in total:
1000 |
1400 |
2000 |
1405 |
1240 |
1204 |
1200 |
1305 |
1800 |
1399 |
1360 |
1330 |
1500 |
1655 |
1483 |
1340 |
1299 |
1283 |
1490 |
1705 |
1242 |
1200 |
1180 |
1164 |
Every element is used so, when performing most calculations, we loop over the matrix by accessing each column in every row. The matrix above requires memory space for 4 x 6 = 24 values. Suppose we had a 1000 x 1000 matrix like this: that would require memory space for 1 million values. Looping over the elements during a calculation on the full matrix would take 1 million loops.
Consider the following 4 x 6 matrix that has some elements undefined:
|
1400 |
|
1405 |
|
1204 |
1200 |
|
1800 |
|
|
|
|
|
1483 |
1340 |
|
|
|
|
1242 |
|
|
|
This matrix uses some of the same values as above but more than 1/2 the elements are missing. To store this in memory in the traditional packed matrix format, we still need space for 24 elements but 16 of them will have dummy values such as 0. Perhaps the 0 values do not affect the calculation. But to access the elements in a traditional protocol still means looping over 24 elements in 4 rows of 6 elements each. Notice that the value "1204" at column 6, row 1 causes the matrix to require 4x6=24 elements rather than 4x4=16 elements, which itself has a large effect on memory space and the performance of calculations. Now imagine the worst case scenario of a 1000 x 1000 matrix consisting of the above sub-matrix plus one element at coordinate (1000,1000). Not only does the image require space for 1 million values with 9 of them valid, but perhaps worse still is the fact that a calculation that uses these 9 elements still requires 1 million loops and the loading or testing of 999,991 possibly useless zero's. These are cases in which a matrix is full of zeros which are meaningful, but that is not the case we are considering. Here we are considering the case of a sparse matrix in which many, if not most of the available cells do not contain a meaningful value.
If we break from the concept of a matrix as a rectangular grid and, instead, view it as a collection of independent rows, then we can let each row contain however many values it needs, and the matrix can use only as many rows at it needs. Let's revisit the above example using the small submatrix plus an element at index (1000,1000). Clever data structuring would allow us to store such a sparse matrix in memory using only 5 rows, with the longest row containing only 3 elements! In total, we would need to store the address of 5 rows and then the values for 3, 2, 2, 1, and 1 elements respectively. In addition to the memory advantage, the gain in calculation performance would be tremendous. Fortunately, creating a vector or a matrix in Lua uses its table paradigm which stores pair values. Each pair consists of an index and a value. Therefore we make a matrix in Lua as a collection of tables, one table storing pairs for the row index and row address, and the other tables storing pairs of column index and value for each row. For a sparse matrix havingm columns and n rows, then there are then n+1 tables. One of the tables contains (row index, row address) pairs to access only the rows that are defined. All the other tables contain the (column index, value) pairs which may be different for each row. The CMatrix class uses this method. All matrices are treated as sparse, regardless of whether they are completely packed or extremely sparse. In fact, each matrix row is a CArray class which is a wrapper around a Lua table of (index,value) pairs. If the matrix is sparse, this provides the storage and performance gains described above. However if the matrix is packed and large, you might consider creating the matrix as a CImage and using the class methods to operate on it at the speed of optimized C++ code.
The code examples below illustrate ways of accessing a CMatrix. If every element of the matrix is important, then we need to use the full access method. This example shows how to loop over the rows and columns of a CMatrix M. In this example, we will consider a matrix to be an x,y grid of height values and use a function Volume() to compute the total volume of the solid object.
The example below shows two ways to access every matrix element in the traditional "rectangular grid" approach. If a matrix element is not initialized (its value has not been set), the CMatrix:Get method returns its value as 0, which does not affect the calculated sum but does require CPU time. Here, we use the simplest, most brute-force methods to loop over every matrix element. We use a naive indexing method involving M:Cols() and M:Rows() to get the dimensions of the matrix. These require recalculating the number of rows and columns each time the loop index is tested. In addition, the address of the value is calculated using M:Get(j,i) for every matrix element:
|
-- M is a CMatrix |
|
|
|
-- use every row, empty or not |
|
-- use every column, empty or not |
|
-- sum the value at (j,i) |
|
|
|
|
|
|
|
|
In a large loop involving tens of thousands of indices or more, we could improve the above function simply by moving the loop limits M:Cols() and M:Rows() out of the loop. This is a minor improvement but worth the effort for large loops:
|
-- M is a CMatrix |
|
-- calculate ahead of the loop |
|
-- calculate ahead of the loop |
|
|
|
-- every row, empty or not |
|
-- every column, empty or not |
|
-- sum the value at (j,i) |
|
|
|
|
|
|
|
|
Another problem is that the for loop tests every index between1 and M:Rows() and between 1 andM:Cols(), even if no value is defined at a given index. If no value exists, thenM:Get(j,i) returns 0. This does not change the outcome of the calculation but does use processing time to lookup the index and add 0 to the sum. We could also test every index and not add it to the sum. But if the matrix has many undefined values, Lua's built-in support of sparse matrices can provide a vast improvement in loop processing. This is described in the following section.
This example shows how to use the in next and in pair Lua syntax to loop over the sparse matrix, avoiding uninitialized rows and columns. Use this protocol if many of the indices i and j do not exist (i.e., are not initialized) in the matrix. In this protocol, not initialized elements are not used and require no CPU time to ignore.
|
-- M is a CMatrix |
|
|
|
-- for index j, Row is the next CArray of columns |
|
-- index i and value Val are returned if they exist |
|
-- avoid calculating a table index |
|
|
|
|
|
|
|
|
The nice thing about sparse access is that it works for both cases: when the matrix is full and when the matrix is sparse.
The above code works for fetching values from a sparse matrix. However, it does not work for setting values in a matrix. There are two reasons why it does not work: 1) The loops index over existing entries in the matrix. 2) A value fetched from the matrix becomes a local value so that changing it does not change the value inside the matrix. Let us look at how to change the values of existing entries in the matrix. Here we use each row reference to access the values inside it. Since each row is a CArray of column pair data, the CArray:Set method is used to change the value of row members:
|
|
|
-- get the pointer to row j |
|
-- access each element in the row |
|
-- use row array to set the value |
|
|
|
|
|
|
We should make one change to the code above. Since a CMatrix can contain any type of data—such as strings or objects, in addition to traditional numbers—we should check that each matrix member is a number before doing math on it. The following small change accomplishes this:
|
|
|
-- get the pointer to row j |
|
-- access each element in the row |
|
|
|
-- use the row array to set the value |
|
|
|
|
|
|
|
|
The example below shows one way to fill a rectangular matrix with the same value. Lua interprets [] following a table name as subscripts for indexing the table. This notation makes it handy to setup a matrix. The script fragment below shows a simple—if not optimal—way to setup a matrix having 4 rows of 6 columns each.
|
|
|
-- declare the matrix as a table |
|
-- loop over rows |
|
-- declare a table for this row |
|
--loop over columns |
|
-- assign a column member to the row table |
|
|
|
|
|
-- return a table reference to the caller |
|
|
Notice that the function Init takes 3 arguments: the number of rows, the number of columns, and the initial value. The matrix M is declared aslocal inside the function. But the function makes it visible to the world in a hidden way by returning a reference to it. The caller can assign the returned matrix to any name it chooses using a syntax like this:
The other alternative for working with matrices is to use the provided CMatrix class. That class provides rich functionality for working with 2-dimensional matrices using a class metaphor. The CMatrix class handles both sparse and packed matrices in an efficient way.
To realize the matrix initialization shown above, we would write the following lines of script:
|
-- construct a CMatrix object. |
|
-- initialize the 4 x 6 matrix |