In the toy car example, we may further wish to divide cars not only by colour, but also by size. For instance, we could still have red, green, blue and yellow toy cars, but perhaps we wish to further divide them into "large" and "small" ones (whatever these descriptors mean). Then we would have not four, but eight categories, as we would have large red cars, small red ones, large green ones, small green ones and so on. We could easily say that we will simply use vectors with eight entries, but in a sense, this would eliminate some of the structure we have; this is where matrices come in.
A matrix is very simply a rectangular array of numbers. (Note that a vector is also such an array, but with one dimension of the array only being one entry long). Using the toy car example again, we could have:
Red | Green | Blue | Yellow | |
---|---|---|---|---|
Large | 1 | 2 | 0 | 1 |
Small | 2 | 2 | 1 | 0 |
1 | 2 | 0 | 1 |
2 | 2 | 1 | 0 |
Clearly, matrices are an extension of vectors, conceptually. So, we may be interested in thinking about how the properties of vectors translate to matrices. Suppose we have two matrices with the same dimensions (i.e. the same number of rows and the same number of columns). Then, we can add them, just as we added two vectors (of the same size) together. And since we have (presumably) established in advance what each entry represents, such an addition makes sense. In the toy car example, buying a new car of each colour and size corresponds to adding a matrix with two rows and four columns and whose entries are all 1s. Similarly, scalar multiplication extends naturally to matrices, as well. Multiplying a matrix A by some scalar k, is the same as multiplying each entry of A by k.
Then, we have the idea of norms. As it happens, matrices also have norms, which work in pretty much the same way as those for vectors. We could, for example, add the absolute values of each of the entries of a matrix, just as we would with a vector. In essence, we can treat a matrix as a vector, by simply ignoring the structure of rows and columns and using a vector norm. Additionally, we can get other types of norms for matrices, which do take their structure into account. For instance, we could take the sums of absolute values for each column and say that the norm of the matrix is the largest sum. In the toy car example above, this corresponds to finding the colour with the most cars; the norm will be the number of cars of that colour. As with vectors, we will not go into norms too deeply here, to keep things simple.
Finally, we can see how matrices interact with vectors. Clearly, Matrices and vectors cannot be added together, since they are dissimilar structures, but perhaps we can define a type of multiplication, which will be valid for matrices and vectors. To do so, we consider the following: Suppose that, in the toy example, we wish to find out how many large and how many small cars there are. Clearly, we can do this very easily, by adding the numbers along each row. Then we can see that there are 1 + 2 + 0 + 1 = 4 large cars and 2 + 2 + 1 + 0 = 5 small ones. But this is the same as taking the entries of each row, (w, x, y, z), multiplying each of them by 1 and then adding them, to obtain 1w + 1x + 1y +1z. And this is precisely one type of multiplication between matrices and vectors. That is, we multiply our toy car matrix by the vector (1, 1, 1, 1), precisely by taking those sums. This may seem like a silly example, but it can be extended. What if we really like green cars and hate yellow cars? We could assign a weight to each colour and get a vector, e.g. (1, 2, 1, 0). Now let's try multiplying our matrix by this vector, as we described. For the first row, we get 1*1 + 2*2 + 1*0 + 0*1 = 5 and for the second one, we get 1*2 + 2*2 + 1*1 + 0*0 = 7, or the vector (5, 7). (We can think of this as the weighted vector of how much we like the large and small toy cars in the set, respectively.)
This is precisely how matrix-vector multiplication is performed. For each row of the matrix, we multiply each entry with the corresponding entry of the vector, then add these together. This gives us a new vector, which is the product of the matrix and vector we started with. Of course, for this to make sense, the dimensions need to match up. That is, to multiply the matrix A by the vector x, we need the vector x to have as many entries as A has columns. Then, the vector b = Ax will have as many entries as A has rows. This type of multiplication is called post-multiplication. We can also pre-multiply a vector with a matrix. In this case, we use the columns of the matrix instead. So, to pre-multiply the vector p with the matrix A, p must have as many entries as A has rows. Then the resulting vector c = pA will have as many entries as A has columns. Furthermore, since a vector is simply a matrix with only one row (or column), this also allows us to multiply vectors together, as long as their dimensions match.