Thursday, October 11, 2012

Mappings and Functions

This is a topic which is generally quite intuitive, but still gives trouble to many students, particularly since it is very rarely taught on its own.

We consider some sets A and B. We take each element of A and assign to it some subset of B. This assignment is called a mapping. More formally, if f is a mapping from A to B, then for each element a in A, there exists some subset B' of B, such that f(a) = B'. To illustrate, consider the set A as your 5 closest friends. Now let B be {tall, short, dark-haired, fair-haired, male, female}. Then you can have a mapping for each of your friends in A, which describes them in terms of the characteristics listed in B. For example, you might have some short, dark-haired female friend, let's call her Angela. Then if we say that f is the description mapping, f(Angela) = {short, dark-haired, female}.

As you can see, mappings are a general way of connecting the elements of one set, with the elements of another set. These sets, of course, will generally not be friends and descriptors. In more general settings, we will have the sets belong to some number system. However, this does not make a difference in what the mappings themselves mean. That is, if we take your 5 closest friends and call them 1, 2, ... 5, then take the descriptors in B and call them, in order, 1, 2, ... 6, we can do the same thing as before. If, for example, Angela is friend 1, we have f(1) = {2, 3, 6}. This statement gives exactly the same information as the one before, i.e., that your friend Angela is short, dark-haired and female.

Naturally, not all mappings will be from or to finite sets. In fact, most interesting mappings only happen in infinite (countable or uncountable) sets. This makes no difference to the definition of the mapping. The idea remains the same: to each specific element of A, we assign some subset of B.

We now move to a specific type of mapping, called a function. A function from A to B is a mapping which assigns to each element of A a subset of B with a single element in it. In other words, it assigns one element of B to each element of A. For example, let's take the integers (... -1, 0, 1, 2, ...) and take the function f(x) = x * x. Then f(0) = 0, f(1) = 1 = f(-1), f(2) = 4 = f(-2) and so on. Note that different elements of A can be assigned the same element from B, but not the other way around. We call A the domain of the function and B the codomain of the function. Furthermore, the set of all elements of B which are assigned to some element of A by f is called the range of f. In the above example, f will never give a negative number. Its range is the positive integers. In addition, we call f(x) the image of x under f.

The reason that functions are important is that, under certain conditions, they can be composed together. That is, if we have 3 sets, A, B and C and two functions, f and g, such that f is a function from A to B and g is a function from B to C, we can define the function h = gf from A to C. As an example, consider the functions f(x) = x * x and g(y) = y + 1. Then we can compose them together to get g(f(x)) = (x * x + 1).

Now the question is, under what conditions can we compose functions like that? The only real requirement is that the range of f is a subset of the domain of g. That is, we need that whatever f gives as its "output" can be plugged into g. If this is not the case, then there will be some f(x) which g can not assign any value to. We then say that g can not be composed with f.

We will now look at more specific families of function. The first such family are the "one-to-one" functions, or "injections". These are the functions for which, if f(x) = f(y), then x = y. That is, if we take two distinct elements in the domain of f, their images in the range of f are also distinct. If we have such a function, then we can also talk about the idea of a pre-image. That is, if f(x) = y, we say that x is the pre-image of y. If f is not on-to-one, we cannot say that, as there might be some other element 2, for which f(w) = y. But the restriction that at most one element can map into y means that the idea of a preimage is possible.

The second family we will consider are the "onto" functions, or surjections. These are the functions for which the codomain is also the range. If we take all the real numbers, f(x) = x * (x - 1) * (x + 1) is an onto function. For any y you pick, there exists some x, for which y = f(x). It is not a one-to-one function, since f(1) = f(0) = f(-1) = 0, but it is onto. Onto functions are important, because if a function f from A to B is onto, it can be composed with any function g with domain B.

Finally, we consider the functions which are both one-to-one and onto. These are known as bijections, or invertible functions. If an f is a bijection, it means that every element in the domain, A, maps to exactly one element in the codomain, B. This means that, going in reverse, we can assign to each element in B its preimage under f on A and make a new function g, which will be the inverse of f. For example, let us take the function f(x) = x + 1. It is both one-to-one and onto, so it is invertible. Since it maps x to x+1, we can simply reverse the mapping to get a function g, which maps x+1 to x. This function g, will be the inverse of f. Clearly, this function, as expected, is g(x) = x - 1.

As you can see, even a function as simple as f(x) = x + 1, belongs to a very special family of functions. This is unsurprising. The functions which are the most powerful and have the most desirable characteristics are generally very restricted. Though we generally think of functions as simple things like x + 1, x/2 and so on, these are very well-behaved functions, and not representative of the majority of functions.

Saturday, August 25, 2012

Graph Theory (II)

It is now time to move away from the introductory posts, and delve deeper into specific topics. Being a graph theorist myself, I decided to make Graph Theory my first non-introductory post.
(For the introductory post to this field, please read this post.)

In the previous post on this topic, I finished with certain questions one can ask about graphs. The first of these is about the connectivity of graphs. Namely, the question is, given two vertices u and v in a graph, can we find some way to get from u to v, following edges in the graph? This is a very easy problem to visualise. If a graph is made up of two or more parts (known as components), then it is impossible to get from a vertex in one component to a vertex in another component. We say that the graph is disconnected. Now, let's say that for every pair of vertices u and v in a graph, you can find some path u, x1, x2, ... , v which moves through consecutively adjacent vertices x1, x2 and so on. Then there cannot be more than one component in the graph and we say the graph is connected.

This question of connectivity can be easily extended in two important ways. The first one deals with the best way to get from one vertex to another. Let's say that the vertices correspond to cities and the edges to roads connecting them. Now, for simplicity, let's say that each edge costs you 5 euros in gasoline. Then, to minimise your costs, you will want to find the path that uses the fewest edges. Going from Munich to Bologna and from there to Paris is possible, but it is generally much cheaper to go as directly as you can. This leads to the idea of the shortest path. That is, given vertices u and v in a graph, what is the shortest path between them? Clearly, for any two vertices, if you can find some path between them, a shortest path must also exist. This is because paths have non-negative length, so you cannot reduce path length indefinitely. One of the most well-known methods for finding shortest paths is Dijkstra's Algorithm. The way it works is by searching for the shortest path from the starting vertex u to every other vertex. This is done by splitting the graph into two parts: the explored vertices and the unexplored vertices. Additionally, for every vertex v, the algorithm maintains a tentative distance from u to v. Initially, only the starting vertex is explored. At each step, the algorithm picks the unexplored vertex with the shortest distance and explores it. Whenever a vertex v is explored, the algorithm attempts to find a shortest path from u to the neighbours of v, by incorporating v. If a path shorter than what was available before can be found, these neighbours are updated accordingly.

The reason this algorithm works, is that the shortest path between two vertices is made up of other shortest paths. That is, if the shortest path from u to v goes through x, then it contains the shortest path from u to x, followed by the shortest path from x to v. The algorithm seeks all such shortest subpaths and picks the best available one.

The second way in which connectivity can be extended is with the notion of k-connectivity. Let's say there are several path between two vertices, u and v. If we were to remove some vertices from the graph (and all the edges that go with them) would there still be a path between u and v? How many vertices make it impossible for us to find such a path? What about removing edges? The ideas of connectivity are important in various problems of resource management. Suppose you have to provide internet service to some locations. You would use various connecting stations, which would handle the traffic of information and route it from place to place. But what if some of these stations were to be damaged? Could your network cope? And how many stations can you safely shut down at a time for maintenance, without compromising the quality of your service? As you can see, the question of connectivity is very important for proper resource management.

Friday, August 24, 2012

Algorithms

The joke goes that when George W. Bush found out that colleges were teaching their Computer Science majors about Algorithms, he complained about unequal representation and asked that they also teach GeorgeBushisms as well. But what exactly are Algorithms? Why do computer scientists learn about them? What do they, in fact, do?

Imagine you are about to do something. This "something" could be a very simple task, like opening your internet browser, or something more complicated, like baking a cake. How do you go about doing it?
If you think about it, all processes have certain steps, which need to be carried out, usually in a precise sequence, in order to be completed. Opening your browser can, for example, be achieved by moving your mouse until the cursor is pointing to the relevant icon, then double-clicking. Performing these steps out of sequence probably won't get you very far, unless you're lucky.

Algorithms are nothing but step-by-step instructions for performing a task. We usually use the word to denote a mathematical or computational task, but, in fact, the instructions for opening your internet browser is also an algorithm.

Now, let's take a different task. Take a piece of paper and a pen. Draw a straight line. Now draw a second line, which starts wherever the first one ended and goes in a different direction. Draw a third line, starting from the same point, but in a different direction. Now connect the remaining three end points with a curve.
Did you get the Mercedes-Benz logo? Probably not. The reason for this was that my instructions were extremely imprecise. I did not tell you how to orient the lines. I did not tell you what angles to make between them. I did not tell you how long to make them. I did not tell you to make the final curve a circle. All these omissions most likely made your drawing something completely arbitrary, not resembling the mentioned logo at all. An algorithm cannot be vague like that. Each instruction must be precise enough, so that whenever someone attempts to perform the same task under the same conditions must get the same result. (With the exception of using randomness, like rolling a die).

When it comes to computers, every piece of software is a sequence of instructions to the computer, telling it what to do, from moving numbers around, to accepting some sort of input, to controlling your enemies in a video game. These instructions are in the language of the computer itself, so there is no longer a problem about vague instructions. So, why is there a need to study algorithms? There are two main reasons. The first one is that, even though a set of instructions can be precise, this doesn't mean that it will do what is intended. The study of algorithms helps us understand ideas like data flow, prerequisites and programme correctness. The second reason is that the same task can often be performed in many different ways. Instead of using a mouse to double-click on the correct icon, you may also be able to open your browser by going into a command prompt and type the appropriate commands to get to the correct folder and run the correct file and so on. Sometimes, one of the ways of performing a task is better (faster, more efficient, less prone to error and so on) than another. However, which method is better is not always clear. This is where algorithm analysis helps. We can analyse a given set of instructions in terms of the time it takes, the possible errors it may encounter and even its correctness, before even running the programme itself.

Wednesday, August 8, 2012

Game Theory

This topic is one of the main parts of what are known as Decision Mathematics. Naturally, the title Game Theory makes the field sound like people playing chess or poker; doing recreational activities. This is actually not the case.

Game Theory is all about making choices, in given scenarios. What makes these scenarios special is that you are not the only one making choices; others are doing this as well and the outcome is affected by their choices as well as yours. One such example is that of conflict. In a battle between two opposing sides, each side may choose to fight or retreat (there are actually many more choices, but I am simplifying here for illustration purposes). If one side fights and the other retreats, the side that fought gains some advantage, e.g. some high ground. In game theory, we may assign a certain amount of points to this outcome, for example the attacking side gets 1 point, whilst the retreating side gets -1 point, to quantify the advantages gained and lost. On the other hand, if both sides fight, they will have a lot of casualties, so both sides get -2 points. Finally, if both sides retreat, they have not yielded anything to the opponent, so they both get 0 points.

Let us assume you are a military advisor of sorts, to one of the sides. What would be the best plan, for this battle? Should you attack or retreat? If we look at the possible results, fighting will get you either 1 or -2 points, whilst retreating will get you 0 or -1. If you want to make sure that you do no worse than a score of -1, you should retreat. This is known as the maximin strategy, as it seeks to maximise the minimum possible result. On the other hand, attacking ensures that the enemy will certainly have a score of at most -1. This would be an aggressive tactic. Of course, if the opponent is known to be aggressive, attacking will result in the worst possible score, for both sides. On the other hand, if the enemy is known to retreat a lot, attacking will cause a gain in points for your side.

As you can see, this sort of decision making is not always easy. In fact, depending on the scenario presented, it could be that every choice is a "bad" one. Or, alternatively, there can be an "obviously good" choice, which is always the best to choose. All these considerations are what Game Theorists study. From the above, it is obvious that the ideas of game theory have a clear practical application in such things as warfare (as well as politics, business management and so on). This may seem like a stretch, considering the simplicity of the scenario I presented, but the ideas were applied during the Cold War, amongst others. And since nuclear arms were already available by then, it can be seen how desirable it is that military advisors should be able to identify strategies such as "minimax" as the best ones to take.

As an exercise, you can try to teach yourself some elementary decision-making. Make your own scenario for two sides in a game, each having two options and each pair of options resulting in a certain point score. It should be fairly easy to figure out what scores make a perfect strategy possible. What about more choices? Or more players? Or both? Is it always clear if a scenario has a perfect strategy or not?

Saturday, July 28, 2012

Infinity and Cardinality

Infinity is a topic which is generally rather hard to grasp. Our world does not readily provide anything infinite for us to study in the Sciences. Indeed, if some structure was infinite, we would not be able to study it fully; indeed, we would not be able to find all its edges. It would have no shape, no measurable volume, no mass, energy or velocity which we would be in any way able to comprehend. And yet, in mathematics, the idea of infinity can actually have meaning, which shows that our imagination is not solely confined to the extent of our experiences. More than that, infinity can actually help us understand the finite physical, chemical, biological etc. structures with which scientists deal all the time. Infinity helps us understand sequences, which in turn  helps in the understanding of limits, giving rise to the ideas of calculus, on which most contemporary Science and Engineering methods rely.

So, what is infinity? In general, we often treat it as a number. We may put it in the limits of a sum, a product or an integral, we may even give it as the result of these things. However, infinity is not, in fact a number. (This assertion is true in the reals. There are formal systems, like the hyperreal numbers, where infinity is a number. These systems are not what I am talking about here.)

Probably the most important work on infinity was done by Georg Cantor, when he formalised the ideas of infinite sets. Indeed, the best way to think of infinity is as a measure of the elements of an infinite set. As for finding such an infinite set, this is simple. We can simply take the positive integers (also known as the natural numbers), 1, 2, 3... and so on. We call this set N and consider the number of elements it contains. Obviously, it cannot be empty, as it contains the number 1. It cannot have only 1 element, as it contains the numbers 1 and 2. Similarly, it cannot contain only 100 elements, as every number from 1 to 101 is in N. For any positive integer we think of, we can find that many elements in N, and more. This is the very idea of infinity: something that is not bounded, which is larger than any number.

As we can see, this set of the natural numbers is infinite, in that it has infinitely many elements. Naturally, this means that any set which contains the natural numbers and others as well (a "superset" of N, as it is called) would also be of infinite size. We can take, for example, the set of all integers, Z. This includes the natural numbers (since they are the positive integers), as well as the negative integers and zero. Intuitively we can see that this must be a bigger set than N. Unfortunately, our intuition is wrong. When it comes to infinite sets, the only way to decide if Z is larger than N is to examine whether a bijection (one-to-one and onto function) exists between the two sets. If such a function can be found, then automatically the sets must have the same "size". Note that the natural numbers are essentially the numbers we count with, 1, 2, 3 and so on. So, finding a bijection between Z and N is the same as finding some way of listing all the elements of Z, without missing any, in a certain order. If such a sequence exists for Z, then Z and N have the same "size". This sequence is extremely easy to come up with: 0, 1, -1, 2, -2... and so on. Obviously this ordering will go through every single integer, much in the same way that 1, 2, 3... goes through every natural number.

The above shows us that Z has, in a sense, the same number of elements as N. Except that, as they both have infinitely many elements, we are not really talking about a number of elements. Because of this, the idea of cardinality was formulated. Cardinality can be thought of as the size of infinite sets, like the naturals or the integers. so, our example above shows that in fact both N and Z have the same cardinality, even if one is a subset of the other. This is a very important concept; infinity does not behave in the same way that other numbers do. Cutting infinity in half or doubling it does not actually change its size at all.

At this point one might think that all infinite sets must have the same cardinality as the natural numbers. After all, it should be easy to find a sequence that lists all the elements in a set. This is often the case; sets like the rational numbers, the algebraic numbers, the set of polynomials with integer coefficients and so on are certainly all countable (i.e. have the same cardinality as N). But there are sets which are actually much larger. The set of real numbers is actually uncountable. Any attempt to make a comprehensive list of all the reals, will result in failure. Indeed any such list can be shown to be missing some real numbers (in fact, an infinite amount of them). The proof for this uses a rather elegant method devised by Cantor, known as Cantor's diagonal argument.. I do not propose to present it here, as it is readily available online in other places. However, it should be clear from all this that the subject of cardinalities and the nature of infinity is one with a lot of substance.

Wednesday, July 25, 2012

Group Theory

This post will assume some fundamental knowledge of Set Theory, as well as Functional Analysis. Only the basic ideas, e.g. what a set is, what functions are, what a bijection is, and so on are required.

Ideally, I should have titled this post "Algebra", however I hesitate to do this, as there are many misconceptions about what algebra actually is. In middle-school people learn it informally as equations with letters replacing some of the numbers. In fact, algebra is the study of relations between mathematical objects.

A group G is a set which has a built-in binary function, F(x, y). This function takes two elements from the set and returns one element from the set. This property is known as closure and is one of the requirements for a group. The other properties a group must have, with respect to its binary function are:

1) An identity element. This is an element e for which F(e, x) = F(x, e) = x. Think of this as the number 0, when we add integers. 0 + x = x + 0 = x, for any integer x. Note that the identity element is unique.

2) Inverses. Every element of the set must also have an inverse in the set. Two elements a and b are inverses of each other if F(a, b) = F(b, a) = e, where e is the identity. This is similar to the negative of a number. When adding integers, x + (-x) = (-x) + x = 0. Note that each element has a unique inverse. This includes the identity, which is its own inverse.

3) Associativity. Using the addition example, this means that (a + b) + c = a + (b + c). That is, the order in which the additions are carried out does not matter; whereas the order of the elements added does.

The last part may seem strange, in light of the addition example. After all, 2 + 3 = 3 + 2 = 5. The order is irrelevant! This is, in fact, not the case, with more general groups. For example, if we take the set of 2x2 matrices with non-zero determinant, under normal matrix multiplication, they form a group. But in this group AB is not equal to BA. The order of the elements added (or multiplied) matters in many groups. The groups for which a + b = b + a are known as abelian groups; the integers under addition are such a group.

Having read all these rather abstract notions which define a group, one may ask what the point of Group Theory is. Why should it be a field of study, when it seems so disconnected from anything which can relate to the experience of our senses? In fact, the same question could be asked of most high-level fields of Mathematics, and the answer will roughly be the same. There are in fact three reasons to study Group Theory, or indeed any sort of Mathematics. The first is because of the applications they have. With Group Theory these may not be readily apparent, however modern Physics, Chemistry and Cryptography rely on the study of Group Theory. The second reason is that we humans have a natural curiosity and a tendency to look for patterns. In many ways, the study of Mathematics in general and Algebra in particular is the study of patterns. The third reason is that mathematical study helps one develop a rational way of thinking. Abstracting problems into mathematical models is the only way which has been discovered so far, which is consistently successful in solving said problems.

Wednesday, June 13, 2012

Iteration and Chaos Theory

Obviously, this post will not be too involved, as the subject is an extremely deep one. In all probability, you have already heard about chaos theory, from some source of other. The idea, such as it was, may have been that, for example, a butterfly flapping its wings in the desert, can cause hurricanes, or something similar. As analogies go, it is a horrible one, since it is extremely obfuscated. The only way to understand what it is an analogy for, is to already know the thing it is an analogy for.

In fact, what the butterfly truism tells us is that, with some processes, a small change in the initial conditions can have a disproportionate, enormous effect on the results. It is not that a butterfly causes hurricanes; but if all the other necessary conditions are already there (within a specific distance of the equator, a sufficiently high temperature, a sufficiently low pressure, a large area of water), then a butterfly's flapping could be the catalyst which starts a process, which itself eventually pushes the system over the edge and causes the hurricane to emerge. In the same way, however, a butterfly's flapping wings could be the catalyst that slightly increases the air pressure and therefore stops a hurricane from appearing, instead.

The type of process studied by chaos theorists is known as a dynamic process. In general, one tries to find out the effect of taking a function and applying it to a successive sequence of inputs, each input generally being the previous output. As a very trivial example, we could take the function f(x) = 2x. If we start with some number for x, e.g. 3, and then each time we apply the function, we use the new result as our new x, we get the following sequence: 3, 6, 12, 24, 48, 96, 192 and so on. Clearly, the numbers in this sequence, will grow without limit. That is, the sequence goes to infinity, or diverges. If, instead, we start with x = 0, our sequence becomes 0, 0, 0 ... i.e. x never changes. The sequence is convergent. It is also fixed.

The process of applying the same function successively to its own outputs is known as iteration. The numbers 3 and 0, used as the initial values are known as the seeds for our iteration. 0 is also known as a fixed point of the function.

As we can easily see, unless x = 0, this function will diverge. This makes it simple to understand and therefore uninteresting. The same is true of a function like f(x) = x/2 where, for any seed we get convergence to the fixed point 0. However, there are functions which produce surprisingly unexpected results. These are known as chaotic functions. For example, we have the function f(x) = 1.25 - x^2. It has two fixed points, but beyond those, any conceivable seed between -2 and 2 will create a sequence that is simply unpredictable in its course. Naturally, the same is true of other types of functions.

As I mentioned, a chaotic function is interesting, because it is not simple to predict how it will behave over iteration. For example, it is not a simple matter to predict stock market fluctuations. This is partly because of the huge number of parametres involved, but also because of the chaotic nature of the processes involved.

Saturday, May 26, 2012

Combinatorics

Combinatorics is usually one of the first non-introductory topics in discrete mathematics that students encounter in their mathematical studies. This also makes it one of the hardest topics to get to grips with, as it often requires a different way of thinking from other mathematical fields, such as calculus.

Combinatorics encompasses various subfields of study of discrete structures. In general, when given a problem in combinatorics, it will usually belong to one of three categories:

1) The existence problem. Given a problem, with specific characteristics and parametres, does a solution to this problem exist? For example, the dean's office at a college will want to assign time slots and classrooms to the various courses which will be taught in the semester. Such an assignment must not have two courses at the same time if the same professor is teaching them. (It is also usually desired that, if a lot of students are expected to take two courses concurrently, these courses not be assigned at the same time, if possible). Furthermore, if two courses are at the same time, they should not be assigned to the same room. Finally, if certain courses are expected to have large numbers of students, they should be assigned to large classrooms, so that everyone can have a seat. Given a list of courses, time slots and classrooms can such an arrangement be found? This is the existence problem. Sometimes there are no solutions to these problems. If, for example there is only one professor teaching both calculus and algebra, both classes needing the same single amphitheatre, and there is only one available time slot, the problem becomes impossible.

2) The counting problem. As mentioned before, some problems cannot be solved. Sometimes, on the other hand, a problem will have more than one possible solution. In our classroom assignment question, there could be several different assignments, which all work with the given parametres. The counting problem asks us to find how many different ways a problem can be solved. If there are two courses to be assigned to two different classrooms, at the same time, we can do this in two different ways; either have the first course in the first room and the second one in the second room, or vice versa.

3) The optimisation problem. Having figured out the different ways in which we can solve a given problem, we may then be interested in the best (whatever this means) solution out of the possible ones. For example, in the classroom assignment problem, we may want to pick the assignment that uses the fewest time slots, or the one that uses the fewest classrooms, or even the one with the fewest clashes between courses with common students. The optimisation problem is probably the most important one of the three, in the sense of solving applied problems, but it is often impossible to get started on, without a thorough understanding of the other two.

I will close this post by mentioning that the field of combinatorics is probably the largest one in mathematics, as it encompasses theory which is needed for almost all other mathematical fields. It would be impossible for a simple blog to present it properly, but perhaps in future posts, I may be able to present some of the more fundamental ideas.

Sunday, April 29, 2012

Graph Theory

We come now to my own personal field of interest, the subject of graph theory. This has nothing to do with pie charts or bar graphs; these are a different kind of graph.

Formally, a graph is a special type of mathematical structure, composed of two parts. The first part is a set, a collection of distinct elements called vertices or nodes. The second is a collection of 2-tuples of the vertex set. These 2-tuples are the edges. Visually, we represent a graph as a network. The vertices are points and the edges are lines connecting these points. More specifically, we are usually interested in graphs with a finite vertex set and where the edges are distinct and connect distinct vertices. The networks we draw for these graphs will have at most one edge between two vertices and no loops on any single vertex.

Graph theorists are interested in many different characteristics of graphs. The reason these characteristics are studied is that there are many applications to real world problems which are more easily solved with graphs. For example, computer networks could not run smoothly and successfully without a proper understanding of graph theory, resource allocation problems become much harder to solve and so on and so forth.

The biggest problem for students learning graph theory is its apparent lack of connectivity to other mathematical topics. More often than not, students have a background that is (perhaps detrimentally) over-abundant in calculus (or rather, the techniques of calculus) and therefore lack a proper foundation in more fundamental branches of Mathematics, including set theory. This means that they will be bombarded with several new definitions and almost none of them will be present in their mathematical background; they will have no previous knowledge to draw upon, to help them understand these new topics. In this post (and hopefully future ones), I will attempt to give these definitions in a slower, more relaxed pace, than a graph theory course would require. Perhaps this method would be best suited for a two semester graph theory course, but that is neither here nor there.

We first consider the idea of measuring things about a graph. How large or small a graph is can depend on what we are using it for. The two most generals ways in which we measure graphs are the number of vertices it has (known as the order of the graph) and the number of edges it has (known as the size). For example, if you look at a map of, say, major motorways and represent these motorways as edges in a graph, the size of the graph will be the number of motorways you have. On the other hand, the order will be the number of destinations (cities for example) that are connected by these motorways. The order and size of a graph, give us an idea of how large our graph is. They also tell us other things, like how dense or sparse a graph is. For example, if a graph has 10 vertices, the largest number of edges it can possibly have is 45. If we have a graph with 10 vertices and only 15 edges, this could be considered a very sparse graph. If on the other hand it had 40 edges, this would be a very dense graph. The ideas of density and sparsity become more important when other measurements are made on graphs, but they're interesting enough on their own as well.

The next simplest definition we can give for graphs is that of adjacency. We say that two vertices are adjacent (or neighbouring), if there is an edge between them. That is, if we have an edge connecting the vertices u and v, then u and v are adjacent. We also say that v belongs in the neighbourhood of u (and vice-versa). In computer networking, if we take a graph of connections in a serve-client model, we could say that a server is adjacent to the clients it is connected to.

It is often the case that we want to know how many neighbours a vertex has (usually because we want to minimise them or maximise them). The number of neighbours of a vertex u, or the number of vertices adjacent to u, is denoted as d(u) and is known as the degree of u. Vertices with many neighbours will have a high degree. In the server-client model, the degree of each server is the number of clients it connects to.

Clearly, there are many more questions one can ask about graphs. (Is there a way to get from a certain vertex to another, following the edges between adjacent vertices? Is there a way to start at some vertex in a graph, and go through every edge exactly once, going sequentially through adjacent vertices? How many colours do we need to colour the countries of a map, so that no neighbouring countries share the same colour?) However, this post is a mere introduction to the idea of what a graph is, and will not delve any deeper into the theory. Perhaps in future posts, I will try to answer these and other questions about graphs.

Sunday, March 25, 2012

Finite State Machines

This topic will appear not to be mathematical in nature to many people, but that is a misconception. I suspect that the most common reason for this misconception and other similar ones, is that many people confuse mathematics with arithmetic, statistics, geometry or trigonometry. It is true that these topics are part of mathematics. They are not by any means an exhaustive list of mathematical topics. In fact, most mathematical topics have very little to do with arithmetic, statistics etc. In this case, the field in question is the theory of computation and complexity; a part of mathematics which mostly has applications in computer science.

A finite state machine (FSM) is at the same time a mathematical structure, a simple type of computer and a language checker. Formally, it is composed of five parts. These are:
a) An alphabet. This is the collection of all symbols that can be passed into the machine. It can be the Latin alphabet, the numbers 0 and 1, or any other predetermined set of symbols.
b) A set of states. These are abstract states the machine can be in. Since these machines generally relate to a physical model, the states will have some meaning.
c) An initial state. This is how the machine starts out, before any input is given.
d) A transition function. This tells the machine how to change its state according to each new input.
e) A set of final states. When the input stops, if the machine is in a final state, we say that the input was accepted. If on the other hand, a certain input ends with the machine at a non-final state, we say that the input was rejected.

To illustrate all this, I will give two common examples. The first one is for an automatic door. Let's say you have an automatic door. For simplicity, its states will be "Closed" and "Open". The "alphabet" will be an input of "0" if no one is within a small area around the door and "1" if there is someone in this area. The initial state will be "Closed". The transition function is as follows: If the input is 0 and the door is open, close it. If the input is 1 and the door is closed, open it. Otherwise, make no changes. Finally, we can say that both our states are acceptable final states, again for simplicity. As you see, this FSM models our automatic door fairly well. The question arises, why do we need this model? The answer is: Because this is how the door can be made automatic. By making this model, we can create the appropriate digital control system, which will be used to make the door work in this way. In fact, the actual electronics involved in making the automatic door work, are nothing more than a finite state machine, although it will be a more complex one than our simplistic model here.

The second example deals with numbers. We want to make a machine that will take in strings of zeroes and ones as input (I keep using this as my alphabet, because this is essentially how computers work; with 0s and 1s) and accept those strings that end in 1; this corresponds to accepting odd integers. To do this, we define our start state (0) and our final state (1). These are enough for the purpose of this simple machine. Our transition function will be: Move to the state which is the same as the symbol we entered. That is, if the most recent symbol is a 0, go to state 0. If it is a 1, go to state 1. Here, if a string is given in that ends in a 0, our machine will finish in state 0, which is not a final state. So the string will be rejected. Our machine only accepts the strings that end in a 1.

You may think that the second example has little to do with a physical model, compared to the door example before it. This is only partly true. The second example helps us understand a very important concept in the fields of mathematics, computer science and linguistics; that of language and grammar. Formally, languages and grammars can be separated into groups, according to their machine complexity. The grammar above ("all strings of 0s and 1s that end in 1) belongs to the group of languages, which can be formally defined by a FSM. These are known as "regular" or Type-3 grammars. Other grammars require more complex machines, in order to be defined. For example, we have the Type-2 or "context-free" grammars, which can be defined by a type of machine called a "Push-down automaton" or PDA. This is a more complicated machine than the FSM, as it also involves an additional structure, known as a stack, which helps maintain some history of the input, as opposed to the FSM, which only knows its current state.

I would like to end by pointing something very important out. Everything I said above may sound extremely theoretical, but it is vital for the design of personal computers. As you are probably viewing this blog on some machine that can access the internet and decode hypertext into readable text, it is safe to assume that your machine contains any number of these FSMs, PDAs and other, even more complicated computational structures. Without these, it would be impossible for your computer to do all the amazing things it does.

Wednesday, March 14, 2012

Logical Fallacies

In my previous post I introduced some valid rules of inference. In this topic I would like to discuss invalid inference and logical fallacies. I will begin by way of an example:

My dog has four legs. Cats have four legs. Therefore, my dog is a cat.
The above syllogism is obviously wrong, since although both the premises are correct, the conclusion is not. This is because the conclusion does not follow from the premises. If one of the premises had been "Everything which has four legs is a cat", the syllogism would be valid. The conclusion would still be false, since this new premise is not actually true, but the inference itself would have been correct. Indeed, if we imagine a world in which all four-legged creatures are cats and my dog was a four-legged creature, in that world he would be a cat. A valid conclusion from the given premises, might instead be "At least one four-legged creature is not a cat". To reiterate, the fallacy in the example above lies in the fact that the conclusion does not follow from the premises. This type of fallacy is known as affirming the consequent and is very common in cases where unidirectional implication is not understood.

A similar fallacy is known as denying the antecedent. An example of that would be as follows:
Cats have four legs. My dog is not a cat. Therefore, my dog does not have four legs.
Once again, this syllogism fails to properly connect the premises to the conclusion logically, so it constitutes a fallacy. And it is also a fallacy created from a lack of understanding of the rules of implication, just like the previous one.

The third most common logical fallacy arises from a misunderstanding of the logical OR relation, also known as "inclusive or" (to distinguish it from XOR, the exclusive or). Although in spoken language we often use the word "or" as an exclusive disjunction, the same is not true in logic. In logic, "p OR q" is a statement which is true when at least one of p and q are true; or even both. A misunderstanding of disjunctions leads to the following fallacy:
My pet is a dog OR it is not a cat. My pet is a dog. Therefore, it is not true that my pet is not a cat, so my pet is a cat.
In fact, my pet can be a dog and not a cat at the same time. This is, in fact, the case. The first premise allows for this situation and does not imply that only one statement is true.

The above are propositional fallacies and, as discussed, they arise from a misunderstanding of logical operators, like implication and disjunction. There are also other types of fallacies, which come about from misunderstanding the rules of inference themselves, the scope of the premises or the conclusion, the words and language used and so on. I will not discuss these more general fallacies for now, as they are abundantly available online and also because they are beyond the scope of what I am discussing in this post.

Saturday, March 10, 2012

Logic and Inference

This is a rather deep topic, which has been studied since ancient times. In simple terms, if a fact is known, logic and inference tell us what other facts are also known because of it. This is not, of course, a simple process. In my previous post, I discussed theorems, which are essentially statements of these facts. Some theorems are extremely complicated. Logic and inference are what we generally use in proofs to connect known facts with those we want to prove.

As an example, let us take the statement "All my friends are taller than I am". Then, let us take the statement "Ed is my friend". Inference tells us that "Ed is taller than I am". If both the first two statements (premises) are true, then so is the third statement (conclusion). If, however, at least one of the premises is false, then the conclusion need not be true. For example, it might not be the case that all my friends are taller than I am. Then Ed may or may not be taller than I am. The creation of this conclusion from the premises above is a valid rule of inference, known as Universal Instantiation. This rule takes a characteristic of a set and applies it to a specific element in the set.

There are various other rules of inference, most of them dealing with taking two premises that are somehow related and extracting from them a conclusion which is related to each of them. The following is a short list of some common rules, with examples. I will use the letters p, q and r as statements, ~ as the negation symbol "NOT", ^ as the conjunction symbol "AND", * as the disjunction symbol "OR" and => (with other similar arrows) as the implication symbols.

Modus ponens (affirming the antecedent). Premises: p, p => q. Conclusion: q.
Here, one of the statements implies the other. Whenever p is true, it makes q true as well. We also have that p is true. Therefore, q must be true as well.
Example: If there is cake, I will eat it. There is cake. So, I will eat it. Here, p is "there is cake" and q is "I will eat the cake".

Modus tollens (denying the precedent). Premises: p => q, ~q. Conclusion: ~p.
This, in a sense, is the reversal of the previous rule. Again, p implies q. If p were true, q would be true. But q is false. So p cannot be true either.
Example: If there is cake, I will eat it. I will not eat cake. Therefore, there is no cake. (Otherwise, I'd be eating it).

Reductio ad absurdum (reduction to the absurd). Premises: p => q, p => ~q. Conclusion: ~p.
This one is a bit of a tricky one at first. In essence, if some statement being true makes another statement both true and false, then the first statement must be false.
Example: If I have a first-born, it is a boy. If I have a first-born, it is a girl. So, I do not have a first-born. (If I did, it would have to be both a boy and a girl).

Disjunctive syllogism. Premises: p * q, ~p. Conclusion: q.
Here we are given that at least one of our premises must be true. We are also given that one of them is definitely false. Then, we know that the other premise has to be the true one.
Example: My wall is white, or the door is open. The door is closed. So, my wall is white.
Note that here, the word "or" does not mean that only one of the two statements is true. In logic, that would be known as an "exclusive or". What we have instead in this case is an inclusive or, where it is possible for both statements to be true and at least one of them is true.

Hypothetical syllogism. Premises: p => q, q => r. Conclusion: p => r
This can be seen as a sequence of implication. When p is true, it causes q to be true. When q is true, it causes r to be true. So, if p is true, it will also cause r to be true. Also, using the same reversals as before, if r is false, it causes q to be false, which in turn causes p to be false.
Example: If it rains, I will bring my umbrella. If I bring my umbrella, I will lose it. so, if it rains, I will lose my umbrella.

There are many more rules of inference, some of which include more (and sometimes fewer) premises. In all of them, the single characteristic is the following: the conclusion follows directly from the premises. All that is at work are the rules of logic and inference. These rules are important; as I mentioned earlier, they give us guidelines in finding new facts, from previously known ones. This is not limited to Mathematics. Science, Law, Medicine all rely on the application of logic to their specific fields. A good understanding of logic is almost indispensable in most academic pursuits.

Of course, logic is often misused. There are many fallacies and errors which can be inadvertently committed by the misapplication of logical rules. I will be going over some of the more common errors of logic in my next post.

Tuesday, February 21, 2012

Mathematical Theorems

Mathematics is the only field which can give one knowledge, which is correct and provably so. The Sciences deal in theories, where a suggestion is made as to how the physical world works and this suggestion is accepted as most likely correct, as long as it makes correct predictions and does not make incorrect ones. It is of course the case that sometimes, theories will survive for a long time, before evidence against them (if any) is ever found. For example, Newton's theory of Gravitation was shown to be incorrect (or rather only an approximation) and a new theory by Einstein replaced it. On the other hand, Darwin's theory of Evolution has not yet been shown to be wrong. Eventually, some species may be discovered that show that the Darwinian model is wrong. Or perhaps not. Science is always in a state of uncertainty. This is not to say that Science does not give knowledge; only that any specific scientific theory can be discarded, if it does not fit physical observations.

Mathematics on the other hand deals in absolute truths. It doesn't matter whether you do your calculations on Earth, in outer space, under the sea or in a theoretical virtual environment on your computer. As long as you perform the operations correctly, the result will always be the same. This is where the difference between scientific theories and mathematical theorems arises. Theories are rigorous and generally accepted attempts at explaining natural phenomena; theorems are descriptions of mathematical properties, which can be proved to be logically true.

A mathematical theorem essentially has three parts. The first is the statement that the theorem asserts. For example, Pythagoras' Theorem tells us that the sum of the squares of the legs of a right-angle triangle is equal to the square of the hypotenuse. That is its statement, because it asserts this property to be true, for all such triangles (in an inner product space, where the idea of angle actually makes sense). The second part of the theorem is the proof. A theorem is useless, unless it can be shown to be correct; the proof uses known theorems and logical connections to show that the theorem logically follows from what is known. For example, if we know that the law of cosines is true, we can use this knowledge to infer Pythagoras' Theorem from it. Furthermore, we can establish the law of cosines, simply by using the definition of the cosine (we call this sort of proof one that arises from first principles). The third part of a proof is its scope. This is actually wholly encompassed in the statement, however it is an extremely important part. The scope of a proof is the collection of all the prerequisite conditions that make a theorem true. If we take Pythagoras' Theorem as an example, what it asserts is not true for arbitrary triangles, so its scope is limited to a very specific family of triangles; those triangles which have two perpendicular sides.

To make the above more obvious, we look at the following example of a Theorem:
Euclid's Theorem: There are infinitely many prime numbers.
Proof: Assume, for contradiction, that there are finitely many prime numbers, 2, 3, ..., p, where p is the largest prime. Then let S = (2)(3)...(p) + 1. Clearly, S is not divisible by any of the prime numbers up to p. Then, either S is prime, or there is some prime factor of S that is larger than p. Both of these cases contradict the assumption that p is the largest prime and therefore the assumption that there are finitely many primes. 
Q.E.D.

In this theorem, the text in green is the statement of the theorem and the text in yellow is its proof. The scope of the theorem are the prime numbers. The theorem does not tell us much about any other number sets (except indirectly). The letters Q.E.D. at the end, stand for "Quod Erat Demonstrandum", which is Latin for "thus it is shown". They are often used to mark the end of a proof.

Euclid's Theorem seems like a simple statement, that might seem intuitively true and therefore not need any proof. This way of thinking gives rise to errors. Many things that seem intuitively true, can be shown to actually be false. A mathematician requires proof, before accepting a new idea; and equally, if the proof is correct, the idea is never rejected. Euclid's Theorem has been true for millennia and was never successfully contested, because its proof is logically valid and will continue being so. Newton's Theory of Gravitation (as discussed before) has been shown to be false (although a very good approximation) and is therefore now not used as a high-level model of Gravitation.

I would like to end this post with a few more terms that someone may encounter in connection with theorems and their proofs:

Proposition: A small, stand-alone proven statement, usually not as important or interesting as a theorem. Think of it as a baby theorem.

Lemma: A proven statement which is generally proved on its own and then used to prove a theorem. Think of it as a part of a theorem which deserves its own subsection but is not actually big enough to be a theorem in itself. (Note that this is not always the case; some Lemmas give information on very fundamental concepts and are much more interesting and important than most theorems.)

Corollary: A secondary proven statement that follows directly from a theorem. Usually this is a smaller result, which requires less work to obtain than the original theorem.

Conjecture: An unproven statement. This is where mathematicians allow some intuition; conjectures are usually open problems waiting for someone to prove or disprove them. They are not accepted as true. Fermat's Last Theorem was actually a conjecture until the mid-90's, when it was proved, even though it was called a theorem.

Contradiction: An impossibility, as it were. This is actually a proof technique, where we assume the opposite of what we want to prove. We then use the assumption and logic to arrive at an impossibility. This shows that the assumption must be wrong, which in turn means that what we want to prove (the opposite of the assumption) is true.

Counterexample: A situation where a statement is shown to be false. This is sometimes used with contradiction proofs, or to disprove proposed theorems. A counterexample to a theorem must conform to all its requirements, but fail to conform to its result. If there were a right-angle triangle for which Pythagoras' Theorem did not hold, that would be a counterexample which would disprove the theorem. Of course, no such triangle can be constructed in Euclidean space.

Induction: A proof technique that relies on abstraction. A statement is shown to be true for some base case and then a proof is given which shows that if the statement is true for simpler or smaller cases, then it is also true for larger or more complex ones.

Tuesday, February 7, 2012

Introduction

During my college years, I have come across a number of topics in many fields of Mathematics. Some of these are very well-explored topics, some are completely new, only now emerging in the literature.

I have decided to make this blog to record those topics that interest me, particularly those that have less exposure outside advanced college courses for students of Mathematics, as well as to give them my own personal perspective. It is my hope that this blog can be used by interested readers, who want to have a small taste of mathematics and learn more about this "mother of Sciences".

Obviously, this blog will not have topics from every mathematical field; only the ones I am personally interested in. I can of course discuss a certain topic I otherwise would not have, if requested, as long as I know enough about it and am sufficiently interested, so feel free to ask, if you have something specific you are interested in. In particular, if you would like some topic to be explored in more depth, or more examples to be given for illustration purposes, comment and let me know.

With that in mind, please make sure that any comments are civil and relevant.