Category Archives: math

understanding regular expressions to finite state machines

i recently came across a question on designing a state machine for detecting a binary string sequence. the solution to such a problem has two approaches. the very commonly followed approach is analyze the “given” sequence over and over again and try to scribble state transitions and test the patterns on it to check if breaks. i believe this is the way hardware engineers work this out, primarily because they spend some time which is considerably less than what they spend for other problems they analyse :) i say hi to “theory of relativity” here..

the second approach is to have rules to analyze such problems because they only consist of binary patterns (0s and 1s) and a rule based solution is like some kind of a program, so you don’t necessarily have to do much testing or alter, adjust your state machine as you try to solve the problem. the theory that deals with such kind of problems of pattern matching and sequence detection is the famous regular expression or the re. regular expressions can be analyzed as finite automata, or what can be called as our commonly known finite state machine.

so the best way to deal with such problems is to represent the sequence as a regular expression and just try converting it to finite state machine based on rules.

So some notations w.r.t re:

a* : an a* would mean 0 or more matches of the letter “a” followed by successive letters. for instance a*b matches “ab”, “aab”, “aaab” or just “b”.

a+ : an a+ would mean 1 or more matches of the letter “a” followed by successive letters. for instance a+b matches “ab”, “aab”, “aaaab” but will not match just b.

a? : an a? would mean 0 or 1 matches of the letter “a” followed by successive letters. for instance a?b matches “ab” or just “b” but will not match “aaab”.

the problem statement mentioned the sequence detection for the string 101011, which also meant that the state machine should be able to sustain 101010111 or a 111101011 as a match since both these string involve the string 101011.

Looking more closely at the sequence, a few points to notice.

a. match as many 1s preceding this sequence so that we end up matching 11101011, or 11111111010111 or 11111111111111101011. to put this in the basic atom terminology, we match (1+)01011.

b. match as many 10s preceding this sequence so that we end up matching 10101011 or 101010101011 … so we can end up consuming any number of 10s once we get past the first occurrence of 10. to put this in the basic atom terminology, we match (1+)0(10)+11

so that is pretty much translating problem into a regex string. The following slideshow will try to illustrate a method to translate a regular expression into a state machine. There might have been mistakes in my understanding and representation. Comment or e-mail me if you think there is something wrongly interpreted.

Thanks
Shyam

Advertisements

counting squares contd.

As a continuation of the previous post on counting squares, here is an update. Only two of my friends commented about their solutions to this simple problem and here are two approaches.

lets assume a board where n = 6. drawing a line would make the board look like this.

+----+----+----+----+----+----+
| \\ |    |    |    |    |    |
+----+----+----+----+----+----+
|    | \\ |    |    |    |    |
+----+----+----+----+----+----+
|    |    | \\ |    |    |    |
+----+----+----+----+----+----+
|    |    |    | \\ |    |    |
+----+----+----+----+----+----+
|    |    |    |    | \\ |    |
+----+----+----+----+----+----+
|    |    |    |    |    | \\ |
+----+----+----+----+----+----+

what i think is the the “indian” geeky way of looking at it is to eliminate the  other half and visualize the same board as

+----+
| \\ | number of squares = 1
+----+----+
|    | \\ | number of squares = 2
+----+----+----+
|    |    | \\ | number of squares = 3
+----+----+----+----+
|    |    |    | \\ | number of squares = 4
+----+----+----+----+----+
|    |    |    |    | \\ | number of squares = 5
+----+----+----+----+----+----+
|    |    |    |    |    | \\ | number of squares = 6
+----+----+----+----+----+----+

and so people arrive with the standard n(n+1)/2 which people remember, thanks to the caning by their high school math teacher for not remembering the sum of natural numbers series upto n :)

but there is a simple way of looking at it
1. the number of squares slashed by the diagonal is n.
2. total number of squares = n^2
3. so remaining squares = (n^2) – n which is equally divided between the two halves
4. so total squares not slashed = [(n^2) – n] / 2 = M
5. so if you want to include the number of squares slashed by the diagonal as
well, then its M + n

I took the first way to get the equation, but that is probably the harder way to do math. a thoughtful layman would come up with the second method first time and is backed by observation.

cheers
shyam

counting squares

while in the midst of trying to solve a problem, a question just struck to me. suppose if there is a chess board of n checks [the normal chess board has 8×8 so n=8], and you draw a diagonal across the entire board from one edge to the other edge, what is the equation that can give us the number of squares in the chess board on one side of the diagonal. squares include those across which the diagonal has been drawn. for instance if a diagonal is drawn across a board with n=4, then the number is 10.

i spent a couple of minutes of thought on how to fit an equation into this but i thought i should have got it faster. there is one approach i followed.. let me know what approach you guys follow!

cheers

shyam