CSFG
English Deutsch Beta Español Beta język polski (2.6.0)
Chapters Curriculum Guides Appendices

Formal Languages
15.2. Getting started

Formal Languages

  • 15.1. What's the big picture?
  • 15.2. Getting started
  • 15.3. Finite state automata
  • 15.4. Regular expressions
  • 15.5. Grammars and Parsing
  • 15.6. The whole story!
  • 15.7. Further reading

To give you a taste of what can be done, let's try searching for words that fit particular patterns. Suppose you're looking for words that contain the name "tim", type the word "tim" (or a few letters from your name), then press the "Filter words" button to find all words containing "tim".

Regular Expression Filter

Loading word file...
Regular Expression

That's a pretty simple search (though the results may have surprised you!). But now we introduce the wildcard code, which in this case is "." – this is a widely used convention in formal languages. This matches any character at all. So now you can do a search like

tim.b

and you will get any words that have "tim" followed by "b" with a single character – any character – in between. Are there any words that match "tim..b"? "tim...b"? You can specify any number of occurrences of a symbol by putting a '*' after it (again a widely used convention), so:

tim.*b

will match any words where "tim" is followed by "b", separated by any number of characters – including none.

Try the following search. What kind of words does it find?

x.*y.*z
Explanation of previous search Teacher Note

This code finds words that contain x, y and z in that order, but separated by 0 or more characters. There are 16 words is the data set which match this.

  • Can you find words that contain your name, or your initials?
  • What about words containing the letters from your name in the correct order?
  • Are there any words that contain all the vowels in order (a, e, i, o, u)?
Vowels solution Teacher Note

To find words with all the vowels in order, the code is simply "a.*e.*i.*o.*u", there are 47 matches.

Students may ask how to do more complex searches, like letters in any order. If they are interested they can explore this on their own, but this is just a warmup exercise. We'll be covering this more carefully in the section on regular expressions.

The code you've used above is a part of a formal language called a regular expression. Computer programs that accept typed input use regular expressions for checking things like dates, credit card numbers and product codes. They’re used extensively by programming language compilers and interpreters to make sense of the text that a programmer types in. We'll look at them in more detail in the section on regular expressions.

Next we examine a simple system for reading input called a finite state automaton, which – as we'll find out later – is closely related to regular expressions. Later we'll explore the idea of grammars, another kind of formal language that can deal with more complicated forms of input.

Klingon linguistics activity Teacher Note

For a fun discussion, you could have the students look at the Klingon Linguistics activity at CS4FN. This page introduces the fundamentals of languages – words (the alphabet) and grammar (the rules of syntax). It discusses why languages are translated and how meaning can be changed by translation. It also explains why computer languages need to be translated.

Previous:
What's the big picture?
Next:
Finite state automata

Looking for something for primary schools? Check out CS Unplugged.

The Computer Science Field Guide is an online interactive resource for high school students learning about computer science.

Useful Links

  • About
  • Chapters
  • Interactives
  • Curriculum Guides

Community

  • Twitter
  • YouTube
  • GitHub

Help

  • Search
  • Glossary
  • Feedback

Switch to student mode

English | Deutsch | Español | język polski (2.6.0)

The Computer Science Field Guide material is open source on GitHub, and this website's content is shared under a Creative Commons Attribution-ShareAlike 4.0 International license. The Computer Science Field Guide is a project by the Computer Science Education Research Group at the University of Canterbury, New Zealand. Icons provided generously by icons8.

3.16.0

This definition is not available in English, sorry!