CSFG
English Deutsch Beta Español Beta język polski (2.6.0)
Chapters Curriculum Guides Appendices

Formal Languages
15.2. Getting started

Formal Languages

  • 15.1. What's the big picture?
  • 15.2. Getting started
  • 15.3. Finite state automata
  • 15.4. Regular expressions
  • 15.5. Grammars and Parsing
  • 15.6. The whole story!
  • 15.7. Further reading

To give you a taste of what can be done, let's try searching for words that fit particular patterns. Suppose you're looking for words that contain the name "tim", type the word "tim" (or a few letters from your name), then press the "Filter words" button to find all words containing "tim".

Regular Expression Filter

Loading word file...
Regular Expression

That's a pretty simple search (though the results may have surprised you!). But now we introduce the wildcard code, which in this case is "." – this is a widely used convention in formal languages. This matches any character at all. So now you can do a search like

tim.b

and you will get any words that have "tim" followed by "b" with a single character – any character – in between. Are there any words that match "tim..b"? "tim...b"? You can specify any number of occurrences of a symbol by putting a '*' after it (again a widely used convention), so:

tim.*b

will match any words where "tim" is followed by "b", separated by any number of characters – including none.

Try the following search. What kind of words does it find?

x.*y.*z
  • Can you find words that contain your name, or your initials?
  • What about words containing the letters from your name in the correct order?
  • Are there any words that contain all the vowels in order (a, e, i, o, u)?

The code you've used above is a part of a formal language called a regular expression. Computer programs that accept typed input use regular expressions for checking things like dates, credit card numbers and product codes. They’re used extensively by programming language compilers and interpreters to make sense of the text that a programmer types in. We'll look at them in more detail in the section on regular expressions.

Next we examine a simple system for reading input called a finite state automaton, which – as we'll find out later – is closely related to regular expressions. Later we'll explore the idea of grammars, another kind of formal language that can deal with more complicated forms of input.

Previous:
What's the big picture?
Next:
Finite state automata

Looking for something for primary schools? Check out CS Unplugged.

The Computer Science Field Guide is an online interactive resource for high school students learning about computer science.

Useful Links

  • About
  • Chapters
  • Interactives
  • Curriculum Guides

Community

  • Twitter
  • YouTube
  • GitHub

Help

  • Search
  • Glossary
  • Feedback

Switch to teacher mode

English | Deutsch | Español | język polski (2.6.0)

The Computer Science Field Guide material is open source on GitHub, and this website's content is shared under a Creative Commons Attribution-ShareAlike 4.0 International license. The Computer Science Field Guide is a project by the Computer Science Education Research Group at the University of Canterbury, New Zealand. Icons provided generously by icons8.

3.12.6

This definition is not available in English, sorry!