Aesthetics

Aesthetics is a branch of philosophy that explores the nature of art, beauty, and taste, with the creation and appreciation of beauty.

ALC

Apple Lossless Coding, a lossless compression method for audio.

Algorithm

A step by step process that describes how to solve a problem and/or complete a task, which will always give a result.

Algorithm analysis

Algorithm analysis working out the complexity of an algorithm.

Algorithm complexity

Algorithm complexity how long the algorithm takes to run (or how much memory it uses). These are almost always specified in terms of the size of input.

Alphabet

In formal languages, a list of characters that may occur in a language, or more generally, a list of all possible inputs that might happen.

Artificial intelligence (AI)

The artificial ability to mimic parts of human intelligence.

In its broadest meaning; artificial intelligence is when something made by humans is able to do something the humans didn't tell it how to do.

ASCII

The commonly used code for representing characters as 8-bit numbers (although only 7 of the 8 bits are usually used).

Association Rule Learning

An analysis method used to find relationships and associations between variables in large datasets.

Asymptotic

A trend is asymptotic if it approaches a value (asymptote), but never reaches it. For example, the graph y = 1/x has asymptotes at y = 0 and x = 0.

Asymptotic complexity

The performance of an algorithm at the extremes, usually a really high input n.

Attack

Gaining access to or decrypting a file that is using encryption, without having the key. There are several types of attacks, some of which are also defined in the glossary.

Big O notation

A convention for showing a measure of complexity, describing how the cost of an algorithm increases with the size of the input.

Binary number system

The base 2 number system, i.e. numbers only made up of the digits "0" and "1". All numbers that can be represented in the decimal number system can be uniquely represented in the binary number system.

Binary search

Searching a sorted list by looking at the middle item, and then searching the appropriate half recursively (used for phone books, dictionaries and computer algorithms).

Bit

Bit short for "binary digit" - a digit that is either 0 or 1.

Bozo search

Searching a list by checking items at random. Items can be checked more than once so it is possible for this algorithm to never end!

Brooks' law

An observation that adding more people to a project that is running late may actually slow it down more.

Brute force attack

A type of decryption attack that is carried out by trying every possible key.

Bubble sort

A sorting algorithm based on swapping adjacent items that are out of order. It is not a good method, but serves as an example of a slow method in contrast to others like quicksort.

Bug

An error in software that causes it to fail or otherwise perform unexpectedly.

Byte

Byte a group of 8 bits, able to represent numbers from 0 to 255, can store one ASCII character (also known as an octet).

Caesar cipher

A very simple cipher that offsets each letter in the alphabet by a certain amount, specified by the key. It is no longer used in practice due to being very easy to attack.

Cartesian coordinate system

A system of plotting points on a graph by, for each point, specifying a distance from the origin along each axis. For example, [2, 3] specifies a point two units along the x axis, and 3 units along the y axis.

Central Processing Unit

The central processing unit (CPU) is a computer component, and is often called the "brains" of a computer. It is the piece of circuitry that executes the instructions in all computer programs and processes data.

Character

The collective name for any letter, number, or symbol used in text. For example: a, A, @, 3, +, and several symbols we don't see (like space and tab), are all unique characters.

Chatterbot

An AI system that has text conversations with the user, typically based on simple pattern matching.

Check digit

An extra digit that is added onto the end of a number such as an ISBN, credit card number, or barcode number. This digit is calculated using a formula based on the other digits in the number. Error detection works by using the check equation to determine whether or not the check digit is as expected.

Check equation

An equation that is used to check whether or not the check digit for a number is correct.

Chomsky hierarchy

A hierarchy of four classifications of formal languages, ranging from simple regular expressions to very flexible (but computationally difficult) grammars.

Cipher

An algorithm used to encrypt a piece of plain text.

Ciphertext

Text which has been encrypted.

Client

In computing, a client is a program that connects to a service made available on a server.

Compiler

A compiler translates an entire program written in a high level language to machine code. This machine code can then be used to run the program.

Complement

Something that completes. In numeracy, a number plus the complement of that number adds to a nice round value.

For example: in decimal, 2 + 8 = 10, so the complement of 2 is 8, and that of 8 is 2.

Note that a complement is different from a compliment, which is an expression of praise.

Complexity

How long it takes to solve a problem. A problem has an inherent complexity (minimum time needed to solve it); any algorithm to solve the problem will have a higher complexity (take at least that long).

Compression

Making a file smaller by removing redundant information (typically using standards like zip, jpeg, mpeg, mp3, etc).

Computer graphics

Either the technologies used to create images in computers, or the images themselves.

Computer program

A collection of instructions that can be executed by a computer.

The people who program programs are called programmers, and the act is called programming.

Computer vision

A scientific field about how computers can be made to emulate human vision; in areas like face recognition and spatial awareness.

Convention

A convention is a formal rule about a way in which something should be done. Unlike protocols, conventions don't have to be followed, but it always helps when they are.

Core

In computing a core, also called a CPU core or processor core, is the part of a computer which actually performs calculations and carries out instructions. Computer CPUs can contain multiple cores.

Correlation

A mutual relationship or connection between two or more variables. Two things are said to be correlated if changing the value or state of one also changes the value or state of the other.

Cost

A measure of how much resource is needed – time, money, material or anything else that is relevant for the context.

Data point

A data point is a single piece or unit of information. It could be, for example, a measurement of something, a time stamp, or a recording of a button click.

Data processing

Carrying out operations on data, for example retrieving it, making calculations with it, and making decisions based on it.

Decimal number system (Denary number system)

The standard base 10 number system that is used in everyday math, using the digits "0", "1", "2", "3", "4", "5", "6", "7", "8", and "9".

Decrypt

Also Decryption, Decipher

Getting the plain text for a piece of cipher text by either using the key or an attack.

Dictionary attack

A brute force attack on a password database that involves testing every common password as well as words from a dictionary and all combinations of characters up to a certain length.

Digital signature

An encryption system that allows the receiver to verify that a document was sent by the person who claims to have sent it.

Edge detection

The process of finding edges in images; areas with a very sharp change in image brightness or other variations. These edges could indicate object boundaries and so are very useful for image segmentation.

Encryption

Changing the representation of data so it can’t be read by an eavesdropper who doesn’t have the decryption key.

Encryption key

The password or secret code that will unlock an encrypted file.

Error correction

Correcting an error that has been detected in some data. This can be demonstrated in the Parity trick, where a person is able to flip the changed bit back over so it is correct again (after they have "detected" which bit was incorrect). Not all error control schemes are able to correct errors; some are only able to detect them.

Error detection

Detecting when an error has occurred in some data, such as a number getting typed incorrectly or a bit getting flipped. Some simple examples of this are parity bits or a check digit.

Extrapolation

Estimating values after some given values assuming they continue the trend. For example, if a sequence of 3 numbers is 1 2 3, we could continue the linear trend and extrapolate the values 4 5 6.

Feature

A function available on a digital device or software, such as copy/paste, autofocus, voice dialling or undo. Features are often used to sell a device, but having features (functionality) should not be confused with people being able to use the device effectively (usability).

Feedback

Responding to or acknowledging a user action. Users find the devices hard to use if the feedback is slow, confusing, or non-existent.

Finite state automaton

In formal languages, a finite state automaton (FSA) is a simple 'machine' that has states, and transitions from one state to another based on strings of input symbols.

The logic for FSAs and regular expressions are interchangeable – every FSA has an equivalent regular expression, and vice versa.

Finite state machine

Alternative name for a finite state automaton.

Formal language

A formal language follows a set of rules that is far more restrictive than any human-spoken language. As a result, it is not at all ambiguous when checking if some given input is valid according to the language.

Frequency analysis attack

An attack on substitution ciphers that takes advantage of the fact that some letters are generally more common than others in a piece of text (e.g. in English, the letter "e" is usually the most common letter) by looking at which letters appear the most in the cipher text and guessing that they must be the substitutions for the most common letters.

GIF

A lossless image compression system typically used for small images with few colours in them (in practice it can be lossy because it has only 256 colours, and if the original has more colours then some will be lost).

Gigabyte

About 1000 megabytes (1,000,000 kilobytes and 1,000,000,000 bytes). This is 8,000 million individual bits (i.e. 0’s and 1’s). Like a kilobyte, there are other definitions, such as 1024x1024x1024 bytes, but usually this level of accuracy isn’t important. Commonly referred to as a "GB".

Grammar

In formal languages, a set of grammars is a set of rules for specifying a formal language, for example, to specify syntax for programming languages.

Unlike regular expressions, grammars don't specify a sequential pattern the language needs to follow. Instead it defines a set of building blocks the language can use.

Graphics transform

A manipulation of a digital object: translation, rotation, scale and others.

Greedy algorithm

An algorithm where; for every choice, all future choices are ignored and the most immediately beneficial decision is made.

Coinage has been designed with this in mind – a greedy algorithm can find the fewest number of coins required for any given amount of money. However, in many other situations the greedy algorithm does not give the most optimal result.

Hash function

A function that maps a "key" of variable length to a seemingly random "hash" of fixed length.

In a hash function used for cryptography, it is easy to find the hash of a key (and the hash will always be the same), but effectively impossible to find what key was used to make the hash.

Heuristic

A heuristic is rule or guideline, usually devised from experience. The term is used in both HCI and algorithms. In HCI, heuristics are often used as a benchmark to evaluate interfaces – they aren’t strict rules, but usually highlight issues with designs. In algorithms, a heuristic is an approximate solution to a problem – intended as a quick, close enough calculation rather than a precise, impossibly long one.

The base 16 number system. Uses the digits "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "B", "C", "D", "E", and "F". All numbers that can be represented in decimal can be uniquely represented in hexadecimal (just like binary). It is most often used as a shorthand notation for binary, by assigning 1 hexadecimal digit to each 4 bit pattern (the assigning is done in numeric order).

A representation for colours that tells the computer how much red, blue, and green light to display in a pixel (to make the desired colour). Uses 1 byte for each of these 3 primary colours, which is 3 bytes (24 bits) in total. These 24 bits are often written as 6 hexadecimal digits to make them easier for humans to read, which is why they are called "Hexadecimal colour codes". They are commonly encountered when specificying colours in HTML for web pages.

High level language

High level language a programming language that is designed for humans to read and write (e.g. Java, Python, C, C#, Basic, Scratch…) as opposed to machine languages.

Human computer interaction (HCI)

An area of computer science looking at how people interact with a digital device, with an emphasis on the quality of the experience to complete tasks.

Hypertext Transfer Protocol (HTTP)

A protocol for the distribution of Hypertext over the internet. It is the most common protocol in use for web browsing.

Image noise

Random variations in colour and brightness within images, usually caused by electrical interference.

Image segmentation

The process of splitting an image into multiple simpler parts. For example, software attempting to read text in an image would try to segment it into individual letters and analyse those.

Insertion sort

Start with an empty list, and insert each item in the correct place; this is a relatively slow method, usually between selection sort and quicksort in terms of speed.

Intelligent systems

An area of computer science that investigates ways to simulate or approximate human intelligence on computers; often referred to as artificial intelligence (AI).

Interface

The part of a computer, software, or electronic device that a human interacts with, whether this is by sight, hearing, or touch.

Interpolation

Estimating values between some given values assuming they follow a trend. For example, if a sequence of 5 numbers starts with 3 and finishes with 11, we might assume the trend is linear and interpolate the values 5, 7, 9 in between.

Interpreter

An interpreter runs a programming language by translating each line of code in sequence as it is executed.

ISBN

Stands for International Standard Book Number. Every published book has one of these numbers on the back of it. ISBN is significant to error control coding because it uses a check digit for error detection.

JPEG

A lossy image compression system typically used for photographs.

Key (in algorithms)

It is an item of data that is being searched for or sorted, and therefore will be compared with other data.

Key (in cryptography)

The password or secret value that is used to encrypt and decrypt an encrypted file (without having to use an "attack"). Some widely used methods have different keys for encryption and decryption.

Kilobyte

About 1000 bytes. This is 8,000 individual bits (i.e. 0’s and 1’s). We say "about" 1000 bytes because the term is ambiguous and it is often taken as 1024 bytes; however, rounding it to 1000 is close enough for most calculations. Commonly referred to as a "KB".

Known plaintext attack

Working out the key or method of encryption (cipher) based on having access to both the original plaintext and its encrypted form.

Language

In formal languages, a language is the set of all strings that the language accepts i.e. that are valid.

Lexical analysis

When compiling a computer program, working out what the components of the program are e.g. identifiers, keywords, integers.

Linear complexity

Linear complexity grows in proportion to the size of the problem - if the problem is twice as big, it will take roughly twice as long to solve.

Linear search (sequential search)

Searching a list by looking at each item in turn, stopping when the current item matches the one being looked for.

Logarithm

Logarithm is a very slow growing mathematical function written as $log n$. In computer science logarithms are usually in base 2, that is, $log_2 n$, which is the inverse of the incredibly fast growing exponent function $2^n$. Logarithms are not needed to understand the material in this book, but they are used a lot in computer science and are a useful concept to understand. Logarithms happen to come up a lot with algorithms, and the two words are often confused. The value $log_2 n$ is just the number of times you can halve n until you get down to 1; for example, $\log_2 32$ is 5, and $log_2 1024$ is 10. Binary search takes $log_2 n$ steps to search n items; storing the number n in binary takes $log_2 n$ bits.

Long tail marketing

A marketing strategy where rather than targeting a large number of customers at once with the same advertisements, many small niche groups of customers are specifically targeted.

Lossless

A compression method that does not cause any loss of data. This means that the uncompressed file will be identical to the original file that was compressed (which is important for text). In the case of images and sound, it means they will be of the same quality before and after compression. For example, ZIP and ALC use lossless compression.

Lossy

A compression method that trades off quality for file size. Lossy compression methods can make files smaller than lossless compression methods can, but the quality of the resulting file will be lower. For example, MP3 and JPEG use lossy compression.

Machine language

The native language for instructions for a computer, not very easy for humans to read and write.

Megabyte

About 1000 kilobytes (1,000,000 bytes). This is 8 million individual bits (i.e. 0’s and 1’s). [Like a kilobyte, there are other definitions, such as 1024x1024 bytes, but usually this level of accuracy isn’t important]. Commonly referred to as a "MB".

MP3

A lossy audio compression system.

Nibble

4 bits (half a byte), sometimes called a nybble.

Nielson’s heuristics

A widely used set of heuristics for evaluating computer interfaces that was devised by Jakob Nielson. The list of heuristics are available on the Nielsen Norman Group website.

Octal

The base 8 number system. Like hexadecimal, it is significant to computer scientists as it allows a shorthand notation for writing binary numbers. Octal assigns a digit to each possible 3 bit pattern.

Packet

In networking, a packet is a small piece of data routed across the internet. The packet includes control information (e.g. a source and destination, error control information), and a payload (the data being sent). The payload is almost always only a tiny part of a much larger file.

Parity

Adding an extra bit to a set of bits to make it so that there is an even number of 1’s. Storing the parity makes it possible to detect and correct errors later. This is known as an even parity bit; an odd parity bit is also possible where the extra bit ensures there is an odd number of 1’s.

Parse tree

The structure derived by parsing some input.

Parsing

Reading some input (typically a computer program) and making sense of it by breaking it into parts according to their function.

Pattern matching

In formal languages, finding text that matches a particular rule, typically using a regular expression to give the rule.

Permutation

One arrangement of items. For example, the permutations of 1 & 2 are [1,2] and [2,1].

Pivot

A randomly selected item in an unsorted list, used in quicksort.

Pixel

This term is an abbreviation of picture element, the name given to the tiny squares that make up a grid that is used to represent images on a computer.

Plaintext

Text in human-readable language, without any encryption.

PNG

A lossless image compression system typically used for small images with few colours in them.

Problem domain

Everything relevant to the problem that should be examined when tying to solve it. The goal of a problem domain is to exclude everything you don't need to think about and only focus on what's left.

Processor

An electronic circuit which performs computations and operations on data. All computers will contain at least one processor, and most contain multiple.

Programming language

A formal language that specifies what instructions can be given to a computer, and what output they return.

Protocol

A set of strict rules or instructions describing how something must be done.

Prototype

A model or test sample of a product, used to gather feedback or test data so the design can be improved upon.

Public key cryptography

A system where a public key can be safely broadcast to anyone, but messages encrypted with it can only be read by a secret private key. This is known as an asymmetric system, because you don't use the same key to encrypt and decrypt a message.

Similarly, messages decrypted with the public key could only have been encrypted with the private key. This can be used to prove a message was sent by the intended sender, and not a third party with malicious intent.

Quadratic complexity grows with the square of the size of the problem - if the problem is twice as big, it will take roughly 4 times as long to solve.

Quicksort

Order the items in a list so that each item is appropriately before or after a pivot. The pivot is now in the correct location. Repeat the process for the items before the pivot, and the items after the pivot.

Redundant bits

Extra bits that are not part of the actual data but instead have been added for error detection and possibly error correction.

Regular expression

A formula used to describe a pattern in a text that is to be matched or searched for. These are typically used for finding elements of a program (such as variable names) and checking input in forms (such as checking that an email address has the right format).

The logic for regular expressions and finite state automata (FSA) are interchangeable – every regular expression has an equivalent FSA, and vice versa.

Rotation

To change the orientation of an object in space.

Salt

Extra information appended to a password, before hashing. Even if passswords for two different users are the same, with different salts those passwords get hashed to different values.

The salt does not need to be kept secret, as knowing it does not help find the original password.

Scale

To change the size of an object in space.

Search

Find a key in a large amount of data.

Selection sort

A sorting algorithm; Select the smallest item, then the second smallest, and so on. This is not a very fast algorithm, but it’s not as bad as bubble sort, and provides a good contrast with quicksort.

Server

In computing, a server is a program or device that provides a service for one or more other programs or devices, which are referred to as clients. They are designed to process requests and deliver data to a client over a network.

Slope

This is a way of expressing the angle or gradient of a line. The slope is simply how far up the line goes for every unit we move to the right. For example, if we have a line with a slope of 2, then after moving 3 units to the right, it will have gone up 6 units. A line with a slope of 0 is horizontal. Normally the slope of a line is represented using the symbol $m$.

Software

The programs, images and other data processed by computer hardware.

Sort

Sort puts keys (numbers, names or other values) in order from smallest to largest (outside computer science this is usually called ordering).

Stakeholder

A stakeholder is any person/people who could affect or be affected by a project. For example if you were designing a bridge, the stakeholders would be:

• Yourself
• The organisation paying for the bridge
• The company building the bridge
• Those who will use the bridge
• People who live near the construction site
• Relevant government officials

String

A sequence of characters.

Substitution cipher

A type of cipher that works simply by replacing each letter or combination of letters in a plain text with a certain other letter or combination of letters to make up the cipher text.

Syntactically correct

A string is syntactically correct if it matches the specifications for a formal language. For example, the string "()(())" is correct for a grammar that gives the rules for balanced parentheses. In a computer program, a syntax error is when a character occurs in the input which isn’t allowed, and the program is therefore not syntactically correct.

Syntax

Rules about what text can appear in a programming language. A programmer needs to get the syntax right for a program to run.

Syntax diagram

Also known as railway (or railroad) diagrams, these are a graphical representation of a grammar using arrows (the "train tracks") to show the options for each component of a language.

Something a user might do with a piece of software or electronic device to achieve a goal. In the case of a cellphone this might be "send a text message" or in the case of a microwave it might be "heat up yesterday’s leftovers". Interfaces are best evaluated when considering how they help a user to perform a task.

Thresholding

A very simple method of image segmentation. For every pixel in an image; if it is above a certain threshold, the pixel is set to a certain colour (often white). Otherwise it is set to another certain colour (often black).

Time complexity

Time complexity the usual meaning of the complexity of an algorithm; this makes it clear that we’re talking about the time taken. Normally it’s expressed in terms of steps, not real time on a particular computer, as different computers are different speeds.

Tractable

A tractable problem is one that can be solved in a reasonable amount of time; usually the distinction between tractable and intractable is drawn at the boundary between problems that can be solved in an amount of time that is polynomial; those that require exponential time are regarded as intractable.

Transition

In a finite state automata, a transition is a link out of one state and into one state. The two states could even be the same state!

Translation

To change the position of an object in space.

Unicode

Unicode is an extension of ASCII; it supports characters from multiple languages, using 16 bits per character.

Usability

A measure of how easily or effectively a user can achieve specific objectives using a product or system. More information can be found on Wikipedia.

See 'Heuristic'

User

The human using the computer system or electronic device.

User Experience (UX)

A user's emotions and attitudes about a product, including how they interact and experience the product.

User story

An informal description of features in a system, written from the perspective of the user.

Visual computing

Visual computing designing systems that can perceive, process, understand and generate images. Images typically come from scanners and cameras, and may be displayed on monitors, head mounted displays, or as movies.