Aesthetics is a branch of philosophy that explores the nature of art, beauty, and taste, with the creation and appreciation of beauty.
Apple Lossless Coding, a lossless compression method for audio.
In computing, an algorithm is a set of computational instructions that solve a problem or achieve a particular outcome. The computational instructions are instructions that could be implemented in a typical programming language (typically involving input, output, sequence, selection, iteration and storage), although general algorithms are usually expressed in pseudo-code rather than in a programming language.
Algorithm analysis working out the complexity of an algorithm.
Algorithm complexity how long the algorithm takes to run (or how much memory it uses). These are almost always specified in terms of the size of input.
In formal languages, a list of characters that may occur in a language, or more generally, a list of all possible inputs that might happen.
The artificial ability to mimic parts of human intelligence.
In its broadest meaning; artificial intelligence is when something made by humans is able to do something the humans didn't tell it how to do.
The commonly used code for representing characters as 8-bit numbers (although only 7 of the 8 bits are usually used).
An analysis method used to find relationships and associations between variables in large datasets.
A trend is asymptotic if it approaches a value (asymptote), but never reaches it. For example, the graph y = 1/x has asymptotes at y = 0 and x = 0.
The performance of an algorithm at the extremes, usually a really high input n.
Gaining access to or decrypting a file that is using encryption, without having the key. There are several types of attacks, some of which are also defined in the glossary.
A convention for showing a measure of complexity, describing how the cost of an algorithm increases with the size of the input.
The base 2 number system, i.e. numbers only made up of the digits "0" and "1". All numbers that can be represented in the decimal number system can be uniquely represented in the binary number system.
Searching a sorted list by looking at the middle item, and then searching the appropriate half recursively (used for phone books, dictionaries and computer algorithms).
Bit short for "binary digit" - a digit that is either 0 or 1.
Searching a list by checking items at random. Items can be checked more than once so it is possible for this algorithm to never end!
An observation that adding more people to a project that is running late may actually slow it down more.
A type of decryption attack that is carried out by trying every possible key.
A sorting algorithm based on swapping adjacent items that are out of order. It is not a good method, but serves as an example of a slow method in contrast to others like quicksort.
An error in software that causes it to fail or otherwise perform unexpectedly.
Byte a group of 8 bits, able to represent numbers from 0 to 255, can store one ASCII character (also known as an octet).
A very simple cipher that offsets each letter in the alphabet by a certain amount, specified by the key. It is no longer used in practice due to being very easy to attack.
A system of plotting points on a graph by, for each point, specifying a distance from the origin along each axis. For example, [2, 3] specifies a point two units along the x axis, and 3 units along the y axis.
The central processing unit (CPU) is a computer component, and is often called the "brains" of a computer. It is the piece of circuitry that executes the instructions in all computer programs and processes data.
The collective name for any letter, number, or symbol used in text. For example: a, A, @, 3, +, and several symbols we don't see (like space and tab), are all unique characters.
An AI system that has text conversations with the user, typically based on simple pattern matching.
An extra digit that is added onto the end of a number such as an ISBN, credit card number, or barcode number. This digit is calculated using a formula based on the other digits in the number. Error detection works by using the check equation to determine whether or not the check digit is as expected.
An equation that is used to check whether or not the check digit for a number is correct.
A hierarchy of four classifications of formal languages, ranging from simple regular expressions to very flexible (but computationally difficult) grammars.
An algorithm used to encrypt a piece of plain text.
Text which has been encrypted.
In computing, a client is a program that connects to a service made available on a server.
A compiler translates an entire program written in a high level language to machine code. This machine code can then be used to run the program.
Something that completes. In numeracy, a number plus the complement of that number adds to a nice round value.
For example: in decimal, 2 + 8 = 10, so the complement of 2 is 8, and that of 8 is 2.
Note that a complement is different from a compliment, which is an expression of praise.
How long it takes to solve a problem. A problem has an inherent complexity (minimum time needed to solve it); any algorithm to solve the problem will have a higher complexity (take at least that long).
Making a file smaller by removing redundant information (typically using standards like zip
, jpeg
, mpeg
, mp3
, etc).
Either the technologies used to create images in computers, or the images themselves.
A collection of instructions that can be executed by a computer.
The people who program programs are called programmers, and the act is called programming.
A scientific field about how computers can be made to emulate human vision; in areas like face recognition and spatial awareness.
A convention is a formal rule about a way in which something should be done. Unlike protocols, conventions don't have to be followed, but it always helps when they are.
In computing a core, also called a CPU core or processor core, is the part of a computer which actually performs calculations and carries out instructions. Computer CPUs can contain multiple cores.
A mutual relationship or connection between two or more variables. Two things are said to be correlated if changing the value or state of one also changes the value or state of the other.
A measure of how much resource is needed – time, money, material or anything else that is relevant for the context.
A data point is a single piece or unit of information. It could be, for example, a measurement of something, a time stamp, or a recording of a button click.
Carrying out operations on data, for example retrieving it, making calculations with it, and making decisions based on it.
The standard base 10 number system that is used in everyday math, using the digits "0", "1", "2", "3", "4", "5", "6", "7", "8", and "9".
Also Decryption, Decipher
Getting the plain text for a piece of cipher text by either using the key or an attack.
A brute force attack on a password database that involves testing every common password as well as words from a dictionary and all combinations of characters up to a certain length.
An encryption system that allows the receiver to verify that a document was sent by the person who claims to have sent it.
The process of finding edges in images; areas with a very sharp change in image brightness or other variations. These edges could indicate object boundaries and so are very useful for image segmentation.
Changing the representation of data so it can’t be read by an eavesdropper who doesn’t have the decryption key.
The password or secret code that will unlock an encrypted file.
Correcting an error that has been detected in some data. This can be demonstrated in the Parity trick, where a person is able to flip the changed bit back over so it is correct again (after they have "detected" which bit was incorrect). Not all error control schemes are able to correct errors; some are only able to detect them.
Detecting when an error has occurred in some data, such as a number getting typed incorrectly or a bit getting flipped. Some simple examples of this are parity bits or a check digit.
Estimating values after some given values assuming they continue the trend. For example, if a sequence of 3 numbers is 1 2 3, we could continue the linear trend and extrapolate the values 4 5 6.
A function available on a digital device or software, such as copy/paste, autofocus, voice dialling or undo. Features are often used to sell a device, but having features (functionality) should not be confused with people being able to use the device effectively (usability).
Responding to or acknowledging a user action. Users find the devices hard to use if the feedback is slow, confusing, or non-existent.
In formal languages, a finite state automaton (FSA) is a simple 'machine' that has states, and transitions from one state to another based on strings of input symbols.
The logic for FSAs and regular expressions are interchangeable – every FSA has an equivalent regular expression, and vice versa.
Alternative name for a finite state automaton.
A formal language follows a set of rules that is far more restrictive than any human-spoken language. As a result, it is not at all ambiguous when checking if some given input is valid according to the language.
An attack on substitution ciphers that takes advantage of the fact that some letters are generally more common than others in a piece of text (e.g. in English, the letter "e" is usually the most common letter) by looking at which letters appear the most in the cipher text and guessing that they must be the substitutions for the most common letters.
A lossless image compression system typically used for small images with few colours in them (in practice it can be lossy because it has only 256 colours, and if the original has more colours then some will be lost).
About 1000 megabytes (1,000,000 kilobytes and 1,000,000,000 bytes). This is 8,000 million individual bits (i.e. 0’s and 1’s). Like a kilobyte, there are other definitions, such as 1024x1024x1024 bytes, but usually this level of accuracy isn’t important. Commonly referred to as a "GB".
In formal languages, a set of grammars is a set of rules for specifying a formal language, for example, to specify syntax for programming languages.
Unlike regular expressions, grammars don't specify a sequential pattern the language needs to follow. Instead it defines a set of building blocks the language can use.
A manipulation of a digital object: translation, rotation, scale and others.
An algorithm where; for every choice, all future choices are ignored and the most immediately beneficial decision is made.
Coinage has been designed with this in mind – a greedy algorithm can find the fewest number of coins required for any given amount of money. However, in many other situations the greedy algorithm does not give the most optimal result.
A function that maps a "key" of variable length to a seemingly random "hash" of fixed length.
In a hash function used for cryptography, it is easy to find the hash of a key (and the hash will always be the same), but effectively impossible to find what key was used to make the hash.
A heuristic is rule or guideline, usually devised from experience. The term is used in both HCI and algorithms. In HCI, heuristics are often used as a benchmark to evaluate interfaces – they aren’t strict rules, but usually highlight issues with designs. In algorithms, a heuristic is an approximate solution to a problem – intended as a quick, close enough calculation rather than a precise, impossibly long one.
The base 16 number system. Uses the digits "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "B", "C", "D", "E", and "F". All numbers that can be represented in decimal can be uniquely represented in hexadecimal (just like binary). It is most often used as a shorthand notation for binary, by assigning 1 hexadecimal digit to each 4 bit pattern (the assigning is done in numeric order).
A representation for colours that tells the computer how much red, blue, and green light to display in a pixel (to make the desired colour). Uses 1 byte for each of these 3 primary colours, which is 3 bytes (24 bits) in total. These 24 bits are often written as 6 hexadecimal digits to make them easier for humans to read, which is why they are called "Hexadecimal colour codes". They are commonly encountered when specificying colours in HTML for web pages.
High level language a programming language that is designed for humans to read and write (e.g. Java, Python, C, C#, Basic, Scratch…) as opposed to machine languages.
An area of computer science looking at how people interact with a digital device, with an emphasis on the quality of the experience to complete tasks.
A protocol for the distribution of Hypertext over the internet. It is the most common protocol in use for web browsing.
Random variations in colour and brightness within images, usually caused by electrical interference.
The process of splitting an image into multiple simpler parts. For example, software attempting to read text in an image would try to segment it into individual letters and analyse those.
Start with an empty list, and insert each item in the correct place; this is a relatively slow method, usually between selection sort and quicksort in terms of speed.
An area of computer science that investigates ways to simulate or approximate human intelligence on computers; often referred to as artificial intelligence (AI).
The part of a computer, software, or electronic device that a human interacts with, whether this is by sight, hearing, or touch.
Estimating values between some given values assuming they follow a trend. For example, if a sequence of 5 numbers starts with 3 and finishes with 11, we might assume the trend is linear and interpolate the values 5, 7, 9 in between.
An interpreter runs a programming language by translating each line of code in sequence as it is executed.
Stands for International Standard Book Number. Every published book has one of these numbers on the back of it. ISBN is significant to error control coding because it uses a check digit for error detection.
A lossy image compression system typically used for photographs.
It is an item of data that is being searched for or sorted, and therefore will be compared with other data.
The password or secret value that is used to encrypt and decrypt an encrypted file (without having to use an "attack"). Some widely used methods have different keys for encryption and decryption.
About 1000 bytes. This is 8,000 individual bits (i.e. 0’s and 1’s). We say "about" 1000 bytes because the term is ambiguous and it is often taken as 1024 bytes; however, rounding it to 1000 is close enough for most calculations. Commonly referred to as a "KB".
Working out the key or method of encryption (cipher) based on having access to both the original plaintext and its encrypted form.
In formal languages, a language is the set of all strings that the language accepts i.e. that are valid.
When compiling a computer program, working out what the components of the program are e.g. identifiers, keywords, integers.
Linear complexity grows in proportion to the size of the problem - if the problem is twice as big, it will take roughly twice as long to solve.
Searching a list by looking at each item in turn, stopping when the current item matches the one being looked for.
Logarithm is a very slow growing mathematical function written as . In computer science logarithms are usually in base 2, that is, , which is the inverse of the incredibly fast growing exponent function . Logarithms are not needed to understand the material in this book, but they are used a lot in computer science and are a useful concept to understand. Logarithms happen to come up a lot with algorithms, and the two words are often confused. The value is just the number of times you can halve n until you get down to 1; for example, is 5, and is 10. Binary search takes steps to search n items; storing the number n in binary takes bits.
A marketing strategy where rather than targeting a large number of customers at once with the same advertisements, many small niche groups of customers are specifically targeted.
A compression method that does not cause any loss of data. This means that the uncompressed file will be identical to the original file that was compressed (which is important for text). In the case of images and sound, it means they will be of the same quality before and after compression. For example, ZIP and ALC use lossless compression.
A compression method that trades off quality for file size. Lossy compression methods can make files smaller than lossless compression methods can, but the quality of the resulting file will be lower. For example, MP3 and JPEG use lossy compression.
The native language for instructions for a computer, not very easy for humans to read and write.
About 1000 kilobytes (1,000,000 bytes). This is 8 million individual bits (i.e. 0’s and 1’s). [Like a kilobyte, there are other definitions, such as 1024x1024 bytes, but usually this level of accuracy isn’t important]. Commonly referred to as a "MB".
A lossy audio compression system.
4 bits (half a byte), sometimes called a nybble.
A widely used set of heuristics for evaluating computer interfaces that was devised by Jakob Nielson. The list of heuristics are available on the Nielsen Norman Group website.
In a grammar, a non-terminal is a symbol that needs to be replaced using a production in order to get the final string that the grammar is producing. Generally a non-terminal is written in upper case letters, but they can also be identified because they appear on the left side of a production in a context-free grammar. For example, in the production S -> aaBbb, the "a" and "b" are terminals, and the "S" and "B" are non-terminals. In a programming language, the non-terminal could be things like "IF_STMT" or ""COMPOUND_STMT", which would be replaced with the terminals for these statements (e.g. "IF_STMT" would eventually be replaced with terminals like "if" and "else").
The base 8 number system. Like hexadecimal, it is significant to computer scientists as it allows a shorthand notation for writing binary numbers. Octal assigns a digit to each possible 3 bit pattern.
In networking, a packet is a small piece of data routed across the internet. The packet includes control information (e.g. a source and destination, error control information), and a payload (the data being sent). The payload is almost always only a tiny part of a much larger file.
Adding an extra bit to a set of bits to make it so that there is an even number of 1’s. Storing the parity makes it possible to detect and correct errors later. This is known as an even parity bit; an odd parity bit is also possible where the extra bit ensures there is an odd number of 1’s.
A visual representation of the structure derived by parsing some input.
Reading some input (typically a computer program) and making sense of it by breaking it into parts according to their function.
In formal languages, finding text that matches a particular rule, typically using a regular expression to give the rule.
One arrangement of items. For example, the permutations of 1 & 2 are [1,2] and [2,1].
A randomly selected item in an unsorted list, used in quicksort.
This term is an abbreviation of picture element, the name given to the tiny squares that make up a grid that is used to represent images on a computer.
Text in human-readable language, without any encryption.
A lossless image compression system typically used for small images with few colours in them.
Everything relevant to the problem that should be examined when tying to solve it. The goal of a problem domain is to exclude everything you don't need to think about and only focus on what's left.
An electronic circuit which performs computations and operations on data. All computers will contain at least one processor, and most contain multiple.
A production (or production rule) in a formal languages is a rewrite rule, such as A -> aAb, which shows how the symbol on the left-hand side can be substituted as a step to generating a string of characters that belong to the language of the grammar; the grammar's language is defined by its production rules.
A formal language that specifies what instructions can be given to a computer, and what output they return.
A set of strict rules or instructions describing how something must be done.
A model or test sample of a product, used to gather feedback or test data so the design can be improved upon.
A system where a public key can be safely broadcast to anyone, but messages encrypted with it can only be read by a secret private key. This is known as an asymmetric system, because you don't use the same key to encrypt and decrypt a message.
Similarly, messages decrypted with the public key could only have been encrypted with the private key. This can be used to prove a message was sent by the intended sender, and not a third party with malicious intent.
Quadratic complexity grows with the square of the size of the problem - if the problem is twice as big, it will take roughly 4 times as long to solve.
Order the items in a list so that each item is appropriately before or after a pivot. The pivot is now in the correct location. Repeat the process for the items before the pivot, and the items after the pivot.
Extra bits that are not part of the actual data but instead have been added for error detection and possibly error correction.
A formula used to describe a pattern in a text that is to be matched or searched for. These are typically used for finding elements of a program (such as variable names) and checking input in forms (such as checking that an email address has the right format).
The logic for regular expressions and finite state automata (FSA) are interchangeable – every regular expression has an equivalent FSA, and vice versa.
To change the orientation of an object in space.
Extra information appended to a password, before hashing. Even if passswords for two different users are the same, with different salts those passwords get hashed to different values.
The salt does not need to be kept secret, as knowing it does not help find the original password.
To change the size of an object in space.
Find a key in a large amount of data.
A sorting algorithm; Select the smallest item, then the second smallest, and so on. This is not a very fast algorithm, but it’s not as bad as bubble sort, and provides a good contrast with quicksort.
In computing, a server is a program or device that provides a service for one or more other programs or devices, which are referred to as clients. They are designed to process requests and deliver data to a client over a network.
This is a way of expressing the angle or gradient of a line. The slope is simply how far up the line goes for every unit we move to the right. For example, if we have a line with a slope of 2, then after moving 3 units to the right, it will have gone up 6 units. A line with a slope of 0 is horizontal. Normally the slope of a line is represented using the symbol .
The programs, images and other data processed by computer hardware.
Sort puts keys (numbers, names or other values) in order from smallest to largest (outside computer science this is usually called ordering).
A stakeholder is any person/people who could affect or be affected by a project. For example if you were designing a bridge, the stakeholders would be:
A sequence of characters.
A type of cipher that works simply by replacing each letter or combination of letters in a plain text with a certain other letter or combination of letters to make up the cipher text.
A string is syntactically correct if it matches the specifications for a formal language. For example, the string "()(())" is correct for a grammar that gives the rules for balanced parentheses. In a computer program, a syntax error is when a character occurs in the input which isn’t allowed, and the program is therefore not syntactically correct.
Rules about what text can appear in a programming language. A programmer needs to get the syntax right for a program to run.
Also known as railway (or railroad) diagrams, these are a graphical representation of a grammar using arrows (the "train tracks") to show the options for each component of a language.
Something a user might do with a piece of software or electronic device to achieve a goal. In the case of a cellphone this might be "send a text message" or in the case of a microwave it might be "heat up yesterday’s leftovers". Interfaces are best evaluated when considering how they help a user to perform a task.
In a grammar, a terminal is a symbol that is in the final string that the grammar is producing. Generally a terminal is written in lower case letters, and non-terminals are written in capital letters. For example, in the production S -> aaBbb, the "a" and "b" are terminals, and the "S" and "B" are non-terminals.
In a programming language, the terminal could be a keyword, such as "if", "else" or "def".
A very simple method of image segmentation. For every pixel in an image; if it is above a certain threshold, the pixel is set to a certain colour (often white). Otherwise it is set to another certain colour (often black).
Time complexity the usual meaning of the complexity of an algorithm; this makes it clear that we’re talking about the time taken. Normally it’s expressed in terms of steps, not real time on a particular computer, as different computers are different speeds.
A tractable problem is one that can be solved in a reasonable amount of time; usually the distinction between tractable and intractable is drawn at the boundary between problems that can be solved in an amount of time that is polynomial; those that require exponential time are regarded as intractable.
In a finite state automata, a transition is a link out of one state and into one state. The two states could even be the same state!
To change the position of an object in space.
Unicode is an extension of ASCII; it supports characters from multiple languages, using 16 bits per character.
A measure of how easily or effectively a user can achieve specific objectives using a product or system. More information can be found on Wikipedia.
See 'Heuristic'
The human using the computer system or electronic device.
A user's emotions and attitudes about a product, including how they interact and experience the product.
An informal description of features in a system, written from the perspective of the user.
Visual computing designing systems that can perceive, process, understand and generate images. Images typically come from scanners and cameras, and may be displayed on monitors, head mounted displays, or as movies.