Data Types

Computer programming languages, including Lua, work with different types of data, and have different techniques for encoding different types of values as binary data.

Boolean

Boolean values represent a truth value, with two possible states. A single bit of information in a computer can hold one or two values. Values in Boolean can be thought of as either true or false or 1 or 0.

Integer

Integer data types hold numeric values with no fractional part. In other words, they cannot contain anything to the right of the decimal place. They are like integers in algebra. They can contain whole (natural) numbers, zero, or negative numbers.

Computers can also work with unsigned integers that do not allow numbers less than zero, aka negative numbers.

The range of numbers an integer variable can hold depends on its storage size. Modern computers can typically use 32-bit or 64-bit integers. When using unsigned integers, the highest possible value will be bigger than with a signed integer of the same size because 1 bit of a signed integer must be used for denoting the sign for negative numbers though the actual technique used, two's complement, is not exactly equivalent to just using the bit's value to encode the sign.

Small integer values, typically 8 bits, are also used to handle character data, or otherwise manipulate data stored in memory that is addressed as some number of bytes.

Range of 16-bit Integer Values

Range of 32-bit Integer Values

Range of 64-bit Integer Values

Lua, unlike most programming languages, actually has no integer type. Instead double-precision floating point numbers are used for all numeric values. A double-precision floating point can actually hold a larger integer value accurately than a 32-bit integer.

Floating Point

Floating point numbers can represent very small or very large values as well as values containing a fractional part, meaning to the right of the decimal place.

Floating point arithmetic is approximate because there are only so many bits available for encoding values. Floating point numbers are encoded using three parts:

The actual value of a floating point number is the mantissa times the base (which is 2 since it is binary) raised the the exponent.

Floating point numbers work similar to scientific notation when writing decimal numbers, but the exponent is a binary number just like the mantissa or base.

Floating point numbers can contain a larger number of values, but when you perform long sequences of operations on them they can accumulate significant errors due to rounding. It is also problematic to compare floating point numbers exactly. To get useful results you typically need to check if the difference of the two numbers being compared is below some small cutoff value instead.

The exact specification for implementing floating point in modern computers is (usually) based on the IEEE 754 standard.

The most common sizes of floating points supported are:

Strings

Strings represent sequences of character data that represent text or other data. In Lua strings are immutable, that means you can't modify a string value, you can only make a new string with a different value. Note that you can change which string value that a variable points to.

In Lua strings can contain 8-bit character values which can be UTF-8 unicode sequences, but Lua has no built-in support for Unicode handling. You can also use Lua strings to contain other types of encoding, Unicode or otherwise.

Data Structures

A data structure is a way to build a complicated set of data out of small, individual data values. For example, a list of numbers would be stored in a data structure, as opposed to a single number.
Besides sequential lists, data structures can also represent other structures or relationships between items.
For example, you can have specific fields with names or indexes which represent different aspects. An example of this would be a data structure to store contact information about someone which had seperate fields for name, phone number, email, and address. You can build ever more complicated data structures by nesting them or connecting them together in different ways. You could create a list of your contact data structures.

Programmers use lots of different kinds of data structures, but there are only a few they use very frequently.

Arrays

The simplest data structure to explain is an array. An array is basically a list of values, which are indexed by their position. In some computer languages the first element is at position 0, while in others it is at position 1.

Arrays Diagram

Tables

Tables are what you use in Lua to create data structures. This is different from most programming languages that have at least two or more language mechanisms for creating data structures.

A Lua table is an associative array, which is sometimes also known as a dictionary. An associative array is a data structure which stores pairs of keys and values. Keys must be unique, and you can use the key to lookup the value that makes the other half of the key value pair. You can also iterate over all the keys or all the pairs. You can insert new pairs. You can also remove pairs, or replace the value assigned to a key with a new value. The key/value pairs in an associate array do not have any specific order assigned to them.

Tables Diagram

Since Lua only has tables for defining data structures, we actually use tables for arrays as well. We do this by using numbers starting with 1 for the keys and going up as far as necessary. Try running this short Lua program to see an example of an array.

Lua also uses tables to represent what many programming language call a record or structure. A record is a compound data type made of named fields, each of which is another data type, either a simple data type like an integer, or actually another record.

Lua supports the . (dot) syntax for indexing into tables to make using tables for records easier syntactically. These two expressions are exactly the same in Lua:

my_table["myfield"]
my_table.myfield

These two are NOT the same, because the value of myvar will be used in the second, but the string "myvar" will be used in the second.

myvar = "myvalue"
my_table[myvar]  -- first: same as my_table["myvalue"]
my_table.myvar   -- second: same as my_table["myvar"]