As code is translated, it goes through a number of stages to convert the source code from the human-friendly high level code that we program in to machine code that can be run on the computer.
The stages of compilation are divided into three distinct stages where the original syntax is sanitised and checked before running:
- Lexical Analysis
- Syntax Analysis
- Code Generation
Lexical analysis is used to identify the high level words used within the source code that match a particular language. A lexeme is a single item of data within the source code.
Much like breaking down data in a database, the source code is broken down into atomic data, that is the smallest meaningful item of data.
Once broken down, these are used to create tokens of data which are saved into a token table with identifiers for their purpose.
Once the tokens have been created, the final job of lexical analysis is to remove any redundant syntax from the source code where it is not needed in the machine code.
When we talk about the syntax of a written language like English, we are referring to the spelling and grammar of the language. This is no different when using syntax when referring to programming languages.
In the first stage of syntax analysis the code statements are checked to ensure that they conform to the rules of the language. These rules are often constructed using BNF (Backus Naur Form) or Syntax Diagrams.
Like syntax analysis, code generation is broken into two distinct tasks:
Machine Code Creation
At this stage, the tokens are used to create the low level machine code. As their use was identified when tokens were created in the lexical analysis stage, this allows the translator to assign memory to the variables, and break the complex high level language down into simple low level language statements.
The final part of code generation is making statements within the source code more efficient. This is often where the programmer has included redundant variables, such as passing values to a variable where it is not needed. Eg:
x = 10
y = x
x = 10
Looking for more?
Log in to access a longer tutorial and resources for learning & revision.