A JavaScript compiler is a specially designed computer programme that processes each line of statement written in a pre-declared programming language. It then converts those LOC into machine language or “encodes” it such that the processors can use it.
Table of Contents
Introduction
Before a detailed analysis of JavaScript let us understand the language statements. Usually, a developer writes language statements in any programming language such as Pascal, C, C++, or Java, one line at a time using a text editor.
The text file created consists of lines of code, also known as source code. Based on the programming language used, the file is saved with the appropriate extension and then it is run on the language compiler, by specifying the file name with source codes.
Working of a JavaScript Compiler
During code execution (run time), the compiler first has to parse (or analyse) each line of the language statement syntactically line by line, in order, in a single or multiple successive stages, also known as “passes”, for building the target program. The compiler makes sure that the statements that have a reference variable that points to another line of code, are ordered sequentially.
Classically, the output generated by the compiler is known as the object code or the object module. The use of the term “object” here has nothing to do with an object-oriented programming system. A machine code that is executed by the processor, by running one instruction at a time is called the Object code.
Lexical Analysis
The Lexical Analysis phase is also known as the tokenisation. Foremost, we need to convert each keyword into a token (known as tokens) by removing white spaces. During separation, tokens are also assigned primitive types, such as “word” or “number”. The Lexical Analyser is the first phase of a compiler, all the comments are dropped in this stage only.
If any token is not recognised by the lexical analyser, it generates an error message and drops the token. The Lexical Analyser works in close coordination with the Syntax analyser. It checks the legalities of the token and then passes the generated token to the Syntax Analyser,on-demand.
Image Source: Tutorials Point
2. Syntax Analysis/Parsing
Syntax Analysis is the second phase of a compiler. As soon a block of text is converted into tokens, we go through each token and try to derive a suitable relationship among them. We club the numbers or symbols associated with the command keywords. Post these steps you can automatically observe a well-aligned structure created by the derived tokens.
Once the structure is created, the compiler designs processes for checking that the given input string is in accordance with the rules and the structure of formal grammar laid down. As the name suggests, the syntactical structure of the given input string is checked with respect to the selected programming language.
Image Source: Guru99
3. Transformation
Post the syntax analysis by the parser, the generated structure is transformed into a suitable one for generating the final result. The step-by-step instruction is laid down by the parser for generating the correct parse tree.
Image Source: Wikipedia
4. Code Optimisation
In this step, the unnecessary intermediate variables or calculation steps are eliminated by the compiler so that the code generated is optimum. The temporary variables are generally used for swapping, storing duplicates and the unused array spaces are also dropped in this step.
5. Code Generation
The final step is to convert the optimised code into a three address assembly code that can be read by the microprocessors and microcontrollers of the computer.
Steps for creating a JavaScript Compiler
Now that we have clarity about working on a compiler, let’s make an attempt to make one using JavaScript. This compiler should take a DBN code as input and change it into an SVG source code.
1. Write the Lexer function
The lexical analyzer has to split a given input string into small meaningful bits, known as tokens. For instance, in the English language, we can split the sentence, “I love Coding Ninjas“, as : [I, love, Coding, Ninjas].In a DBN, each token is delimited with the help of white spaces and later classified as a “word” or a “number”.
function lexer (code) {
return code.split(/\s+/)
.filter(function (t) { return t.length > 0 })
.map(function (t) {
return isNaN(t)
? {type: 'word', value: t}
: {type: 'number', value: t}
})
}
Code Courtesy: lexer.js
2. Create the Parser function
The parser has to analyse each token and validate its syntactic information, the main aim of the parser is to build an object known as the AST (Abstract Syntax Tree). The AST is a hierarchical representation that denotes how tokens are related to one another. Tokens are identified according to their type, such as “NumberLiteral” which implies that the given value is a number or “CallExpression” for showing arguments for CallExpression.
function parser (tokens) {
var AST = {
type: 'Drawing',
body: []
}
// extract a token at a time as current_token. Loop until we are out of tokens.
while (tokens.length > 0){
var current_token = tokens.shift()
// Since number token does not do anything by it self, we only analyze syntax when we find a word.
if (current_token.type === 'word') {
switch (current_token.value) {
case 'Paper' :
var expression = {
type: 'CallExpression',
name: 'Paper',
arguments: []
}
// if current token is CallExpression of type Paper, next token should be color argument
var argument = tokens.shift()
if(argument.type === 'number') {
expression.arguments.push({ // add argument information to expression object
type: 'NumberLiteral',
value: argument.value
})
AST.body.push(expression) // push the expression object to body of our AST
} else {
throw 'Paper command must be followed by a number.'
}
break
case 'Pen' :
...
case 'Line':
...
}
}
}
return AST
}
Code Courtesy: parser.js
3. Create the Transformer function
Although, the Abstract Syntax Tree created in step two is ideal for describing the hierarchy of the code, yet the SVG file cannot be created with the help of it.
For instance: “Paper” is a concept that exists in the DBN paradigm only. Therefore, in SVG, an <rect> element might be used for representing a Paper. The main function of the transformer function is to convert the AST into an SVG-friendly one.
function transformer (ast) {
var svg_ast = {
tag : 'svg',
attr: {
width: 100, height: 100, viewBox: '0 0 100 100',
xmlns: 'http://www.w3.org/2000/svg', version: '1.1'
},
body:[]
}
var pen_color = 100 // default pen color is black
// Extract a call expression at a time as `node`. Loop until we are out of expressions in body.
while (ast.body.length > 0) {
var node = ast.body.shift()
switch (node.name) {
case 'Paper' :
var paper_color = 100 - node.arguments[0].value
svg_ast.body.push({ // add rect element information to svg_ast's body
tag : 'rect',
attr : {
x: 0, y: 0,
width: 100, height:100,
fill: 'rgb(' + paper_color + '%,' + paper_color + '%,' + paper_color + '%)'
}
})
break
case 'Pen':
pen_color = 100 - node.arguments[0].value // keep current pen color in `pen_color` variable
break
case 'Line':
...
}
}
return svg_ast
}
Code Courtesy: transformer.js
4.
- Code the Generator function
As the concluding step of the compiler created, the generator function creates a SVG code on the basis of the new AST made in step three.
function generator (svg_ast) {
// create attributes string out of attr object
// { "width": 100, "height": 100 } becomes 'width="100" height="100"'
function createAttrString (attr) {
return Object.keys(attr).map(function (key){
return key + '="' + attr[key] + '"'
}).join(' ')
}
// top node is always <svg>. Create attributes string for svg tag
var svg_attr = createAttrString(svg_ast.attr)
// for each elements in the body of svg_ast, generate svg tag
var elements = svg_ast.body.map(function (node) {
return '<' + node.tag + ' ' + createAttrString(node.attr) + '></' + node.tag + '>'
}).join('\n\t')
// wrap with open and close svg tag to complete SVG code
return '<svg '+ svg_attr +'>\n' + elements + '\n</svg>'
}
Code Courtesy: generator.js
5. Assemble all these to generate the compiler
In the final step, we create an object of sbn type as this compiler can be called a “sbn compiler” (SVG by numbers compiler), so we create objects for lexer, parser, transformer, and generator methods. Finally, we create a “compile” method for invoking all of them sequentially. Post this, we pass the input string to the compile method and get the SVG output.
var sbn = {}
sbn.VERSION = '0.0.1'
sbn.lexer = lexer
sbn.parser = parser
sbn.transformer = transformer
sbn.generator = generator
sbn.compile = function (code) {
return this.generator(this.transformer(this.parser(this.lexer(code))))
}
// call sbn compiler
var code = 'Paper 0 Pen 100 Line 0 50 100 50'
var svg = sbn.compile(code)
document.body.innerHTML = svg
Code Courtesy: compiler.js
Frequently Asked Questions
JavaScript is considered to be a “Compiled Language”. Although it is a compiled language, it is distinct from the various other compiled languages including C++ and Java.
The key difference is that the compiled code isn’t portable and it is not even compiled in advance. In an actual scenario, a JavaScript code is compiled only a few microseconds prior to its execution.
JavaScript is JIT-compiled with respect to the native machine code in a wide range of JavaScript implementations. Even though a few parts of a JavaScript program are interpreted briefly, yet JavaScript is not an interpreted language.
Earlier, Javascript used to be an interpreted language. Hence, it usually had no compilers and was interpreted by the web browser hosted by their own ‘JavaScript Engines’.
Although, in recent years, JavaScript engines are working as effective compilers. For instance: the V8 engine by Google, the JavaScript engine that hosts the server-side JavaScript – NODE.js is employed for compiling the JavaScript code into machine instructions instead of the classical obsolete interpretation. There are a few Just-In-Time compilers such as Mozilla‘s JägerMonkey.
There are numerous JavaScript transpilers, e.g. Dart, Babel (ES6), CoffeeScript, TypeScript, etc. Based on your requirements you can pick any, TypeScripts tends to be a promising choice as JavaScript is a dynamically typed language. That is the reason for Angular 2 from Google for adopting TypeScript by Microsoft over Dart by Google itself.
The prime function of any given compiler is to convert any high-level language to a low-level language that can be understood by the machine. For example, in C or C++ programming languages, the compiler can directly convert the written source code into machine language code which is usually dependent on any particular platform, but we know that Java is totally platform-independent.
The Java compiler has been named javac which converts the target source code to an intermediate source code. This intermediate source code is known as the Java bytecode. This generated java bytecode is highly flexible as it is platform-independent, this implies that you are allowed to compile this source code in Windows, or any other platforms such as Linux, or Mac using this javac compiler.
Just In Time has been abbreviated as JIT, in contrast to other compiled languages, such as C, where the compilation is done prior time (this implies, before the targeted execution of the source code), the JIT compiler allows JavaScript compilation to be carried out during execution.
As soon as a Java method is called, the JIT compiler is enabled by default. The JIT compiler runs “just in time”, as it compiles the bytecode of any given method into native machine code at the edge of execution. After the successful compilation of any method, the JVM doesn’t interpret the compiled code of any method, rather it calls it directly.
For accelerating the execution process, the V8 doesn’t run JavaScript code into an interpreter rather it converts it into an efficient machine code. The working of a “Just in Time” compiler is analogous to numerous modern JavaScript engines such as SpiderMonkey and Rhino (Mozilla).
Conclusion
The software Developer industry is one of the most dynamic industries, every minute hundred of software are updated, no matter if they are App codes, websites, Git repositories, or development frameworks. As we noticed in the case of JavaScript, a few years ago it was an interpreted language, but now it exists as one of the most efficient yet fragile compiled languages.
Creating a compiler is not an impossible task, as, towards the end, a compiler is ultimately software. There are a lot of constraints added for creating a compiler as it is going to be the bug locator. There should be no or negligible scope of error in the compiler if we want people to use it. The JavaScript compiler is unique because of its “Just in Time” feature, therefore, it was readily accepted by the Software Developers’ Community.
By Vanshika Singolia
Leave a Reply