Skip to main content

2. Implement a Lexical Analyzer for a given program using Lex Tool.

 Let's assume we have a simple programming language with keywords `if`, `else`, `while`, and identifiers (variable names) consisting of letters and digits. We want to tokenize a given program written in this language.


Here's the Lex specification file (`lexer.l`) along with explanations:


```lex

%{

#include <stdio.h>

%}


%option noyywrap


%%

if      { printf("IF\n"); }

else    { printf("ELSE\n"); }

while   { printf("WHILE\n"); }

[a-zA-Z][a-zA-Z0-9]* { printf("IDENTIFIER: %s\n", yytext); }

[ \t\n]  ; // Skip whitespace

.        { printf("UNKNOWN CHARACTER: %s\n", yytext); }

%%


int main() {

    yylex();

    return 0;

}

```


Explanation of each section:


- `%{ ... %}`: This section is used for including any necessary header files and declaring global variables or definitions. In this case, we include `stdio.h` for printing messages.


- `%option noyywrap`: This option indicates that the `yywrap` function won't be used. It's commonly used to signal the end of input in Lex programs.


- `%%`: This delimiter separates the Lex rules from the user code.


- `if`, `else`, `while`: These are keywords in our language. For each keyword, we specify a regular expression followed by an action to be taken when a match is found. In this case, we print the corresponding token name.


- `[a-zA-Z][a-zA-Z0-9]*`: This regular expression matches identifiers. It starts with a letter (uppercase or lowercase) and can be followed by letters or digits. When an identifier is matched, we print its value using `yytext`.


- `[ \t\n]`: This regular expression matches whitespace characters (spaces, tabs, newlines). We skip these characters.


- `.`: This regular expression matches any character that didn't match any of the previous patterns. When an unknown character is encountered, we print its value using `yytext`.


- The final section (`main()`) initializes the Lexical Analyzer using `yylex()`.


To compile and run the program:


1. Save the Lex specification in a file named `lexer.l`.

2. Open a terminal and navigate to the directory containing `lexer.l`.

3. Run the following commands:

   - `lex lexer.l` (compiles the Lex specification)

   - `gcc lex.yy.c -o lexer -ll` (compiles the Lex-generated code)

   - `./lexer` (runs the program)


Now, you can provide input text (your program) to the compiled lexer, and it will tokenize the input and display the corresponding tokens.


Comments

Popular posts from this blog

12. Write a program to convert NFA to DFA.

Converting a Non-Deterministic Finite Automaton (NFA) to a Deterministic Finite Automaton (DFA) involves creating a new DFA where each state corresponds to a set of NFA states reachable under certain conditions. Below is a simple Python program to perform this conversion. I'll explain each line of code: ```python from collections import defaultdict def epsilon_closure(states, transitions, epsilon):     closure = set(states)     stack = list(states)          while stack:         state = stack.pop()         for next_state in transitions[state].get(epsilon, []):             if next_state not in closure:                 closure.add(next_state)                 stack.append(next_state)          return closure def nfa_to_dfa(nfa_states, nfa_transitions, alphabet, start_state, n...