Skip to content

Conversation

@jserv
Copy link
Collaborator

@jserv jserv commented Aug 6, 2025

This commit modernizes lexer token recognition by utilizing existing hashmap infrastructure and arena allocation system for performance and consistency, resulting in O(n) → O(1) average case token lookup.

Summary by Bito

This pull request enhances the lexer token recognition system by replacing multiple strcmp chains with efficient hashmap lookups, improving performance from O(n) to O(1). A cleanup function for the lexer hashmaps has also been introduced to manage memory more effectively, ensuring better consistency and performance.

This commit modernizes lexer token recognition by utilizing existing
hashmap infrastructure and arena allocation system for performance
and consistency, resulting in O(n) → O(1) average case token lookup.
@jserv jserv requested review from ChAoSUnItY and DrXiao August 6, 2025 04:09
@jserv
Copy link
Collaborator Author

jserv commented Aug 6, 2025

Due to internal error of parser, I can not write elegant code like below:

// Create preprocessor directive lookup table
typedef struct {
    const char *name;
    token_t token;
} directive_entry_t;

static const directive_entry_t directives[] = {
    {"#include", T_cppd_include}, {"#define", T_cppd_define},
    {"#undef", T_cppd_undef}, {"#error", T_cppd_error},
    {"#if", T_cppd_if}, {"#elif", T_cppd_elif},
    {"#ifdef", T_cppd_ifdef}, {"#ifndef", T_cppd_ifndef},
    {"#else", T_cppd_else}, {"#endif", T_cppd_endif},
    {"#pragma", T_cppd_pragma}, {NULL, T_start}
};

// Use binary search instead of linear strcmp chain
token_t lookup_directive(const char *token) {
    int low = 0, high = 10;  // Number of directives - 1
    while (low <= high) {
        int mid = (low + high) / 2;
        int cmp = strcmp(token, directives[mid].name);
        if (cmp == 0) return directives[mid].token;
        else if (cmp < 0) high = mid - 1;
        else low = mid + 1;
    }
    return T_identifier;
}

Instead, I have to specify each item for hash map.

@jserv jserv requested review from fennecJ and vacantron August 6, 2025 04:33
@jserv jserv merged commit f0e6325 into master Aug 6, 2025
12 checks passed
@jserv jserv deleted the lexer-lookup branch August 6, 2025 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants