Skip to content

AhmedBineuro/Toke

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Toke

Toke Icon

What is Toke?

Toke is a small and simple tokenizer for text parsing purposes. The entire tokenizer is in the header file that you can easily include in your project.

What can you do with Toke?

With Toke, you can tokenize special characters and strings by adding them to context using the IncludeToken and the IncludeFormatToken functions to specify their name and character/string (more on FormatTokens in What's new with toke?). Toke will automatically add a TokenType called NULLTOK with a token of '\0' that will encompass any other remaining text or characters. Then, Toke will automatically consume any whitespace and new line characters but it will keep count of which token belonged to which line.

What's new with Toke?

  • Format based tokens!
    • With format based tokens you can do a post processing typing operation to all remaining NULLTOK tokens
      • This means that the format matching happens AFTER parsing the whole file
    • To add a format based token, you have to provide the format validation function which takes in one T_string variable and returns a boolean value
  • Format validation functions
    • Along with the format based tokens, there are a couple of pre-implemented validation functions like IsInteger, IsHex, IsFloat, and IsAlphabetic
    • These function can be used or viewed to understand how the format validation works or used directly for your custom tokens

How to use Toke?

Toke can be used with 4 simple steps:

  1. Create the context
Context* CTX=CreateContext();
  1. Include your tokens
    IncludeToken(CTX,"OPEN TAG","<");
    IncludeToken(CTX,"CLOSE TAG",">");
    IncludeToken(CTX,"SLASH","/");
    IncludeToken(CTX,"ASSIGNMENT","=");
    IncludeToken(CTX,"QUOTATION","\"");
    IncludeToken(CTX,"SEMICOLON",";");
    //String match example
    IncludeToken(CTX,"HEADER 1","h1");
  1. Include any format based tokens
  // IsHex is a function with the following signature
  // bool IsHex(T_string str)
  IncludeFormatToken(CTX,"Hex",IsHex)
  1. Tokenize the file!
    TokenArray* ta=TokenizeFile(CTX,"./index.html");
  1. Free the context once you're done!
    FreeContext(CTX);

Check the example in example/example.c where I try to parse an html file for a more "real life" application

About

A simple and small 1 header tokenizing library for any purpose!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors