Skip to content

Commit d726cf4

Browse files
docs: Finalize the introduction section for project report; Remove duplicated doc files under the root directory. (#20)
1 parent 9343e9d commit d726cf4

File tree

2 files changed

+16
-172
lines changed

2 files changed

+16
-172
lines changed

README.md

Lines changed: 16 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,20 @@
11
# log-surgeon: A performant log parsing library
2-
Project Link: [Homepage][home-page]
32

4-
Video Demo Link: [Video Demo][video-demo]
5-
6-
---
7-
8-
## Team Members
9-
- Student 1: Siwei (Louis) He, 1004220960, [email protected]
10-
- Student 2: Zhihao Lin, 1005071299, [email protected]
11-
12-
---
3+
[![Build status][badge-build-status]][project-gh-action]
4+
![Apache Lisensed][badge-apache]
135

146
## Introduction
157

16-
`log-surgeon` is a library for high-performance parsing of unstructured text
17-
logs implemented using Rust.
8+
`log-surgeon` is a library for high-performance parsing of unstructured text logs implemented using
9+
Rust. This project originated as the course project for
10+
[ECE1724F1 Performant Software Systems with Rust][ece1724], offered in 2024 at the University of
11+
Toronto.
1812

19-
---
13+
- Project Link: [Homepage][home-page]
14+
- Video Demo Link: [Video Demo][video-demo]
15+
- Team Members
16+
- Student 1: [Siwei (Louis) He][github-siwei], 1004220960, [email protected]
17+
- Student 2: [Zhihao Lin][github-zhihao], 1005071299, [email protected]
2018

2119
## Motivation
2220
Today's large technology companies generate logs the magnitude of petabytes per day as a critical
@@ -83,8 +81,6 @@ Our project, [log-surgeon-rust][home-page], is designed to improve CLP's parsing
8381
safe and high-performant regular expression engine specialized for unstructured logs, allowing users
8482
to extract named variables from raw text log messages efficiently according to user-defined schema.
8583

86-
---
87-
8884
## Objective
8985
The objective of this project is to fill the gap explained in the motivation above in the current
9086
Rust ecosystem. We shall deliver a high-performance and memory-safe log parsing library using Rust.
@@ -107,8 +103,6 @@ The log parsing interface will provide user programmatic APIs to:
107103
- Feed input log stream to the log parser
108104
- Retrieve outputs (parsed log events structured according to the user schema) from the parser
109105

110-
---
111-
112106
## Features
113107
As a log parsing library, log-surgeon provides the following features that differ from general text
114108
parsers:
@@ -133,13 +127,9 @@ feature provides APIs for:
133127
- Merging multiple NFAs into a single DFA.
134128
- Simulating a DFA with character streams or strings.
135129

136-
---
137-
138130
## Architecture Overview
139131
![log-surgeon-arch-overview](docs/src/overall-arch-diagram.png)
140132

141-
---
142-
143133
## User's Guide
144134
log-surgeon is a Rust library for high-performance parsing of unstructured text logs. It is being
145135
shipped as a Rust crate and can be included in your Rust project by adding the following line to
@@ -184,8 +174,6 @@ The example uses the repository relative path to include the dependency. If you
184174
library in your project, you can follow the user's guide above where you should specify the git URL
185175
to obtain the latest version of the library.
186176

187-
---
188-
189177
## Contributions by each team member
190178
1. **[Louis][github-siwei]**
191179
- Implemented the draft version of the AST-to-NFA conversion.
@@ -202,8 +190,6 @@ to obtain the latest version of the library.
202190
Both members contributed to the overall architecture, unit testing, integration testing, and library
203191
finalization. Both members reviewed the other's implementation through GitHub's Pull Request.
204192

205-
---
206-
207193
## Lessons learned and concluding remarks
208194
This project provided us with an excellent opportunity to learn about the Rust programming language.
209195
We gained hands-on experience with Rust's borrowing system, which helped us write safe and reliable
@@ -226,17 +212,20 @@ The future work:
226212
- Implement [tagged-DFA][wiki-tagged-dfa] to support more powerful variable extraction.
227213
- Optimize the lexer to emit tokens based on buffer views, reducing internal string copying.
228214

229-
215+
[badge-apache]: https://img.shields.io/badge/license-APACHE-blue.svg
216+
[badge-build-status]: https://github.com/Toplogic-Inc/log-surgeon-rust/workflows/CI/badge.svg
230217
[clp-paper]: https://www.usenix.org/system/files/osdi21-rodrigues.pdf
231218
[clp-s-paper]: https://www.usenix.org/system/files/osdi24-wang-rui.pdf
219+
[ece1724]: https://www.eecg.toronto.edu/~bli/ece1724
232220
[github-clp]: https://github.com/y-scope/clp
233221
[github-siwei]: https://github.com/Louis-He
234222
[github-zhihao]: https://github.com/LinZhihao-723
235223
[hadoop-logs]: https://zenodo.org/records/7114847
236224
[home-page]: https://github.com/Toplogic-Inc/log-surgeon-rust
237225
[mongodb-logs]: https://zenodo.org/records/11075361
226+
[project-gh-action]: https://github.com/Toplogic-Inc/log-surgeon-rust/actions
238227
[regex-syntax-ast-Ast]: https://docs.rs/regex-syntax/latest/regex_syntax/ast/enum.Ast.html
239228
[wiki-dfa]: https://en.wikipedia.org/wiki/Deterministic_finite_automaton
240229
[wiki-nfa]: https://en.wikipedia.org/wiki/Nondeterministic_finite_automaton
241230
[wiki-tagged-dfa]: https://en.wikipedia.org/wiki/Tagged_Deterministic_Finite_Automaton
242-
[video-demo]: TODO
231+
[video-demo]: https://www.youtube.com/watch?v=0mJwwBKXU7A&ab_channel=SiweiHe

proposal.md

Lines changed: 0 additions & 145 deletions
This file was deleted.

0 commit comments

Comments
 (0)