-
-
Notifications
You must be signed in to change notification settings - Fork 65
refactor: Rework docstring parser #996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
ffddcbe to
05dd1fe
Compare
The extractor needed some fine tuning to only pick up top level docstrings instead of (possibly wrong) locally scoped docstrings outside the reach of library users. The function signature parser should now be a tad bit more efficient. It performs no operations on the whole array that gets passed beyond those strictly required by however as many elements (lines) are part of the function signature. Prior to this, there were a bunch of `enumerate` and `join` calls that wouldn't exactly be efficient for, say, arrays containing thousands of lines for some of the source code.
05dd1fe to
540ec7a
Compare
What exactly do you mean? |
I mean the The For both of these, Just provides nice built-ins to (1) require a program to exist in the user's Granted, transitioning from an independent Python script to a script recipe in Just would be a bit of work, but it should reduce the cognitive load on the script by not expecting it to work independent of An example of this that I can quickly get my hands on is a set unstable := true
set shell := ["fish", "-c"]
set script-interpreter := ["uv", "run", "--script"]
set quiet := true
alias c := clean
alias d := doc
alias cc := compile
alias r := run
alias dbg := debug
src_dir := if path_exists(justfile_directory() / "src") == "true" { justfile_directory() / "src" } else { error("src directory not found") }
build_dir := justfile_directory() / "build"
target_out := build_dir / "final_program"
lsd := require("lsd")
src_files := prepend(src_dir / "", replace(shell(lsd + " --icon=never -1 " + src_dir), "\n", ' '))
obj_files := replace(replace_regex(src_files, '([[:alpha:]]+)\.cc', '${1}.o'), src_dir, build_dir)
clangd_flags := if path_exists(justfile_directory() / "compile_flags.txt") == "true" { justfile_directory() / "compile_flags.txt" } else { error("compile_flags not found") }
cxx := require("clang++")
cxxflags := trim(replace(replace_regex(read(clangd_flags), '(?m)^-I(.*)?\n', ''), "\n", ' '))
ldflags := trim(env("LDFLAGS", "") + " -pie")
cppflags := env("CPPFLAGS", "") + " " + trim(replace_regex(read(clangd_flags), '(?m)^-[^I](.*)', ''))
doxygen := require("doxygen")
doc_dir := if path_exists(justfile_directory() / "doc") == "true" { justfile_directory() / "doc" } else { error("doc directory not found") }
doxyfile := if path_exists(doc_dir / "configDoxygen.cfg") == "true" { doc_dir / "configDoxygen.cfg" } else { error("doxyfile not found") }
lldb := require("lldb")
[private]
default:
just --list --unsorted --justfile {{ justfile() }}
# generates doxygen documentation
[macos]
doc:
{{ doxygen }} {{ doxyfile }}
{{ doc_dir / "html" }} && pwd | pbcopy
# cleans up build artifacts and older docs
clean:
rm -rf {{ build_dir }}
rm -rf {{ doc_dir / "html" }}
# build current project (non-incrementally)
[macos]
compile: _compile
{{ cxx }} \
{{ obj_files }} \
-o {{ target_out }} \
{{ ldflags }}
[script]
_compile: clean
# /// script
# dependencies = ["sh"]
# ///
import sh
cxx = sh.Command({{ quote(cxx) }})
cppflags = [{{ replace(quote(cppflags), " ", "', '") }}]
cxxflags = [{{ replace(quote(cxxflags), " ", "', '") }}]
input = [{{ replace(quote(src_files), " ", "', '") }}]
output = [{{ replace(quote(obj_files), " ", "', '") }}]
sh.mkdir("-p", {{ quote(build_dir) }})
for i, file in enumerate(input):
cxx(*cppflags, *cxxflags, c=file, o=output[i])
# run the thing
[no-quiet]
run *args: compile
{{ target_out }} {{ args }}
# debug the thing
[no-quiet]
debug: compile
{{ lldb }} {{ target_out }} |
|
Hm, I have no opinion on this. But I would like to keep tools separate – the script should work without |
Either way, there's still pending work on this PR before moving on to anything related to the web documentation. I'll see then if I can make some changes to the |
Following the plan in the PR this branch is part of, the function signature parser rework is done. The conclusion is that the signature gets parsed now all in a single pass instead of having to separately consider the function and parameter span, and then parse the parameter span. The gains in efficency on the prior commit are still kept, so no operations are performed on the whole array; only those elements of the array with all the source lines corresponding to the function are parsed. There is, though, a small performance loss in the fact that a custom runtime regex is built to more accurately have the whitespace indentation of some parameters' default values be represented in the final output. Because the project doesn't use an autoformatter, if some lines contain an indentation that is beyond a single multiple of 2 (according to the .editorconfig file) then the parser will correctly recognize that the contents of the named argument, if the named argument is not a string or Typst content type, should be deindented by however as much whitespace was detected at the start of the parameter name. Now the parameter parser also correctly implements function default arguments.
|
I cherry picked your commit onto master. |
|
I just pushed the changes I've been making throughout the last few weeks to the parser. These include merging both the function signature parser and the argument parser to extract I also introduced support for more Typst type values if some future function uses as a named The docstring parser proper is mostly done. Unlike the above, it's not yet working seamlessly with I've also experimented with some diagnostics in the docstring parsing process, but I ended up The highlight of the refactored docstring parser is that it now also supports mutline parameter To recap:
|
The parser is mostly done. This commit will be ammended/fixed up once the docstring parser is completely done, so more details can be found in the accompanying PR.
454c568 to
8a2e9fb
Compare
The extractor needed some fine tuning to only pick up top level docstrings instead of (possibly wrong) locally scoped docstrings outside the reach of library users.
The function signature parser should now be a tad bit more efficient. It performs no operations on the whole array that gets passed beyond those strictly required by however as many elements (lines) are part of the function signature. Prior to this, there were a bunch of
enumerateandjoincalls that wouldn't exactly be efficient for, say, arrays containing thousands of lines for some of the source code.Further work will continue in the function parameter parsing, by possibly modifying the parameters of the parser itself, such that without modifying the resulting
typst queryoutput, we avoid performing two passes through the argument list; one in the initial function signature parsing, and another one in the parameter list parsing.Once work on the function signature is done, the next step will be to fix the actual docstring parser, such that it picks up on newlines in function parameter documentation. An example of a docstring that I expect the parser to work through nicely is given in #986. Only after this is done, will I try to move on to seeing what can be done with the type syntax incompatibilities between the manual and the web documentation.
I also wanted to ask whether it's a good idea to be running the Python script for HTML generation directly and not through an isolated environment, by possibly using the nice integration
justhas withuvfor script recipes [1].