Skip to content

Commit e11c68c

Browse files
committed
Improve names and types in Rust stdlib scraper…
Two main fixes: 1. Names: all pages except for modules' index pages ignored what module they were from, and were just prepended with `std::`. This meant there were 13 pages named `std::Iter`, about structs named `Iter` from different modules. It also meant that things outside modules, e.g. primitive types, were prefixed with `std::`, naming the page on `bool` as `std::bool`, although it can't be referenced that way in code. This also means that there are two pages named `std::char` - one for the module, and one for the primitive type `char`. This prefixes everything in a module with that module's path, and does not prefix primitives. It also includes submodules in the path. For example: std::fn → fn std::Iter → std::Option::Iter std::MetadataExt → std::os::linux::fs::MetadataExt 2. Types: almost everything was filed in `std`, with the exception of modules' index pages and primitive types. This meant there were over 30,000 pages in the `std` type, and many types for modules with only one page in them. This creates types for each module which include all submodules, and files anything not in a module, e.g. primitive types, in `std`. For example: std::bool / std::bool → std / bool std / std::Iter → std::option / std::option::Option::Iter
1 parent 18a79e1 commit e11c68c

File tree

1 file changed

+15
-8
lines changed

1 file changed

+15
-8
lines changed

lib/docs/filters/rust/entries.rb

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,15 @@ def get_name
2222
else
2323
at_css('main h1').at_css('button')&.remove
2424
name = at_css('main h1').content.remove(/\A.+\s/).remove('⎘')
25-
mod = slug.split('/').first
25+
path = slug.split('/')
26+
if path.length == 2
27+
# Anything in the standard library but not in a `std::*` module is
28+
# globally available, not `use`d from the `std` crate, so we don't
29+
# prepend `std::` to their name.
30+
return name
31+
end
32+
path.pop if path.last == 'index'
33+
mod = path[0..-2].join('::')
2634
name.prepend("#{mod}::") unless name.start_with?(mod)
2735
name
2836
end
@@ -38,13 +46,12 @@ def get_type
3846
elsif slug.start_with?('error_codes')
3947
'Compiler Errors'
4048
else
41-
path = name.split('::')
42-
heading = at_css('main h1').content.strip
43-
if path.length > 2 || (path.length == 2 && (heading.start_with?('Module') || heading.start_with?('Primitive')))
44-
path[0..1].join('::')
45-
else
46-
path[0]
47-
end
49+
path = slug.split('/')
50+
# Discard the filename, and use the first two path components as the
51+
# type, or one if there is only one. This means anything in a module
52+
# `std::foo` or submodule `std::foo::bar` gets type `std::foo`, and
53+
# things not in modules, e.g. primitive types, get type `std`.
54+
path[0..-2][0..1].join('::')
4855
end
4956
end
5057

0 commit comments

Comments
 (0)