Skip to content

Commit 8df5bc3

Browse files
committed
Update README.md
1 parent 8ad2fe9 commit 8df5bc3

File tree

1 file changed

+13
-3
lines changed

1 file changed

+13
-3
lines changed

backend/README.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,21 @@
11
# Congress.Dev Backend
22
This contains the two components of the backend
3-
- API
3+
- [API](https://api.congress.dev/ui)
44
- Legislation Parser
55

66
## API
7-
The API is an OpenAPI based system for interacting with the Postgres database
7+
The API is an OpenAPI based system for interacting with the Postgres database.
88

99

1010
## Legislation Parser
11-
The parser reads the XML from various .gov sites and creates a representation of the USCode in the database, which it then uses to reference when parsing the legislation with regex.
11+
This parser is the core of the project and covers a few different processes.
12+
13+
### USCode Parsing
14+
The [USCode](https://uscode.house.gov/detailed_guide.xhtml) is a collection of all codified legislation enacted by Congress. It is available in a few formats, but the one we consume is XML, each '[Release Point](https://uscode.house.gov/download/priorreleasepoints.htm)' corresponds to the codification of one or more pieces of legislation, and is comprised of over 600MB of XML.
15+
16+
We convert the XML into a sort of tree structure in the database for the next step. Currently we parse each Release Point entirely, so you can expect the database to grow by 3-400 MB per month due to new RPs being published.
17+
18+
### Legislation Parsing
19+
Once the USCode is parsed, we can [download](https://www.govinfo.gov/bulkdata/BILLS/116/2/hr/BILLS-116-2-hr.zip) the latest copy of legislation for both the House and the Senate. Using a collection of handwritten regex, we parse the legislation text for 'actions' that each one is proposing, and we attempt to 'apply' those actions and predict what the USCode would look like once codified. The difference between the current version and our prediction version is then displayed in the frontend to highlight changes.
20+
21+
Currently we download the .zip bundle for both chambers every time, but recently there have been new developments in the API, so we might be able to ask for a list of bills and download each day's new legislation after you've bootstrapped the year's previous legislation.

0 commit comments

Comments
 (0)