Skip to content

Commit 94fff19

Browse files
Update package metadata, fix models codegen (#742)
1 parent 9b60a63 commit 94fff19

File tree

37 files changed

+435
-251
lines changed

37 files changed

+435
-251
lines changed

docs/1.getting-started/1.installation.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -37,16 +37,14 @@ Minimum hardware requirements are 256 MB RAM, 1 CPU core, and some disk space fo
3737

3838
The easiest way to install DipDup is to use our interactive installer script. It will ask you a few questions and install DipDup with all dependencies. Run the following command in your terminal:
3939

40-
<!-- TODO: Ensure that installer script deploys and works as intended -->
41-
4240
```shell [Terminal]
4341
curl -Lsf https://dipdup.io/install.py | python
4442
```
4543

46-
That's it! DipDip installed as a CLI application and available everywhere in a system. Now you can run `dipdup new` to spawn a new project from lots of ready-to-use templates and proceed to the next section: [Core concepts](2.core-concepts.md)
44+
That's it! DipDip is installed as a CLI application and is available everywhere in a system. Now you can run `dipdup new` to spawn a new project from lots of ready-to-use templates and proceed to the next section: [Core concepts](2.core-concepts.md)
4745

4846
::banner{type="note"}
49-
Thi script performs some early checks, installs pipx for the current user, then installs dipdup and pdm with pipx. But it's always better to read the code before running it. To do so, `curl -Lsf https://dipdup.io/install.py | tee /tmp/install.py`, review the scripts, then run it with `python /tmp/install.py` to proceed.
47+
This script performs some basic checks, installs pipx for the current user, then installs dipdup and pdm with pipx. But it's always better to read the code before running it. To do so, `curl -Lsf https://dipdup.io/install.py | tee /tmp/install.py`, review the script, then run it with `python /tmp/install.py` to proceed.
5048
::
5149

5250
### From scratch
@@ -63,7 +61,7 @@ Then use the snippets below to create a new Python project and add DipDup as a d
6361

6462
#### PDM (recommended)
6563

66-
PDM is a very powerful package manager with a lot of features. It's a good choice for both beginners and advanced users.
64+
PDM is a very powerful package manager with a lot of features. Also, it can run scripts from pyproject.toml as npm does. It's a good choice for both beginners and advanced users.
6765

6866
```shell [Terminal]
6967
pdm init --python ">=3.10,<3.11"
@@ -73,7 +71,7 @@ pdm venv activate
7371

7472
#### Poetry
7573

76-
Poetry is another popular tool to manage Python projects. It's slower and less stable than PDM, but it's still a good choice.
74+
Poetry is another popular tool to manage Python projects. It's less stable and often slower than PDM, but it's still a good choice.
7775

7876
```shell [Terminal]
7977
poetry init --python ">=3.10,<3.11"
@@ -88,7 +86,7 @@ Finally, if you prefer to do everything manually, you can use pip. It's the most
8886
```shell [Terminal]
8987
python -m venv .venv
9088
. .venv/bin/activate
91-
echo "dipdup" >> requirements.txt
89+
echo "dipdup~={{ project.dipdup_version }}" >> requirements.txt
9290
pip install -r requirements.txt -e .
9391
```
9492

docs/1.getting-started/2.core-concepts.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -7,35 +7,35 @@ block: "getting_started"
77

88
# Core concepts
99

10-
Before proceeding to development, let's take a look at basic DipDup concepts.
11-
12-
<!-- TODO: Better, but still very complex and boring -->
10+
Before proceeding to development, let's take a look at basic DipDup concepts. The main terms are highlighted in _italics_ and will be discussed in detail in the following sections.
1311

1412
## Big picture
1513

16-
DipDup is an SDK for building custom backends for decentralized applications, or, indexers. DipDup indexers are off-chain services that aggregate blockchain data from various sources and store it in a database.
14+
DipDup is a _Python SDK_ for building custom backends for decentralized applications, or, _indexers_. DipDup indexers are off-chain services that aggregate blockchain data from various sources and store it in a database.
15+
16+
Each indexer consists of a _YAML config_ file and a _Python package_ with models, handlers, and other code. The configuration file describes what contracts to index, what data to extract from them, and where to store the result. Powerful configuration features like templates, environment variables substitution, and merging multiple files allow making your indexer completely declarative. If you're coming from The Graph, the syntax is somewhat similar to Subgraph manifests.
1717

18-
Each indexer consists of a YAML configuration file and a Python package with models, handlers, and other code. The configuration file describes the inventory of the indexer, i.e., what contracts to index and what data to extract from them. It supports templates, and environment variables, and uses a syntax similar to Subgraph manifest files. The Python package contains models, callbacks and queries. Models describe the domain-specific data structures you want to store in the database. Callbacks implement the business logic, i.e., how to convert blockchain data to your models. Other files in the package are optional and can be used to extend DipDup functionality.
18+
An _index_ is a set of contracts and rules for processing them as a single entity. Your config can contain more than one index, but they are processed in parallel and cannot share data as execution order is not guaranteed.
1919

20-
As a result, you get a service responsible for filling the database with indexed data. Then you can use it to build a custom API backend or integrate with existing ones. DipDup provides Hasura GraphQL Engine integration to expose indexed data via REST and GraphQL with little to no effort, but you can also use other API engines like PostgREST or develop one in-house.
20+
The Python package contains ORM models, callbacks, typeclasses, scripts and queries. Models describe the domain-specific data structures you want to store in the database. Callbacks implement the business logic, i.e., how to convert blockchain data to your models. Other files in the package are optional and can be used to extend DipDup functionality.
21+
22+
As a result, you get a service responsible for filling the database with indexed data. Then you can use it to build a custom API backend or integrate with existing ones. DipDup provides _Hasura GraphQL Engine_ integration to expose indexed data via REST and GraphQL with zero configuration, but you can use other API engines like PostgREST or develop one in-house.
2123

2224
<!-- TODO: SVG include doesn't work -->
2325

2426
![Generic DipDup setup and data flow](../assets/dipdup.svg)
2527

2628
## Storage layer
2729

28-
DipDup uses PostgreSQL or SQLite as a database backend. All the data is stored in a single database schema created on the first run. Make sure it's used by DipDup exclusively since changes in index configuration or models require DipDup to drop the whole database schema and start indexing from scratch. You can, however, mark specific tables as immune to preserve them from being dropped.
29-
30-
DipDup does not support database schema migration. Any change in models or index definitions will trigger reindexing. Migrations introduce complexity and do not guarantee data consistency. DipDup stores a hash of the SQL version of the DB schema and checks for changes each time you run indexing.
30+
DipDup uses PostgreSQL or SQLite as a database backend. All the data is stored in a single database schema created on the first run. Make sure it's used by DipDup exclusively since changes in index configuration or models DipDup trigger _reindexing_, dropping the whole database schema and starting indexing from scratch. You can, however, mark specific tables as immune to preserve them or configure actions to be performed on each reindexing reason.
3131

32-
DipDup applies all updates atomically block by block, ensuring data integrity. If indexing is interrupted, the next time DipDup starts, it will check the database state and continue from the last block processed. DipDup state is stored in the database per index and can be used by API consumers to determine the current indexer head.
32+
DipDup does not support database schema migrations, as they introduce complexity and can mess with data consistency. Any change in models or index definitions will trigger reindexing. DipDup stores hashes of the SQL schema and config file, and checks them each time you run indexing.
3333

34-
An index is a set of contracts and rules for processing them as a single entity. Your config can contain more than one index, but they are processed in parallel and cannot share data as execution order is not guaranteed.
34+
DipDup applies all updates _atomically_ block by block, ensuring data integrity. If indexing is interrupted, the next time DipDup starts, it will check the database state and continue from the last block processed. The DipDup state is stored in the database per index and can be used by API consumers to determine the current indexer head.
3535

3636
## Handling chain reorgs
3737

38-
Reorg messages signal chain reorganizations, which means some blocks, including all operations, are rolled back in favor of another with higher fitness. It's crucial to handle these messages correctly to avoid accumulating duplicate or invalid data. DipDup processes chain reorgs by restoring a previous database state, but you can implement your rollback logic by editing the `on_index_rollback`{lang="python"} system hook.
38+
Reorg messages signal chain reorganizations, which means some blocks, including all operations, are _rolled back_ in favor of another with higher fitness. It's crucial to handle these messages correctly to avoid accumulating duplicate or invalid data. DipDup processes chain reorgs by restoring a previous database state, but you can implement your rollback logic by editing the `on_index_rollback`{lang="python"} system hook.
3939

4040
<!--
4141

docs/1.getting-started/4.package.md

Lines changed: 37 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -12,35 +12,42 @@ To generate all necessary directories and files according to config run the `ini
1212

1313
The structure of the resulting package is the following:
1414

15-
| Path | Description |
16-
| ------------------------ | --------------------------------------------------------------- |
17-
| :file_folder: `abi` | Contract ABIs used to generate typeclasses |
18-
| :file_folder: `configs` | Environment-specific configs to merge with the root one |
19-
| :file_folder: `deploy` | Dockerfiles, compose files, and default variables |
20-
| :file_folder: `graphql` | Custom GraphQL queries for Hasura |
21-
| :file_folder: `handlers` | User-defined callbacks to process contract data |
22-
| :file_folder: `hasura` | Arbitrary Hasura metadata to apply during configuration |
23-
| :file_folder: `hooks` | User-defined callbacks to run manually or by schedule |
24-
| :file_folder: `models` | DipDup ORM models to store data in the database |
25-
| :file_folder: `sql` | SQL scripts and queries to run manually or on specific events |
26-
| :file_folder: `types` | Automatically generated Pydantic dataclasses for contract types |
27-
28-
There's also a bunch on files in the root directory: .ignore files, pyproject.toml, PEP 561 marker, etc. Usually, you won't need to modify them.
15+
| Path | Description |
16+
| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------ |
17+
| :file_folder: `abi` | Contract ABIs used to generate typeclasses |
18+
| :file_folder: `configs` | Environment-specific configs to merge with the root one |
19+
| :file_folder: `deploy` | Dockerfiles, compose files, and default env variables for each environment |
20+
| :file_folder: `graphql` | Custom GraphQL queries to expose with Hasura engine |
21+
| :file_folder: `handlers` | User-defined callbacks to process contract data |
22+
| :file_folder: `hasura` | Arbitrary Hasura metadata to apply during configuration |
23+
| :file_folder: `hooks` | User-defined callbacks to run manually or by schedule |
24+
| :file_folder: `models` | DipDup ORM models to store data in the database |
25+
| :file_folder: `sql` | SQL scripts and queries to run manually or on specific events |
26+
| :file_folder: `types` | Automatically generated Pydantic dataclasses for contract types |
27+
| `dipdup.yaml` | Root DipDup config; can be expanded with env-specific files |
28+
| `pyproject.toml` | Python package metadata (introduced in PEP 518; see [details](https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/)) |
29+
| | |
30+
31+
There are also a bunch files in the root directory: .ignore files, PEP 561 marker, etc. Usually, you won't need to modify them.
2932

3033
## ABIs and typeclasses
3134

32-
DipDup uses contract type information to generate dataclasses for developers to work with strictly typed data. These dataclasses are generated automatically from contract ABIs. In most cases, you don't need to modify them manually. The process is roughly the following:
35+
DipDup uses contract type information to generate [Pydantic](https://docs.pydantic.dev/) models to work with strictly typed data. We call these models _typeclasses_. Modules in the `types` directory are generated automatically from contract ABIs and JSONSchemas during init. You can modify them manually, but usually won't need to. Under the hood, the process is roughly the following:
3336

34-
1. Contract ABIs are placed in the `abi` directory; either manually or during init.
37+
1. Contract ABIs are fetched from public sources or provided by the user.
3538
2. DipDup converts these ABIs to intermediate JSONSchemas.
36-
3. JSONSchemas converted to Pydantic dataclasses.
39+
3. JSONSchemas converted to Pydantic models with [datamodel-code-generator](https://pydantic-docs.helpmanual.io/datamodel_code_generator/).
3740

38-
This approach allows to work with complex contract types with nested structures and polymorphic variants.
41+
This approach allows working with complex contract types with nested structures and polymorphic variants.
42+
43+
::banner{type="note"}
44+
Currently, we use Pydantic v1, but plan to migrate to v2 very soon.
45+
::
3946

4047
<!--
4148
DipDup receives all smart contract data (transaction parameters, resulting storage, big_map updates) in normalized form ([read more](https://baking-bad.org/blog/2021/03/03/tzkt-v14-released-with-improved-smart-contract-data-and-websocket-api/) about how TzKT handles Michelson expressions) but still as raw JSON. DipDup uses contract type information to generate data classes, which allow developers to work with strictly typed data.
4249
43-
DipDup generates [Pydantic](https://pydantic-docs.helpmanual.io/datamodel_code_generator/) models out of JSONSchema. You might want to install additional plugins ([PyCharm](https://pydantic-docs.helpmanual.io/pycharm_plugin/), [mypy](https://pydantic-docs.helpmanual.io/mypy_plugin/)) for convenient work with this library.
50+
DipDup generates models out of JSONSchema. You might want to install additional plugins ([PyCharm](https://pydantic-docs.helpmanual.io/pycharm_plugin/), [mypy](https://pydantic-docs.helpmanual.io/mypy_plugin/)) for convenient work with this library.
4451
4552
The following models are created at `init` for different indexes:
4653
@@ -53,27 +60,25 @@ Other index kinds do not use code generated types.
5360

5461
## Nested packages
5562

56-
Callbacks can be joined into packages to organize the project structure. Add one or multiple dots to the callback name to define nested packages:
63+
Callbacks can be grouped into packages to organize the project structure. Add one or multiple dots to the callback name to define nested packages:
5764

5865
```yaml [dipdup.yaml]
59-
package: indexer
66+
package: {{ project.package }}
6067
hooks:
61-
foo.bar:
62-
callback: foo.bar
68+
backup.restore:
69+
callback: backup.on_restore
6370
```
6471
65-
After running the `init` command, you'll get the following directory tree:
66-
67-
<!-- TODO: Borked tree -->
72+
After running the `init` command, you'll get the following directory tree (shortened for brevity)
6873

6974
```
70-
indexer
75+
{{ project.package }}
7176
├── hooks
72-
│ ├── foo
73-
│ │ └── bar.py
77+
│ ├── backup
78+
│ │ └── on_restore.py
7479
└── sql
75-
└── foo
76-
└── bar
80+
└── backup
81+
└── on_restore
7782
```
7883
79-
The same applies to handler callbacks. The callback alias still needs to be a valid Python module path: lowercase letters, underscores, and dots.
84+
Handler callbacks can be grouped the same way. Note, that callback name still needs to be a valid Python module path: _only lowercase letters, underscores, and dots_.

docs/1.getting-started/5.models.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,16 @@ description: "DipDup is a Python framework for building smart contract indexers.
66

77
# Models
88

9-
To store indexed data in the database, you need to define models. Our storage layer is based on [Tortoise ORM](https://tortoise.github.io/index.html). It's fast, flexible, and has a syntax similar to Django ORM. We have extended it with some useful features like a copy-on-write rollback mechanism, caching, and more.
9+
To store indexed data in the database, you need to define models, that are Python classes that represent database tables. DipDup uses a custom ORM to manage models and transactions.
1010

11-
We plan to make things official and fork this library under a new name, but it's not ready yet. For now, let's call our implementation **DipDup ORM****.
11+
## DipDup ORM
12+
13+
Our storage layer is based on [Tortoise ORM](https://tortoise.github.io/index.html). This library is fast, flexible, and has a syntax familiar to Django users. We have extended it with some useful features like a copy-on-write rollback mechanism, caching, and more. We plan to make things official and fork Tortoise ORM under a new name, but it's not ready yet. For now, let's call our implementation **DipDup ORM**.
1214

1315
Before we begin to dive into the details, here's an important note:
1416

1517
::banner{type="warning"}
16-
Please, don't report DipDup issues to the Tortoise ORM bug tracker! We patch it heavily to better suit our needs, so it's not the same library anymore.
18+
Please, don't report DipDup ORM issues to the Tortoise ORM bug tracker! We patch it heavily to better suit our needs, so it's not the same library anymore.
1719
::
1820

1921
You can use [Tortoise ORM docs](https://tortoise.github.io/examples.html) as a reference. We will describe only DipDup-specific features here.
@@ -25,7 +27,7 @@ Project models should be placed in the `models` directory in the project root. B
2527
Here's an example containing all available fields:
2628

2729
```python
28-
{{ #include ../src/dipdup/templates/models.py.j2 }}
30+
{{ #include ../src/dipdup/templates/models.py }}
2931
```
3032

3133
Pay attention to the imports: field and model classes **must** be imported from `dipdup` package instead of `tortoise` to make our extensions work.
@@ -82,7 +84,7 @@ Querysets are not copied between chained calls. Consider the following example:
8284
await dipdup.models.Index.filter().order_by('-level').first()
8385
```
8486

85-
In Tortoise ORM each subsequent call creates a new queryset using expensive `copy.copy()` call. In DipDup ORM it's the same queryset, so it's much faster.
87+
In Tortoise ORM each subsequent call creates a new queryset using an expensive `copy.`copy()` call. In DipDup ORM it's the same queryset, so it's much faster.
8688
8789
### Transactions
8890

0 commit comments

Comments
 (0)