Skip to content

Commit 3e6aa81

Browse files
committed
initial code commit
0 parents  commit 3e6aa81

File tree

14 files changed

+1018
-0
lines changed

14 files changed

+1018
-0
lines changed

.flake8

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
[flake8]
2+
# E203: Conflicts with black's formatting
3+
# E501: Conflicts with black's formatting, which defaults to a max
4+
# line length of 88 but will go over if necessary.
5+
# See https://github.com/ambv/black#line-length for more info
6+
# W605: Conflicts with regular expression formatting
7+
ignore = E203,E501,E731,W503,W605
8+
import-order-style = google
9+
# Packages added in this list should be added to the setup.cfg file as well
10+
application-import-names =
11+
py2sfn_task_tools
12+
exclude =
13+
*vendor*
14+
.venv
15+
.env
16+
.pants.d
17+
.pantscache
18+
.git
19+
__init__.py
20+
setup.py

.gitignore

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
.DS_Store
2+
.venv*
3+
**/__pycache__/
4+
**/.cache/
5+
**/*.coverage
6+
**/*.egg-info
7+
**/*.pyc
8+
**/.tox
9+
**/dist

.pre-commit-config.yaml

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
---
2+
default_stages: [commit]
3+
4+
default_language_version:
5+
python: python3
6+
7+
exclude: "examples/.*$"
8+
9+
repos:
10+
- repo: https://github.com/pre-commit/pre-commit-hooks
11+
rev: v2.4.0
12+
hooks:
13+
- id: check-ast
14+
- id: check-builtin-literals
15+
- id: check-byte-order-marker
16+
- id: check-executables-have-shebangs
17+
# Need to define stages explicitly since `default_stages` was not being respected
18+
stages: [commit]
19+
- id: check-merge-conflict
20+
- id: debug-statements
21+
- id: forbid-new-submodules
22+
- id: no-commit-to-branch
23+
- id: trailing-whitespace
24+
args: [--markdown-linebreak-ext=md]
25+
# Need to define stages explicitly since `default_stages` was not being respected
26+
stages: [commit]
27+
28+
- repo: local
29+
hooks:
30+
- id: isort
31+
name: Sort Python imports (isort)
32+
entry: isort
33+
language: python
34+
types: [file, python]
35+
additional_dependencies: [isort==4.3.16]
36+
37+
- id: black
38+
name: Format Python (black)
39+
entry: black
40+
language: python
41+
types: [file, python]
42+
additional_dependencies: [black==19.3b0]
43+
44+
- id: pydocstyle
45+
name: Lint Python docstrings (pydocstyle)
46+
entry: pydocstyle
47+
language: python
48+
types: [file, python]
49+
additional_dependencies: [pydocstyle==4.0.1]
50+
51+
- id: flake8
52+
name: Lint Python (flake8)
53+
entry: flake8 --config py2sfn-task-tools/.flake8
54+
language: python
55+
types: [file, python]
56+
additional_dependencies:
57+
- flake8==3.7.9
58+
- "flake8-import-order<0.19,>=0.18"

LICENSE.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
Copyright (c) 2020, Narrative Science
2+
3+
All rights reserved.
4+
5+
Redistribution and use in source and binary forms, with or without
6+
modification, are permitted provided that the following conditions are met:
7+
8+
- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
9+
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
10+
- Neither the name of the <organization> nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
11+
12+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
13+
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
14+
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
15+
DISCLAIMED. IN NO EVENT SHALL <COPYRIGHT HOLDER> BE LIABLE FOR ANY
16+
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
17+
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
18+
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
19+
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
20+
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
21+
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

README.md

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
# py2sfn-task-tools
2+
3+
[![](https://img.shields.io/pypi/v/py2sfn-task-tools.svg)](https://pypi.org/pypi/py2sfn-task-tools/) [![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)
4+
5+
Tools for tasks embedded in an [AWS Step Functions](https://aws.amazon.com/step-functions/) state machine. This is a helper library for [py2sfn](https://github.com/NarrativeScience/py2sfn).
6+
7+
Features:
8+
9+
- Offload state data to DynamoDB/S3 instead of storing data in the *very* constrained state machine input data object
10+
- Cancel the currently executing workflow
11+
12+
Table of Contents:
13+
14+
- [Installation](#installation)
15+
- [Guide](#guide)
16+
- [Stopping the execution](#stopping-the-execution)
17+
- [Working with the State Data Client](#working-with-the-state-data-client)
18+
- [Development](#development)
19+
20+
## Installation
21+
22+
py2sfn-task-tools requires Python 3.6 or above. It should be installed in a [py2sfn task entry point](https://github.com/NarrativeScience/py2sfn#task-entry-points).
23+
24+
```bash
25+
pip install py2sfn-task-tools
26+
```
27+
28+
## Guide
29+
30+
Once the py2sfn-task-tools library is installed, a `Context` should be created and passed to the tasks. Each py2sfn task will then have a `context` object to work with.
31+
32+
### Stopping the execution
33+
34+
If you need to stop/cancel/abort the execution from within a task, you can use the `context.stop_execution` method within your task's `run` method. A common use case is if you need to check the value of a feature flag at the beginning of the execution and abort if it's false. For example:
35+
36+
```python
37+
if not some_condition:
38+
return await context.stop_execution()
39+
```
40+
41+
You can provide extra detail by passing `error` and `cause` keyword arguments to the `stop_execution` method. The `error` is a short string like a code or enum value whereas `cause` is a longer description.
42+
43+
### Working with the State Data Client
44+
45+
One of the stated Step Functions best practices is to avoid passing large payloads between states; the input data limit is only 32K characters. To get around this, you can choose to store data from your task code in a DynamoDB table. With DynamoDB, we have an item limit of 400KB to work with. When you put items into the table you receive a pointer to the DynamoDB item which you can return from your task so it gets includes in the input data object. From there, since the pointer is in the `data` dict, you can reload the stored data in a downstream task. This library's `StateDataClient` class provides methods for putting and getting items from this DynamoDB table. It's available in your task's `run` method as `context.state_data_client`.
46+
47+
The client methods are split between "local" and "global" variants. Local methods operate on items stored within the project whereas global methods can operate on items that were stored from any project. Global methods require a fully-specified partition key (primary key, contains the execution ID) and table name to locate the item whereas local methods only need a simple key because the partition key and table name can be infered from the project automatically. The `put_*` methods return a dict with metadata about the location of the item, including the `key`, `partition_key`, and `table_name`. If you return this metadata object from a task, it will get put on the `data` object and you can call a `get_*` method later in the state machine.
48+
49+
Many methods also accept an optional `index` argument. This argument needs to be provided when getting/putting an item that was originally stored as part of a `put_items` or `put_global_items` call. Providing the `index` is usually only done within a map iteration task.
50+
51+
Below are a few of the more common methods:
52+
53+
#### `put_item`/`put_items`
54+
55+
The `put_item` method puts an item in the state store. It takes `key`, `data`, and `index` arguments. For example:
56+
57+
```python
58+
context.state_data_client.put_item("characters", {"name": "jerry"})
59+
context.state_data_client.put_item("characters", {"name": "elaine"}, index=24)
60+
```
61+
62+
Note that the item at the given array index doesn't actually have to exist in the table before you call `put_item`. However, if it doesn't exist then you may have a fan-out logic bug upstream in your state machine.
63+
64+
The `put_items` method puts an entire list of items into the state store. Each item will be stored separately under its corresponding array index. For example:
65+
66+
```python
67+
context.state_data_client.put_items("characters", [{"name": "jerry"}, {"name": "elaine"}])
68+
```
69+
70+
#### `get_item`
71+
72+
The `get_item` method gets the data attribute from an item in the state store. It takes `key` and `index` arguments. For example:
73+
74+
```python
75+
context.state_data_client.get_item("characters") # -> {"name": "jerry"}
76+
context.state_data_client.get_item("characters", index=24) # -> {"name": "elaine"}
77+
```
78+
79+
#### `get_item_for_map_iteration`/`get_global_item_for_map_iteration`
80+
81+
The `get_item_for_map_iteration` method gets the data attribute from an item in the state store using the `event` object. This method only works when called within a map iterator task. For example, if the `put_items` example above was called in a task, and its value was given to a map state to fan out, we can use the `get_item_for_map_iteration` method within our iterator task to fetch each item:
82+
83+
```python
84+
# Iteration 0:
85+
context.state_data_client.get_item_for_map_iteration(event) # -> {"name": "jerry"}
86+
# Iteration 1:
87+
context.state_data_client.get_item_for_map_iteration(event) # -> {"name": "elaine"}
88+
```
89+
90+
This works because the map iterator state machine receives an input data object with the schema:
91+
92+
```json
93+
{
94+
"items_result_table_name": "<DynamoDB table for the project>",
95+
"items_result_partition_key": "<execution ID>:characters",
96+
"items_result_key": "characters",
97+
"context_index": "<array index>",
98+
"context_value.$": "1"
99+
}
100+
```
101+
102+
The `get_item_for_map_iteration` is a helper method that uses that input to locate the right item. The `get_global_item_for_map_iteration` method has the same signature. It should be called when you know that the array used to fan out could have come from another project (e.g. the map state is the first state in a state machine triggered by a subscription).
103+
104+
## Development
105+
106+
To run functional tests, you need an AWS IAM account with permissions to:
107+
108+
- Create/update/delete a DynamoDB table
109+
- Create/update/delete an S3 bucket
110+
111+
Set the following environment variables:
112+
113+
- `AWS_ACCESS_KEY_ID`
114+
- `AWS_SECRET_ACCESS_KEY`
115+
- `AWS_DEFAULT_REGION`
116+
117+
To run tests:
118+
119+
```bash
120+
tox
121+
```

pyproject.toml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
[build-system]
2+
requires = ["flit_core >=2,<3"]
3+
build-backend = "flit_core.buildapi"
4+
5+
[tool.flit.metadata]
6+
module = "py2sfn_task_tools"
7+
author = "Jonathan Drake"
8+
author-email = "[email protected]"
9+
home-page = "https://github.com/NarrativeScience/py2sfn-task-tools"
10+
license = "BSD-3-Clause"
11+
description-file = "README.md"
12+
requires = [
13+
"backoff>=1.8.0,<2",
14+
"boto3",
15+
"sfn-workflow-client",
16+
]
17+
classifiers = [
18+
"Intended Audience :: Developers",
19+
"License :: OSI Approved :: BSD License",
20+
"Programming Language :: Python :: 3",
21+
"Topic :: Software Development :: Libraries :: Python Modules",
22+
]
23+
requires-python = ">=3.6,<4"
24+
keywords = "py2sfn,aws step functions,workflow"
25+
26+
[tool.flit.sdist]
27+
include = ["LICENSE.md"]
28+
exclude = ["tests/"]

setup.cfg

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
[isort]
2+
# These rules should conform to flake8-import-order's google import order style
3+
# and black's styling rules
4+
# If adding a new package to this list, please make sure to update the .flake8 file as
5+
# well
6+
atomic=true
7+
combine_as_imports=true
8+
default_section=THIRDPARTY
9+
force_sort_within_sections=true
10+
include_trailing_comma=true
11+
known_standard_library=typing
12+
known_first_party=
13+
py2sfn_task_tools
14+
test_utils
15+
line_length=88
16+
multi_line_output=3
17+
no_lines_before=LOCALFOLDER
18+
order_by_type=false
19+
sections=FUTURE,STDLIB,THIRDPARTY,FIRSTPARTY,LOCALFOLDER
20+
skip=__init__.py
21+
22+
[pydocstyle]
23+
convention=numpy
24+
add-select=D413,D416,D417
25+
add-ignore=D202,D205,D400,D401,D406,D407

src/py2sfn_task_tools/__init__.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
"""Tools for tasks embedded in an AWS Step Functions state machine."""
2+
3+
__version__ = "0.1.0"

src/py2sfn_task_tools/context.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
"""Contains a data class holding common clients for SFN tasks"""
2+
from typing import Callable, NamedTuple
3+
4+
from .state_data_client import StateDataClient
5+
6+
7+
class TaskContext(NamedTuple):
8+
"""Data class holding common clients for SFN tasks.
9+
10+
This will be passed as the ``context`` argument to each task's ``run`` method.
11+
"""
12+
13+
state_data_client: StateDataClient = None
14+
stop_execution: Callable = None
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
"""Contains exception classes for the SFN tools library"""
2+
3+
4+
class StateDataClientError(Exception):
5+
"""Base class for state data client exceptions"""
6+
7+
pass
8+
9+
10+
class NoItemFound(StateDataClientError):
11+
"""Raised when no item could be found in the state data table"""
12+
13+
pass

0 commit comments

Comments
 (0)