Skip to content

Commit e3854d2

Browse files
misrasaurabh1aseembits93cragwolfe
authored
Setup Codeflash Github Actions to optimize all future code (#4082)
- This Pull Request sets up the `codeflash.yml` file which will run on every new Pull Request that modifies the source code for `unstructured` directory. - We setup the codeflash config in the pyproject.toml file. This defines basic project config for codeflash. - The workflow uses uv to install the CI dependencies faster than your current caching solution. Speed is useful to get quicker optimizations. - Please take a look at the requirements that are being installed. Feel free to add more to the install list. Codeflash tries to execute code and if it is missing a dependency needed to make something run, it will fail to optimize. - Codeflash is being installed everytime in the CI. This helps the workflow always use the latest version of codeflash as it improves rapidly. Feel free to add codeflash to dev dependency as well, since we are about to release more local optimization tools like VS Code and claude code extensions. - Feel free to modify this Github action anyway you want **Actions Required to make this work-** - Install the Codeflash Github app from [this link](https://github.com/apps/codeflash-ai/installations/select_target) to this repo. This is required for our github-bot to comment and create suggestions on the github repo. - Create a new `CODEFLASH_API_KEY` after signing up to [Codeflash from our website](https://www.codeflash.ai/). The onboarding will ask you to create an API Key and show instructions on how to save the api key on your repo secrets. Then, after this PR is merged in it will start generating new optimizations 🎉 --------- Signed-off-by: Saurabh Misra <[email protected]> Co-authored-by: Aseem Saxena <[email protected]> Co-authored-by: cragwolfe <[email protected]>
1 parent fed8942 commit e3854d2

File tree

4 files changed

+70
-1
lines changed

4 files changed

+70
-1
lines changed

.github/workflows/codeflash.yml

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
name: Codeflash Optimization
2+
3+
on:
4+
pull_request:
5+
paths:
6+
- 'unstructured/**'
7+
8+
workflow_dispatch:
9+
10+
concurrency:
11+
group: ${{ github.workflow }}-${{ github.ref }}
12+
cancel-in-progress: true
13+
14+
jobs:
15+
optimize:
16+
name: Optimize new Python code
17+
if: ${{ github.actor != 'codeflash-ai[bot]' }}
18+
runs-on: ubuntu-latest
19+
env:
20+
NLTK_DATA: ${{ github.workspace }}/nltk_data
21+
steps:
22+
- uses: actions/checkout@v4
23+
with:
24+
fetch-depth: 0
25+
- name: 🐍 Set up Python 3.12
26+
uses: actions/setup-python@v5
27+
with:
28+
python-version: 3.12
29+
- name: 📦 Install Environment
30+
uses: ./.github/actions/base-cache
31+
with:
32+
python-version: 3.12
33+
- name: ⚡️ Codeflash Optimization
34+
env:
35+
UNS_API_KEY: ${{ secrets.UNS_API_KEY }}
36+
TESSERACT_VERSION: "5.5.1"
37+
CODEFLASH_API_KEY: ${{ secrets.CODEFLASH_API_KEY }}
38+
run: |
39+
source .venv/bin/activate
40+
sudo apt-get update
41+
sudo apt-get install -y libmagic-dev poppler-utils libreoffice
42+
sudo add-apt-repository -y ppa:alex-p/tesseract-ocr5
43+
sudo apt-get update
44+
sudo apt-get install -y tesseract-ocr tesseract-ocr-kor
45+
tesseract --version
46+
installed_tesseract_version=$(tesseract --version | grep -oP '(?<=tesseract )\d+\.\d+\.\d+')
47+
if [ "$installed_tesseract_version" != "${{env.TESSERACT_VERSION}}" ]; then
48+
echo "Tesseract version ${{env.TESSERACT_VERSION}} is required but found version $installed_tesseract_version"
49+
exit 1
50+
fi
51+
# FIXME (yao): sometimes there is cache but we still miss argilla in the env; so we add make install-ci again
52+
make install-ci
53+
pip install codeflash
54+
codeflash

CHANGELOG.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,11 @@
1+
## 0.18.15-dev0
2+
3+
### Enhancements
4+
5+
### Features
6+
7+
### Fixes
8+
19
## 0.18.14
210

311
### Enhancements

pyproject.toml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,3 +56,10 @@ select = [
5656
"UP034", # -- Avoid extraneous parentheses --
5757
"W", # -- Warnings, including invalid escape-sequence --
5858
]
59+
60+
[tool.codeflash]
61+
module-root = "unstructured"
62+
tests-root = "test_unstructured"
63+
test-framework = "pytest"
64+
ignore-paths = []
65+
formatter-cmds = ["ruff check --exit-zero --fix-only $file", "autoflake --in-place $file", "black --line-length=100 $file"]

unstructured/__version__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.18.14" # pragma: no cover
1+
__version__ = "0.18.15-dev0" # pragma: no cover

0 commit comments

Comments
 (0)