This repository contains source code of AI-based structured web data extractor.
- π¨βπ» Author: Jan JoneΕ‘
- π Thesis: PDF, assignment, submission, slides
- π Demo: live, Docker Hub, examples below
- ποΈ Data: SWDE with visuals
- π
awe/: Python module (data manipulation and machine learning). Seeawe/README.md. - π
js/: Node.js app (visual attribute extractor and inference demo). Seejs/README.md. - π
docs/- π
dev/ - π
data.md: dataset preparation. - π
extractor.md: running the visual extractor. - π
train.md: training instructions. - π
release.md: release instructions. - π
demo/
- π
docker pull janjones/awe-demo
docker run --rm -it -p 3000:3000 janjones/awe-demoOpen a web browser and navigate to http://localhost:3000/.
For more details, see docs/demo/run.md.
docker pull janjones/awe-gradient
docker run --rm -it -v awe:/storage -p 3000:3000 janjones/awe-gradient bashThen, run inside the Docker container:
git clone https://github.com/jjonescz/awe .
git clone https://github.com/jjonescz/swde-visual data/swde
python -m awe.training.params
python -m awe.training.train
# Model is trained, now you can run the demo.
cd js
pnpm install
DEBUG=1 pnpm run serverFor more details, see
Generated by the live demo.

