Skip to content

Commit 3038b4e

Browse files
committed
feat: add docling loader
Signed-off-by: Panos Vagenas <[email protected]>
1 parent d048de0 commit 3038b4e

File tree

15 files changed

+6204
-1
lines changed

15 files changed

+6204
-1
lines changed

.flake8

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
[flake8]
2+
per-file-ignores = __init__.py:F401
3+
max-line-length = 88
4+
exclude = test/*
5+
max-complexity = 25
6+
docstring-convention = google
7+
ignore = W503,E203
8+

.gitignore

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
node_modules/
2+
3+
.idea/
4+
*~
5+
*.DS_Store
6+
7+
# Byte-compiled / optimized / DLL files
8+
__pycache__/
9+
*.py[cod]
10+
*$py.class
11+
12+
# C extensions
13+
*.so
14+
15+
# Distribution / packaging
16+
.Python
17+
build/
18+
develop-eggs/
19+
dist/
20+
downloads/
21+
eggs/
22+
.eggs/
23+
lib/
24+
lib64/
25+
parts/
26+
sdist/
27+
var/
28+
wheels/
29+
*.egg-info/
30+
.installed.cfg
31+
*.egg
32+
MANIFEST
33+
34+
# PyInstaller
35+
# Usually these files are written by a python script from a template
36+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
37+
*.manifest
38+
*.spec
39+
40+
# Installer logs
41+
pip-log.txt
42+
pip-delete-this-directory.txt
43+
44+
# Unit test / coverage reports
45+
htmlcov/
46+
.tox/
47+
.coverage
48+
.coverage.*
49+
.cache
50+
nosetests.xml
51+
coverage.xml
52+
*.cover
53+
.hypothesis/
54+
.pytest_cache/
55+
56+
# Translations
57+
*.mo
58+
*.pot
59+
60+
# Django stuff:
61+
*.log
62+
local_settings.py
63+
db.sqlite3
64+
65+
# Flask stuff:
66+
instance/
67+
.webassets-cache
68+
69+
# Scrapy stuff:
70+
.scrapy
71+
72+
# Sphinx documentation
73+
docs/_build/
74+
75+
# PyBuilder
76+
target/
77+
78+
# Jupyter Notebook
79+
.ipynb_checkpoints
80+
81+
# pyenv
82+
.python-version
83+
84+
# celery beat schedule file
85+
celerybeat-schedule
86+
87+
# SageMath parsed files
88+
*.sage.py
89+
90+
# Environments
91+
.env
92+
.venv
93+
env/
94+
venv/
95+
ENV/
96+
env.bak/
97+
venv.bak/
98+
99+
# Spyder project settings
100+
.spyderproject
101+
.spyproject
102+
103+
# Rope project settings
104+
.ropeproject
105+
106+
# mkdocs documentation
107+
/site
108+
109+
# mypy
110+
.mypy_cache/
111+
112+
# VisualStudioCode
113+
.vscode/

.pre-commit-config.yaml

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
fail_fast: true
2+
repos:
3+
- repo: local
4+
hooks:
5+
- id: black
6+
name: Black
7+
entry: poetry run black docling_langchain test
8+
pass_filenames: false
9+
language: system
10+
files: '\.py$'
11+
- id: isort
12+
name: isort
13+
entry: poetry run isort docling_langchain test
14+
pass_filenames: false
15+
language: system
16+
files: '\.py$'
17+
- id: autoflake
18+
name: autoflake
19+
entry: poetry run autoflake docling_langchain test
20+
pass_filenames: false
21+
language: system
22+
files: '\.py$'
23+
- id: mypy
24+
name: MyPy
25+
entry: poetry run mypy docling_langchain test
26+
pass_filenames: false
27+
language: system
28+
files: '\.py$'
29+
- id: flake8
30+
name: Flake8
31+
entry: poetry run flake8 docling_langchain
32+
pass_filenames: false
33+
language: system
34+
files: '\.py$'
35+
- id: pytest
36+
name: Pytest
37+
entry: poetry run pytest test/
38+
pass_filenames: false
39+
language: system
40+
files: '\.py$'
41+
- id: nbqa_black
42+
name: nbQA Black
43+
entry: poetry run nbqa black examples/
44+
pass_filenames: false
45+
language: system
46+
files: '\.ipynb$'
47+
- id: nbqa_isort
48+
name: nbQA isort
49+
entry: poetry run nbqa isort examples/
50+
pass_filenames: false
51+
language: system
52+
files: '\.ipynb$'
53+
- id: poetry
54+
name: Poetry
55+
entry: poetry check --lock
56+
pass_filenames: false
57+
language: system

CODE_OF_CONDUCT.md

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
# Contributor Covenant Code of Conduct
2+
3+
## Our Pledge
4+
5+
We as members, contributors, and leaders pledge to make participation in our
6+
community a harassment-free experience for everyone, regardless of age, body
7+
size, visible or invisible disability, ethnicity, sex characteristics, gender
8+
identity and expression, level of experience, education, socio-economic status,
9+
nationality, personal appearance, race, religion, or sexual identity
10+
and orientation.
11+
12+
We pledge to act and interact in ways that contribute to an open, welcoming,
13+
diverse, inclusive, and healthy community.
14+
15+
## Our Standards
16+
17+
Examples of behavior that contributes to a positive environment for our
18+
community include:
19+
20+
* Demonstrating empathy and kindness toward other people
21+
* Being respectful of differing opinions, viewpoints, and experiences
22+
* Giving and gracefully accepting constructive feedback
23+
* Accepting responsibility and apologizing to those affected by our mistakes,
24+
and learning from the experience
25+
* Focusing on what is best not just for us as individuals, but for the
26+
overall community
27+
28+
Examples of unacceptable behavior include:
29+
30+
* The use of sexualized language or imagery, and sexual attention or
31+
advances of any kind
32+
* Trolling, insulting or derogatory comments, and personal or political attacks
33+
* Public or private harassment
34+
* Publishing others' private information, such as a physical or email
35+
address, without their explicit permission
36+
* Other conduct which could reasonably be considered inappropriate in a
37+
professional setting
38+
39+
## Enforcement Responsibilities
40+
41+
Community leaders are responsible for clarifying and enforcing our standards of
42+
acceptable behavior and will take appropriate and fair corrective action in
43+
response to any behavior that they deem inappropriate, threatening, offensive,
44+
or harmful.
45+
46+
Community leaders have the right and responsibility to remove, edit, or reject
47+
comments, commits, code, wiki edits, issues, and other contributions that are
48+
not aligned to this Code of Conduct, and will communicate reasons for moderation
49+
decisions when appropriate.
50+
51+
## Scope
52+
53+
This Code of Conduct applies within all community spaces, and also applies when
54+
an individual is officially representing the community in public spaces.
55+
Examples of representing our community include using an official e-mail address,
56+
posting via an official social media account, or acting as an appointed
57+
representative at an online or offline event.
58+
59+
## Enforcement
60+
61+
Instances of abusive, harassing, or otherwise unacceptable behavior may be
62+
reported to the community leaders responsible for enforcement using
63+
64+
65+
All complaints will be reviewed and investigated promptly and fairly.
66+
67+
All community leaders are obligated to respect the privacy and security of the
68+
reporter of any incident.
69+
70+
## Enforcement Guidelines
71+
72+
Community leaders will follow these Community Impact Guidelines in determining
73+
the consequences for any action they deem in violation of this Code of Conduct:
74+
75+
### 1. Correction
76+
77+
**Community Impact**: Use of inappropriate language or other behavior deemed
78+
unprofessional or unwelcome in the community.
79+
80+
**Consequence**: A private, written warning from community leaders, providing
81+
clarity around the nature of the violation and an explanation of why the
82+
behavior was inappropriate. A public apology may be requested.
83+
84+
### 2. Warning
85+
86+
**Community Impact**: A violation through a single incident or series
87+
of actions.
88+
89+
**Consequence**: A warning with consequences for continued behavior. No
90+
interaction with the people involved, including unsolicited interaction with
91+
those enforcing the Code of Conduct, for a specified period of time. This
92+
includes avoiding interactions in community spaces as well as external channels
93+
like social media. Violating these terms may lead to a temporary or
94+
permanent ban.
95+
96+
### 3. Temporary Ban
97+
98+
**Community Impact**: A serious violation of community standards, including
99+
sustained inappropriate behavior.
100+
101+
**Consequence**: A temporary ban from any sort of interaction or public
102+
communication with the community for a specified period of time. No public or
103+
private interaction with the people involved, including unsolicited interaction
104+
with those enforcing the Code of Conduct, is allowed during this period.
105+
Violating these terms may lead to a permanent ban.
106+
107+
### 4. Permanent Ban
108+
109+
**Community Impact**: Demonstrating a pattern of violation of community
110+
standards, including sustained inappropriate behavior, harassment of an
111+
individual, or aggression toward or disparagement of classes of individuals.
112+
113+
**Consequence**: A permanent ban from any sort of public interaction within
114+
the community.
115+
116+
## Attribution
117+
118+
This Code of Conduct is adapted from the
119+
[Contributor Covenant](https://www.contributor-covenant.org),
120+
version 2.0, available at
121+
[https://www.contributor-covenant.org/version/2/0/code_of_conduct.html](https://www.contributor-covenant.org/version/2/0/code_of_conduct.html).
122+
123+
Community Impact Guidelines were inspired by [Mozilla's code of conduct
124+
enforcement ladder](https://github.com/mozilla/diversity).
125+
126+
127+
For answers to common questions about this code of conduct, see the FAQ at
128+
[https://www.contributor-covenant.org/faq](https://www.contributor-covenant.org/faq). Translations are available at
129+
[https://www.contributor-covenant.org/translations](https://www.contributor-covenant.org/translations).

0 commit comments

Comments
 (0)