Skip to content

Commit bf7a78a

Browse files
authored
fix(docker): extract wordnet corpus after NLTK download (#61)
The wordnet corpus is downloaded as a zip file by default but needs to be extracted for NLTK to use it properly. Add explicit unzip command after NLTK download to ensure wordnet is available.
1 parent 0ab7ac7 commit bf7a78a

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

docker/Dockerfile.backend

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,7 @@ RUN cd /opt/xagent/frontend && npm ci \
136136
&& npm install -g pptxgenjs@4.0.1 \
137137
&& python -m playwright install chromium \
138138
&& python -c "import nltk; nltk.download('punkt'); nltk.download('punkt_tab'); nltk.download('wordnet'); nltk.download('averaged_perceptron_tagger')" \
139+
&& unzip /root/nltk_data/corpora/wordnet.zip -d /root/nltk_data/corpora \
139140
&& python -c "import tiktoken; tiktoken.encoding_for_model('gpt-4')" \
140141
&& chmod +x /opt/xagent/deploy/entrypoint.sh
141142

0 commit comments

Comments
 (0)