Skip to content

Commit 09fdd22

Browse files
committed
[UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py
1 parent d5a6e5f commit 09fdd22

File tree

3 files changed

+14
-19
lines changed

3 files changed

+14
-19
lines changed
Lines changed: 6 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,32 @@
11
# AnyTextPipeline Pipeline
22

3-
From the repo [page](https://github.com/tyxsspa/AnyText)
3+
Project page: https://aigcdesigngroup.github.io/homepage_anytext
44

55
"AnyText comprises a diffusion pipeline with two primary elements: an auxiliary latent module and a text embedding module. The former uses inputs like text glyph, position, and masked image to generate latent features for text generation or editing. The latter employs an OCR model for encoding stroke data as embeddings, which blend with image caption embeddings from the tokenizer to generate texts that seamlessly integrate with the background. We employed text-control diffusion loss and text perceptual loss for training to further enhance writing accuracy."
66

7-
For any usage questions, please refer to the [paper](https://arxiv.org/abs/2311.03054).
7+
Each text line that needs to be generated should be enclosed in double quotes. For any usage questions, please refer to the [paper](https://arxiv.org/abs/2311.03054).
88

99

1010
```py
1111
import torch
1212
from diffusers import DiffusionPipeline
1313
from anytext_controlnet import AnyTextControlNetModel
14-
from diffusers import DDIMScheduler
1514
from diffusers.utils import load_image
1615

17-
1816
# I chose a font file shared by an HF staff:
1917
!wget https://huggingface.co/spaces/ysharma/TranslateQuotesInImageForwards/resolve/main/arial-unicode-ms.ttf
2018

21-
# load control net and stable diffusion v1-5
2219
anytext_controlnet = AnyTextControlNetModel.from_pretrained("tolgacangoz/anytext-controlnet", torch_dtype=torch.float16,
2320
variant="fp16",)
2421
pipe = DiffusionPipeline.from_pretrained("tolgacangoz/anytext", font_path="arial-unicode-ms.ttf",
25-
controlnet=anytext_controlnet, torch_dtype=torch.float16,
26-
trust_remote_code=True,
27-
).to("cuda")
28-
29-
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
30-
# uncomment following line if PyTorch>=2.0 is not installed for memory optimization
31-
#pipe.enable_xformers_memory_efficient_attention()
32-
33-
# uncomment following line if you want to offload the model to CPU for memory optimization
34-
# also remove the `.to("cuda")` part
35-
#pipe.enable_model_cpu_offload()
22+
controlnet=anytext_controlnet, torch_dtype=torch.float16,
23+
trust_remote_code=False, # One needs to give permission to run this pipeline's code
24+
).to("cuda")
3625

3726
# generate image
3827
prompt = 'photo of caramel macchiato coffee on the table, top-down perspective, with "Any" "Text" written on it using cream'
3928
draw_pos = load_image("https://raw.githubusercontent.com/tyxsspa/AnyText/refs/heads/main/example_images/gen9.png")
4029
image = pipe(prompt, num_inference_steps=20, mode="generate", draw_pos=draw_pos,
41-
).images[0]
30+
).images[0]
4231
image
4332
```

examples/research_projects/anytext/anytext.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,6 @@ def _is_whitespace(self, char):
149149
>>> import torch
150150
>>> from diffusers import DiffusionPipeline
151151
>>> from anytext_controlnet import AnyTextControlNetModel
152-
>>> from diffusers import DDIMScheduler
153152
>>> from diffusers.utils import load_image
154153
155154
>>> # I chose a font file shared by an HF staff:
@@ -162,7 +161,6 @@ def _is_whitespace(self, char):
162161
... trust_remote_code=False, # One needs to give permission to run this pipeline's code
163162
... ).to("cuda")
164163
165-
>>> pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
166164
167165
>>> # generate image
168166
>>> prompt = 'photo of caramel macchiato coffee on the table, top-down perspective, with "Any" "Text" written on it using cream'

examples/research_projects/anytext/anytext_controlnet.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,14 @@
1111
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
14+
#
15+
# Based on [AnyText: Multilingual Visual Text Generation And Editing](https://huggingface.co/papers/2311.03054).
16+
# Authors: Yuxiang Tuo, Wangmeng Xiang, Jun-Yan He, Yifeng Geng, Xuansong Xie
17+
# Code: https://github.com/tyxsspa/AnyText with Apache-2.0 license
18+
#
19+
# Adapted to Diffusers by [M. Tolga Cangöz](https://github.com/tolgacangoz).
20+
21+
1422
from typing import Any, Dict, Optional, Tuple, Union
1523

1624
import torch

0 commit comments

Comments
 (0)