Skip to content

Commit 00ae126

Browse files
committed
Update Tue Jan 14 16:47:19 CET 2025
1 parent fe8bb73 commit 00ae126

File tree

8 files changed

+40
-90
lines changed

8 files changed

+40
-90
lines changed

de/daten.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -454,7 +454,7 @@ <h2 id="annotationstiefe-textgenauigkeit-und-artefakte">Annotationstiefe, Textge
454454
<li>Textzeilen</li>
455455
</ul>
456456

457-
<p><a href="https://ocr-d-repo.scc.kit.edu/api/v1/metastore/bagit">Zur Übersicht</a></p>
457+
<p><a href="https://github.com/OCR-D/gt_structure_text/releases">Zur Übersicht</a></p>
458458

459459
<p>Die Spezialkorpora umfassen:</p>
460460

en/data.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -460,7 +460,7 @@ <h2 id="depth-of-annotation-text-accuracy-and-artifacts">Depth of Annotation, Te
460460
<li>Text lines</li>
461461
</ul>
462462

463-
<p><a href="https://ocr-d-repo.scc.kit.edu/api/v1/metastore/bagit">Overview</a> (The list will be extended continuously.)</p>
463+
<p><a href="https://github.com/OCR-D/gt_structure_text/releasest">Overview</a></p>
464464

465465
<p>The special corpora contain:</p>
466466

en/faq.html

Lines changed: 31 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -424,12 +424,10 @@ <h2>
424424
<li><a href="#what-is-the-difference-between-ocr-d-and-abbyy">What is the difference between OCR-D and ABBYY?</a></li>
425425
<li><a href="#what-is-the-difference-between-ocr-d-and-tesseract">What is the difference between OCR-D and Tesseract?</a></li>
426426
<li><a href="#what-is-the-difference-between-ocr-d-and-transkribus">What is the difference between OCR-D and TRANSKRIBUS?</a></li>
427-
<li><a href="#what-is-the-difference-between-ocr-d-and-ocr4all">What is the difference between OCR-D and ocr4all?</a></li>
428427
<li><a href="#is-ocr-d-production-ready">Is OCR-D production-ready?</a></li>
429428
<li><a href="#which-formats-are-supported-by-ocr-d">Which formats are supported by OCR-D?</a></li>
430429
<li><a href="#why-does-ocr-d-need-mets-files-how-can-i-process-images-without-mets">Why does OCR-D need METS files? How can I process images without METS?</a></li>
431430
<li><a href="#how-much-does-it-cost-to-deploy-ocr-d">How much does it cost to deploy OCR-D?</a></li>
432-
<li><a href="#how-are-the-full-texts-produced-by-ocr-d-presented-to-the-library-user-are-they-integrated-into-the-library-catalog-and-can-therefore-be-used-for-full-text-search-in-the-catalog">How are the full texts produced by OCR-D presented to the (library) user? Are they integrated into the library catalog and can therefore be used for full text search in the catalog?</a></li>
433431
<li><a href="#what-are-the-system-requirements-for-ocr-d-software">What are the system requirements for OCR-D-software?</a></li>
434432
</ul>
435433
</li>
@@ -440,49 +438,9 @@ <h2>
440438
<li><a href="#how-do-i-get-help-on-ocr-d-processors">How do I get help on OCR-D processors?</a></li>
441439
<li><a href="#how-can-i-specify-parameters-on-the-command-line">How can I specify parameters on the command line?</a></li>
442440
<li><a href="#how-do-i-specify-multiple-inputoutput-file-groups">How do I specify multiple input/output file groups?</a></li>
443-
<li><a href="#how-to-configure-logging">How to configure logging?</a></li>
444441
<li><a href="#how-to-stop-tensorflow-logging-spam">How to stop tensorflow logging spam</a></li>
445442
</ul>
446443
</li>
447-
<li><a href="#ocr-d-module-project-software">OCR-D module project software</a>
448-
<ul>
449-
<li><a href="#where-can-i-find-official-ocr-d-module-project-software">Where can I find official OCR-D module project software?</a></li>
450-
<li><a href="#which-third-party-ocr-d-compatible-software-exists">Which third-party OCR-D-compatible software exists?</a></li>
451-
<li><a href="#which-processors-are-available">Which processors are available?</a></li>
452-
</ul>
453-
</li>
454-
<li><a href="#workflows-and-processors">Workflows and processors</a>
455-
<ul>
456-
<li><a href="#how-can-i-define-workflows">How can I define workflows?</a></li>
457-
<li><a href="#where-can-i-find-sample-workflows-to-experiment-with">Where can I find sample workflows to experiment with?</a></li>
458-
<li><a href="#how-to-handle-failing-workflows">How to handle failing workflows?</a></li>
459-
<li><a href="#why-do-some-processors-have-multiple-input-or-output-file-groups">Why do some processors have multiple input or output file groups?</a></li>
460-
<li><a href="#where-can-i-learn-about-the-input-and-output-file-groups-of-a-processor">Where can I learn about the input and output file groups of a processor?</a></li>
461-
<li><a href="#how-can-i-validate-my-workflow-is-correctly-wired">How can I validate my workflow is correctly wired?</a></li>
462-
<li><a href="#where-can-i-learn-about-the-parameters-of-a-processor">Where can I learn about the parameters of a processor?</a></li>
463-
</ul>
464-
</li>
465-
<li><a href="#ocrd_all"><code class="language-plaintext highlighter-rouge">ocrd_all</code></a>
466-
<ul>
467-
<li><a href="#what-is-ocrd_all">What is <code class="language-plaintext highlighter-rouge">ocrd_all</code>?</a></li>
468-
<li><a href="#how-to-update-ocrd_all">How to update <code class="language-plaintext highlighter-rouge">ocrd_all</code>?</a></li>
469-
<li><a href="#how-to-debug-ocrd_all-problems">How to debug <code class="language-plaintext highlighter-rouge">ocrd_all</code> problems?</a></li>
470-
<li><a href="#i-used-sudo-and-now-everything-is-broken">I used <code class="language-plaintext highlighter-rouge">sudo</code> and now everything is broken</a></li>
471-
</ul>
472-
</li>
473-
<li><a href="#training">Training</a>
474-
<ul>
475-
<li><a href="#i-want-to-train-a-custom-ocr-model-where-do-i-start">I want to train a custom OCR model. Where do I start?</a></li>
476-
</ul>
477-
</li>
478-
<li><a href="#ocr-d-ground-truth">OCR-D-Ground Truth</a>
479-
<ul>
480-
<li><a href="#which-of-the-three-transcription-levels-specified-in-the-transcription-guidelines-was-used-for-the-gt-of-ocr-d">Which of the three transcription levels specified in the Transcription Guidelines was used for the GT of OCR-D?</a></li>
481-
<li><a href="#are-the-three-transcription-levels-designed-hierarchically-meaning-does-level-3-include-level-2-and-level-1">Are the three transcription levels designed hierarchically? Meaning, does level 3 include level 2 and level 1?</a></li>
482-
<li><a href="#i-want-to-make-some-gt-myself-which-level-should-i-use-can-i-mix-levels">I want to make some GT myself. Which level should I use? Can I mix levels?</a></li>
483-
<li><a href="#i-have-some-transcriptions-of-early-modern-books-but-i-didnt-stick-to-the-ocr-d-gt-guidelines-would-my-transcriptions-still-be-useful-for-ocr-d">I have some transcriptions of early modern books, but I didn’t stick to the OCR-D GT guidelines. Would my transcriptions still be useful for OCR-D?</a></li>
484-
</ul>
485-
</li>
486444
</ul>
487445
</li>
488446
</ul>
@@ -509,7 +467,8 @@ <h3 id="where-can-i-get-support-on-ocr-d">Where can I get support on OCR-D?</h3>
509467
<ul>
510468
<li>Open an issue at the <a href="https://github.com/OCR-D/core">OCR-D/core repository</a></li>
511469
<li>Chat with OCR-D project members and other OCR-D users in <a href="https://gitter.im/OCR-D/Lobby">OCR-D’s chat room</a>.</li>
512-
<li>Send an email to …</li>
470+
<li>Join us at our <a href="https://ocr-d.de/en/community">regular calls</a>.</li>
471+
<li>Send an email to <a href="mailto:[email protected]?subject=Request%20via%20OCR-D.de">eckert[at]hab.de</a></li>
513472
</ul>
514473

515474
<h3 id="what-is-the-difference-between-ocr-d-and-abbyy">What is the difference between OCR-D and ABBYY?</h3>
@@ -553,10 +512,12 @@ <h3 id="what-is-the-difference-between-ocr-d-and-transkribus">What is the differ
553512
interaction.</li>
554513
</ul>
555514

556-
<h3 id="what-is-the-difference-between-ocr-d-and-ocr4all">What is the difference between OCR-D and ocr4all?</h3>
515+
<!--- ### What is the difference between OCR-D and ocr4all? --->
557516

558517
<h3 id="is-ocr-d-production-ready">Is OCR-D production-ready?</h3>
559518

519+
<p>Yes! Several libraries in Germany (e.g. Staatsbibliothek Berlin, ULB Göttingen, ULB Sachsen-Anhalt) are already using OCR-D at a large scale, with over 10 million pages digitized already.</p>
520+
560521
<h3 id="which-formats-are-supported-by-ocr-d">Which formats are supported by OCR-D?</h3>
561522

562523
<p>OCR-D is primarily based around METS as a container format and PAGE-XML for
@@ -584,7 +545,7 @@ <h3 id="how-much-does-it-cost-to-deploy-ocr-d">How much does it cost to deploy O
584545
<p>OCR-D is Free Software, licensed under the terms of the Apache 2.0 license and
585546
will be free to use and adapt in perpetuity.</p>
586547

587-
<h3 id="how-are-the-full-texts-produced-by-ocr-d-presented-to-the-library-user-are-they-integrated-into-the-library-catalog-and-can-therefore-be-used-for-full-text-search-in-the-catalog">How are the full texts produced by OCR-D presented to the (library) user? Are they integrated into the library catalog and can therefore be used for full text search in the catalog?</h3>
548+
<!--- ### How are the full texts produced by OCR-D presented to the (library) user? Are they integrated into the library catalog and can therefore be used for full text search in the catalog? --->
588549

589550
<h3 id="what-are-the-system-requirements-for-ocr-d-software">What are the system requirements for OCR-D-software?</h3>
590551

@@ -658,7 +619,7 @@ <h3 id="how-do-i-specify-multiple-inputoutput-file-groups">How do I specify mult
658619
<code class="language-plaintext highlighter-rouge">DEFAULT</code> group and region-segmented layout information from the <code class="language-plaintext highlighter-rouge">REGIONS</code>
659620
group.</p>
660621

661-
<h3 id="how-to-configure-logging">How to configure logging?</h3>
622+
<!--- ### How to configure logging? --->
662623

663624
<h3 id="how-to-stop-tensorflow-logging-spam">How to stop tensorflow logging spam</h3>
664625

@@ -676,53 +637,53 @@ <h3 id="how-to-stop-tensorflow-logging-spam">How to stop tensorflow logging spam
676637
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">export </span><span class="nv">TF_CPP_MIN_LOG_LEVEL</span><span class="o">=</span>3
677638
</code></pre></div></div>
678639

679-
<h2 id="ocr-d-module-project-software">OCR-D module project software</h2>
640+
<!--- ## OCR-D module project software -->
680641

681-
<h3 id="where-can-i-find-official-ocr-d-module-project-software">Where can I find official OCR-D module project software?</h3>
642+
<!--- ### Where can I find official OCR-D module project software? --->
682643

683-
<h3 id="which-third-party-ocr-d-compatible-software-exists">Which third-party OCR-D-compatible software exists?</h3>
644+
<!--- ### Which third-party OCR-D-compatible software exists? --->
684645

685-
<h3 id="which-processors-are-available">Which processors are available?</h3>
646+
<!--- ### Which processors are available? --->
686647

687-
<h2 id="workflows-and-processors">Workflows and processors</h2>
648+
<!--- ## Workflows and processors --->
688649

689-
<h3 id="how-can-i-define-workflows">How can I define workflows?</h3>
650+
<!--- ### How can I define workflows? --->
690651

691-
<h3 id="where-can-i-find-sample-workflows-to-experiment-with">Where can I find sample workflows to experiment with?</h3>
652+
<!--- ### Where can I find sample workflows to experiment with? --->
692653

693-
<h3 id="how-to-handle-failing-workflows">How to handle failing workflows?</h3>
654+
<!--- ### How to handle failing workflows? --->
694655

695-
<h3 id="why-do-some-processors-have-multiple-input-or-output-file-groups">Why do some processors have multiple input or output file groups?</h3>
656+
<!--- ### Why do some processors have multiple input or output file groups? --->
696657

697-
<h3 id="where-can-i-learn-about-the-input-and-output-file-groups-of-a-processor">Where can I learn about the input and output file groups of a processor?</h3>
658+
<!--- ### Where can I learn about the input and output file groups of a processor? --->
698659

699-
<h3 id="how-can-i-validate-my-workflow-is-correctly-wired">How can I validate my workflow is correctly wired?</h3>
660+
<!--- ### How can I validate my workflow is correctly wired? --->
700661

701-
<h3 id="where-can-i-learn-about-the-parameters-of-a-processor">Where can I learn about the parameters of a processor?</h3>
662+
<!--- ### Where can I learn about the parameters of a processor? --->
702663

703-
<h2 id="ocrd_all"><code class="language-plaintext highlighter-rouge">ocrd_all</code></h2>
664+
<!--- ## `ocrd_all` --->
704665

705-
<h3 id="what-is-ocrd_all">What is <code class="language-plaintext highlighter-rouge">ocrd_all</code>?</h3>
666+
<!--- ### What is `ocrd_all`? --->
706667

707-
<h3 id="how-to-update-ocrd_all">How to update <code class="language-plaintext highlighter-rouge">ocrd_all</code>?</h3>
668+
<!--- ### How to update `ocrd_all`? --->
708669

709-
<h3 id="how-to-debug-ocrd_all-problems">How to debug <code class="language-plaintext highlighter-rouge">ocrd_all</code> problems?</h3>
670+
<!--- ### How to debug `ocrd_all` problems? --->
710671

711-
<h3 id="i-used-sudo-and-now-everything-is-broken">I used <code class="language-plaintext highlighter-rouge">sudo</code> and now everything is broken</h3>
672+
<!--- ### I used `sudo` and now everything is broken --->
712673

713-
<h2 id="training">Training</h2>
674+
<!--- ## Training --->
714675

715-
<h3 id="i-want-to-train-a-custom-ocr-model-where-do-i-start">I want to train a custom OCR model. Where do I start?</h3>
676+
<!--- ### I want to train a custom OCR model. Where do I start? --->
716677

717-
<h2 id="ocr-d-ground-truth">OCR-D-Ground Truth</h2>
678+
<!--- ## OCR-D-Ground Truth --->
718679

719-
<h3 id="which-of-the-three-transcription-levels-specified-in-the-transcription-guidelines-was-used-for-the-gt-of-ocr-d">Which of the three transcription levels specified in the Transcription Guidelines was used for the GT of OCR-D?</h3>
680+
<!--- ### Which of the three transcription levels specified in the Transcription Guidelines was used for the GT of OCR-D? --->
720681

721-
<h3 id="are-the-three-transcription-levels-designed-hierarchically-meaning-does-level-3-include-level-2-and-level-1">Are the three transcription levels designed hierarchically? Meaning, does level 3 include level 2 and level 1?</h3>
682+
<!--- ### Are the three transcription levels designed hierarchically? Meaning, does level 3 include level 2 and level 1? --->
722683

723-
<h3 id="i-want-to-make-some-gt-myself-which-level-should-i-use-can-i-mix-levels">I want to make some GT myself. Which level should I use? Can I mix levels?</h3>
684+
<!--- ### I want to make some GT myself. Which level should I use? Can I mix levels? --->
724685

725-
<h3 id="i-have-some-transcriptions-of-early-modern-books-but-i-didnt-stick-to-the-ocr-d-gt-guidelines-would-my-transcriptions-still-be-useful-for-ocr-d">I have some transcriptions of early modern books, but I didnt stick to the OCR-D GT guidelines. Would my transcriptions still be useful for OCR-D?</h3>
686+
<!--- ### I have some transcriptions of early modern books, but I didn't stick to the OCR-D GT guidelines. Would my transcriptions still be useful for OCR-D? --->
726687

727688
</main>
728689
</div><footer class="footer" style="padding: 1rem">

en/setup.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -922,7 +922,7 @@ <h3 id="installation-1">Installation</h3>
922922

923923
<h3 id="testing-the-native-installation">Testing the native installation</h3>
924924

925-
<p>For example, let’s fetch a document from the <a href="https://ocr-d-repo.scc.kit.edu/api/v1/metastore/bagit/">OCR-D GT Repo</a>:</p>
925+
<p>For example, let’s fetch a document from the <a href="https://ola-hd.ocr-d.de/">OLA-HD Repo</a>:</p>
926926

927927
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>wget <span class="s2">"https://ola-hd.ocr-d.de/api/export?id=21.11156/BFBAD520-65F4-430A-B4B2-C81A296C9E09&amp;internalId=false"</span> <span class="nt">-O</span> wundt_grundriss_1896.ocrd.zip
928928
<span class="nb">sudo </span>unzip wundt_grundriss_1896.ocrd.zip

feed.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.4">Jekyll</generator><link href="https://ocr-d.de/feed.xml" rel="self" type="application/atom+xml" /><link href="https://ocr-d.de/" rel="alternate" type="text/html" /><updated>2025-01-14T16:21:23+01:00</updated><id>https://ocr-d.de/feed.xml</id><title type="html">OCR-D</title><subtitle>Write an awesome description for your new site here. You can edit this line in _config.yml. It will appear in your document head meta (for Google search results) and in your feed.xml site description.</subtitle><entry xml:lang="de"><title type="html">OCR-D Phase III gestartet</title><link href="https://ocr-d.de/de/2021/08/06/kick-off-phase3.html" rel="alternate" type="text/html" title="OCR-D Phase III gestartet" /><published>2021-08-06T00:00:00+02:00</published><updated>2021-08-06T00:00:00+02:00</updated><id>https://ocr-d.de/de/2021/08/06/kick-off-phase3</id><content type="html" xml:base="https://ocr-d.de/de/2021/08/06/kick-off-phase3.html"><![CDATA[<p>Am 30. Juli fand unser Kick-off-Workshop statt, der die Phase III von OCR-D einläutete.</p>
1+
<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.4">Jekyll</generator><link href="https://ocr-d.de/feed.xml" rel="self" type="application/atom+xml" /><link href="https://ocr-d.de/" rel="alternate" type="text/html" /><updated>2025-01-14T16:47:09+01:00</updated><id>https://ocr-d.de/feed.xml</id><title type="html">OCR-D</title><subtitle>Write an awesome description for your new site here. You can edit this line in _config.yml. It will appear in your document head meta (for Google search results) and in your feed.xml site description.</subtitle><entry xml:lang="de"><title type="html">OCR-D Phase III gestartet</title><link href="https://ocr-d.de/de/2021/08/06/kick-off-phase3.html" rel="alternate" type="text/html" title="OCR-D Phase III gestartet" /><published>2021-08-06T00:00:00+02:00</published><updated>2021-08-06T00:00:00+02:00</updated><id>https://ocr-d.de/de/2021/08/06/kick-off-phase3</id><content type="html" xml:base="https://ocr-d.de/de/2021/08/06/kick-off-phase3.html"><![CDATA[<p>Am 30. Juli fand unser Kick-off-Workshop statt, der die Phase III von OCR-D einläutete.</p>
22

33
<p>Das Team gab eine Einführung in die <a href="/assets/kick-off/phase3.pdf">Ziele und öffentlichen Kommunikationskanäle von OCR-D in Phase III</a>, in <a href="/assets/kick-off/spec_core_ocrd_all.pdf">Status und Pläne der OCR-Software</a> und der <a href="/assets/kick-off/web-api.pdf">Web-API</a> und in den Umgang mit <a href="/assets/kick-off/gt.pdf">Ground Truth Daten in OCR-D</a>. Zudem gab das Koordinierungsprojekt einen Einblick in die bisherige Praxis der <a href="/assets/kick-off/software-development.pdf">Softwareentwicklung in OCR-D</a> mit Möglichkeiten, mitzuwirken.</p>
44

gt-repo/index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
<h1>Redirecting...</h1>
1010
<script>
1111
const u = window.location.href
12-
window.location.href = 'https://ocr-d-repo.scc.kit.edu/api/v1/metastore/bagit'
12+
window.location.href = 'https://github.com/OCR-D/gt_structure_text/releases'
1313
</script>
1414
</body>
1515
</html>

0 commit comments

Comments
 (0)