Skip to content

Commit 4e33628

Browse files
authored
Merge pull request #27 from leferrad/release/1.3.2
Release v1.3.2
2 parents c45886d + dfdde5a commit 4e33628

File tree

8 files changed

+58
-9
lines changed

8 files changed

+58
-9
lines changed

.github/workflows/CI.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,8 @@ jobs:
4747
runs-on: ubuntu-latest
4848
steps:
4949
- uses: actions/checkout@v2
50+
- name: Install Tesseract
51+
run: sudo apt-get update && sudo apt-get install -y tesseract-ocr
5052
- uses: julia-actions/setup-julia@v1
5153
with:
5254
version: '1'

.github/workflows/CompatHelper.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@ jobs:
77
CompatHelper:
88
runs-on: ubuntu-latest
99
steps:
10+
- name: Install Tesseract
11+
run: sudo apt-get update && sudo apt-get install -y tesseract-ocr
1012
- name: Pkg.add("CompatHelper")
1113
run: julia -e 'using Pkg; Pkg.add("CompatHelper")'
1214
- name: CompatHelper.main()

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "OCReract"
22
uuid = "c9880795-194d-450c-832d-1e8a03a8ecd1"
33
authors = ["Leandro Ferrado"]
4-
version = "1.3.1"
4+
version = "1.3.2"
55

66
[deps]
77
FileIO = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549"

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,12 @@
1010
## Installation
1111

1212
From the Julia REPL, type `]` to enter the Pkg REPL mode and run:
13+
1314
```julia-repl
1415
pkg> add OCReract
1516
```
1617

17-
This is just a wrapper, so it assumes you already have installed [Tesseract](https://tesseract-ocr.github.io/tessdoc/Installation.html).
18+
This is just a wrapper, so it assumes you already have installed [Tesseract](https://tesseract-ocr.github.io/tessdoc/Installation.html). Also, be sure the binary `tesseract` is in your PATH (you can check this by running `tesseract --version` in your terminal).
1819

1920
## Usage
2021

@@ -37,4 +38,5 @@ julia> println(strip(res_text));
3738
In a Julia session, run `Pkg.test("OCReract", coverage=true)`.
3839

3940
## Next steps
41+
4042
- Develop a module for image pre-processing (to improve OCR results)

docs/src/index.md

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,13 @@ OCReract is a simple Julia wrapper of the well-known OCR engine called [Tesserac
1010
The Tesseract OCR engine must be installed manually. On ubuntu, this may be as simple as
1111

1212
```
13-
$ sudo apt-get install -y tesseract-ocr
13+
sudo apt-get install -y tesseract-ocr
1414
```
1515

1616
but the [installation instructions](https://tesseract-ocr.github.io/tessdoc/Installation.html) are the authoritative source.
1717

1818
The Julia wrapper can be installed using the Julia package manager. From the Julia REPL, type `]` to enter the Pkg REPL mode and run:
19+
1920
```julia-repl
2021
pkg> add OCReract
2122
```
@@ -26,7 +27,7 @@ In this simple example, we will process the following image through the two opti
2627

2728
![Test Image](https://raw.githubusercontent.com/leferrad/OCReract.jl/master/test/files/noisy.png)
2829

29-
#### In disk
30+
### In disk
3031

3132
Let's execute `run_tesseract` to process the image from repository's test folder, and then `cat` the resulting text file.
3233

@@ -39,7 +40,7 @@ julia> read(`cat $res_path`, String)
3940
"Noisy image\nto test\nOCReract.jl\n\f"
4041
```
4142

42-
#### In memory
43+
### In memory
4344

4445
`OCReract` uses [JuliaImages](https://juliaimages.org/latest/) module to process images in memory. So, the image should be loaded with `Images` module (or the lighter-weight combination `using ImageCore, FileIO`) to then execute `run_tesseract` to retrieve the result as a `String`.
4546

@@ -61,6 +62,12 @@ OCReract.jl
6162
```@index
6263
```
6364

65+
```@docs
66+
OCReract.OCReract
67+
OCReract.check_tesseract_installed
68+
OCReract.get_tesseract_version
69+
```
70+
6471
```@autodocs
6572
Modules = [OCReract]
6673
Private = false

src/OCReract.jl

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
module OCReract
22

33
export
4-
run_tesseract
4+
run_tesseract,
5+
get_tesseract_version,
6+
check_tesseract_installed
57

68
include("tesseract.jl")
79

src/tesseract.jl

Lines changed: 30 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,27 @@ using ImageCore
33
using Logging
44

55
export
6-
run_tesseract
6+
run_tesseract,
7+
get_tesseract_version,
8+
check_tesseract_installed
79

810
# Tesseract settings
911
command = "tesseract"
1012
psm_valid_range = 0:14
1113
oem_valid_range = 0:4
1214

13-
"""Util to check and inform whether Tesseract is installed or not"""
15+
"""
16+
check_tesseract_installed()
17+
18+
This function checks if Tesseract is installed in the system by running the command
19+
`$command --version`. If the command is not recognized, an error is logged.
20+
21+
# Examples
22+
```julia-repl
23+
julia> using OCReract;
24+
julia> check_tesseract_installed()
25+
```
26+
"""
1427
function check_tesseract_installed()
1528
try
1629
read(`$command --version`, String);
@@ -21,7 +34,21 @@ end
2134

2235
check_tesseract_installed()
2336

24-
"""Util to get version of Tesseract installed"""
37+
"""
38+
get_tesseract_version() -> String
39+
40+
Function to get the version of Tesseract installed in the system.
41+
The version is extracted from the first line of the output of the command `$command --version`.
42+
43+
# Returns
44+
- `String`: version of Tesseract installed
45+
46+
# Examples
47+
```julia-repl
48+
julia> using OCReract;
49+
julia> get_tesseract_version()
50+
```
51+
"""
2552
function get_tesseract_version()
2653
info = read(`$command --version`, String)
2754
version = split(info, "\n")[1]

test/test_tesseract.jl

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,13 @@ end
116116
function test_get_tesseract_version()
117117
version = OCReract.get_tesseract_version()
118118
tesseract_string, version_string = split(version, " ")
119+
# Remove v from version string if it exists
120+
if tesseract_string == "tesseract" && version_string[1] == 'v'
121+
version_string = version_string[2:end]
122+
end
123+
# Use only the version number (major.minor.patch)
124+
version_string = split(version_string, ".")[1:3]
125+
version_string = join(version_string, ".")
119126
@test tesseract_string == "tesseract"
120127
@test occursin(r"^([1-9]\d*|0)(\.(([1-9]\d*)|0)){2}$", version_string)
121128
end

0 commit comments

Comments
 (0)