10tenKaizen

Overview

A robust ETL (Extract, Transform, Load) pipeline built on LibreOffice Basic scripting that transforms unstructured dictionary data output from the 10ten Japanese Reader browser extension, by Birchill (dictionary data from JMdict/EDICT, KANJIDIC and JMnedict/ENAMDICT), into normalized, structured LibreOffice Calc spreadsheet data.

Key Features

Complete ETL Workflow: Validates, parses, transforms, loads, and formats data in a single execution
Data Normalization: Converts semi-structured text into structured LibreOffice Calc columns
Error Handling: Multi-layer validation to maintain data integrity
Zero Dependencies: Pure UNO API implementation, no external requirements
Built-in Safeguards: 50-line limit per entry prevents infinite processing loops
Dynamic Row Insertion: Automatically creates rows when entry components exceed available space
Verbose Commentary: Built for LibreOffice Basic learning and ease of maintenance

Installation

(Tested on 10ten Japanese Reader 1.26.1, Mozilla Firefox 147.0.3, and LibreOffice 25.8.4.2)

Clone the repository

git clone https://github.com/sanchesfilho/10tenkaizen

Import the macro to LibreOffice Calc

Open LibreOffice Calc
At the program's main window, navigate: Tools → Macros → Organize Macros → Basic... (keyboard shortcut: Alt + F11)
On the BASIC Macros dialog box, navigate to My Macros → Standard → Module 1, and click "Edit". (Alternatively, you can create new Modules by clicking "Organizer...", navigating to My Macros → Standard, clicking "New", and clicking "Ok" on the "New Module" dialog box.)
The "My Macros & Dialogs" window will appear. There will be a template of code, which may look something like this:

1 REM  *****  BASIC  *****
2 
3 Sub Main
4 
5 End Sub
6

Select and delete all that pre-existing code in the module (e.g. by clicking on the code editor window and pressing Ctrl + A → Delete). Make sure that the cursor is positioned at line 1.
Navigate to File → Import BASIC...
On the Open File dialog, navigate to /10tenkaizen/, select "10tenKaizen_v1.0.0.bas" and click "Open"
Click "Save" (keyboard shortcut: Ctrl + S). You can now run the macro by clicking the "Run" button (keyboard shortcut: F5) inside the "My Macros & Dialogs" window

Set the macro as a shortcut (optional, but recommended)

At the program's main window, navigate to Tools → Customize...
Under the "Keyboard" tab, go into the "Category" panel, and navigate to Application Macros → My Macros → Standard → Module 1 (or other Module you have created).
Select "Macro10tenKaizen" on the "Function" panel.
Select your desired shortcut key or key combination inside the menu "Shortcut Keys", click "Assign", and click "OK".

How to Use

Copy Data: There are two main ways to copy an entry with the 10ten Japanese Reader browser extension:
- Method 1 (Menu navigation): Hover over a Japanese word → wait for the popup → click the word inside the popup → click "Entry" under the "Copy" tab.
- Method 2 (Keyboard shortcut): Hover over a word → wait for the popup → press the sequence C → E on your keyboard.
  IMPORTANT: Other copying methods are NOT supported by this code. Only the "Copy entry" format produces bracketed readings (e.g. [かな]) which are required by this macro.
Paste: Paste the copied 10ten Japanese Reader output (keyboard shortcut: Ctrl + V) into any LibreOffice Calc cell. The Text Import dialog is expected to appear automatically; then, click "OK". This spreads the output vertically across rows in the same column.
(If the Text Import dialog does not appear, use Paste Special (keyboard shortcut: Ctrl + Shift + V) and select "Use text import dialog").
You can paste either one or multiple 10ten Japanese Reader outputs, as long as they are all in the same column and there are no empty cells between them—the macro will automatically detect and process all consecutive entries until an empty cell is encountered.
Execute: Select the first cell of the first entry (i.e., the headword) and run Macro10tenKaizen as explained above, either via the "My Macros & Dialogs" window or the keyboard shortcut you have assigned.
Result: The macro automatically parses, normalizes, structures, merges, numbers, and formats the data across columns A–E.

Processing Flow Example

Raw Data Output From 10ten Japanese Reader: 仮名, 仮字, 假名 [かな] (n) kana; Japanese syllabaries (i.e. hiragana and katakana)

Normalized Data Output Schema via 10tenKaizen:

A	B	C	D	E
`1`	`仮名`	`仮字` `假名`	`かな`	`(n) kana; Japanese syllabaries` `(i.e. hiragana and katakana)`
`Auto-generated` `group identifier`	`Headword`	`Alternative forms`	`Reading`	`Definitions`

ETL Transformation Sequence

Parse: Tokenize the first line from the output ("仮名, 仮字, 假名 [かな]") into semantic components
Extract: Isolate headword, alternative writings, and enclosed reading ("かな") from parsed tokens
Collect: Aggregate definition lines vertically below the main entry (row 2 onwards) in the source column
Calculate: Determine required block height based on whichever is largest: word count, reading count, or definition count
Expand: Dynamically insert rows if calculated height exceeds available space between entries
Distribute: Map parsed elements to normalized columns B–E
Number: Assign sequential Group ID numbers to column A (auto-incremented based on previous blocks)
Structure: Apply cell merging (columns A–D) and border formatting for visual grouping
Cleanup: Erase processed source data to maintain sheet cleanliness
Iterate: Continue processing subsequent entries in the same manner until empty cell is encountered

Author / Contact

Jayme Sanches Filho

GitHub: @sanchesfilho
Email: jssfo@protonmail.com

License

Distributed under the MIT License. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
10tenKaizen_v1.0.0.bas		10tenKaizen_v1.0.0.bas
LICENSE		LICENSE
README-ja.md		README-ja.md
README-pt_BR.md		README-pt_BR.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

10tenKaizen

Overview

Key Features

Installation

Clone the repository

Import the macro to LibreOffice Calc

Set the macro as a shortcut (optional, but recommended)

How to Use

Processing Flow Example

ETL Transformation Sequence

Author / Contact

License

About

Uh oh!

Releases

Packages

Languages

License

sanchesfilho/10tenkaizen

Folders and files

Latest commit

History

Repository files navigation

10tenKaizen

Overview

Key Features

Installation

Clone the repository

Import the macro to LibreOffice Calc

Set the macro as a shortcut (optional, but recommended)

How to Use

Processing Flow Example

ETL Transformation Sequence

Author / Contact

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages