|
| 1 | +# OCI Language Translation Tools |
| 2 | + |
| 3 | +## Introduction |
| 4 | + |
| 5 | +This repository contains two powerful tools for leveraging OCI Language Translation services: |
| 6 | + |
| 7 | +1. **Bulk Document Translation**: Automatically translate multiple documents stored in an OCI Object Storage bucket. This tool supports various document formats and maintains the original file structure in the target bucket. |
| 8 | + |
| 9 | +2. **CSV/JSON Field Translation**: Selectively translate specific columns in CSV files or keys in JSON documents while preserving the original structure. This is particularly useful for localizing data files while maintaining their format and untranslated fields. |
| 10 | + |
| 11 | +## Prerequisites |
| 12 | + |
| 13 | +- Python 3.8 or higher |
| 14 | +- OCI Account with Language Translation service enabled |
| 15 | +- Required IAM Policies and Permissions |
| 16 | +- Object Storage buckets (for document translation) |
| 17 | +- OCI CLI configured with proper credentials |
| 18 | + |
| 19 | +### OCI Setup Requirements |
| 20 | + |
| 21 | +1. Create an OCI account if you don't have one |
| 22 | +2. Enable Language Translation service in your tenancy |
| 23 | +3. Set up OCI CLI and create API keys: |
| 24 | + ```bash |
| 25 | + # Install OCI CLI |
| 26 | + bash -c "$(curl -L https://raw.githubusercontent.com/oracle/oci-cli/master/scripts/install/install.sh)" |
| 27 | + |
| 28 | + # Configure OCI CLI (this will create ~/.oci/config) |
| 29 | + oci setup config |
| 30 | + ``` |
| 31 | +4. Set up appropriate IAM policies |
| 32 | +5. Create source and target buckets in Object Storage (for document translation) |
| 33 | +6. Note your Object Storage namespace (visible in the OCI Console under Object Storage) |
| 34 | + |
| 35 | +## Getting Started |
| 36 | + |
| 37 | +1. Clone this repository: |
| 38 | + ```bash |
| 39 | + git clone <repository-url> |
| 40 | + cd oci-language-translation |
| 41 | + ``` |
| 42 | + |
| 43 | +2. Install required dependencies: |
| 44 | + ```bash |
| 45 | + pip install -r requirements.txt |
| 46 | + ``` |
| 47 | + |
| 48 | +3. Configure the environment (optional - can be set in config.yaml instead): |
| 49 | + ```bash |
| 50 | + # Optional - all these values can be set in config.yaml |
| 51 | + export OCI_COMPARTMENT_ID="ocid1.compartment.oc1..your-compartment-id" |
| 52 | + export OCI_SOURCE_LANG="en" |
| 53 | + export OCI_TARGET_LANG="es" |
| 54 | + ``` |
| 55 | + |
| 56 | +4. Update `config.yaml` with your translation and storage settings: |
| 57 | + ```yaml |
| 58 | + # Language Translation Service Configuration |
| 59 | + language_translation: |
| 60 | + compartment_id: "ocid1.compartment.oc1..your-compartment-id" |
| 61 | + source_bucket: "source-bucket-name" |
| 62 | + target_bucket: "target-bucket-name" |
| 63 | + source_language: "en" # ISO language code |
| 64 | + target_language: "es" # ISO language code |
| 65 | + |
| 66 | + # Object Storage Configuration |
| 67 | + object_storage: |
| 68 | + namespace: "your-namespace" # Your tenancy's Object Storage namespace |
| 69 | + bucket_name: "your-bucket-name" # Bucket for CSV/JSON translations |
| 70 | + ``` |
| 71 | +
|
| 72 | +5. For bulk document translation: |
| 73 | + ```bash |
| 74 | + python bucket_translation.py |
| 75 | + ``` |
| 76 | + |
| 77 | +6. For CSV/JSON translation: |
| 78 | + ```bash |
| 79 | + # For CSV files (column numbers start from 1) |
| 80 | + python csv_json_translation.py csv input.csv output.csv 1 2 3 |
| 81 | + |
| 82 | + # For JSON files |
| 83 | + python csv_json_translation.py json input.json output.json key1 key2 |
| 84 | + ``` |
| 85 | + |
| 86 | +## Usage Examples |
| 87 | + |
| 88 | +### Bulk Document Translation |
| 89 | +```bash |
| 90 | +# Translate all documents from source bucket to target bucket |
| 91 | +python bucket_translation.py |
| 92 | +``` |
| 93 | + |
| 94 | +### CSV Translation |
| 95 | +```bash |
| 96 | +# Translate columns 1, 3, and 5 from English to Spanish |
| 97 | +python csv_json_translation.py csv products.csv products_es.csv 1 3 5 |
| 98 | +``` |
| 99 | + |
| 100 | +### JSON Translation |
| 101 | +```bash |
| 102 | +# Translate 'name' and 'details' fields in a JSON file |
| 103 | +python csv_json_translation.py json catalog.json catalog_es.json name details |
| 104 | +``` |
| 105 | + |
| 106 | +## Configuration |
| 107 | + |
| 108 | +The project uses three types of configuration: |
| 109 | + |
| 110 | +1. **OCI Configuration** (`~/.oci/config`): |
| 111 | + - Created automatically by `oci setup config` |
| 112 | + - Contains your OCI authentication details |
| 113 | + - Used for API authentication |
| 114 | + |
| 115 | +2. **Translation Configuration** (`config.yaml`): |
| 116 | + ```yaml |
| 117 | + # Language Translation Service Configuration |
| 118 | + language_translation: |
| 119 | + compartment_id: "ocid1.compartment.oc1..your-compartment-id" |
| 120 | + source_bucket: "source-bucket-name" |
| 121 | + target_bucket: "target-bucket-name" |
| 122 | + source_language: "en" |
| 123 | + target_language: "es" |
| 124 | + |
| 125 | + # Object Storage Configuration |
| 126 | + object_storage: |
| 127 | + namespace: "your-namespace" # Your tenancy's Object Storage namespace |
| 128 | + bucket_name: "your-bucket-name" # Bucket for CSV/JSON translations |
| 129 | + ``` |
| 130 | +
|
| 131 | +3. **Environment Variables** (optional, override config.yaml): |
| 132 | + - `OCI_COMPARTMENT_ID`: Your OCI compartment OCID |
| 133 | + - `OCI_SOURCE_LANG`: Source language code |
| 134 | + - `OCI_TARGET_LANG`: Target language code |
| 135 | + |
| 136 | +### Configuration Priority |
| 137 | + |
| 138 | +The configuration values are loaded in the following priority order: |
| 139 | +1. Environment variables (if set) |
| 140 | +2. Values from config.yaml |
| 141 | +3. Default values (for language codes only: en -> es) |
| 142 | + |
| 143 | +## Supported Languages |
| 144 | + |
| 145 | +The service supports a wide range of languages. Common language codes include: |
| 146 | +- English (en) |
| 147 | +- Spanish (es) |
| 148 | +- French (fr) |
| 149 | +- German (de) |
| 150 | +- Italian (it) |
| 151 | +- Portuguese (pt) |
| 152 | +- Chinese Simplified (zh-CN) |
| 153 | +- Japanese (ja) |
| 154 | + |
| 155 | +For a complete list of supported languages, refer to the OCI Documentation. |
| 156 | + |
| 157 | +## Error Handling |
| 158 | + |
| 159 | +Both tools include comprehensive error handling: |
| 160 | +- Configuration validation |
| 161 | +- Service availability checks |
| 162 | +- File format validation |
| 163 | +- Translation status monitoring |
| 164 | + |
| 165 | +## Contributing |
| 166 | + |
| 167 | +Contributions are welcome! Please feel free to submit a Pull Request. |
| 168 | + |
| 169 | +## License |
| 170 | + |
| 171 | +This project is licensed under the MIT License - see the LICENSE file for details. |
0 commit comments