Skip to content

Conversation

@ben-kaufman
Copy link
Contributor

Adds a script to pull the latest Android translation files from Transifex and adjust the folders and format to match the project.

Note: Diffs in the strings files are due to formatting of the last line or new strings which were translated.

@claude
Copy link

claude bot commented Nov 11, 2025

Pull Request Review: Add script to pull Transifex translations

Summary

This PR adds automation for pulling translation files from Transifex, which is a good improvement over the previous manual process. The script is generally well-structured, but there are several issues that should be addressed.


Critical Issues

1. Missing Newline at End of File (.tx/config:8)

Issue: The .tx/config file is missing a newline at the end, which violates POSIX standards and can cause issues with some tools.

Fix: Add a newline at the end of the file.


Code Quality and Best Practices

2. Unsafe Use of find -exec mv (Line 21)

Issue: The command at line 21 could fail silently if files already exist in the destination, and the OR true swallows all errors.

Location: scripts/pull-translations.sh:21

Recommendation: Use a while loop with better error handling to avoid silently ignoring move failures.

3. Inconsistent Error Handling

Issue: The script uses set -e but then uses OR true and redirects stderr to /dev/null in multiple places, which can mask real errors.

Locations:

  • Line 21: OR true after find command
  • Line 92: redirects stderr when moving temp file
  • Line 102: suppresses wc errors

Recommendation: Be more selective about which errors to suppress. Consider using set +e around specific commands where you expect and handle failures explicitly.

4. Race Condition in Directory Deletion (Lines 116-123)

Issue: The script collects directories in an array but does not sort them depth-first, which could cause issues if nested directories are processed out of order.

Recommendation: Add reverse sort to ensure deeper directories are processed first.

5. AWK Script Could Be More Robust (Lines 81-91)

Issue: The AWK script modifies XML files but does not validate that they are well-formed before or after processing.

Recommendation: Add basic XML validation before processing files.


Performance Considerations

6. Multiple Passes Over Filesystem

Issue: The script makes multiple separate passes with find commands.

Assessment: Given that this is a one-time operation per pull, the performance impact is minimal. The current approach is more readable and maintainable.


Security Concerns

7. Path Injection Risk (Minor)

Issue: While variables are properly quoted, the script does not validate that RES_DIR exists and is within the expected location before performing operations.

Location: scripts/pull-translations.sh:7

Recommendation: Add validation to ensure the script is run from the project root.

8. No Validation of Transifex CLI Output

Issue: The script runs tx pull -a but does not check if it succeeded or if the files pulled are valid.

Location: scripts/pull-translations.sh:26

Recommendation: Add error checking for the Transifex command.


Documentation

9. Strong Documentation in README

Positive: The README.md additions are clear and helpful, providing step-by-step instructions for installing Transifex CLI, authenticating, and running the script.

10. Script Could Use More Comments

Suggestion: While the script is reasonably self-documenting, adding comments for the complex regex and AWK sections would help future maintainers.


Test Coverage

11. No Automated Tests

Issue: There are no tests for this script. Consider adding a dry-run mode that shows what would be changed without making changes.


Minor Improvements

12. Hardcoded Values

Location: Lines 41-59 contain hardcoded language code mappings.

Suggestion: Consider extracting these to a configuration file or associative array at the top of the script for easier maintenance.

13. Progress Visibility

Positive: The script has good progress output with counts at each stage. This is excellent for debugging and user feedback.


Summary of Recommendations

Must Fix:

  1. Add newline to .tx/config
  2. Add error checking for tx pull -a command
  3. Add project root validation before operations

Should Fix:
4. Improve error handling in merge function
5. Fix directory deletion order with reverse sort
6. Add XML validation before processing

Nice to Have:
7. Add dry-run mode for testing
8. Extract language mappings to configuration
9. Add more inline comments for complex sections


Overall Assessment

This is a solid contribution that improves the translation workflow. The script is well-structured with good user feedback. The main concerns are around error handling and validation. With the recommended fixes, this will be production-ready.

Recommendation: Approve with minor changes requested

@claude
Copy link

claude bot commented Nov 11, 2025

Pull Request Review - PR #464

Summary

This PR adds a comprehensive script for pulling Android translations from Transifex with automatic cleanup and normalization. Overall, this is a well-structured and valuable addition to the project's localization workflow. The code quality is high, with good error handling and clear documentation.


✅ Strengths

Code Quality

  • Excellent error handling: The script properly validates prerequisites, checks command availability, and handles errors gracefully
  • Clear structure: Well-organized with helper functions (rename_or_merge, validate_xml) that have single responsibilities
  • Good use of bash best practices:
    • Uses set -e with appropriate set +e sections where needed
    • Properly quotes variables to handle spaces in paths
    • Uses process substitution with find and null-terminated strings (-print0)
    • Uses local variables in functions

Documentation

  • Comprehensive README updates: Clear installation and authentication instructions for Transifex CLI
  • Inline comments: Script has helpful comments explaining logic, especially for complex regex patterns
  • User-friendly output: Progress messages are clear and informative

Functionality

  • Robust language code handling: Properly handles Android's BCP 47 format (values-b+es+419) and locale-specific translations
  • XML normalization: Ensures consistent formatting across all translation files
  • Cleanup logic: Removes empty files and directories to keep the repo clean

🔍 Potential Issues & Suggestions

1. Edge Case: Array handling in directory cleanup (Line 207)

for dir in $(printf '%s\n' "${dirs_to_check[@]}" | sort -u | sort -r); do

Issue: If dirs_to_check array is empty, this could cause issues.

Suggestion: Add a check before iterating to prevent issues with empty arrays.

2. Potential data loss concern (Lines 125-127)

values-b+pt+PT|values-pt_PT)
    echo "  Removing: $dir_name"
    rm -rf "$dir" && REMOVED_COUNT=$((REMOVED_COUNT + 1))

Concern: The script unconditionally deletes Portuguese-Portugal translations with rm -rf. If there are unique strings in these directories, they'll be lost.

Suggestions:

  • Consider logging which files are being deleted for audit purposes
  • Or merge pt-PT into pt-BR (values-pt) instead of deleting
  • Add a dry-run mode (--dry-run flag) to preview changes before executing

3. XML validation is basic (Lines 70-78)

Issue: The validation only checks for opening/closing tags but won't catch malformed XML inside the file.

Suggestion: Consider using xmllint for more robust validation if available, with fallback to the current approach.

4. Merge conflicts in rename_or_merge (Lines 47-52)

Issue: If a file with the same name exists in the destination, mv will overwrite it without warning.

Suggestion: Add mv -n (no-clobber) flag or implement conflict detection to prevent silent overwrites.


🔒 Security Considerations

✅ Good Practices

  • No credential hardcoding (uses Transifex CLI authentication)
  • Validates paths before operations
  • Uses 2>/dev/null to suppress error output appropriately

⚠️ Minor Concerns

  1. Temporary file handling (Line 169): Creates .tmp files that could be exposed briefly

    • Consider using mktemp for safer temporary file creation
  2. rm -rf usage (Line 127): While appropriate here, it's inherently dangerous

    • Current implementation is acceptable as it only operates on validated translation directories

📊 Test Coverage

Missing Tests

  • No automated tests for the script: Consider adding unit tests using bats-core or similar framework
  • Integration test that creates mock translation directories, runs the script, and validates expected output

🎯 Performance Considerations

Efficient Patterns

  • ✅ Uses find with -print0 for safe file handling
  • ✅ Processes files in single pass where possible
  • ✅ Uses awk instead of multiple sed calls for normalization

Current implementation is performant for the typical number of Android locales.


📝 Documentation Review

README.md Changes

Excellent additions:

  • Clear step-by-step instructions
  • Links to official Transifex documentation
  • Example .transifexrc configuration
  • Proper warning about API token handling

Minor suggestion: Add a note about what to do if translations conflict with local changes.


🔄 Transifex Config (.tx/config)

The configuration looks correct and is properly configured for Android resource structure.


📋 Translation File Changes

The changes to translation files appear to be:

  1. Formatting normalization: Removing trailing whitespace, ensuring newline at EOF
  2. New translations: Added strings in various languages (Arabic, Spanish, French)

These changes are expected output from the script and look correct.


🎯 Recommendations Summary

High Priority

  1. Add protection against empty array iteration (Line 207)
  2. Consider adding a --dry-run flag to preview changes

Medium Priority

  1. Improve XML validation with xmllint if available
  2. Add merge conflict detection in rename_or_merge
  3. Add basic tests for the script

Low Priority

  1. Document merge vs. delete behavior for Portuguese locales
  2. Add verbose mode (-v flag) for debugging

✅ Approval Status

Overall Assessment: ✅ APPROVE with minor suggestions

This is a solid, production-ready script that will significantly improve the translation workflow. The suggestions above are mostly for edge case handling and future maintainability rather than critical bugs.

Recommendation:

  • Safe to merge as-is if time is critical
  • Ideally, address the empty array check (Priority 1) before merging
  • Other suggestions can be addressed in follow-up PRs

Great work on this automation! 🎉


Review generated by Claude Code

@ovitrif
Copy link
Collaborator

ovitrif commented Nov 13, 2025

@ben-kaufman can you please merge with master, that will fix the e2e tests runs

Copy link
Collaborator

@ovitrif ovitrif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👏🏻

@claude
Copy link

claude bot commented Nov 13, 2025

Pull Request Review: Add script to pull Transifex translations

Overall Assessment

This is a well-structured PR that adds automation for pulling and normalizing translations from Transifex. The script is comprehensive and handles edge cases well. Below are detailed findings across different areas.


Code Quality and Best Practices ⭐

Strengths:

  • Excellent error handling: The script uses set -e appropriately and temporarily disables it when needed (lines 90-93, 192-204)
  • Good validation: Validates prerequisites before execution (lines 12-21, 82-87)
  • Clear structure: Well-organized with helper functions and logical flow
  • Informative output: Provides clear progress messages and summaries
  • Defensive programming: Uses null-terminated strings with -print0 and proper quoting

Areas for Improvement:

  1. Line 47-52: Race condition in merge operation

    while IFS= read -r -d '' item; do
        if ! mv "$item" "$dst/" 2>/dev/null; then

    The find command could theoretically list items that disappear before mv executes. This is handled gracefully with error counting, but consider adding a check:

    [ -e "$item" ] && ! mv "$item" "$dst/" 2>/dev/null
  2. Line 135: Complex sed expression could be clearer

    new_name=$(echo "$dir_name" | sed 's/values-b+\([a-z][a-z]*\)+\([A-Z0-9][A-Z0-9]*\)/values-\1_\2/')

    Consider adding a comment explaining the BCP 47 to underscore format conversion, or extract to a function with a descriptive name.

  3. Line 194-195: Multiple processes spawned for simple checks

    line_count=$(wc -l < "$file" 2>/dev/null | tr -d ' ' || echo "0")
    string_count=$(grep -c '<string name=' "$file" 2>/dev/null || echo "0")

    Could be optimized by reading the file once:

    IFS=$'\n' read -d '' -r -a file_info < <(awk '/^/ {lines++} /<string name=/ {strings++} END {print lines; print strings}' "$file" || echo -e "0\n0")
    line_count="${file_info[0]}"
    string_count="${file_info[1]}"
  4. Missing SheBang verification: Consider adding bash version check since the script uses bash-specific features like BASH_SOURCE, process substitution, and arrays.


Potential Bugs or Issues 🐛

  1. Line 157-168: AWK logic issue

    if (saw_resources && $0 == "") next
    if (saw_resources && $0 != "") saw_resources = 0

    This logic seems intended to remove trailing empty lines after </resources>, but the second condition saw_resources = 0 would be triggered by any non-empty line after </resources>, potentially allowing empty lines later. The logic should be:

    if (saw_resources) next  # Skip all lines after </resources>
  2. Line 207: Array iteration without bounds check

    for dir in $(printf '%s\n' "${dirs_to_check[@]}" | sort -u | sort -r); do

    If dirs_to_check is empty, this could produce unexpected behavior. Add:

    if [ ${#dirs_to_check[@]} -eq 0 ]; then
        echo "  No directories to check"
    else
        for dir in $(printf '%s\n' "${dirs_to_check[@]}" | sort -u | sort -r); do
  3. Line 73: XML validation is too basic
    The validation only checks for presence of tags, not well-formedness:

    if ! grep -q '<resources' "$file" 2>/dev/null || ! grep -q '</resources>' "$file" 2>/dev/null; then

    Consider using xmllint if available for proper validation:

    if command -v xmllint &> /dev/null; then
        xmllint --noout "$file" 2>/dev/null || return 1
    else
        # fallback to basic validation
    fi
  4. Line 126-127: rm -rf without confirmation

    rm -rf "$dir" && REMOVED_COUNT=$((REMOVED_COUNT + 1))

    While this is scripted automation, consider logging what's being removed before deletion for debugging purposes.


Performance Considerations ⚡

  1. Multiple find operations: The script runs find multiple times (lines 141, 180, 203, 210). Consider consolidating where possible.

  2. Serial processing: XML normalization (lines 151-180) processes files serially. For large translation sets, consider parallel processing with GNU parallel or background jobs.

  3. Redundant sorting: Line 207 does sort -u | sort -r - the second sort is unnecessary if you use sort -ur.


Security Concerns 🔒

Good Security Practices:

  • Proper quoting prevents injection attacks
  • Uses 2>/dev/null instead of exposing errors that could leak paths
  • Validates input directories exist

Recommendations:

  1. API Token in README.md (lines 13-19): The README instructs users to store their Transifex token in plaintext in ~/.transifexrc. Consider mentioning:

    • Set restrictive permissions: chmod 600 ~/.transifexrc
    • Alternative: Use environment variables instead
    • Warn not to commit this file
  2. Temporary file handling (line 169): Uses predictable temporary file names ($file.tmp). While low risk, consider using mktemp for better security:

    tmp_file=$(mktemp) || { echo "Failed to create temp file"; continue; }
    if awk '...' "$file" > "$tmp_file"; then
        mv "$tmp_file" "$file"
    fi
    rm -f "$tmp_file"  # Cleanup on failure

Test Coverage 🧪

Missing:

  • No automated tests for the script
  • No CI/CD integration to verify script functionality

Recommendations:

  1. Add a test script that:

    • Mocks the Transifex CLI
    • Creates test XML files with various edge cases
    • Verifies directory renaming logic
    • Tests empty file cleanup
  2. Consider adding the script to CI to verify it remains executable and has no syntax errors:

    - name: Validate scripts
      run: |
        bash -n scripts/pull-translations.sh
        # shellcheck scripts/pull-translations.sh  # if available

Documentation 📝

Strengths:

  • Excellent README updates with clear installation and usage instructions
  • Good inline comments explaining complex logic
  • Clear Transifex configuration

Suggestions:

  1. Add a comment at the top of the script explaining its purpose and high-level flow
  2. Document the expected directory structure and naming conventions
  3. Add examples of the transformations (e.g., values-arbvalues-ar)

Additional Observations

  1. .tx/config file: Clean and properly configured for Android string resources

  2. Translation file changes: The changes to strings.xml files are minimal (mostly formatting), which confirms the script's normalization is working as intended

  3. Locale handling: The script correctly handles complex Android locale naming:

    • BCP 47 format (values-b+es+419)
    • Underscore format (values-es_ES)
    • Regional variants

Recommendations Summary

High Priority:

  • Fix AWK logic in XML normalization (line 157-168)
  • Add XML validation with xmllint if available
  • Document security best practices for API token storage

Medium Priority:

  • Add basic test coverage
  • Optimize performance for large translation sets
  • Add bounds checking for empty arrays

Low Priority:

  • Refactor complex sed expressions for clarity
  • Consolidate redundant find operations
  • Use mktemp for temporary files

Conclusion

This is a solid, production-ready script with good error handling and clear logic. The issues identified are mostly minor improvements rather than blocking problems. The automation will significantly improve the translation workflow.

Recommendation: Approve with minor suggestions

Great work on the thorough error handling and edge case management!

@ovitrif ovitrif merged commit 24be4bc into master Nov 13, 2025
14 checks passed
@ovitrif ovitrif deleted the chore/update-translations branch November 13, 2025 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants