Skip to content

Retrospective research opportunities and forward-thinking testing/engagement using ClearML dataΒ #915

@Enkidu93

Description

@Enkidu93

Some time ago, John developed a script to scrape build job data from ClearML (research and production runs) which I've refined and expanded. Currently, we only use this data (to my knowledge) for some high-level reporting (e.g., how many projects are using Serval, how many new projects this month, etc.), but I think there's a lot more we could glean from it. Here are some ideas:

  • Using the ClearML data, we can identify which production projects are long-time, consistent users of our tools. Knowing this, we could:
    • Update/expand our list of standard NMT testing projects to include some of these projects to see how updates to the pipeline would affect our most consistent users - not only in regard to scores like BLEU but also in regard to other new features like marker placement or quotation denormalization.
    • Attempt to identify patterns in these projects: Are they from certain language families? Certain regions? Certain partner organizations? If we run a test against a random 250, do they tend to get higher scores than the non-long-time projects?
    • Reach out to the project owners and develop some kind of inner-circle where we could explore feature ideas and hear concerns.
  • Using the production data, we could also do the reverse: What projects tried it once a long time ago and haven't tried it since? We could do similar things for these projects as mentioned in the bullet points above as well as:
    • Reach out to these projects and ask whether they've encountered difficulties with their draft or encourage them to retry (if it's been long enough).
  • Since we can scrape the complete config from all research ClearML runs through the API, there's an opportunity to analyze the affect of different configuration options retrospectively. This could include:
    • Language or script codes (which could be mapped to families, regions, etc.)
    • Hyperparameters
    • Or just establish baselines across many runs or long-term trends (are our drafts getting better?)

And I'm sure there are more opportunities than these!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    Status

    πŸ“‹ Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions