Skip to content

Conversation

@mdayican
Copy link
Contributor

@mdayican mdayican commented Oct 8, 2025

EM-6870 add mimetype update task

@mdayican mdayican force-pushed the EM-6870-add-mimetask branch from b1781c7 to 457e0ca Compare October 9, 2025 06:02
@mdayican mdayican marked this pull request as ready for review October 10, 2025 12:10
@yogesh-hullatti
Copy link
Contributor

Hi @mdayican ,
My comments as below,

  1. Remove unused method from src/main/java/uk/gov/hmcts/dm/repository/DocumentContentVersionRepository.java - markMimeTypeUpdated()
  2. We have 12 attributes in DocumentContentVersion. We are updating 2 attributes - mimeType & mimeTypeUpdated. Instead of making the DB call to load the entire object. We can use something like below which I believe is efficient and lots of lines of code can be removed related to loading and try/catch.
//Efficient: repository-level update to avoid loading the entity
// In DocumentContentVersionRepository:
@Modifying
@Query("update DocumentContentVersion d set d.mimeType = :mimeType, d.mimeTypeUpdated = true where d.id = :id")
int updateMimeType(@Param("id") UUID id, @Param("mimeType") String mimeType);

@mdayican
Copy link
Contributor Author

Hi @mdayican , My comments as below,

  1. Remove unused method from src/main/java/uk/gov/hmcts/dm/repository/DocumentContentVersionRepository.java - markMimeTypeUpdated()
  2. We have 12 attributes in DocumentContentVersion. We are updating 2 attributes - mimeType & mimeTypeUpdated. Instead of making the DB call to load the entire object. We can use something like below which I believe is efficient and lots of lines of code can be removed related to loading and try/catch.
//Efficient: repository-level update to avoid loading the entity
// In DocumentContentVersionRepository:
@Modifying
@Query("update DocumentContentVersion d set d.mimeType = :mimeType, d.mimeTypeUpdated = true where d.id = :id")
int updateMimeType(@Param("id") UUID id, @Param("mimeType") String mimeType);

fixed.


private final DocumentContentVersionRepository documentContentVersionRepository;
private final StoredDocumentRepository storedDocumentRepository;
private final MimeTypeDetectionService mimeTypeDetectionService; // New dependency
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This java comment can be removed

public DocumentContentVersionService(DocumentContentVersionRepository documentContentVersionRepository,
StoredDocumentRepository storedDocumentRepository) {
StoredDocumentRepository storedDocumentRepository,
MimeTypeDetectionService mimeTypeDetectionService) { // Injected here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This java comment can be removed


log.info("Iteration {}: Found {} records to process for MIME type update.", iteration, documentIds.size());

ExecutorService executorService = Executors.newFixedThreadPool(threadLimit);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can update in batch as well similar to

int batchCommitSize = 500; // Define the batch size for committing to the DB

log.info("Detected MIME type for {} as: {}", documentVersionId, mimeType);
return mimeType;

} catch (IOException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing a unit test for this catch block

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants