Skip to content

feat: Add serializer for Histotable used by HighestUriPrecedenceProvider#646

Closed
adam-miller wants to merge 1 commit intointernetarchive:masterfrom
adam-miller:add-histotable-serializer-for-highesturiprecedenceprovider
Closed

feat: Add serializer for Histotable used by HighestUriPrecedenceProvider#646
adam-miller wants to merge 1 commit intointernetarchive:masterfrom
adam-miller:add-histotable-serializer-for-highesturiprecedenceprovider

Conversation

@adam-miller
Copy link
Copy Markdown
Contributor

This fixes an issue when using the HighestUriPrecedenceProvider where the nested generics in the Histotable fail to serialize properly.

Sample config:

  <bean id="frontier" class="org.archive.crawler.frontier.BdbFrontier">
    <property name="queuePrecedencePolicy">
      <bean class="org.archive.crawler.frontier.precedence.HighestUriQueuePrecedencePolicy"/>
    </property>
    <property name="maxRetries" value="40"/>
    <property name="retryDelaySeconds" value="180"/>
    <property name="dumpPendingAtClose" value="true"/>
  </bean>

@adam-miller
Copy link
Copy Markdown
Contributor Author

I'm left with the following warning:

00:02  WARN: Class is not registered: org.archive.crawler.frontier.precedence.HighestUriQueuePrecedencePolicy$HighestUriPrecedenceProvider
Note: To register this class use: kryo.register(org.archive.crawler.frontier.precedence.HighestUriQueuePrecedencePolicy.HighestUriPrecedenceProvider.class);

However, this class is in the engine subproject and importing it into KryoBinding would create a circular dependency. I see that other classes such as BdbWorkQueue seem to get around this with an autoregisterTo() method, but I'm not clear on how to get this to get that function called if implemented in HighestUriQueuePrecedencePolicy

@adam-miller adam-miller marked this pull request as draft April 22, 2025 00:57
@adam-miller adam-miller changed the base branch from master to master-ait-contrib April 22, 2025 01:25
@adam-miller adam-miller changed the base branch from master-ait-contrib to master April 22, 2025 01:26
@adam-miller adam-miller deleted the add-histotable-serializer-for-highesturiprecedenceprovider branch April 24, 2025 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant