This documentation explains how to integrate internal linking analysis data (PageRank, inbound links, etc.) into your Solr search results to improve relevancy and enable advanced search features.
- PageRank Integration: Boost search results based on page authority
- Inbound Link Weighting: Consider popularity in search relevance
- Centrality Metrics: Use page network position for ranking
- Custom Sorting: Enable sorting by page importance
- Configurable Relevance: Adjust factors through TypoScript
The extension includes a prepared TypoScript configuration. Add it to your site by:
@import "EXT:page_link_insights/Configuration/TypoScript/setup.typoscript"
Alternatively, uncomment the import in ext_localconf.php.
The configuration includes:
- DataProcessor for page metrics
- Field definitions
- Relevance formula
- Sorting options
The Solr integration requires the main features of the extension to be working:
- Make sure the Scheduler task has been run at least once
- Verify metrics are stored in
tx_pagelinkinsights_pageanalysistable - Reindex your pages in Solr
To verify the metrics are properly indexed:
- Access the Solr admin interface
- Search with
*:*to see all documents - Verify the presence of fields:
pagerank_floatS: Authority score of the pageinbound_links_intS: Number of incoming content linkscentrality_floatS: Network centrality score
The default configuration uses the following influence factors:
- PageRank (multiplier: 2.0)
- Inbound links (multiplier: 1.5)
- Base Solr score (multiplier: 1.0)
The final formula combines these elements:
final_score = (base_score * 1.0) + (pagerank * 2.0) + (inbound_links * 1.5)
The configuration adds a new sorting option "Page Rank" to your Solr frontend, allowing users to sort by page importance rather than just relevance.
You can customize the influence of different metrics by changing the multipliers:
plugin.tx_solr.search.relevance.multiplier {
pagerank = 3.0 # Increase PageRank influence
inboundLinks = 1.0 # Decrease link count influence
}
For more complex scoring, you can modify the formula directly:
plugin.tx_solr.search.relevance.formula = sum(
mul(queryNorm(dismax(v:1)), 1.0),
mul(fieldValue(pagerank_floatS), 2.0),
mul(div(fieldValue(inbound_links_intS), 100), 1.5),
# Add additional factors here
)
If metrics aren't showing up in your Solr index:
- Run the Page Link Insights scheduler task
- Check data exists in the database table
- Reindex the pages in Solr
- Clear all TYPO3 caches
If search results aren't ranked as expected:
- Verify the actual PageRank values in Page Link Insights module
- Check multiplier settings in TypoScript
- Examine debug information from Solr
- Adjust multipliers to achieve desired balance
For large sites, consider implementing incremental updates:
- Configure the scheduler task to run more frequently on a subset of pages
- Mark only affected pages for reindexing in Solr
- Use dedicated indexing queues for metrics updates
To find optimal relevance settings:
- Create multiple search configurations with different weights
- Use Solr search collections to compare results
- Analyze user behavior to determine best settings
Q: Will this slow down my Solr searches?
A: The impact is minimal. The additional calculations happen during indexing, not during search.
Q: How often should I update PageRank metrics?
A: For most sites, weekly is sufficient. Sites with frequent content updates may benefit from daily runs.
Q: Can I use this with non-page records?
A: Currently, the metrics are calculated only for pages, but the concept could be extended to other record types.