Search Appliance


Thunderstone Search Appliance Manual

Data From Field Example - Subfetch to use PDF Contents for a Web Page

Subfetches allow you to use content from other URLs to populate the current URL's record. We may have a site about articles, where each article has a web page describing the article, and a link to a PDF of the actual article. We'd like searches that match article contents to take us to the web page, not the article PDF itself.

If the web page has a meta header called "pdfLink" with a URL to the article PDF, we can use the body of the PDF as a replacement for the web page's body with two Data from Field rules like this:

First Data from Field rule:

Second Data from Field rule:

The Subfetch Data from Field rule fetches the URL specified in the pdfLink header. While this grabs the PDF, it doesn't change anything on its own. We then pull from the PDF's text output, and use that as the Body of the current web page.

Copyright © Thunderstone Software     Last updated: Dec 5 2019