Document#

A document is an individual piece of content that can be indexed and retrieved by from your search engine. Documents are a representation of your content that is stored in Silverstripe Search. Only content added into documents will be available to search on.

A document is made up of fields which have a key and a value. The key is the name of the field and this is the same for all documents within the engine. A field’s value represents the unique content of the document. For example your documents might have a key called title which you use to store the title of your pages such as “Home Page”, “About Page” etc. An example document could look like:

Field key Field value
titleHome Page
contentWelcome to the home page
url/
date_published2024-07-17 10:00:00

Giving your documents a structure like this, combined with a schema, allows you to use the full power of Silverstripe Search. In the above example, having a date_published allows you to create a search for all documents published after a certain date.

Documents are added to an Engine programmatically, find out more in the Developer's guide.

Files#

Silverstripe Search can search files such as PDF (.pdf) and Microsoft Office (.docx) documents. To do this it extracts the file content into a Silverstripe Search Document. You can send the file, with the help of a developer, in a special binary type field called \_attachment. The service will then extract the content it can process and put in the body field. There is a 15MB limit on file size.

TIP

File content extraction is only supported on Tiers with Analyst features. For more information, see Features

Supported file types#

  • .txt
  • .py
  • .rst
  • .html
  • .markdown
  • .json
  • .xml
  • .csv
  • .md
  • .ppt
  • .rtf
  • .docx
  • .odt
  • .xls
  • .xlsx
  • .rb
  • .paper
  • .sh
  • .pptx
  • .pdf
  • .doc