A Basic Guide to SharePoint Indexing and Crawling

Topics: Office 365, SharePoint, Microsoft Search, Document Library

SharePoint is a fantastic tool for businesses to organize and manage their files. Thanks to the power of Microsoft Search, users have the ability to comb through whatever number of files your organization has and find exactly what they’re looking for. However, this only works if your content is properly crawled and indexed. It’s okay if you don’t know what that means – most users don’t. By the end of this guide, you should have a solid understanding of SharePoint crawling and indexing so you can take full advantage of everything SharePoint Online has to offer your business.

What are Crawling and Indexing?

The terms “crawling” and “indexing” are used interchangeably. Both refer to the organizing and cataloguing of information within your SharePoint site. While they refer to the same process, what part they play in that process differs. Crawling is the mechanism that your site uses to scan all the documents inside of it. Indexing is the process of your site sorting and integrating this information into its search database. Once a document has been indexed and crawled, it is added to your site’s search index, making it eligible to show when a user performs a search related to the content.

Crawling vs. Indexing

Types of Crawling

SharePoint sites can perform three different kinds of crawling.

Full Crawl: A full crawl is when the crawler sifts through the content and metadata for your whole site. The time this takes varies depending on the number of files your business is storing.

Incremental Crawl: An incremental crawl is when the crawler only sifts through items created or updated since the last crawl.

Continuous Crawl: A continuous crawl is when a crawler checks the change logs on your sites regularly (every 15 minutes is the default).

Generally, your site will perform a crawl every 4-8 hours, which can vary depending on which version of SharePoint your business uses.

The Search Index

As stated previously, the search index contains all the searchable content within your site. What documents your search content retrieves is determined by your SharePoint site’s search schema. The search schema is made up of crawled properties, crawled property categories, the crawled to managed property mapping, and the managed property setting.

A crawled property is the content and metadata that the crawler extracts from an item. This can include the author, title, or subject. To include this information in the search index, you must map the crawled properties to managed properties in your SharePoint site. Managed properties are the attributes that determine how your content shows in search results. If you do not map a crawled property to a managed property it will not be entered in the search index.

Your organization’s search schema controls what users can search, how they search, and how the results are presented on your intranet. It’s important to make sure that your search schema is tooled to match your business’s needs. Take the time to invest plenty of resources into managing your search schema to make sure your search index is providing users what they are looking for in a timely, convenient manner.

Re-Indexing

Content that has not been crawled and indexed is not searchable. If you want a new file or document to show in the search index, you either need to wait for the next crawl or manually request a re-indexing. The process for this may be different depending on which version of SharePoint your organization uses. SharePoint Online uses an automated schedule that can’t be changed. This schedule is managed by Microsoft, and generally takes between 15 minutes and an hour between upload and availability for search results. In contrast, SharePoint On-Premises allows an organization to alter its crawling schedule and manually re-index its SharePoint site as needed.

If your organization uses SharePoint on-Premises, there are several things to note about manually re-indexing your SharePoint site. First, when adding or updating a managed property list or library, make sure you are only changing the search schema, not the site itself. Changing the site itself can cause a major headache for your business. Second, make sure that you are choosing the right type of crawling for your purposes. Indexing your entire SharePoint site can take a long time and can cause a significant slowdown within the search system.

New call-to-action

How to Manually Re-Index Your SharePoint Site

If you need to manually re-index your site, here are the steps you need to follow:

  • On the site, select the Settings gear in the top right corner of the screen.
  • Select Site Settings. If you don’t see Site Settings, select Site Information and then select View All Site Settings.
  • Under Search, click Search and Offline Availability.
  • Under the Reindex Site section, click Reindex Site.
  • A warning page will appear, click Reindex Site again to confirm. The content will then be re-indexed during the next scheduled site crawl.

Dock 365’s Document Management Solution

A business’s success hinges on its users’ abilities to find the files they need in a timely manner. At Dock 365, we've seen the impact that quick content retrieval can have on an organization’s effectiveness. That’s why we designed our Document Management System to efficiently sort and manage your documents and files.

Book a free demo today to see how effective our intuitive, user-friendly system is!

Free Demo

Don't Get Left Behind!
Subscribe to Dock 365 Blogs!
Stay up to date with the latest business tips and trends.