Skip to main content

Overview

The Glean SharePoint connector enable secure and efficient data fetching from the Microsoft 365 (M365) SharePoint platform. User permissions are strictly enforced, and all data remains securely within your Glean environment.
  • Glean requires authentication to the M365 instance to fetch relevant information.
  • Authentication is accomplished by creating and registering an App Registration for each deployment.
  • Glean understands all user access permissions and strictly enforces permissions for users at the time of the query. This ensures that users are not able to see results that they do not have access to.
  • Quicklinks are provided to quickly create Word, Excel, and PowerPoint documents.

Integration Features

SharePoint

Glean will capture:
  • Site Pages (web part or wiki page libraries)
  • Site Drives (document libraries)
  • Other Site Lists (Basic List and Calendar List items) [optional configuration, not by default]

Objects Supported

  • Folders: Captured and indexed within SharePoint
  • Documents: Various types stored in SharePoint
  • Native File Types: Office including Word, Excel, PowerPoint, etc.
  • Content from Personal and Shared Drives: Supported from both personal and shared drives

API Usage & Permissions

Glean will use the standard Graph API v1.0 and SharePoint REST API to ingest data. We use application permissions with admin-granted access.

Setup Prerequisites

A tenant administrator (with global admin privileges for both the Azure/Entra ID and SharePoint admin portals) is required to set up several dedicated service applications granted with the required privileges above.
Sites.SelectedThe Sites.Selected permission can be leveraged instead of Sites.FullControl.All, however there are some significant trade-offs:
  • Each individual site to be crawled must be explicitly set in the Glean admin UI.
  • Updates to content and content permissions are only updated every 24 hours.
  • Ranking for search and generative AI results will be heavily impacted as Sites.Selected prevents the crawler from accessing activity and content metadata.
Files.ReadWrite.AllGlean subscribes to the webhook events for all files in the tenant. This allows Glean to react to changes in content and permissions in near real-time.The minimum permission available for webhooks to be both set up and re-authenticated is Files.ReadWrite.All.

SharePoint REST API Permissions

In order to read data from the SharePoint REST API and crawl site collections, site content, and content permissions via REST, the FullControl permission need to be granted. Microsoft does not provide granular controls or a dedicated read scope for these data endpoints in the SharePoint REST API, so the FullControl permission is required.

Versions Supported

There are no specific version limitations of the SharePoint connector.

License Tier(s) Required

There are no specific license tier requirements for the SharePoint connector.
Looking for the original version of this page? You can find the archived version here.