Export all Sitecore live pages to Excel

When it comes to exporting data from Sitecore, I often get asked about the easiest and most efficient way to do it. Fortunately, Sitecore PowerShell provides a powerful interface that makes this process easier and efficient. With Sitecore PowerShell, you can quickly create or adjust a script to export data from your Sitecore instance in various formats like Excel, CSV, and more without needing to deploy or wait.

In fact, I previously wrote a blog post on exporting data from Sitecore into different formats, covering everything from setting up your script to executing the export smoothly. But recently, I received a new request that added a bit of a twist: I needed to export all the live pages from our Sitecore-powered website

Code first talks later

# ========================
# Configuration Variables
# ========================
$rootPath = "/sitecore/content/Demo/home"
$databaseName = "master"
$excludedItemName = "Page Components"
$workflowFinalStateId = "{SQ6541R7-5ABM-3X45-23V1-Z3E2262EA6F3}"
$searchableFieldName = "searchable"
$disableIndexingFieldName = "Disable Indexing"
$titleFieldName = "Title"
$workflowStateFieldName = "__Workflow state"

# ========================
# Initialization
# ========================
$rootItem = Get-Item -Path $rootPath

if (-not $rootItem) {
    Write-Error "Root item not found at path $rootPath"
    return
}

$db = [Sitecore.Data.Database]::GetDatabase($databaseName)
$results = @()

# ========================
# Processing Descendants
# ========================
$items = $rootItem.Axes.GetDescendants() | Where-Object { $_.TemplateID -ne $null }

foreach ($item in $items) {

    # Skip excluded item by name
    if ($item.Name -eq $excludedItemName) {
        continue
    }

    $hasPresentation = $false

    # Check if item has a presentation layout
    try {
        if ($item.Visualization -and $item.Visualization.Layout -ne $null) {
            $hasPresentation = $true
        }
    } catch {
        # Some items might not support Visualization; ignore errors
    }

    if ($hasPresentation) {

        $workflowField = $item[$workflowStateFieldName]
        $isFinal = ($workflowField -eq $workflowFinalStateId)

        if ($isFinal) {
            # Determine search visibility
            $showInSearch = if ($item[$searchableFieldName] -eq "1") { "Yes" } else { "No" }

            # Determine sitemap inclusion
            $includeInSitemap = if ($item[$disableIndexingFieldName] -eq "1") { "No" } else { "Yes" }

            # Fetch title and URL
            $title = $item[$titleFieldName]
            $url = [Sitecore.Links.LinkManager]::GetItemUrl($item)

            # Add to results
            $results += [PSCustomObject]@{
                ID                = $item.ID.Guid
                Name              = $item.Name
                ShowInSearch      = $showInSearch
                IncludeInSitemap  = $includeInSitemap
                Link              = $url
            }
        }
    }
}

# ========================
# Output Results
# ========================
$results | Select-Object ID, Name, ShowInSearch, IncludeInSitemap, Link | Show-ListView

Link to file on GitHub: https://github.com/zaheer-tariq/Sitecore-Blog-Gists/blob/main/PowerShell/Reports/GetAllLivePages.ps1

Sitemap – My Initial Thought

My first instinct was to leverage the Sitemap since it’s a natural starting point for getting all live pages. However, I quickly realized that my requirements were slightly different. I didn’t just need a list of live pages I also needed specific attributes tied to those pages.

For example, while a page may be live and part of the Sitemap, it might not be indexed for search. This happens because we set different flags on page items to control their visibility in search engines. Some pages are live but hidden from search, and vice versa. I wanted my export to capture this level of detail, allowing us to identify discrepancies or opportunities to adjust our search indexing strategy.

Solution – A Custom PowerShell Script

To achieve this, I wrote a custom script that extracts the necessary data, including:

  • All live pages from Sitecore, making sure to include only pages in live workflow state and having presentation details.
  • We do not have web database because it is a headless solution so I relied on master database
  • Search visibility status
  • Any custom flags that determine if a page is hidden from search or the Sitemap

The script is designed with flexibility in mind. I’ve used variables strategically so you can easily tweak it according to your own environment. Whether you need to adjust the output format or filter specific attributes, this script can be adapted to fit your needs.

Final Summary

Exporting data from Sitecore using PowerShell is easy and powerful. Whether you need to create reports, move data, or connect with other systems, Sitecore PowerShell Extensions help you do this efficiently. With PowerShell’s scripting, administrators and developers can simplify their tasks and work more effectively in managing Sitecore environments.

Keywords: Sitecore PowerShell, Export data from Sitecore, Sitecore PowerShell Extensions, Sitecore data export script, Sitecore automation, Sitecore CSV export, PowerShell scripting in Sitecore, Sitecore item export, Sitecore content management, Sitecore administration tools

Leave a Reply

Your email address will not be published. Required fields are marked *