Skip to content

Conversation

@AndyButland
Copy link
Contributor

@AndyButland AndyButland commented Mar 7, 2025

This PR tackles an issue raised about having different URL segments for content for when the content is rendered as a page, versus when it's used as an ancestor for a page in the route.

Here's the specific example, where it's argued that doing so can add some SEO benefits.

Given the following content structure, these routes should be used:

Software                              /software
 Business Software                    /software/business-software
   HR Software                        /software/business/hr-software
     Payroll Software                 /software/business/hr/payroll-software
     Recruitment Software             /software/business/hr/recruitment-software 

To support this we can use an IUrlProvider and an IContentFinder, as documented here and here. This issue is that our IContentFinder needs to look up the document from the route, and will use the IDocumentService for this. But this only supports one segment per document, culture and draft/published flag combination.

URL segments are provided by registered IUrlSegmentProvider classes, which, although you can register more than one, terminate when they find and return a segment. So you can never find more than one segment for a given document, culture and draft/published flag combination.

This PR makes the following changes:

  • Persistence and migration updates to update the database schema to allow saving of more than on segment per document, culture and draft/published flag combination (expanding the unique index on the umbracoDocumentUrl table to include the segment column).
  • Adding a new field isPrimary to the table, which allows us to record which segment was first resolved from the IUrlSegmentProvider collection.
    • In doing this we can maintain any existing functionality that retrieves a single segment for a document, culture and draft/published flag combination, as this will still return the first one found even if we have found more.
  • Adds a property called AllowAdditionalSegments to IUrlSegmentProvider, which defaults to false. If set to true, the provider will no longer terminate and thus allow other providers to return segments. This allows for more than one segment per document, culture and draft/published flag combination to be resolved and persisted.
  • Updates the methods in the implementation of IDocumentService to use all segments when looking up a document by the provided route.
  • Integration tests to verify behaviour.

Code Sample

Putting it all together we can meet the use case with the following custom IUrlProvider, IContentFinder and IUrlSegmentProvider:

Composer and components to meet the described use case
using Microsoft.Extensions.Options;
using Umbraco.Cms.Core.Composing;
using Umbraco.Cms.Core.Configuration.Models;
using Umbraco.Cms.Core.Models;
using Umbraco.Cms.Core.Models.PublishedContent;
using Umbraco.Cms.Core.PublishedCache;
using Umbraco.Cms.Core.Routing;
using Umbraco.Cms.Core.Services;
using Umbraco.Cms.Core.Services.Navigation;
using Umbraco.Cms.Core.Strings;
using Umbraco.Cms.Core.Web;

namespace Umbraco.Cms.Web.UI.Composers;

public class SoftwarePageUrlsComposer : IComposer
{
    public void Compose(IUmbracoBuilder builder)
    {
        builder.UrlSegmentProviders().Insert<SoftwarePageUrlSegmentProvider>();
        builder.ContentFinders().InsertBefore<ContentFinderByUrlNew, SoftwarePageContentFinder>();
        builder.UrlProviders().Insert<SoftwarePageUrlProvider>();
    }
}

public class SoftwarePageUrlProvider : NewDefaultUrlProvider
{
    private readonly IPublishedContentCache _publishedContentCache;
    private readonly IDocumentUrlService _documentUrlService;
    private readonly IDocumentNavigationQueryService _navigationQueryService;
    private readonly IShortStringHelper _shortStringHelper;

    public SoftwarePageUrlProvider(
        IOptionsMonitor<RequestHandlerSettings> requestSettings,
        ILogger<DefaultUrlProvider> logger,
        ISiteDomainMapper siteDomainMapper,
        IUmbracoContextAccessor umbracoContextAccessor,
        UriUtility uriUtility,
        ILocalizationService localizationService,
        IPublishedContentCache publishedContentCache,
        IDomainCache domainCache,
        IIdKeyMap idKeyMap,
        IDocumentUrlService documentUrlService,
        IDocumentNavigationQueryService navigationQueryService,
        IPublishedContentStatusFilteringService publishedContentStatusFilteringService,
        IShortStringHelper shortStringHelper)
        : base(
            requestSettings,
            logger,
            siteDomainMapper,
            umbracoContextAccessor,
            uriUtility,
            localizationService,
            publishedContentCache,
            domainCache,
            idKeyMap,
            documentUrlService,
            navigationQueryService,
            publishedContentStatusFilteringService)
    {
        _publishedContentCache = publishedContentCache;
        _documentUrlService = documentUrlService;
        _navigationQueryService = navigationQueryService;
        _shortStringHelper = shortStringHelper;
    }

    public override UrlInfo? GetUrl(IPublishedContent content, UrlMode mode, string? culture, Uri current)
    {
        // Only apply this provider for software pages.
        if (SoftwarePageUrlHelper.IsSoftwarePage(content.ContentType.Alias, content.Name) is false)
        {
            return base.GetUrl(content, mode, culture, current);
        }

        UrlInfo? defaultUrlInfo = base.GetUrl(content, mode, culture, current);
        if (defaultUrlInfo is null)
        {
            return null;
        }

        if (_navigationQueryService.TryGetAncestorsKeys(content.Key, out IEnumerable<Guid>? ancestorsKeys) is false)
        {
            return null;
        }

        IEnumerable<string> ancestorContentSegments = ancestorsKeys
            .Reverse()
            .Skip(1)
            .Select(x => _publishedContentCache.GetById(false, x))
            .Where(x => x is not null)
            .Select(x => CreateSoftwarePageUrlSegment(x!));

        var url = $"/{string.Join('/', ancestorContentSegments)}/{content.Name.ToUrlSegment(_shortStringHelper)}";

        return new UrlInfo(url, true, defaultUrlInfo.Culture);
    }

    private string CreateSoftwarePageUrlSegment(IPublishedContent content)
        => content.Name.ToUrlSegment(_shortStringHelper).Replace("-software", string.Empty);
}

public class SoftwarePageContentFinder : IContentFinder
{
    private readonly IDocumentUrlService _documentUrlService;
    private readonly IUmbracoContextAccessor _umbracoContextAccessor;

    public SoftwarePageContentFinder(IDocumentUrlService documentUrlService, IUmbracoContextAccessor umbracoContextAccessor)
    {
        _documentUrlService = documentUrlService;
        _umbracoContextAccessor = umbracoContextAccessor;
    }

    public Task<bool> TryFindContent(IPublishedRequestBuilder request)
    {
        if (_umbracoContextAccessor.TryGetUmbracoContext(out IUmbracoContext? umbracoContext) is false)
        {
            return Task.FromResult(false);
        }

        var path = request.Uri.GetAbsolutePathDecoded();

        // Only looking for software pages, found under the "/software" path.
        if (path.StartsWith("/software") is false)
        {
            return Task.FromResult(false);
        }

        // Make sure content not found when browsing via a default URL e.g. /software/business-software/hr-software.
        // This is valid according to segments, as both "business-software" and "business" are registered as segments.
        // But only /software/business/hr-software should be navigable.
        var pathParts = path.Split('/');
        if (pathParts.Take(pathParts.Length - 1).Any(x => x.EndsWith("-software")))
        {
            // Need to explicitly set 404, otherwise the the default content finder by route will find it.
            IPublishedContent? fileNotFoundContent = GetFileNotFoundContent(umbracoContext);
            request.SetPublishedContent(fileNotFoundContent);
            request.SetIs404();
            return Task.FromResult(true);
        }

        // Look up the document key by route and set the content if found.
        // This look-up will be using the registered segments.
        Guid? documentKey = _documentUrlService.GetDocumentKeyByRoute(path, null, null, false);
        if (documentKey.HasValue is false)
        {
            return Task.FromResult(false);
        }

        IPublishedContent? content = umbracoContext.Content.GetById(documentKey.Value);
        if (content is null)
        {
            return Task.FromResult(false);
        }

        request.SetPublishedContent(content);
        return Task.FromResult(true);
    }

    private static IPublishedContent? GetFileNotFoundContent(IUmbracoContext umbracoContext)
        => umbracoContext.Content.GetById(Guid.Parse("6e806c1f-9b61-437f-91d3-35650f38f560"));
}

public class SoftwarePageUrlSegmentProvider : IUrlSegmentProvider
{
    private readonly IUrlSegmentProvider _provider;

    public SoftwarePageUrlSegmentProvider(IShortStringHelper stringHelper) => _provider = new DefaultUrlSegmentProvider(stringHelper);

    public bool AllowAdditionalSegments => true; // Ensure that the default URL segment provider is still called.

    public string? GetUrlSegment(IContentBase content, string? culture = null)
    {
        // Only apply this provider for software pages.
        if (SoftwarePageUrlHelper.IsSoftwarePage(content.ContentType.Alias, content.Name) is false)
        {
            return null;
        }

        // Remove "-software" from the dfault URL segment and register that as a URL segment too.
        // As this IUrlSegmentProvider is not terminating, the default URL segment provider will still be called and include
        // the segment with the "-software" suffix too.
        var segment = _provider.GetUrlSegment(content, culture);
        return segment?.Replace("-software", string.Empty);
    }
}

public static class SoftwarePageUrlHelper
{
    public static bool IsSoftwarePage(string contentTypeAlias, string? contentName) =>
        contentTypeAlias == "textPage" &&
            !string.IsNullOrEmpty(contentName) &&
            contentName.EndsWith(" Software");
}

To Review:

  • I've taken the opportunity for a bit of clean-up in the classes amended here, in particular adding XML header comments and re-ordering methods so they read more easily within the class. Unfortunately that makes the change to DocumentUrlService in particular a little harder to spot via GitHub, so you may need to pull this down locally, or review commits 2-4 independently.

To Test:

  • Using the above code and integration tests as reference, verify that multiple segments can be associated with a document and that they can be used by a content finder to resolve content.

@AndyButland AndyButland changed the title WIP: Multiple URL segments per document Multiple URL segments per document Mar 10, 2025
@AndyButland AndyButland marked this pull request as ready for review March 10, 2025 06:38
@AndyButland AndyButland changed the title Multiple URL segments per document Allow multiple URL segments per document Mar 10, 2025
Copy link
Contributor

@nikolajlauridsen nikolajlauridsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, there's a problem with your migration when running on SQLite. You cannot add the isPrimary column since bools are saved as INTEGER NOT NULL and there's no default value. One way to solve this would be to create a new table with the new column, migrate over the data, and replace the existing table with the new one. AddGuidsToUserGroups has an example of this; there's also more of an explanation in the comments there.

Also, a smaller thing, but we want to avoid raw SQL so we should do something like:

Execute.Sql(Sql().Update<DocumentUrlDto>(u => u.Set(x => x.IsPrimary, 1))).Do(); instead

@nikolajlauridsen
Copy link
Contributor

Other than that I think it looks good, at least as far as I can tell, only did a minor tidy up

@AndyButland
Copy link
Contributor Author

Unfortunately, there's a problem with your migration when running on SQLite. You cannot add the isPrimary column since bools are saved as INTEGER NOT NULL and there's no default value.

Ah, good catch - I've been working with SQL Server locally and forgot about this SQLite restriction. I'll have a look.

Just quickly though - you say there's no default value. Actually ideally I would have liked one, to just set the default to "1" and then I wouldn't need the update statement. But I couldn't find a migration where we had default values on columns. Do you know of one or know if we can do this please?

@nikolajlauridsen
Copy link
Contributor

That would also be a solution, and I think it would be fine since it should still align with existing behaviour, I remember doing it once, but I can't for the life of me remember how I did it right now 😅

@nikolajlauridsen
Copy link
Contributor

nikolajlauridsen commented Mar 11, 2025

Oh right I remember now, you can add [Constraint(Default = true)] to the DocumentUrlDto.IsPrimary property.

The migration runs now, however, I run into a new issue, I get a System.InvalidOperationException: The service needs to be initialized before it can be used, after the migration has run from the DocumentUrlService

@AndyButland
Copy link
Contributor Author

OK - thanks for that. You can leave it with me now and I'll see if I can sort out what's going on there.

@AndyButland AndyButland added area/backend status/needs-docs Requires new or updated documentation labels Mar 12, 2025
@AndyButland
Copy link
Contributor Author

AndyButland commented Mar 12, 2025

I've updated to resolve the migration on SQLite. However I can't seem to trigger the issue you've mentioned. I guess it's occurring due to the URLs being rebuilt before the migration occurs - as the new field added won't be there yet. But I don't see it in test - either with attended or unattended migrations.

And I can see we have this check in DocumentUrlServiceInitializerNotificationHandler to not attempt to rebuild the URLs when in an upgrade state:

        if (_runtimeState.Level == RuntimeLevel.Upgrade)
        {
            //Special case on the first upgrade, as the database is not ready yet.
            return;
        }

Copy link
Contributor

@nikolajlauridsen nikolajlauridsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found an unused logger I've removed, but other than that it looks good to me now.

On the error, I can't reproduce it anymore either, I think I might have put my system in a weird state when playing around with the migrations, so it's good to merge in my opinion now 👍

@AndyButland
Copy link
Contributor Author

Great, thanks for the review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants