Tag: Information Architecture

Site Topology Planning and Taxonomies

In the previous article SharePoint Site Topology Planning I discussed some of the technical implications of organizing the the sites within one or more applications, site collections, and sub-sites.  The article started to get pretty long so I decided to save the taxonomy part of the discussion for a separate article.

Organizing Sites

In the previous article I addressed different types of content and using that to help segment the sites across applications and site collections.  Within a given application it is possible to provide some meaningful segmentation by configuring managed paths. 

For example, you may segment collaboration sites with the following url structure:

  • http://collaborate.company.com – Collaboration Application
    • /Communities – Communities of Practice Managed Path
      • /Proposals – Proposal Community of Practice Site Collection
      • /Procurement -  Procurement Community of Practice Site Collection
    • /Projects – Projects Managed Path
      • /Alpha – Alpha Project Site Collection
      • /Omega – Omega Project Site Collection

Organizational Hierarchies

Most people understand hierarchies, and most businesses (at least in the west) have been organized in hierarchies for many years.  It is natural for people to think of their organization in this manner, but this may not be the best way to plan for the topology of your sites.

Traditional Intranets tend to go from the largest organizational unit down to the smallest.  There may be multiple divisions, with multiple business units, with multiple departments, with multiple teams, with people that actually do the work.  Sites or portals that go 5 or more levels deep can become very difficult to manage and even harder to use.  Modern businesses need to remain agile with teams always being redefined, combined and split up.  In  most situations it is a good idea to fight the hierarchy tendencies and strive for a flatter structure. 

From a SharePoint perspective a flatter structure with more site collections will make it easier to reorganize sites and structure versus a single site collection with 5+ sub-site layers deep.  As previously discussed Site Collections can be backed up and restored with a high level of fidelity (completeness) compared to a sub-site’s export and import options.  The key to usability and manageability is to find the right amount of segmentation and site collection structure.

Finding Sites and Content in a Flat World

An alternative to a rigid hierarchy is adopting flexible taxonomies with tagging.  Tagging provides a flexible and dynamic method of describing the content and sites that can evolve over time.  A great example of this is a site like StackOverflow compared with the rigid structure of the MSDN/TechNet Forums. The flat structure decreases the chance of duplication and provides new opportunities to view the data in new and unique ways. 

The SharePoint 2010 system full supports tagging without the need for custom or third party add-on components.  I fully expect that these will be a popular feature within the new version. 


Following the guidance between the two articles you should be able to properly plan your site topology.  Assumptions and business decisions do change, but if you establish the right level of granularity with applications and site collections you will be able to migrate and relocate things as needed.

Related Posts

SharePoint Site Topology Planning

This is the next in a series of articles addressing core SharePoint implementation topics.  I hope that this is valuable to both groups looking to implement SharePoint for the first time as well as groups planning planning for an upgrade or migration to SharePoint 2010.

A Series of Containers

I tend to think about SharePoint from a container perspective.  Each of the containers has a set of settings and features that can be configured and administered for all of the containers within.  It is important to understand the boundaries of each level so that you can create a site topology that meets all of your objectives.  I’ll cover a sub-set of these I typically consider while planning the site topology.  The containers I’m going to address include:

Farm – In the context of this article, all of the Applications, Site Collections, and Sites hosted by one or more connected SharePoint servers.  I say “connected” because it is possible, and fairly likely that most organizations will have more than one Farm (Dev, Staging, Production for things like Intranet, Extranet, Internet, etc).  A farm has one or more content Applications plus additional applications for things like Central Administration.

Application – An application would be the top level address which maps to an application in IIS.  For example http://sharepoint.  An application has one or more Site Collections.  The application level is also the level where you have the opportunity to identify one or more content databases for storing the content on the SQL server.

Site Collection – A site collection has one or more Sites also referred to as Webs or Sub-Sites.

Sub-Sites or Webs – This is the smallest container which resides under and can be managed by the Site Collection.

Types of Content

The first step is identify what types of content or site(s) you expect to host.

  • Internet
  • Extranet
  • Intranet
  • MySites
  • Company or Divisional Portal(s)
  • Web Content Management (Publishing)
  • Electronic Content Management (Document Management)
  • Business Intelligence
  • Project Management
  • Team Collaboration
  • Application Hosting

Not all of those types of content apply to every organization, but it should be pretty clear that the content, the update frequency and the manner in which it is used and managed can vary quite a bit from type to type. 

Considerations of Multiple Applications

Very small companies or workgroups may be able to get by with all of the types of content hosted in a single Application, but for anything larger there should be plans to segment it to multiple Applications in the farm.

Here are some considerations when using multiple Applications:

Authentication Model – What type of authorization model will be used?  Options include Anonymous, Windows Integration (NTLM or Kerberos), and Forms Based Authentication (FBA).  Since an Application can only support a single authorization mode, that can dictate additional applications.

Edit:    Anders Rask was kind enough to point out some changes to the authorization model in 2010 that make the previous statements incorrect.

In 2007:  Options include Anonymous, Windows Integration (NTLM or Kerberos), and Forms Based Authentication (FBA).  Since an Application can only support a single authorization mode, that can dictate additional applications.

In 2010: There are two high level options; Classic which supports Windows Authentication only and Claims Based which supports one or more different providers.  Additional information can be found on TechNet’s article Plan Authentication Methods.

Sessions – When a user visits an application a session is created.  It is important to understand that their session is for a specific IIS Application so if they visit the main company portal and then click to see the MySite they may be asked to authenticate again.  With Anonymous this should not be an issue.  With Windows Integrated this should not be an issue if 1) users are using Internet Explorer 2) they access the site(s) with the same account they use to access their computer and 3) their browser is properly configured to pass logged on user information. 

Application Scoped Solutions – Solutions can be scoped to specific Applications which can provide some flexibility in deploying new features.  In a large environment it is important to only show features to the areas where they apply.

Considerations for Site Collections

The site collection has some important boundaries to consider when deciding to use one versus multiple site collections.  They include:

Amount of Data – It is important to keep your site collections at a maintainable level.  There is no hard limit, but as Site Collections grow over 40GB they can get more difficult to maintain and will take longer to restore.  SQL Server tuning becomes more critical with larger databases.  It is a good idea to segment your content across multiple site collections (and across multiple content databases) in a manner that makes sense.

SharePoint Groups – SharePoint Groups can be a good way to organize users, especially when they do not map to functional areas or groups otherwise managed in Active Directory.  These groups are defined and contained within a given Site Collection which means if you want to use them in multiple site collections they have to be duplicated and maintained separately which can be problematic.

Site Collection Administrator – A site collection administrator has great power and control within the site collection container.  A Site Collection administrator can choose themes, manage all security within the site collection, manage activated solutions and potentially deploy other customizations if policy permits.  Enabling content owners should be a top priority, and the Site Collection provides a good container for that.

Quota Management – Quotas are set at the Site Collection level so if granular quota management is required for billing or charge back purposes it may have some impact on how site collections are segmented. 

Navigation – The default SharePoint navigation provider does not span site collections (and therefore applications) which can make a standardized or unified navigation scheme difficult to maintain.  This tends to work fine for team collaboration sites, but can be cumbersome when you need to link many site collections.

Content Types – I think Content Types were one of the most important changes introduced with WSS 3.0 and MOSS.  An entire overview of content types is outside the scope of this article, but keep in mind that within the current release Content Types are created and maintained within a Site Collection.  If a Content Type applies to multiple site collections then it needs to be duplicated.  SharePoint 2010 will support farm level content types which will remove the need to duplicate them.

Profiles (WSS Only) – If you are using WSS it is important to understand that the profiles are stored at the Site Collection level.  If you add custom attributes they will need to be added to all site collections they apply to.

Implications on Backup and Recovery

There are a few different methods to backup and recover SharePoint content.  Within the context of this article it is important to understand the difference between the commands that stsadm provides. 

Site Collection Backup and Restore – Considering the boundaries previously discussed, there is a lot of extra content stored in the top level site of a site collection.  Doing an stsadm backup will provide a high fidelity snapshot of the content, configuration, workflows, and other customizations.  It is important to note that if you are moving the site collection between applications or farms that you will need to install any solutions or dependencies referenced in the current location.

Sub-Site or Web Export and Import – The Export process offered for Sub-Sites is great for archiving, but it does not provide the fidelity needed to move sites around.  It will not save workflow, features, solutions or alerts.  I’ve also had inconsistent results with DataViews on sites being migrated in this manner.

If you think you will need to migrate the content or want flexibility, it would be in your best interest to consider using more Site Collections rather than deeply nested Sub-Sites.

Upgrade and Migration Considerations

The purpose and content within a site collection or site can evolve over time.  Some sites that started very narrow in purpose may change and now warrant their own Site Collection or Application.  When preparing for a migration or upgrade it is a good time to run through this exercise again to validate the assumptions and decisions that were previously made or overlooked.  While it may make the move more difficult the changes will pay dividends over the coming years offering a system better tuned to the user’s needs and more maintainable by the site owners and farm administrators.

Related Posts

Stacking Managed Paths

Most SharePoint administrators are likely familiar with the Managed Paths feature that allows certain paths to be designated to hold site collections. The “sites” entry is added by default. Not everyone knows that those paths can be stacked to provide the same functionality at different levels.

Flat versus Deep
As part of any Taxonomy plan the team should always consider the appropriate depth to the site collections. Some might argue having a flat taxonomy with lots of siblings, while others will push for a traditional hierarchy.

Using single level Manage Paths help to provide fairly flat site collection taxonomy. That would give you a top-level site, and then site collections organized one level down. In most of the systems I have worked with there are more than one entry, typically separated by a general purpose (i.e. Projects, Team Sites, Applications, ECM Sites).

In some medium complex organizations that may not be enough. Perhaps there are multiple business units, and the sites could benefit from some BU level landing pages with the site collections below that. Stacking the managed paths will provide this.

For example:

Regular managed paths: http://myserver/units/

Stacked managed paths: http://myserver/units/euro/sites/

Note: In very large organizations it is likely that these sites would be in separate web applications or even separate farms. This technique may still offer value.

Landing Pages/Sites
To establish a landing area Site Collection above each of the managed path entries you will need to define a single level managed path and create the site collection as usually. Using the example above /units would be used for the managed paths and the site collection would be setup at http://myserver/units/euro. This would serve as the landing page for European Operations. The site collections underneath would be organized in the sites path.

When establishing the managed paths, the system does not check to see if a site exists at the path you supplied. If you were to setup a site collection at http://myserver/units and then establish /units as a managed path the site would no longer be accessible.

Site Collections versus Sub-Sites
There are a number of reasons why you might choose the isolation of Site Collections versus creating sub-sites. One big reason to require Site Collections would be for Quota Management which may be needed to support a chargeback system. Since quotas are set at the site collection level you will need isolation in order to manage that effectively and provide accurate notification and reporting.

Search versus Information Architecture

At a recent Triangle SP User Group meeting an interesting debate broke out regarding the use of search to find information inside of a SharePoint system. One person went as far as to say that “search is a last resort” while another said “if a user has to search, you have failed.” Those are some pretty strong positions to say the least. I believe that Information Architecture and Taxonomy are incredibly important, but Search is as well and should not be a last resort. If proper planning is not done in both areas people are going to have trouble finding what they are looking for.

Preferred Method Depends on Context
First I think this debate requires context since the type and the amount of content to look through can range dramatically. Browsing a Wiki with 20 pages is not the same as browsing a Document Library filled with 100,000 invoices. For the former it would seem crazy to need to rely on search, but with the latter it is absolutely essential to most end users.

Users Think Differently
I have had the chance to do usability studies for large SharePoint implementations in a few organizations now. Early on I was amazed at some of the feedback I received with regards to organization and taxonomy. All users do not process thoughts the same way, so while you can work hard to cover all your bases some will still be left out. Put as much time into planning the Taxonomy as possible, and review it on a regular basis as the content and purpose evolves.

Search Needs Tuning
The search service with MOSS can be pretty powerful, but it isn’t something you just turn on and expect targeted search results to be delivered. Time should be spent planning the search system to take advantage of its powerful features including authoritative levels, keywords, best bets, scopes, etc.

As part of this planning effort you should try to identify which sections or sites could most benefit from custom search interaction. If the nature of the data is such that searching is likely to be needed, configure tools to make it as easy as possible to pull that specific set of data. In Mike Gannotti’s presentation, one of the things he suggested was creating a sub-site for each feature or type of list. The advantage of this is that when users search at the “This Site” scope, only data from that level will be returned. With the exception of Wiki configuration, this is not something I have typically done.

Learned Behavior – Browsing versus Searching
After the meeting I reflected on the topic quite a bit, and then a day or two later something occurred to me while working on my Vista machine. When I sent to go open a program I found myself using the desktop search feature to find the programs that were not pinned to the Start Menu. In the past, on previous computers, I had spent an insane amount of time building a system to help organize my program menu. Since I had too many things installed and the menu would wrap, I created a taxonomy with high level categories that led to the program folders (i.e. Dev Tools, Sys Tools, Office). Everyone agrees that browsing to something when you know its exact location is very quick. However, it is quicker still to type in a name and then double click the result returned. Reaching the Event Viewer for example has never been so easy. Without realizing it, my behavior in this OS has changed. While I am not a typical end-user, I believe that I am not unique in being able to adapt to new ways of finding information.

End User Ownership and Evolving Content
Some of the biggest Information Architecture challenges I have seen come from the fact that in most organizations many of the site collections are owned and managed by business units. You can try to educate them on proper planning and maintenance, but in most cases it is far from their top priority even as people complain about usability.

In many cases the purpose of the system changes leaving it out of sync with the taxonomy or navigation design. This is a sure fire thing that pushes users to rely on search.

Best Way Forward
I think the best way to tackle all of these challenges starts with proper planning and site owner education up front followed by periodic reviews of existing sites after the fact. If you can schedule reviews either quarterly or semi-annually as part of the overall service offering it will help you limit some of the issues of End user Ownership and Evolving Content. By keeping the system tuned, users will be more productive, and therefore more interested in using the system which increases end user buy in.

Where Do You Stand?
Where do you stand on the debate? Do you have anything to contribute to the discussion? Is Search a last resort? Is developing a maintainable taxonomy even possible?

Planning for Separate Site Collections

When planning out the site structure of a new SharePoint system, here are five things I take into consideration. Taking these into account during your planning phase will hopefully reduce the amount of rework needed as your system evolves.

Amount of data

If you are expecting a large data set, perhaps you are scanning all supplier invoices and using SharePoint for Content Management, its important to consider the size of a Site Collection and the underlying Content Database from an Administration standpoint. Larger Site Collections and Content DBs require more care and attention along with a more advanced skill set. There have been a number of discussions on what the guidelines are, but I aim to keep Content DBs under 40Gb unless there is a real exception. In this case, the size of the data over rules any of the other answers.

For site collections I expect to grow large, I set them up in a dedicated Content DB right from the start. This saves the trouble having to move it to a different content DB later which can take some time with large sites.

Type and Purpose of the Sites

In a small to mid-sized organization I typically treat sites that support Intranet or department level collaboration a little different than cross-functional or project type sites. Its easy to define the groups around the functional areas and you can reuse those groups in other sites where there is overlap.

In larger organizations sites typically require more isolation. There may be a some interaction between the groups, but not within divisions. In this case the Site collections can be structured to support the organizational boundaries.

Groups Definition and Membership

Groups are defined within the site collection container. If you setup a group called “Management” in Site Collection A, it wouldn’t be available to Site Collection B unless its setup separately. It may not sound like a big deal, but it gets very tedious when you have to manage the same group in multiple places. To make matters worse administrators often take for granted who is in the group based on the name. When setting Site Permissions a group’s members is not directly visible.


WSS and MOSS have some good out of the box navigation systems. They are limited though when it comes to trying to tie together multiple Site Collections. There are solutions like defining custom providers but then you are introducing sophisticated customizations that not every company can support. A last ditch option is to manually link to sites and resources but then everything becomes much more difficult to manage.

If it is important to have a consistent horizontal navigation scheme, look to keep as much as possible within a limited number of site collections.

Aggregating or Reusing Data

One of SharePoint’s greatest values is in its ability to support syndication and aggregation of content to deliver it where the users need it. You can aggregate the content using tools like the Content Query Web Part or DataViews. These techniques get a little more complicated when content is in different site collections. Custom code or third party tools are then required to accomplish bring things together.

Agree? Disagree? Let me know, I would love to hear your feedback.

%d bloggers like this: