• Home
  • Inspiratie
  • Blogs
  • Seo-optimizing-your-aspnet-mvc-site-for-google-search-using-meta-properties-robotstxt-and-sitemapxml

SEO: Optimizing your ASP.NET MVC site for Google Search using meta properties, robots.txt and sitemap.xml

For many websites, “organic” search results from Google are the most important source for traffic. This means that it is extremely important that content on the website gets a high ranking in the search results, and is also presented well between other search results. In this post I am going to implement some basic SEO techniques in ASP.NET MVC so that the pages of the site can be found and indexed as well as possible by Google and other search engines.

The example snip from the search results show examples of how it is possible to get extra attention to your search results, using schema.org microformats as a SEO technique:

image

And you probably found this post using Google, after clicking a link to this site, that shows my picture and name and as the author.

image

This post is about how to get these optimizations on your site, using ASP.NET MVC.

Black-hat and white-hat SEO
In many ways, SEO is a battleground. Google is always trying to present the most relevant results for its users on top (complementing those results with the best paying advertisements). It uses proprietary algorithms and lots of data to calculate the best results, and their algorithms change all the time. Understanding those algorithms and manipulating search results has become big business: SEO optimization. There is black-hat SEO and white-hat SEO (and many shades of grey in between). Black-hat SEO optimization (like paying a company to litter internet forums and blog discussions with links to your site, or paying for Twitter followers and Facebook Likes) is dangerous: if it ever gets found-out, your site may be banned from the Google index for some time, or you may become an object of public mockery on the internet. Also, users who are tricked into visiting your site are probably not going to rel="noopener noreferrer" be customers anyway. See this video from the Onion if you want to have a good laugh on social media marketing (yes, this really happens).

My advise is to stay away as far as you can from scammy SEO companies and their techniques. Correctly implementing basic SEO is usually good enough to get your content on the first page of search results, if you have good content. Think, in advance, about your target audience on the wider internet, how to attract them to your site and then how to convert them to customers. If basic SEO does not give the results you want, consider advertising. If your Content Management System is unable to deliver SEO optimized pages, fix that first and do not try work around it.

The starting point for this article is the ASP.NET MVC site that I created before, based on the ASP.NET MVC 4 Internet Site template. I created generic content pages using a simple content editing feature based on Markdown. The content is stored in a SQL Server table, that is initialized with Fluent Migrator and queried using the SqlFu Micro ORM.

This is the action plan to implement SEO:

  • Create a SEO friendly “Slug” to optimize the URL
  • Add keywords and <meta> properties to all our pages
  • Specify a canonical url for all content
  • Use schema.org or RDFa microformats to optimize display of search results
  • Add author picture to search results

  • Create a robots.txt
  • Create a sitemap.xml
  • Use Google Webmaster Tools to check the results

Database changes
First, we need to add columns to the database to store the SEO fields. Because I am using FluentMigrator, I do this by adding a new Migration class to add fields to our earlier Migrations that will be run automatically the next time the application is started. The ContentPage table already has a field “Slug”, that contains a seo-optimized url segment derived from the title. This url is very important for the SEO ranking. I will add a text field Keywords, another text field for the Google Plus user id of the author, a datetime Modified and a boolean NoSearch. Modified will be automatically set to the last modification time using a trigger.


using FluentMigrator;
using System;

namespace Auction.Web.Domain.Migrations
{
[Migration(3)]
public class ContentPage : Migration
{
public override void Up()
{
Create.Table("ContentPage")
.WithColumn("Id").AsInt32().NotNullable().PrimaryKey().Identity()
.WithColumn("Slug").AsString(120).NotNullable().Indexed("IX_slug")
.WithColumn("Title").AsString(150).NotNullable()
.WithColumn("Author").AsString(100).Nullable()
.WithColumn("SortOrder").AsFloat().NotNullable().WithDefaultValue(0.0f)
.WithColumn("Markdown").AsString(8000).Nullable();


Execute.Sql("SET IDENTITY_INSERT [ContentPage] ON");
Insert.IntoTable("ContentPage").Row(new { Id = 1, Slug = "about", Title = "About this site", Author = "Maarten Sikkema", MarkDown = @"Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Quisque posuere laoreet dui sit amet euismod. Morbi laoreet laoreet justo eu auctor. Suspendisse facilisis dui non elit blandit faucibus. Proin nec malesuada augue. Quisque ut sollicitudin mi.
Ut eleifend quam a massa gravida scelerisque. Cras commodo, risus a fermentum tristique, mi nibh vulputate dui, id rhoncus quam sem vitae orci. Sed et eros condimentum, dictum urna sit amet, imperdiet massa.
Nunc sed fermentum nibh, ut tristique augue. Curabitur in nisl a purus luctus porta. Sed tellus augue, hendrerit scelerisque enim sed, euismod facilisis purus.
Integer sed leo nec tellus elementum egestas ac ut magna. Donec risus libero, cursus quis mauris ut, tristique porttitor nulla.
[More information](http://www.macaw.nl)" });

Execute.Sql("SET IDENTITY_INSERT [ContentPage] OFF");
}

public override void Down()
{
Delete.Table("ContentPage");
}

}
}
view rawContentPage.cs hosted with ❤ by GitHub
using FluentMigrator;

namespace Auction.Web.Domain.Migrations
{
[Migration(5)]
public class Seo : Migration
{
public override void Down()
{
Execute.Sql(@"drop trigger SetContentPageModified");
Delete.Column("Modified").FromTable("ContentPage");
Delete.Column("Keywords").FromTable("ContentPage");
Delete.Column("GooglePlusId").FromTable("ContentPage");
Delete.Column("NoSearch").FromTable("ContentPage");
}

public override void Up()
{
Alter.Table("ContentPage")
.AddColumn("Modified").AsDateTime().WithDefault(SystemMethods.CurrentUTCDateTime)
.AddColumn("Keywords").AsString(120).Nullable()
.AddColumn("GooglePlusId").AsString(30).Nullable()
.AddColumn("NoSearch").AsBoolean().WithDefaultValue(false);

Execute.Sql(@"update ContentPage set NoSearch = 0, Modified = getutcdate(), GooglePlusId = '109374356863565148764'");

Execute.Sql(@"create trigger SetContentPageModified on ContentPage for update as
begin
if not update(Modified)
begin
update ContentPage
set Modified=getutcdate()
from ContentPage inner join inserted i
on ContentPage.Id = i.Id
end
end"
);
}
}
}
view rawSeoMigration.cs hosted with ❤ by GitHub


The ContentPage Model (that maps to the ContentPage table in SQL Server) looks like this:


using SqlFu;
using System;
using System.ComponentModel.DataAnnotations;

namespace Auction.Web.Domain.Entities
{
[Table("ContentPage", PrimaryKey = "Id", AutoGenerated = true)]
public class ContentPage : Entity<int>
{
[Required]
[RegularExpression(@"^[a-z0-9-]+$")]
[Display(Name = "SEO friendly url: only lowercase, number and dash (-) character allowed")]
public string Slug { get; set; }

[Required]
public string Title { get; set; }
[Required]
public string Author { get; set; }
[RegularExpression(@"^[0-9]*$")]
[Display(Name = "Google+ Id of the author (for picture in search results)")]
public string GooglePlusId { get; set; }
[Required]
public float SortOrder { get; set; }

[Display(Name = "Comma-separated list of keywords for search engines")]
public string Keywords { get; set; }

[Required]
[Display(Name = "Content in Markdown format")]
public string Markdown { get; set; }
[Display(Name = "Last Modified On")]
[QueryOnly]
public DateTime Modified { get; set; }

[Required]
[Display(Name = "Exclude this content from search engine results")]
public bool NoSearch { get; set; }
}
}
view rawContentPage.cs hosted with ❤ by GitHub


I also add these fields to the edit pages in the Admin area, so that I can edit content pages. The dit page looks like this:

image

Meta properties
Now we can create the view for this content page. We will use these extra fields to help Google index and show the page optimally. The _Layout page has a section named “meta” that is added in the <head>. In this section we put all the page-specific meta properties:

  • description: a short abstract of the content. I don’t want to enter this manually, so I just use the first 50 characters of the content. Of course I could also have used another field
  • og properties: these are mainly for Facebook, but it seems Google also uses them for sharing on Google Plus
  • <link rel=”canonical”>: this is a very important property that is used to avoid feeding Google duplicate content. Google does not like duplicate content, and will lower its ranking. However, Google sees http and https as different addresses. It also treats urls in a case-sensitive way, so it sees /Page/About as a different address than /page/about. ASP.NET MVC is case-insensitive by default, and coming from a Windows background we are not used to working with case sensitive paths and we’ll make mistakes with internal links easily. To avoid getting “punished” for that, we need to specify a canonical url for every page: our original address of page. The address should be fully qualified, starting with http:// or https://. I have not found negative consequences of using https:// addresses for the canonical url and actually think it is a good idea to use https:// for all pages by default now.

I am using a web.config AppSetting for the external server address. This is to make sure that canonical url is the same, even if the site has multiple DNS addresses (domain.com and www.domain.com).


Author picture
Google also supports microformats, and in some cases will display these in its results. Microformats are extra attributes in the markup that indicate a semantic meaning. For pages about products with ratings it will show the rating with 1-5 stars, and in some cases also the price or price range. In weblogs and other publications, Google is able to display a picture of the author alongside the link. rel="noopener noreferrer" Google gets this information using a microformat. Google supports multiple micro formats, most importantly RDFa and Schema.org. See the Google webmaster tools for the most recent information.

In order to get an author picture to show up next to a content page, we need to do several things:

  • Setup a Google Plus profile (I have searched if this can be done with other social networks, but this only seems to work with Google+, this looks like a clear case of monopoly power abuse to me)
  • Enter Profile information (with the picture you want to show in the search results) and add yourself in one or more Circles
  • Register your profile as the owner of your site (or sub-site) using Google Webmaster tools
  • Linking from your content to your Google profile with a specific syntax: <a href="https://plus.google.com/[googleplusid]?rel="author">Google+</a>. The rel=author is the important bit that Google uses to match you as the author.

This is the resulting view:


@using Auction.Web.Utility
@model Auction.Web.Domain.Entities.ContentPage

@{
ViewBag.Title = Model.Title;
ViewBag.Description = Model.Markdown.Substring(0, Math.Min(Model.Markdown.Length - 1, 50));
}

@section meta {
<meta name="description" content="@ViewBag.Description" />
<meta name="keywords" content="@Model.Keywords" />
<meta property="og:title" content="@ViewBag.Title" />
<meta property="og:description" content="@ViewBag.Description" />
<meta property="og:url" content="@Url.Action("ContentPage", "Page", new RouteValueDictionary(new { area = ViewContext.RouteData.DataTokens["area"], slug = Model.Slug }), Uri.UriSchemeHttps, System.Configuration.ConfigurationManager.AppSettings["HostName"])" />
<meta property="og:image" content="@Request.Url.GetLeftPart(UriPartial.Authority)/Content/images/auction.jpg" />
<link rel="canonical" href="@Url.Action("ContentPage", "Page", new RouteValueDictionary(new { area = ViewContext.RouteData.DataTokens["area"], slug = Model.Slug }), Uri.UriSchemeHttps, System.Configuration.ConfigurationManager.AppSettings["HostName"])" />
}

<article class="content-item">
<header class="page-header">
<h1>@Model.Title</h1>
<div class="byline">
By <a rel="author" href="https://plus.google.com/@(Model.GooglePlusId)?rel=author">@Model.Author</a>
on <time pubdate datetime="@Model.Modified.ToLocalTime().ToShortDateString()" title="@Model.Modified.ToLocalTime().ToLongDateString()">@Model.Modified.ToLocalTime().ToShortDateString()</time>
</div>
</header>
@Html.Markdown(Model.Markdown)
</article>
view rawContentPage.cshtml hosted with ❤ by GitHub


If you did everything well then, after a day or two, your picture will start showing up next to your blog search results!



ROBOTS.TXT and Sitemap.xml
The robots.txt file is an artifact of ancient internet time, when people searched with Altavista and the only cross-platform format was ASCII text. Before indexing a site, search bots look for a text file named ROBOTS.TXT in which the site-owner of the site can specify which parts of the site should be excluded from search engines. You can opt-out completely, or exclude specific parts of the site. Excluding parts of a site is a bit limited, therefore the sitemap was invented. This happened during the XML days, so this is an XML file. Using the sitemap, a site owner can help the search bot easily find all content that should be indexed by providing links in a structured way. In the sitemap file it is also possible to specify the expected frequency of changes and the relative importance of pages, using a number between 0 and 1.

We will implement these two files now in our project. First, we will register the routes:


using Microsoft.AspNet.SignalR;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.Mvc;
using System.Web.Routing;

namespace Auction.Web
{
public class RouteConfig
{
public static void RegisterRoutes(RouteCollection routes)
{
routes.MapHubs();

routes.MapRoute(
name: "sitemap.xml",
url: "sitemap.xml",
defaults: new { controller = "Site", action = "SitemapXml" },
namespaces: new[] { "Auction.Web.Controllers" }
);

routes.MapRoute(
name: "robots.txt",
url: "robots.txt",
defaults: new { controller = "Site", action = "RobotsText" },
namespaces: new[] { "Auction.Web.Controllers" }
);

routes.MapRoute(
name: "googleid.html",
url: "google{id}.html",
defaults: new { controller = "Site", action = "Google" },
namespaces: new[] { "Auction.Web.Controllers" }
);

routes.MapRoute(
name: "ContentPage",
url: "Page/{slug}",
defaults: new { controller = "Page", action = "ContentPage" },
namespaces: new[] { "Auction.Web.Controllers" }
);

routes.MapRoute(
name: "Default",
url: "{controller}/{action}/{id}",
defaults: new { controller = "Home", action = "Index", id = UrlParameter.Optional },
namespaces: new[] { "Auction.Web.Controllers" }
);
}
}
}
view rawRouteConfig.cs hosted with ❤ by GitHub


Make sure that, in web.config, you set <modules runAllManagedModulesForAllRequests="true">. Otherwise the routes won’t get called.

Now we can implement them in the SiteController:


using Auction.Web.Domain.Queries;
using Auction.Web.Models;
using Auction.Web.Utility;
using System;
using System.Collections.Generic;
using System.Configuration;
using System.Diagnostics;
using System.Globalization;
using System.IO;
using System.Text;
using System.Web.Mvc;
using System.Xml.Linq;

namespace Auction.Web.Controllers
{
[AllowAnonymous]
public class SiteController : BaseController
{
private const string SitemapsNamespace = "http://www.sitemaps.org/schemas/sitemap/0.9";

public ActionResult AllowCookies(string ReturnUrl)
{
CookieConsent.SetCookieConsent(Response, true);
return RedirectToLocal(ReturnUrl);
}

public ActionResult NoCookies(string ReturnUrl)
{
CookieConsent.SetCookieConsent(Response, false);
// if we got an ajax submit, just return 200 OK, else redirect back
if (Request.IsAjaxRequest())
return new HttpStatusCodeResult(System.Net.HttpStatusCode.OK);
else
return RedirectToLocal(ReturnUrl);
}


[OutputCache(Duration = 60 * 60 * 24 * 365, Location = System.Web.UI.OutputCacheLocation.Any)]
public ActionResult FacebookChannel()
{
return View();
}

[OutputCache(Duration = 60 * 60 * 24, Location = System.Web.UI.OutputCacheLocation.Any)]
public FileContentResult RobotsText()
{
var content = new StringBuilder("User-agent: *" + Environment.NewLine);

if (string.Equals(ConfigurationManager.AppSettings["SiteStatus"], "live", StringComparison.InvariantCultureIgnoreCase))
{
content.Append("Disallow: ").Append("/Account" + Environment.NewLine);
content.Append("Disallow: ").Append("/Error" + Environment.NewLine);
content.Append("Disallow: ").Append("/signalr" + Environment.NewLine);

// exclude content pages with NoSearch set to "true"
var items = Query(new GetSeoContentPages(noSearch: true));
foreach (var item in items)
{
content.Append("Disallow: ").Append(Url.Action("ContentPage", "Page", new { area = "", slug = item.Slug })).Append(Environment.NewLine);
}
content.Append("Sitemap: ").Append("https://").Append(ConfigurationManager.AppSettings["HostName"]).Append("/sitemap.xml" + Environment.NewLine);

}
else
{
// disallow indexing for test and dev servers
content.Append("Disallow: /" + Environment.NewLine);
}


return File(
Encoding.UTF8.GetBytes(content.ToString()),
"text/plain");
}

[NonAction]
private IEnumerable<SitemapNode> GetSitemapNodes()
{
List<SitemapNode> nodes = new List<SitemapNode>();

nodes.Add(new SitemapNode(this.ControllerContext.RequestContext, new { area = "", controller = "Home", action = "Index"} )
{
Frequency = SitemapFrequency.Always,
Priority = 0.8
});

var items = Query(new GetSeoContentPages(false));
foreach (var item in items)
{
nodes.Add(new SitemapNode(this.ControllerContext.RequestContext, new { area = "", controller = "Page", action = "ContentPage", id = item.Slug })
{
Frequency = SitemapFrequency.Yearly,
Priority = 0.5,
LastModified = item.Modified
});
}

return nodes;
}

[NonAction]
private string GetSitemapXml()
{
XElement root;
XNamespace xmlns = SitemapsNamespace;

var nodes = GetSitemapNodes();

root = new XElement(xmlns + "urlset");


foreach (var node in nodes)
{
root.Add(
new XElement(xmlns + "url",
new XElement(xmlns + "loc", Uri.EscapeUriString(node.Url)),
node.Priority == null ? null : new XElement(xmlns + "priority", node.Priority.Value.ToString("F1", CultureInfo.InvariantCulture)),
node.LastModified == null ? null : new XElement(xmlns + "lastmod", node.LastModified.Value.ToLocalTime().ToString("yyyy-MM-ddTHH:mm:sszzz")),
node.Frequency == null ? null : new XElement(xmlns + "changefreq", node.Frequency.Value.ToString().ToLowerInvariant())
));
}

using (var ms = new MemoryStream())
{
using (var writer = new StreamWriter(ms, Encoding.UTF8))
{
root.Save(writer);
}

return Encoding.UTF8.GetString(ms.ToArray());
}
}


[HttpGet]
[OutputCache(Duration = 24 * 60 * 60, Location = System.Web.UI.OutputCacheLocation.Any)]
public ActionResult SitemapXml()
{
Trace.WriteLine("sitemap.xml was requested. User Agent: " + Request.Headers.Get("User-Agent"));

var content = GetSitemapXml();
return Content(content, "application/xml", Encoding.UTF8);
}

public ActionResult Google(string id)
{
if (ConfigurationManager.AppSettings["GoogleId"] == id)
return View(model: id);
else
return new HttpNotFoundResult();
}
}
}
view rawSiteController.cs hosted with ❤ by GitHub
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.Routing;
using System.Web.Mvc;

namespace Auction.Web.Models
{
public class SitemapNode
{
public string Url { get; set; }
public DateTime? LastModified { get; set; }
public SitemapFrequency? Frequency { get; set; }
public double? Priority { get; set; }


public SitemapNode(string url)
{
Url = url;
Priority = null;
Frequency = null;
LastModified = null;
}

public SitemapNode(RequestContext request, object routeValues)
{
Url = GetUrl(request, new RouteValueDictionary(routeValues));
Priority = null;
Frequency = null;
LastModified = null;
}

private string GetUrl(RequestContext request, RouteValueDictionary values)
{
var routes = RouteTable.Routes;
var data = routes.GetVirtualPathForArea(request, values);

if (data == null)
{
return null;
}

var baseUrl = request.HttpContext.Request.Url;
var relativeUrl = data.VirtualPath;

return request.HttpContext != null &&
(request.HttpContext.Request != null && baseUrl != null)
? new Uri(baseUrl, relativeUrl).AbsoluteUri
: null;
}



}

public enum SitemapFrequency
{
Never,
Yearly,
Monthly,
Weekly,
Daily,
Hourly,
Always
}
}
view rawSitemapNode.cs hosted with ❤ by GitHub




I have implemented ROBOTS.TXT to look for an App Setting “SiteStatus”. If this is anything other than “live” (we use test and dev values), it will exclude the whole site from indexing. Otherwise, it will return:

image

(I found that Google actually tries to index SignalR output, which won’t do much good…)

Opening Sitemap.xml gives this result:

image

Nice, that is the result that Google expects.

The SiteController also implements a Google ID Action. Upon registering your site in the Google Webmaster tools, Google will supply an ID that you can set in an AppSetting in web.config, after which Google can validate the ID. This proves your ownership of the website to Google. Now you can manage how Google interacts with your site. Google will show errors in your ROBOTS or SItemap file if it finds them, and it will show dead links. If you use micro formats, you can also test them here. Google will show you how it will show your search results.

image

[This post is part of the series Development of a mobile website with apps and social features]