UPDATE 21st April: It would appear that the snippet of misformed HTML is no longer being included in our blogs! The information doesn't seem to be included in any other fashion so I'm assuming it will be back once Blogger have decided how they are going to do it properly.
I've been messing about with Blogger templates over the last few days and I've spotted that Blogger seem to have broken the HTML of every blog they host! They have added an invalid element within the
head
section of each page which causes the premature closing of the
head
section. Depending on what scripts etc. you use in your template this may cause a problem.
The new code that Blogger have added to our templates is aimed at adding extra metadata to each page, which in turn will enable Google to have more information about each page within a blog when they are included in a search result. They are using the
schema.org metadata format to achieve this. Specifically each page of this blog now contains the following in the
head
section:
<itemscopetag itemscope='itemscope' itemtype='http://schema.org/Blog'>
<meta content='Code from an English Coffee Drinker' itemprop='name'/>
</itemscopetag>
As you can probably gather, this snippet essentially tags the page as being from a blog whose title is "Code from an English Coffee Drinker". From the full
Blog schema you can see that there is actually a whole set of properties that Google could set for each blog, and I'm guessing that at some point in the future they will add more information, which in turn will enrich their search result pages. Now I'm all for adding extra metadata (I've even written a
GATE application that runs
ANNIE over webpages and then embeds appropriate schema.org metadata), but unfortunately Blogger have messed up their implementation.
The problem is that they have used an
itemscopetag
tag, which isn't valid in any version of the HTML specification. Also the specification tells us that if, when parsing the
head
section of a page we encounter an unknown tag
"act as if an end tag token with the tag name "head" had been seen, and reprocess the current token". This essentially causes the premature closing of the
head
section, with anything else now part of the
body
instead. Depending on what has been forced out of
head
and into
body
and which browser you are using you may see different results. For example, it looks as if
links to the Chrome Web Store are broken by this.
What Blogger should have done was added the information to the
body
tag or one of the main content
div
tags instead. For example, they could have started the
body
as follows:
<body itemscope='itemscope' itemtype='http://schema.org/Blog'>
<meta content='Code from an English Coffee Drinker' itemprop='name'/>
This would have embedded exactly the same metadata but in a format conformant with the HTML specification, and which follows the
instructions given on the schema.org site.
Unfortunately there doesn't appear to be anyway to remove this code from our blogs. The best we can do is to move the piece of template code that generates the invalid tag (as well as lots of other code) as late in the
head
section as possible, so that it pushes the least possible code into the
body
. To do this you need to edit the HTML version of your template and move the line:
<b:include data='blog' name='all-head-content'/>
To just before the closing
head
tag so it looks like:
<b:include data='blog' name='all-head-content'/>
</head>
Hopefully Blogger will fix the code the generate soon but until then we just have to minimize the damage they inflict on our blogs any way we can.