Go to page content

Get localized pages into search engines

Normally only the pages in the default language get indexed.

Why

Zotonic is really great if you want a multilingual website. Let’s suppose that your content is partly localized, with a default language (of course), and some resources in either the default language only, in another language only or in several languages. Zotonic uses content negotiation and cookies to switch between localized versions. However, it will always present content in default language to search engine bots. The problem is to get everything indexed in a way that is acceptable for search engines, avoiding the indexing of duplicate content (in the same language), and in a way that does not degrade user experience.

Assumptions

Readers are assumed to be comfortable with Erlang programming, applying patches and developing Zotonic templates.

This tutorial requires :
– a zotonic website in at least two languages, with mod_translation
mod_seo_sitemap enabled
– (optional) a Google Webmasters Tools account with this website

How

After this tutorial, you will have:
– the same experience for users, who, in most cases, will see all content at URL and will be able to switch as they used to (typically through a widget)
– for each resource available in several languages, content in the default language will be indexed at URL and content in other languages available will be indexed at URL?lg=CODE, where CODE is the language code (en, nl, fr, etc.).
– each resource available in the default language will be available, as before, at URL
– search engine bots will be able to find alternate versions
– additionally, sitemap.xml will list content in all languages, i.e. URL and URL?lg=CODE for resources available in several languages.

1. Language switcher

Somewhere in your page, include the language switch template.

{% include “_language_switch.tpl” %}

You can customize this template, but it could be the topic of another cookbook entry.

2. Make resources available at ?lg=XXX

This is the tough: you need to modify mod_translation.
Save the following patch in a file called mod_translation.diff at the root of zotonic, and apply it with:

$ patch -p1 < mod_translation.diff
diff —git a/modules/mod_translation/mod_translation.erl b/modules/mod_translation/mod_translation.erl
index 944be84..a063858 100644
—– a/modules/mod_translation/mod_translation.erl
+++ b/modules/mod_translation/mod_translation.erl
@@ -62,16 +62,34 @@ init(Context) –>
 %% @doc Check if the user has a prefered language (in the user’s persistent data). If not
 %%      then check the accept-language header (if any) against the available languages.
 observe_session_init_fold(session_init_fold, Context, _Context) –>
+    % Honor lg parameter for search engines.
+    case z_context:get_q(lg, Context) of
+        undefined –>
+            language_negotiation(Context);
+        Lang –>
+            LanguageAvailable = languages_available(Context),
+            case lists:member(Lang, LanguageAvailable) of
+                true –>
+                    do_set_language(list_to_atom(Lang), Context);
+                false –>
+                    language_negotiation(Context)
+            end
+    end.
+
+languages_available(Context) –>
+    [ atom_to_list(Lang)
+        || {Lang, Props} <– get_language_config(Context),
+           proplists:get_value(is_enabled, Props) =:= true
+    ].
+
+language_negotiation(Context) –>
    case z_context:get_persistent(language, Context) of
        undefined –>
            case z_context:get_req_header(“accept-language”, Context) of
                undefined –>
                    Context;
                AcceptLanguage –>
–                    LanguagesAvailable = [ atom_to_list(Lang)
–                                            || {Lang, Props} <– get_language_config(Context),
–                                               proplists:get_value(is_enabled, Props) =:= true
–                                         ],
+                    LanguagesAvailable = languages_available(Context),
                    case catch do_choose(LanguagesAvailable, AcceptLanguage) of
                        Lang when is_list(Lang) –>
                            do_set_language(list_to_atom(Lang), Context);

Recompile (with z:m(). in the shell). To test that it works, try to view a localized page with URL?lg=CODE in a browser that has no cookie for your website. Alternatively, you can try in Google Webmaster Tools, analyze as Googlebot.

3. Add a reference to these resources in the head section

You need to edit _html_head.tpl file, creating it if it does not exist. You can create it in your site’s templates or directly in mod_translation for all your sites.

In this file, put the following content :

{% if id and id.page_url %}
    {% for code,lang in m.config.i18n.language_list.list %}
        {% if all or lang.is_enabled %}
            {% if z_language != code and m.config.i18n.language.value != code %}
<link rel=“alternate” hreflang=“{{ code }}” href=“{{ id.page_url}}?lg={{ code }}”/>
            {% endif %}
        {% endif %}
    {% endfor %}
{% endif %}

Rescan for modules (to let Zotonic find this file) and load a page available in several languages. If you view it in the default language, the head section will include an alternate link to the localized content.

4. List localized pages in the sitemap.xml file

Create a new _sitemap_xml.tpl file in your site’s templates directory :

<?xml version=“1.0” encoding=“UTF-8”?>
<urlset xmlns=“http://www.sitemaps.org/schemas/sitemap/0.9”
        xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”
        xsi:schemaLocation=“http://www.sitemaps.org/schemas/sitemap/0.9
                http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd”>
{% with m.site.hostname|default:“localhost” as hostname %}
<url>
    <loc>http://{{ hostname }}/</loc>
    <lastmod>{{ m.rsc.home.modified|default:now|date:“c” }}</lastmod>
    <changefreq>daily</changefreq>
    <priority>1.00</priority>
</url>
    {% for id in result %}
        {% if not id.seo_noindex and id.page_url != “/” %}
            {% for lang in id.language|default:[z_language] %}
<url>
                {% if not z_language|member:r_language or lang == z_language %}
   <loc>http://{{ hostname }}{{ id.page_url|escapexml }}</loc>
                {% else %}
   <loc>http://{{ hostname }}{{ id.page_url|escapexml }}?lg={{ lang }}</loc>
                {% endif %}
   <lastmod>{{ id.modified|date:“c” }}</lastmod>
   <changefreq>daily</changefreq>
   <priority>{% if id.page_path %}0.8{% else %}0.5{% endif %}</priority>
</url>
            {% endfor %}
        {% endif %}
    {% endfor %}
{% endwith %}

Troubleshooting

There are no troubleshooting steps available for this guide. Please provide any you have learned in the comments below or on the Zotonic Users Group.

This page is part of the Zotonic documentation, which is licensed under the Apache License 2.0.