Preparing for a successful migration
Before we dive into the tasks involved in the migration plan, let’s go over some of the things we’ve been doing to get ready.
Replacement subdomain for controller docs
As mentioned a bit earlier, there is some Automation controller documentation available from docs.ansible.com
. That documentation is intended for Red Hat customers and is not moving to Read The Docs. We plan to keep the content on the web server after the migration is complete. For this reason, we need to create a new subdomain so folks can still find Automation controller documentation.
We’ve consulted the team responsible for that content and have decided to go with the legacy-controller-docs.ansible.com
subdomain.
Consolidating and converting redirects
To facilitate the migration to Read The Docs, we had to drastically reduce the number of server-side redirects. Read The Docs imposes a limit of 100 redirects per project. When we started out with the documentation for the Ansible community package, there were thousands of server-side redirects.
Earlier this year we consolidated all the pre-collections redirects. We also added an Sphinx extension for redirects in the Ansible community docs.
As a result, we should not break any redirects in place for the Ansible community docs defined in the top-level .htaccess file or the pre-collections redirects.
Phase one: Setting things up
Phase one of the migration sets up the legacy-controller-docs.ansible.com
subdomain for Automation controller docs and adjusts DNS records.
Creating a separate landing page
We need to create a landing page for the legacy-controller-docs.ansible.com
subdomain that should:
- Briefly explain the migration to Read The Docs.
- Provide clear links to Ansible community and Ansible core documentation.
- Notify that Ansible core documentation from version 2.15 and later will be available from docs.ansible.com after the migration.
Note that we can enable builds for earlier versions in Read The Docs if necessary. It would be good to get feedback.
- Explain that we will not remove any pages from the current server but will no longer actively maintain or update them.
For example, https://docs.ansible.com/ansible/11/getting_started/index.html will still be available from the legacy-controller-docs.ansible.com subdomain.
- Provide entry points to the main pages that will be available after the migration, as follows:
ansible-tower.html
automation-tower-chinese-translations.html
automation-tower-japanese-translations.html
automation-tower-korean-translations.html
automation-tower-prior-versions.html
platform.html
To create the archive landing page, do the following:
- Create an
archive-index.html
in the ansible/docsite
repository.
This index page should be as simple as possible and not generated from any template. It should use straightforward inline styling.
- Update the
.htaccess
configuration to allow access to archive-index.html
.
- Update the docsite build job in Jenkins to deploy
archive-index.html
to the web server.
After migration, we will separate the index page from the ansible/docsite
repository.
Configuring DNS
Complete the following steps to set up legacy-controller-docs.ansible.com
alongside docs.ansible.com
.
- Create a new A record for
legacy-controller-docs.ansible.com
that points to the IP address of the EC2 instance.
- Convert the A record for
docs.ansible.com
into a CNAME that points to legacy-controller-docs.ansible.com
.
This step will result in a smoother transition when we move the CNAME to Read The Docs. It becomes a simple replacement of the CNAME value and not A to CNAME change, which means faster propagation. Likewise if an issue arises with the migration and we need to rollback, we can quickly revert the CNAME back to legacy-controller-docs.ansible.com
.
- Lower the Time To Live (TTL) setting for the
docs.ansible.com
record.
This step will help the CNAME change to propagate quickly. The TTL setting tells DNS resolvers how long to cache the record before updating. We can lower the TTL to something like 60 seconds. After doing this, we wait for the amount of time for the initial TTL setting. DNS resolvers will retain the cache for the original TTL duration. Once this waiting period is over, any new DNS queries will get the response with the shorter TTL value.
Configuring the web server
We need to make some changes on the web server so that the docs are available from both docs.ansible.com
and legacy-controller-docs.ansible.com
.
- Configure the web server to handle requests for
legacy-controller-docs.ansible.com
.
For example, create /etc/httpd/conf.d/legacy-controller-docs.ansible.com.conf
.
- Configure the web server to serve content for both
docs.ansible.com
and legacy-controller-docs.ansible.com
.
- Configure the web server to use different index files for each subdomain.
The resulting configuration should be something like the following:
<VirtualHost *:80>
ServerName docs.ansible.com
ServerAlias legacy-controller-docs.ansible.com
DocumentRoot /var/www/html/docs/
<Directory /var/www/html/docs/>
Options Indexes FollowSymLinks
AllowOverride All
Require all granted
</Directory>
<If "%{HTTP_HOST} == 'docs.ansible.com'">
DirectoryIndex index.html
</If>
<If "%{HTTP_HOST} == 'legacy-controller-docs.ansible.com'">
DirectoryIndex index-archive.html
</If>
ErrorLog /var/log/httpd/docs.ansible.com-error.log
CustomLog /var/log/httpd/docs.ansible.com-access.log combined
</VirtualHost>
Test plan for phase one
The archive subdomain, legacy-controller-docs.ansible.com
, will be served from Red Hat managed infrastructure.
- Use dig to verify DNS settings for the archive subdomain and
docs.ansible.com
.
- Determine if
legacy-controller-docs.ansible.com
has an A record, for example: dig a legacy-controller-docs.ansible.com
- Determine if
docs.ansible.com
has a CNAME record, for example: dig cname docs.ansible.com
- Verify TTL for
docs.ansible.com
, for example: dig docs.ansible.com +ttlunits
We can also complete the following steps to test that everything is set up correctly:
- Ensure that the archive landing page is available from
legacy-controller-docs.ansible.com
.
- Ensure that entry points for legacy Automation Controller documentation is available from
legacy-controller-docs.ansible.com
.
- Ensure that the docsite build job in Jenkins can update the archive landing page.
- Ensure that both
legacy-controller-docs.ansible.com
and docs.ansible.com
serve the same set of web pages, for example: gobuster dns -d docs.ansible.com -w /path/to/wordlist.txt
.
Phase two: Migrating the subdomain
Update the CNAME record for docs.ansible.com to point to Read The Docs.
- Follow the steps to add a custom domain on Read The Docs.
a. Enter docs.ansible.com
as the custom domain.
b. Select the Canonical option.
- Update the DNS record for
docs.ansible.com
so that it points to readthedocs.io
.
- Wait for the changes to propagate and then test with something like
nslookup
to verify the CNAME record.
- Update the canonical url in
conf.py
to include the projects subdirectory: ansible-documentation/docs/docsite/rst/conf.py at 350ef3df2c61dcce411f5c237ebc288079586003 · ansible/ansible-documentation · GitHub
Test plan for phase two
After we add the docs.ansible.com
subdomain to Read The Docs and update the DNS record, we should verify that changes have propagated as expected.
- Check the canonical name that the subdomain points to:
nslookup -type=CNAME docs.ansible.com
- Check against Google’s DNS:
nslookup docs.ansible.com 8.8.8.8
- Check against Read The Doc’s name server:
nslookup docs.ansible.com tegan.ns.cloudflare.com
Phase three: Cleaning up
After the docs.ansible.com subdomain is migrated to Read The Docs hosting, we should do some clean up work.
Tasks in Google search console
To help preserve SEO authority of docs.ansible.com
, we should perform the following steps in the search console:
Custom sitemaps
XML sitemaps help search engines discover and index the site structure faster. Read The Docs automatically creates sitemaps, however we should consider generating a custom sitemap for the top-level project.
Additionally, we should create a new XML sitemap for the new subdomain to replace the existing ones:
Here are some commands used to create sitemaps:
sudo dnf install nodejs
sudo npm install -g sitemap-generator-cli
sitemap-generator -f ansible-sitemap.xml https://docs.ansible.com/ansible/latest/
sitemap-generator -f automation-controller-sitemap.xml https://docs.ansible.com/automation-controller/latest/
Custom robots.txt file
The robots.txt file
helps improve SEO by controlling access to urls from search crawlers. We currently disallow several urls from search crawlers in robots.txt. We should investigate how to integrate that into the Sphinx configuration according to their documentation. Alternatively that might be something we just copy across in the Read The Docs configuration from the top-level domain.
Updating internal links
Even though we will have redirects in place that should automatically point to the updated docs.ansible.com
urls, we should ensure that as many links are updated as possible. Updating internal linking should help both users and search engines navigate the new structure without relying on redirects.
Here are some places where we should scan for docs.ansible.com
urls and make batch updates where necessary:
ansible/ansible-documentation
ansible/ansible
ansible/aap-docs
Separating the landing pages
Landing pages refer to the top-level pages that guide users to relevant parts of the documentation.
After the migration, there will be two landing pages:
docs.ansible.com
on Read The Docs and sourced from the ansible/docsite
repository
legacy-controller-docs.ansible.com
on the web server and sourced from the ansible/archive-docsite
repository
Updating the archive landing page
We should complete the following steps to modify the landing page for legacy-controller-docs.ansible.com
:
- Temporarily disable the Jenkins job to build the docsite.
- Fork the
ansible/docsite
repository to ansible/archive-docsite
.
- Rename
archive-index.html
to index.html
.
- Create a new standalone 404 page with the cowsay image.
- Remove the following files and folders:
├── ansible/
├── data/
├── requirements/
├── sass/
├── static/css/
├── static/images/community_logo.svg
├── static/js/
├── templates/
├── .pip-tools.toml
├── .readthedocs.yaml
├── ansible-sitemap.xml
├── build.py
└── noxfile.py
After removing the preceding files and folders, continue with these steps:
- Update
.htaccess
to use the standalone 404 page.
- Update
.htaccess
to remove redirects that applied to the docs.ansible.com
subdomain.
- Update the catch all redirect in
.htaccess
at https://github.com/ansible/docsite/blob/c02fae53bbfae3b296f38b1b04b7639d3431b98a/.htaccess#L11
- Update robots.txt to disallow the
/ansible
and /ansible-core
directories. Consider disallowing all content to avoid competing with redhat.com for Automation controller content.
- Update
robots.txt
to modify the sitemaps. Remove ansible-sitemap
and update the subdomain for automation-controller-sitemap
.
- Update
automation-controller-sitemap.xml
to reflect the change of the subdomain.
- Update the Jenkins job to prune deleted files and folders from the rsync step.
- Update the Jenkins job to clone the
ansible/archive-docsite
repository instead of ansible/docsite
.
- Enable the Jenkins job to build the docsite and run it.
- Modify the web server configuration to serve content for
legacy-controller-docs.ansible.com
only and to use the index.html
file.
Updating the landing page on Read The Docs
We should update the docs.ansible.com landing pages to put the focus on the content journeys, which essentially means removing the following files and folders:
├── ansible/
├── templates/ansible_community.html
├── templates/automation-tower-*.html
├── templates/core-translated-ja.html
├── templates/core.html
├── templates/platform.html
├── .htaccess
├── ansible-sitemap.xml
├── automation-controller-sitemap.xml
└── robots.txt
- Remove the preceding files and folders.
- Update templates as appropriate to reflect the changes.
- Update the
ecosystem.html
page to ensure links to Read The Docs projects are correct.
- Ensure there is a link to the ecosystem page on the index.
- Add redirects to the top-level Read The Docs project for the deleted
templates/*
pages.
As a future enhancement, we should consider building the docs.ansible.com landing pages with Sphinx. This would allow us to make better use of the Read The Docs widget that provides cross-project search.
Updating the prior versions page
We should ensure that the templates/ansible-prior-versions.html
page has the correct urls for older versions of the Ansible community docs.
For instance, https://docs.ansible.com/archive/ansible/2.7/
changes to https://ansible.readthedocs.io/projects/ansible/2.7-archive/
Test plan for phase three clean up work
After the docs.ansible.com
subdomain is transferred to Read The Docs hosting, additional testing is needed to ensure that pages are served correctly, redirects are working as expected, and that everything is in place.
- Use curl to check and validate sitemaps.
- Use the url checker to validate pages available from
docs.ansible.com/ansible
and docs.ansible.com/ansible-core
are in place and still return HTTP 200 status.
- Ensure that the landing pages for
docs.ansible.com
and legacy-controller-docs.ansible.com
have the correct entry points and are clearly separated.
Ensure that any pages in the ansible/*
and ansible-core/*
directories on the web server redirect to Read The Docs.