How to move a self hosted #WordPress blog to WordPress.com

I migrated my coldstreams.com blog to coldstreams.wordpress.com and the old web site automatically redirects to the new URL.

In theory, this process is simple – but getting it done right was complicated.

The hard part was getting the redirection to work for all cases. If you just want to redirect coldstreams.com to coldstreams.wordpress.com – that is easy!

But that redirects all references to coldstreams.com to the new URL.

  • What do you do if you still want to access other directories on your old web site like coldstreams.com/thesis or coldstreams.com/public?
  • If you have links on other pages that point to the old self hosted blog like coldstreams.com/?p=11074 you want those to still point to coldstreams.com and not the new URL. How do you do that?

The solution was to edit the .htaccess file on the Apache web server and implement a set of RewriteRules to translate some URLs but not others. Sounds simple but I am not a RewriteRules guru (never even looked at them before!) There is a lot of bad documentation available online plus forum Q&A that is wrong or insufficient. Getting the RewriteRules to work for me took hours of plowing through bad documentation plus trial and error.

Here I explain what I did. I  hope it is helpful for your WordPress migrations!

Step 1 – Migrate your content

After you set up your account at WordPress.com, migrate your content from your self hosted WordPress blog. Log in to your Dashboard (wp-admin) control panel and find Tools | Export and under Choose what to export, select All content and then Download Export File (to your computer).

Step 2 – Import content to new WordPress.com blog

Log in to your WordPress.com account, click on My Sites if needed, to see the options listed down the left side. Select Settings and then on the tab at the top, select Import. Click Start Import on the WordPress import item to import your content from a WordPress export file. You’ll specify the file you downloaded from your own web site.

ImportProcess

WordPress.com suggests it may take 15 minutes to import your site. In my case, it took about 3 1/2 hours! During the import process, WordPress.com fetches all of your referenced media files (like images) and copies those to WordPress.com.

Step 3 – Adjust any direct http references

In my web site, I use widgets and in some Text widgets I referenced images at the old URL. I had to copy those images to the new WordPress.com blog media library and  update the URLs to point to the new location. Most of your media will transfer over automatically. On my old blog, I had uploaded some files to my old server using ftp and WordPress knew nothing about these files and could not change the URLs automatically.

Step 4 – Setting up Redirection using .htaccess

My old blog coldstreams.com was in the root directory (something I do not recommend).

Redirecting coldstreams.com to –> coldstreams.wordpress.com is straightforward using a .htaccess file. The Apache server (and others) support .htaccess files. The server reads this file containing a set of rules and instructions for mapping (or redirecting) URLs from one location to another.

A common use of .htaccess is to perform site redirection. That is – the user types coldstreams.com but because we’ve moved the web site to a new URL, we want to translate coldstreams.com to coldstreams.wordpress.com

The common way to do this is to edit your existing .htaccess file or to create a new one and put in a single line

RewriteRule ^$ https://coldstreams.wordpress.com/ [R=301,L]

The rule uses a cryptic notation (known as regular expressions) to pattern match something and if the pattern is found, convert what it found to the text on the right.

If we typed coldstreams.com, the funny notation ^$ matches coldstreams.com and converts what it matched to the text on the right or https://coldstreams.wordpress.com

The options at the far right tell the server to respond with a 301 Redirection code to the caller. That tells search engines that the old URL has permanently redirected to the new location. The L means this is the last rule to process in the .htaccess file – ignore any more rules after this one.

That is a brief explanation of the concept.

What could go wrong? LOTS and LOTS!

First, my web server has other stuff located on it. For example, I have files in coldstreams.com/thesis, coldstreams.com/public and so on.

If I use the above RewriteRule, coldstreams.com/thesis gets mapped to coldstreams.wordpress.com/thesis – which is wrong!!! That content has not moved and still needs to be accessible.

Another problem – there are online links that reference the old blog pages such as

http://coldstreams.com/?p=11074

If we use the above RewriteRule, this gets changed to

https://coldstreams.wordpress.com/?p=11074

which results in a page not found since that page ID does not exist on the new web site.

I wanted to keep the old links functional so that someone can still access the old content by typing coldstreams.com/?p=11074 without redirecting to the new web site. That way, links saved on other web pages will still work to get to the old content.

Step 5 – The .htaccess file for redirecting self hosted WordPress to WordPress.com

I put together an .htaccess file that works for my situation – and it might be useful for others that:

  1. Want to migrate a self hosted WordPress blog to WordPress.com
  2. Their old blog was in the root directory (or not)
  3. They still wish to retain the old web site and access to other folders on that server
  4. They want old WordPress /?=nnn references to continue to link to the old content.
  • My goals are that my other web sites, hosted as subdomains, still work at the old URL: social.coldstreams.com, 3d.coldstreams.com and ajb.coldstreams.com continue to work as they always have.
  • I and others can still access subfolders like coldstreams.com/public
  • If someone clicks on an old link that was in the form coldstreams.com/?p=11074 this will redirect to something useful rather than a missing page.

Step 5a – Copy the original WordPress blog out of the root directory into a new folder

I created a new folder named coldstreams2. I copied all of the WordPress files from the original root directory blog into coldstreams2.

Step 5b – Create the .htaccess file for your self hosted server

Here is my .htaccess file

Options +FollowSymlinks
RewriteEngine on

RewriteCond %{QUERY_STRING} p=([0-9]*)
RewriteCond %{HTTP_HOST}  !^(.*)social.coldstreams.com
RewriteCond %{HTTP_HOST}  !^(.*)3d.coldstreams.com
RewriteCond %{HTTP_HOST}  !^(.*)ajb.coldstreams.com
RewriteCond %{REQUEST_URI}  !=/coldstreams2/
RewriteRule ^(.*)$ http://coldstreams.com/coldstreams2/ [R=301,L]
# If contains ?p=nnn, but does not contain social or 3d or ajb
# and does not contain coldstreams2, then redirect to new folder
# The test for coldstreams2 prevents this from entering a redirect loop after the first change

RewriteRule ^(.*)feed=rss2$ https://coldstreams.wordpress.com/feed/ [L]
# if the URL contains the old newsfeed, then redirect to the new newsfeed

RewriteCond %{QUERY_STRING} !p=([0-9]*)
RewriteRule ^$ https://coldstreams.wordpress.com/ [R=301,L]
# if the URL does not contain the ?p= query, then redirect to the new web site

# All other URLs of the form coldstreams.com/folder will pass through unchanged
# so that we can then reach /public, /thesis, etc

The first convoluted rules says if the URL has the form coldstreams.com/?p=11074, then we translate that URL to coldstreams.com/coldstreams2/?p=11074

This is convoluted because it tests first to see

  • if there is a p= in the URL
  • AND the URL is not social.coldstreams.com, 3d.coldstreams.com or ajb.coldstreams.com (all of which may contain p= but should not be translated)
  • AND the URL does not already contain /coldstreams2/
  • THEN translate the original URL to coldstreams.com/coldstreams2/?p=nnn

Since this rule as a [L] after it, if all the conditions are met, then translation is done and this is the last rule to apply.

Question  – Why do we need to test for /coldstreams2/ in the URL since this is what we are translating to? Because the .htaccess process runs more than once – the first time, we translate to coldstreams.com/coldstreams2/?p=nnn

That’s the redirection. The new URL is then fetched, which goes right back to our server where the .htaccess rules get applied once again.

If we did not check for the /coldstreams2/, on the second time through the rules after the redirection, we’d match on the ?p=nnn case and do the same redirection again. In effect, we would have a redirect loop – which is very bad!

By checking for /coldstreams2/ we are, in effect, checking that we have already done this conversion and should not do it again!

If the URL did not match the first rule, then try the next one. The next one checks to see if the URL references the original RSS newsfeed (coldstreams.com/?feed=rss2) and if it does, it translates that RSS request to the new server’s RSS feed.

If we did not match that rule, we go on to the next one.

The next and final rule says, if the URL does not contain ?p=nnn then redirect to the new web site.

With no more rules, anything else falls through unchanged – which means URLs like coldstreams.com/public operate as normal. And references to things like 3d.coldstreams.com/?p=3454 just pass all the way through to those WordPress blogs (which are still self hosted).

Afterword

As far as using .htaccess files I am close to an idito. Okay, I do have a couple of degrees in computer science and software engineering and I know what a regular expression is. However, I’ve never used .htaccess and RewriteCond and RewriteRules before. In reading about it to set this up, I learned that RewriteRules are incredibly powerful but are a literal mine field of details that will always blow up in your face. The nature of writing these rules makes the process highly prone to errors, let alone that I did not really know what I was doing.

Without question there are better ways that some RewriteRules guru can come up with – but this works for me.

This online tool – http://htaccess.mwl.be/ – provides an interactive way to test your rules. Sort of. The tool is a great idea, but you can encounter some oddities like needing to double escape some characters (first for Javascript and then for .htaccess so I had to put \\? in one example rule I experimented with). I also had situations where the rules worked in the simulator but not on the server – and vice versa. Still, it was helpful to use the simulator tool in learning how to work with RewriteRules.

Advertisements

One thought on “How to move a self hosted #WordPress blog to WordPress.com

  1. Pingback: Web site migration update and info on next tutorial/sample code | App Inventor 2 – Pevest.com

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s