Moving back to WordPress

Moving back to WordPress thumbnail

You might not have noticed (or you might have...who knows) but this site is now back on WordPress. I had shifted to Jekyll back in December for the speed and security of static pages, but I ended up writing less, and I didn't like that. So, it's back to WordPress.

Most pages should be working correctly. There will definitely be problems with embeds (images, etc) and I'll be working through those over the next several weeks (years?).

Post categories and tags are also messed up and I'll be reindexing those as well. The search bar on the right works, so stick to that if you're looking for something specific.


Featured image Monopoly flickr photo by randomwire shared under a Creative Commons (BY-NC-SA) license

Reclaiming Jekyll

Reclaiming Jekyll thumbnail

In December, I moved off of WordPress to Jekyll. This is easy to do with GitHub Pages, but I wanted to self-host because keeping a SSL certificate was important to me. I followed Tim Owen’s sample had everything up and working well.

I faced two specific challenges, though.

  1. FTP and SSH uploads were blocked in several places where I normally work. That meant I needed to remember which files needed to be uploaded via cPanel every time I wanted to publish a post. I would often forget an image or have a broken link, which meant regenerating the entire site.
  2. Because SSH was blocked, I had to use a cron job to publish the post. I would set one up to run every 5 minutes while I was working to make sure the changes were correct. Then, I would delete the cron job.

The bigger issue was that building on the server duplicated the site files. So, I’d have a folder of all of my posts and assets (images, styles, etc) that would get copied into the live site. Instead of shrinking my server footprint, it was doubled. No good.

My next idea was to use git, which is preinstalled on Reclaim shared hosting (which is awesome), to manage all of my files. But, I ran into the SSH problem again (HTTPS doesn’t work out of the box with git and setting it up is a headache). I also had problems tying the folder to the Reclaim location for some reason. So, that idea was out.

I continued to think about the problem and I finally landed on it: I wanted to keep all of my files on Reclaim when I really only needed the _site directory. I can build it on my computer and then publish only the live components.

This introduced another problem: it’s more complicated than just uploading the new post. The _site directory is changed and paginated with each build, so all of the relative links have the potential to change. How would I limit my upload to the site directory without needed to build on the server?

It turns out that you can pull single directories from a GitHub repo online. The key is only checking out the directory you want. Instead of using git pull to fetch and merge everything, you break it down into several steps.

  1. Set up an empty repository using git init.
  2. Assign a remote repo via url using git remote add . So, mine is called nodes-site and maps to https://github.com/bennettscience/nodes.git.
  3. Fetch the entire project with git fetch nodes-site. This finds and maps the entire project to git but doesn’t actually add any files yet.
  4. Check out a single folder with git checkout nodes-site/master -- _site. This creates a read-only directory!

I don’t need to write any files on the server…I do all of that on the computer. This step just grabs what’s been published to the Github repo and displays it as a live page on blog.ohheybrian.com.

Here’s my new process:

  1. Write and build the site on the computer. It runs faster, no need for the Internet.
  2. Use git to track all of the changes and push it up to GitHub. All of the files are public already through the blog, so I don’t care that it’s available publicly there, too. In fact, it serves as a nice backup in case I really bork something up.
  3. Write the steps above as a cron job to pull the _site directory in a couple times per day. If nothing changes, no new files are copied over. If there’s been a new post, that’s reflected in Git and the entire directory restructures with the changes.

My original folder (with everything) came in around 300MB. The new folder, with only the published material, is about 180MB. So, I saved 50% of my disk space by publishing only the live pages.


This StackOverflow post got me moving in the right direction.

Featured image: Allen-Bradley Clock Tower flickr photo by kallao shared under a Creative Commons (BY-NC-ND) license

Moving from WordPress to Jekyll

This is a long post in several parts. Jump to different sections using the links below.


Introduction

Long story short, I moved from self-hosted WordPress to a static HTML site generated by Jekyll.

WordPress does it’s job really well. I think there was some statistic [citation needed] that showed nearly 30% of the Internet runs on WordPress in one form or another. That’s a lot of the Internet.

But, because of ubiquity, there is a significant drawback: WordPress sites are prime targets for malicious code and hacking. A plugin on my site shows how many dozens (and sometimes hundreds!) of login attempts there have been. It’s a battle to make sure security plugins are always up to date. That leads to other issues: incompatibility with plugins.

So, This entire blog - 2018 all the way back to 2010 - is a set of static HTML pages generated by Jekyll on my Reclaim Hosting server. No more logins, no more plugins to check and update. Just nice, clean, lightweight HTML.

It took me several weeks to work out the details for the migration. It wasn’t too bad, but I learned some things along the way that I’d like to share here.

Exporting WordPress

Jekyll uses Markdown and YAML data to generate a website. It’s quite clever how they pulled it all together, actually, to mimic a typical dynamic (database-driven) blog like WordPress. There is a plugin which will export your WordPress blog formatted for Jekyll, including all post metadata like tags, permalinks, and image resources. It gives you a .zip file which you can then extract and use to populate your new Jekyll site.

First, it extracts your entire media library. WordPress automatically generates several files for each image you’ve uploaded for different display situations. My media folder was well over 300 MB because I didn’t take the time to clean the library up. I’d suggest cleaning up any unused image files before the export.

Second, any pages you have on your site (not blog posts) get their own folder. Take time to go through each folder and make sure it’ll fit the Jekyll file structure.

Finally, do a regular WordPress XML export so you have an entire backup of your blog. The Jekyll plugin only converts posts and pages. If you have other resources, you’ll want to save them somewhere before deleting or redirecting your site.

Hosting

The great thing about Jekyll is that it is ready to go with GitHub Pages. If you’re already using GitHub, you can go that route with your username.github.io account with a single commit and push. I have a lot of traffic (humblebrag much?) to blog.ohheybrian.com already and I don’t want to set up a redirect. I’m also already using my GitHub Pages site for my (web apps)[https://dev.ohheybrian.com]. You can map a custom domain to GitHub Pages, but you cannot use HTTPS on that domain, which was a dealbreaker for me.

Each web host is different, so you need to see if yours supports Ruby 2.4 or higher. Lucky for me, Tim Ownes from Reclaim Hosting already had a blog post on setting it up with my Reclaim account. I followed his instructions to the letter and got it working on the second try (I borked some theme and file structures on the first, so I deleted everyting and started over).

SSL is a big deal. If you don’t know what it is, read The Six Step “Happy Path” to SSL by Troy Hunt (or anything else he writes, honestly).

Comments

I don’t get a ton of comments, but with a static HTML site, there isn’t an easy way to include comments. If you’re hosting with Github Pages, Staticman is an awesome and secure way to include comments on your posts. Another option would be to use a third-party tool like Disqus. I didn’t go with Disqus because they’ve had some trouble with clean use of user data in the past.

I decided to create a custom commenting form (based on this post) using Firebase. It’s a real-time database run by Google which can push information anywhere I want it to go. Each post has a query to the database to check for comments. Pushing the comments to the database is handled with a little bit of JavaScript, which I’ve modified from the linked tutorial:

Firebase also includes cloud functions that can be written in Node JS. I’ve never written any applications in Node, so this was a learning experience for me. This function watches the comment database and simply notifies me if a change has been made using the following script:

It could definitely use some refinement, but it does what I need it to do.

Updating

Relying on an Internet connection to write a blog post seems so 2012. With Jekyll, I can write in any text editor and then upload when it’s ready. If I’m on my main machine, I can even serve the page locally to see what the update will look like as if it were live on the web. It’s a small perk, but as I’ve moved to working more and more with text files (rather than things like Google docs) it’s nice to be able to open a blank text file and start writing. I can come back whenever I want and finish up or let it sit on the large pile of started-and-never-finished posts.

Conclusion

In the end, this is a highly technical shift away from something built for the end user into something I have absolute control over. If the blog breaks, it’s my fault and I will have to work to fix it, which is satisfying in its own nerdy way. It’s definitely not the easiest route to start (or continue) blogging, but it’s mine, which is fulfilling.

If you’d like to know more about how to make a switch, feel free to try out that nifty commenting section below or just write me an email: brian [at] ohheybrian [dot] com.

Featured image is Find your way flickr photo by maximilianschiffer shared under a Creative Commons (BY-ND) license