Server Hardening and SSL

Last night, I got [a self-hosted photo sharing site](https://photos.ohheybrian.com) up and running on my raspberry pi 3. You can see more about that process [here](https://blog.ohheybrian.com/2018/11/forget-the-mac-mini-bring-on-the-raspberry/).

Putting it on the real, live Internet is scary. Securing a server is no small task, so I took some steps based on [these tips](https://serverfault.com/questions/212269/tips-for-securing-a-lamp-server) to make sure I don't royally get myself into trouble.

(I have a stinking feeling that posting this exposes vulnerability even more, but _c'est la vie_.)

To start: new user password. Easy to do using `sudo raspi-config` and going through the menus. It's big, it's unique and no, I'm not giving any hints.

As for updating the OS, I have a cron job which runs as root to update and reboot every couple of days. Lychee is [active on GitHub](https://github.com/lycheeorg/lychee) and I've starred it so I'll get updates with releases, etc. I also took some steps to separate the Apache server from the OS.

Putting a self-hosted server online requires port forwarding. That involves opening specific ports to outside traffic. I only opened the public HTTP/HTTPS ports. Several sites say to open SSH ports, but I think that's where I feel very timid. I don't plan on running anything insanely heavy which would require in-the-moment updates from somewhere remote. (There's also the fact that my school network blocks SSH traffic entirely, so there's even less reason to enable it.)

Once the ports were open, I had to find my external IP address and update my DNS Zone records on [Reclaim Hosting](https://reclaimhosting.com). By default, Comcast assigns dynamic IP addresses so they can adjust network traffic. Most tutorials encourage users to request static IPs for home servers, but others say they've used a dynamic address for years without issue. I'll see if I can work myself up to calling.

Anyways, I logged into my cPanel and added an A record for a new subdomain: [photos.ohheybrian.com](https://photos.ohheybrian.com) that pointed to my public IP address. The router sees that traffic coming in and points it at the Raspberry Pi. I tested on my phone and, hey presto, it worked.

Opening HTTP/HTTPS ports came next. It's easy to get unencrypted traffic in and out. But, like the rest of my sites, I wanted to make sure they were all SSL by default. I could't assign a Let's Encrypt certificate through Reclaim because it wasn't hosted on their servers. [The Internet came through with another good tutorial](https://www.tecmint.com/install-free-lets-encrypt-ssl-certificate-for-apache-on-debian-and-ubuntu/) and I was off.

First, I had to enable the `ssl` package on the Apache server:

```
sudo a2enmod ssl
sudo a2ensite default-ssl.conf
sudo service apache2 restart
```

Once it can accept SSL traffic, it was time to install the Let's Encrypt package, which lives on GitHub:

```
sudo git clone https://github.com/letsencrypt/letsencrypt
```

I then had to install the Apache2 plugin:

```
sudo apt-get install python-certbot-apache
```

From there, the entire process is automated. I moved into the install directory and then ran:

```bash
cd /usr/local/letsencrypt
sudo ./letsencrypt-auto --apache -d photos.ohheybrian.com
```

It works by verifying you own the domain and then sending the verification to the Let's Encrypt servers to generate the certificate. The default life is three months, but you can also cron-job the renewal if nothing about the site is changing.

After I was given the certification, I went to https://photos.ohheybrian.com and got a 'could not connect' error, which was curious. After more DuckDuckGoing, I realized that SSL uses a different port (duh). So, Back to the router to update port forwarding and it was finished.

There are several steps I want to take, like disaggregating accounts (one for Apache, one for MySQL, one for phpMyAdmin) so if one _happens_ to be compromised, the whole thing isn't borked.

---

_Featured image is They Call It Camel Rock flickr photo by carfull...in Wyoming shared under a Creative Commons (BY-NC-ND) license _

Reclaiming Jekyll

In December, I moved off of WordPress to Jekyll. This is easy to do with GitHub Pages, but I wanted to self-host because keeping a SSL certificate was important to me. I followed Tim Owen’s sample had everything up and working well.

I faced two specific challenges, though.

  1. FTP and SSH uploads were blocked in several places where I normally work. That meant I needed to remember which files needed to be uploaded via cPanel every time I wanted to publish a post. I would often forget an image or have a broken link, which meant regenerating the entire site.
  2. Because SSH was blocked, I had to use a cron job to publish the post. I would set one up to run every 5 minutes while I was working to make sure the changes were correct. Then, I would delete the cron job.

The bigger issue was that building on the server duplicated the site files. So, I’d have a folder of all of my posts and assets (images, styles, etc) that would get copied into the live site. Instead of shrinking my server footprint, it was doubled. No good.

My next idea was to use git, which is preinstalled on Reclaim shared hosting (which is awesome), to manage all of my files. But, I ran into the SSH problem again (HTTPS doesn’t work out of the box with git and setting it up is a headache). I also had problems tying the folder to the Reclaim location for some reason. So, that idea was out.

I continued to think about the problem and I finally landed on it: I wanted to keep all of my files on Reclaim when I really only needed the _site directory. I can build it on my computer and then publish only the live components.

This introduced another problem: it’s more complicated than just uploading the new post. The _site directory is changed and paginated with each build, so all of the relative links have the potential to change. How would I limit my upload to the site directory without needed to build on the server?

It turns out that you can pull single directories from a GitHub repo online. The key is only checking out the directory you want. Instead of using git pull to fetch and merge everything, you break it down into several steps.

  1. Set up an empty repository using git init.
  2. Assign a remote repo via url using git remote add . So, mine is called nodes-site and maps to https://github.com/bennettscience/nodes.git.
  3. Fetch the entire project with git fetch nodes-site. This finds and maps the entire project to git but doesn’t actually add any files yet.
  4. Check out a single folder with git checkout nodes-site/master -- _site. This creates a read-only directory!

I don’t need to write any files on the server…I do all of that on the computer. This step just grabs what’s been published to the Github repo and displays it as a live page on blog.ohheybrian.com.

Here’s my new process:

  1. Write and build the site on the computer. It runs faster, no need for the Internet.
  2. Use git to track all of the changes and push it up to GitHub. All of the files are public already through the blog, so I don’t care that it’s available publicly there, too. In fact, it serves as a nice backup in case I really bork something up.
  3. Write the steps above as a cron job to pull the _site directory in a couple times per day. If nothing changes, no new files are copied over. If there’s been a new post, that’s reflected in Git and the entire directory restructures with the changes.

My original folder (with everything) came in around 300MB. The new folder, with only the published material, is about 180MB. So, I saved 50% of my disk space by publishing only the live pages.


This StackOverflow post got me moving in the right direction.

Featured image: Allen-Bradley Clock Tower flickr photo by kallao shared under a Creative Commons (BY-NC-ND) license