How We Avoid Downtime

In this blog post, we recap Catan Universe's extended downtime and explain how Colonist avoids these issues.

How We Avoid Downtime

On October 31st, Catan Universe, an online Catan site, announced that it would be offline from November 1st to November 3rd for a backend migration. On November 3rd, they extended the downtime to November 4th. This is where the real story began.

November 6th - Catan Universe is still unplayable

Catan Universe's discord account announces that the issues are still not fixed
Catan Universe Official Statement

November 9th - Still unplayable, the team has "identified underlying issues."

Catan Universe's discord account announces that the issues are still not fixed after almost a week
Catan Universe Official Statement

November 11th - "If you are an avid online player, we still suggest not playing online matches."

Catan Universe's discord account announces that the issues are still not fixed (Nov 11)
Catan Universe Official Statement

Catan Universe's discord account announces that the issues are still not fixed. The downtime has officially been over a week longer than originally claimed. The team takes weekends off, so it will not be fixed until November 14th at the earliest.

November 14th - "Today we should be able to release the first fixes for the game sessions themselves"

Catan Universe's discord account announces that the issues are being fixed (nov 14th)
Catan Universe Official Statement

November 18th - Issues with matchmaking are slowly being resolved, but disconnecting and unable-to-join issues will take more time.  

November 18th, CU team outlines plan going forward
Catan Universe Official Statement

November 25 - Most recent update. The game remains intensely buggy, and the team claims to be making some amount of progress.

CU discord announcement that they will roll back previous statements
Catan Universe Official Statement

Catan Universe Downtime in Data

Catan Universe Past Month Game Data
Catan Universe Past Month Game Data

As you can see, games on Catan Universe dropped by over 50% when downtime began, and have yet to approach pre-migration numbers.

This post is not meant simply to shame Catan Universe for their handling of the situation. I am sure that they are working as hard as they can to bring their game back online. Instead, I am taking this opportunity to give a look behind the curtain and share the internal practices that have prevented Colonist from ever having an extended downtime.

Colonist's Best Practices

Every change to our codebase undergoes a multitude of checks:

  1. First, each developer tests a feature on their own local and online test servers
  2. Then, all code changes are run through a variety of automated tests to ensure they are up to our standards
  3. Finally, changes must be approved by two members of our team before being uploaded to one of our test servers
  4. Changes are made to the test servers at least one week before we plan to release an update. This gives ample time to test and fix problems
  5. We have a team of QA testers, who spend the week leading up to an update ensuring all changes are running smoothly and checking edge cases
  6. Finally, we have an incredible group of Colonist players who do final testing before an update is released

All of these practices are to support our core pillar of No Backward Improvements. We plan to create another blog post delving deeper into this concept, but the short version is: ensure that people want to play the latest version of the game, not the previous version.

I hope you enjoyed this look into how the sausage gets made here at Colonist. If you found it interesting and want to get involved, you're in luck! We're looking for more exceptional testers, developers, and more! Check out the available positions on our careers page.