Hotfix 8/28/2018 & Commentary

SBE Team
SBE Team
Posts: 890
Joined: Fri Jul 22, 2011 8:11 am

Hotfix 8/28/2018 & Commentary

Post by chimpy » Tue Aug 28, 2018 8:23 am

Chimpy here. I wanted to spend a bit of time in these notes to step through some of the process to help the community see behind the curtain a bit. Through the first 3+ days of Season three we have produced close to 500MB of log data. In the world of flat text files, this is a lot of content to sift through. Fortunately we have a number of tools these days to sort, categorize, and parse these logs (think tools like Splunk or the ELK stack). By actually reviewing these logs in a proactive manner we are looking for trouble spots that need to be fixed. Our review thus far has shown that the efforts made into the core stability of the server have paid off. Sestemic issues have been greatly reduced and the majority of error log data was related to localized problems - things impacting only a handful of players. This is great news for the overall server health.

One of the errors we saw (~1k errors every 24 hours) was related to the 'Kick' power. This may seem like a shitty place to spend any effort at all, and I don't disagree with you, but getting to the point where we know that the kick power is causing an issue is where the majority of the effort comes from. In the log files we were seeing an error that essentially stated "attempted to perform a power and couldn't find the power to perform." This sent us down a path were we added additional logging to the code around this execution so that we could get as much insight as possible. The concern was that we had a number of powers that were failing for some reason and obviously this would impact the game quite a bit. What we found was that the code calling this was related to deferred powers - ones that wait a period of time to execute - and then digging into the database revealed that Kick was the only power that didn't have a key indicating what needed to be done when the deferral was up.

The interesting thing here, and some of you may have seen it in game, was that Kick had another issue where it would Execute and then a second or 2 later it would spam 3 times that it executed again. Combining these two issues together gave us a clearer picture of what was going on and a path toward resolution. In this case, no change to the code outside of removing the extra logging, but there was a database script that needed to be executed to fix the entries on the powers tables. That script is being deployed to production today and both of these issues will be closed. Why was the database entry messed up? I have my theories, but it would be inappropriate to speculate without more data.

Why did I relay is story? Again to show that changes and fixes do take time and sometimes we go down a path thinking we found a whale that turns out to be a minnow. Between the research, fixes, deployment, and testing we are at about 3 hours on this one change between Malant and I. That's a lot of time when this is a side hustle. Something to keep in mind as we go through this journey together.

In this hotfix:
1. Database update to correct and issue with Kick
2. Additional logging to support investigations into "Failure to send" messages (outbound communication from server to client)
3. An update to mob timers to Address an issue with Mobs that cast spells
4. An update to the Combat Manager to better detail issues with Weapons


Return to “Patch Notes”