GuildPortal Dev Blog

Updates from Aaron Lewis, GuildPortal Code Monkey

Posted 1/4/2013 4:14 PM by Aaron Lewis. 1153984 reads. Share:

From 1/1/2013 to 1/3/2013 at 10:14 AM (Mountain), GuildPortal services went offline. Here's what went down, in sequence. The cause of the problem had its start on 12/16/2012, so I'll begin there:

12/16/2012 to 12/31/2013: The server backups, which are stored on a network share, failed to execute for two weeks straight. In the case of backup failures, our server provider's management tools are supposed to alert them so that whatever the problem is can be fixed. That didn't happen. So for two weeks, GuildPortal was flying without a backup, and nobody knew.

1/1/2013: The drive on the database server that stores the full-text index, fails, and the site goes down. When there is a drive/controller failure, our server provider's system (again) is supposed to alert them. The first tech I spoke with later acknowledged this, but said that they weren't using their old notification system anymore, since the company that acquired them was going to have them use their newer system. What happens in between? No notifications or alerts of any kind. 
 
1/1/2013, 1:30 PM: Sandy finds out the site's been down all night when she does her usual check to make sure things are running. I get on the phone with our server provider while she lets everyone know that we're on it on our Facebook page. I am connected to someone in technical support who is genuinely helpful, but who has very little SQL knowledge. 

I put up an "offline" page on the GuildPortal site, informing everyone that we'll be back, and providing a link to our Facebook page where we begin posting updates as we have them, and answering questions.
 
Over the next 7 hours, the tech repeatedly attempts to get ahold of a DBA to help with the problem. We both breathe a little easier when we get a response back in the form of an IM from the regular DBA from home. A few IMs later, we have something to try, and the tech and I give it a go, brimming with optimism. I post on Facebook that everything should be up and running within 30+ minutes.

1/1/2013, 9:30 PM: It's taking far too long. I suspect something is wrong, so I cancel the operation. It takes 4 hours to cancel, even though it was only initially running for 1. The culprit, it turns out, was the database file itself. When the N drive failed, it left the primary database file and the associated log file in a totally unstable (and it now appears, unrecoverable) state. The DBA we were IMing is now incommunicado.

No problem, we thought, because I'd been paying for the extended SQL support package faithfully for all these years. That means he can hop on the phone and get the revered On-Call SQL Team to fly in on an epic mount, day or night. While we wait for them to respond to his initial pages and phone calls and IMs, we try a couple different restore strategies. At the time we didn't know the state of the main database file, so again, we were optimistic.

I again, stupidly, post an update on Facebook saying that our latest attempt resulted in a success and that the site would be up any moment. I end up staring at a spinning ball with the words "query executing" next to it for the next several hours.

1/2/2013, 2:03 AM: The restore failed. All phone calls, pages, and IMs to the On-Call SQL Team have been totally ignored. The tech is frustrated, apologetic, and (if I read it right) a little embarrassed for his company. I post apologies and stop giving ETAs on Facebook, but continue communicating with everyone, not wanting you guys to think I'd snuggled up to a pillow and said "heck with them."

It's now 13 hours after the initial phone call, and the odds of us getting a DBA involved at all before the regular morning shift are looking slimmer with every passing minute. Dread sinks in (forgive the melodrama).

The tech and I part on the phone, since he can only really sit there and listen to me breathe while I try different ways to restore the database. Attaching, single-file attaching, standard restore, restore with full-text indexing belayed, different recovery models, etc.. After we hang up, he continues to attempt to get ahold of the On-Call SQL Team until his shift ends, to no avail. My efforts, of course, were equally doomed.

1/2/2013, 5:08 AM: I've tried everything I could think of. At this point, I post updates on Facebook and wait for a herd of DBAs to show up all bright-eyed for work at our server management company's HQ (they're two hours ahead of me).

1/2/2013, 9:09 AM: I'm informed that a DBA is working to get the server fixed. I remain cheerful and optimistic, since now we at least had a DBA on the case, and (I thought) I'd be able to give you guys a reasonably accurate ETA before too long. I post to Facebook, with a smiley face even.

I thank the DBA very much for taking it on and ask very politely and delicately (I'm careful around DBAs) if, at any time he has even the smallest update that I could give you guys, that he just shoot a real quick e-mail my way. KK? Tks u!

I head back to Facebook, posting and chatting it up to let you all know I'm staying with it until it's fixed, and also because it helps me stay awake. We all wait for the DBA to work his magic, and send us little updates that I can relay to you.

1/2/2013, 1:50 PM: The DBA comes back after some attempts with bad news. The database file is corrupt. But he has a plan that he has used before, and he is confident it will work. I post his e-mail, verbatim, on Facebook. Hope glimmers once more. Like anime eyes.

To pass the time while this plan is set in motion, I post some polls to Facebook. We all again wait, collectively, for the Awesomeness to happen. One of my guild leaders posts that he's a DBA too, and gives me a tip to pass along to Johnny that might save a lot of time.

So I call up and ask to speak with our server management company's DBA, thinking he'd appreciate the tip. Well, as soon as the guy who answered the phone IMed him that I had some info that might "help" him, he wouldn't take my call. Busy guy, I guess! Must have already thought of it anyway, right? No hard feelings...

1/2/2013, 4:02 PM: After waiting for around two hours with no update, and my guild leaders becoming understandably more anxious, I lose my patience, Samuel L. Jackson style. In a totally unreasonable rage, I write the following, unthinkably terrible thing to the DBA. The following is the actual body of the e-mail I sent. Parents, you might want to send your children out of the room. Here it is:

"Hiya Johnny! How's it going? Look like it's going to work?"

1/2/2013, 5:32 PM: The DBA replies with open hostility, accusing my "team" (I have a team?!) of detaching the database in a way that corrupted it, when in fact the loss of the N drive finished it off long before, and the failure of the entire alerting process all the way back from the backups that had failed weeks before to this point was the real villian of the story. None of which I had control over.

He closes by inviting me and my "team" to do it ourselves if we think we can do better.

At that point, I swear, the world turned upside down. It wasn't a proud moment for me, but I clicked on the reply button and type up an e-mail that conveyed some of the anger that had been building up, but more of the hurt and disbelief at what was going on.

Anyway, as far as I can tell, immediately after reading this e-mail, the DBA stopped any running restore, disconnected from his session, and walked away. I wouldn't hear anything from our hardware service provider until the following morning. The worst part is that he full well knew he wasn't just punishing me, but all of you, as well.

After I posted what had happened to Facebook, many, many guild leaders (you guys, yay!) basically raided the hardware service provider's Facebook page. Immediately upon seeing it (once they got in), they responded, saying they'd make fixing it a "top priority." While I wasn't contacted until a couple hours after that, I am sure that you guys proving you were real, and not to be trifled with, had everything to do with the fact that we would get some real results, and soon.

To my horror, the best that could be done was a restore from the last backup that had succeeded, on 12/16/2012. So any new data from then until the site went back online would be lost. Though catastrophic and totally unacceptable in my (and I'm sure, your) eyes, I had to give the go-ahead. There was just no other option, at least, none that they could or would provide.

1/3/2013, 10:14 AM: The site comes back online, with data restored from 12/16/2012.

This will not happen again.

For my part, even if we're paying our provider extra for support packages that include monitoring, alerts, and reliable backups, I will not take it for granted that anything, whether it's something I have control over or not, is working as it should be. I will check on backups and perform many of the other IT-type tasks that we have been relying on someone else to handle.

I'll invest (as soon as I can -- this event has cost us dearly, financially, and GuildPortal is already "in the red" because of the economy and lack of new games that really draw new players into the MMO world) in more hardware that will add more layers of fault tolerance to all tiers of the GuildPortal service.

Sandy has been diligently refunding all subscriptions for all new sites created after 12/16/2012 (since they're not there any more). She is nearly done as I write this update. If you fall into that category of site, and you haven't seen a refund come through PayPal yet, please give her another day to finish up. You do not need to send in a support ticket. We're not waiting for you to contact us refund you, we are just doing it.

I'm extremely sorry this happened!

This has been, without any question, the worst downtime event in GuildPortal history. It's the worst data loss event for sure (there was only one other -- the result of me writing a bad trigger that deleted a bunch of old shouts it shouldn't have).

It would be tempting to blame our hardware service provider for it all, but deflecting is their game, not mine. The short truth of it is that I should have been double-checking that they were doing their part, whether I was paying more for alerting and reliability services or not. That should have been something I considered part of my job before this all happened, and it most certainly will be, moving forward.

To all of you who lost data, I cannot begin to convey to you how much I feel your rage, anger, disbelief and loss over this. You're not simply punching in letters and numbers to practice your typing, you're building community. To have over two weeks of that just taken away as if it never happened is unacceptable. I was awake from 1/1/2013 at 8:30 AM until the site went back online at 10:14 AM on 1/3/2013, because I wanted to keep providing updates, or at least keep the lines of communication open while we were down... To let you know that your guild is very important to us, whether it's a paid site or not. I didn't choose the guild web hosting vertical niche just because there was nothing else filling it back when we started; I love the idea of providing and enhancing tools for people to build and personalize their communities online, and I just so happen(ed) to be a gamer.

I feel terrible for disappointing you. While the financial hit to us is tremendous, GuildPortal will survive and be here for your guild for years to come. I'm currently looking for another job, as, like I said before, we're in the red, and have been for some time. Once I have one, I'll still be able to provide support and some feature enhancements; I'll just be limited to a couple hours a day or so.

Thank you!

If you are going to leave over this, or have already left, thank you for making your home with us for however long you did. Thank you to all you guild leaders and members who were understanding and supporting on our Facebook page. Thank you to those who are staying, and a promise: it won't happen again.

As always, thank you for choosing GuildPortal!

Posted 10/27/2012 4:38 PM by Aaron Lewis. 1170289 reads. Share:

10/27 Update - Activity Wall

The Activity Wall, a new widget, goes live today! It's like the walls you'll find commonly on social networking sites. It's what the old Status Updates widget was kind of trying to do, only it does it a lot better. It could also be looked at as a replacement for shout boxes altogether, since it supports media.

In order for people to post to it, they'll need to be granted higher than public/applicant access to your site. Here are some of the features:

Posting

The familiar WYSIWYG editor is used, only in a slimmer version. Tools available for now are: toggle full-screen edit mode, spell check, some formatting, insert link, insert image, and insert video (from either YouTube or Vimeo). You may find the area too small to work with, especially if you're inserting big images or videos, so make use of that full-screen toggle on the far left!

Adding Video

Adding video from YouTube or Vimeo is easy. Just go to the web page on YouTube or Vimeo where the video is shown, copy the address, click the blue Play icon to the far right in the wall editor, and paste the URL. Hit your tab key and you'll be presented with a preview of what it'll look like in your post, along with some options.

Most of them can be left alone. The one you want to pay attention to is "Play the video automatically on load." You'll probably want to un-check that box, else risk the wrath of guildies opening the page with the wall on it, getting hit with all kinds of videos starting to play at the same time!

The Next Thing

Sample wall postOnce you've added your text/video/images, click the post button and boom -- there you have it. Emoticons are automatically parsed based on defaults and/or any custom emoticons the guild uses. Clicking on the name of the poster displays the standard drop-down menu for doing things like viewing their profile, visiting their blog, chatting with them if they are online, and all that good stuff.

Images

Know how sometimes, you can put an image in a post or a news item and if it's too big, it'll stretch out the page, wrecking the design? Well, I think I've got that figured out now (and with all the layout possibilities there are due to customization, and the fact that IE ignores max-width unless everything's set a particular way at the parent level, it took a while -- that, and I'm dumb as a rock)! Anyway, when you post an image to the wall, it'll now do its best to fit inside the available space, without stretching things out. Notice: if your browser is way, way old, it'll probably be icky like before. So with a large image (the one shown is actually around 1200 pixels wide, in a widget that's about 700 pixels width), here's what a well-behaving browser will show (minus the purple arrows I thought were neat while putting the screens together in Fireworks):

But hey! What if the image is gigantic because there's that much going on? Easy to do with something like an in-game screenshot. And you might want to see it full size. No problem! You can click any images on the wall and they'll open up all sexy like in a gallery-type scroll view dyno-resizing nifty thingy. Stuff. Whatever you wanna call it. Hey, I'm not a writer, k? Anyway, it's got arrows (way better-looking than my purple ones up there) that you can use to move between other images on the wall.

The Morning After (after you post, I mean)

There are a few things at the bottom of each post. People can click on Comments to show comments or add their own. The date and time of the post has been sacrificed in a pagan ritual, making way for the more friendly "how long ago" display. There are tools to delete the post if you're a Super Admin or the original author, and if you're the author you can edit the post, too.

The comments are pretty simple. Not much to explain there. I thought about spinning it so they sounded all complicated and neat and shiny, but... Yeah. No. Oh, and both of those posts are from me. I talk to myself while testing, and for a reason I cannot fathom, I always revert to a despicable sub-set of the English language.

Future Enhancements

Instead of cramming the thing with everything I could think of, taking ideas from some of the major social sites out there, I decided to hold back and push it as it is now. I'm counting on feedback to direct the decision making when it comes to further enhancements for the wall. After all, there are things I found that I like which a lot of you might really not, and there were some features the big boys are sporting now that I really find... icky (who says "icky?" I do!).

So let me know what you would like to see done moving forward. Maybe WYSIWYG editing of comments instead of the simple type-and-hit-enter behavior it has now? Or automatically-entered posts by the site when people apply, add a shout, post something in the forums, add a news item, create a new raid, add an image to the gallery (all linked automatically to the source item)? Anyway, let me know!

Chat Updates

When someone requests a one-on-one chat session, you will now hear a... beeping thingy. Useful if you have more than one browser window open, are looking at a different screen, or have your head spun around facing behind you, like that girl from the Exorcist. I do that sometimes. Don't judge me!

Typical GP Chat

Also, on-demand playable sound effects have been fixed in guild chat. Type /sounds for a clickable list of the currently available ones. In addition, lots of little bugs and stuff were fixed.

Other Stuff

  • Many widgets have had their displays cleaned up a little. There will be more of this going forward, as we move to a more universally clean (and still customizable) theme format. This is primarily being done to enable us (and you) to create much higher quality themes than is now possible. We will be making use of HTML5, CSS3, and responsive design principles.
  • Quirky behaviors in some of the style editors have been un-quirkified.
  • Your hamster has been watching you with malice as you sleep.
  • The ability to add an image to a post via link instead of the image manager has been restored.
  • Many other bug fixes.

Posted 10/10/2012 9:59 AM by Aaron Lewis. 520136 reads. Share:

Many enhancements have been incrementally rolled out since the last release announcement. I'll summarize what's been going on, minus a lot of the minor bug fixes:
  • New Feature Promotion Letters. Whenever a member is promoted to a higher level, you can replace the default system-sent mail with your own custom one, and you can have a unique letter for every level. For example, you can have a different letter sent when someone is promoted to member versus when someone is promoted to officer or council. Get started with promotion letters in Control Panel General Promotion letters or Guild Bar Admin Member Management Promotion Letters.
  • Enhancement New World of Warcraft roster with in-game WoW Guild Achievements. Configuration-free, fast sorting and filtering, more frequently updated, real thumbnails of your characters.. Stay tuned, more new stuff for WoW is on the way.
  • Enhancement WYSIWYG editor file selection/uploading. The editor tool for selecting an image has been revamped to function much like the file manager in the Control Panel. However, you now have additional buttons: one to add media (sound, video, etc) and another to add Flash content. All three tools allow for direct upload while editing your content.
  • New Feature User uploads from WYSIWYG Editor. Previously, there was no way for members to upload images or any other media for use in their forum posts. Now, they have access to the three tools mentioned above, but all of their uploads are stored in a special sub-directory off the guild root with the format /MemberUploads/memberid. That is their root directory, and they can create sub-directories, drag-and-drop copy files, and directly edit images (add text, skew, rotate, crop, etc). They cannot, however, see the guild root folder or navigate to the root of other guild member folders.
  • Enhancement Page Footer. The old page footer had a pretty low limit on the number of characters it allowed. This limitation has been eased up, and you may now also specify a background gradient fade and the top edge color and size for the footer area. If you want a solid color instead of a gradient, just select the same color for both the start and end colors. Control Panel Style Tools Page Footer or Guild Bar Admin Site Customization Footer.
  • Css For those who use custom CSS, the class for the new footer area is gp5-footer.
  • Enhancement 8 new GuildWars 2 themes have been added.
  • Maint Some of you may have noticed the new error reporting form you are taken to when you encounter a run-time error. Many of you who have, have filled out the "what were you doing when the error happened" field, and I just wanted to drop a quick "thank ya" for doing so. The details you provide, along with the actual error details, are both put together to automatically create a new support ticket, which is assigned directly to -- waaaaaaaait for it -- development. Anyway, it makes getting to the cause of a problem much easier and has resulted in many hotfixes over the past month! Oh, and to the individual who typed in "I was sleeping".. lol
Finally, a bunch of bugs were fixed.

Posted 9/4/2012 12:02 PM by Aaron Lewis. 232712 reads. Share:

Hey all! Please, feel free to rate the dev blog service updates (using the star rating thing), and leave comments! Let me know what you liked about any update, what you didn't like, what you'd like to see more of, and all that good stuff. So far, most of our feedback comes from admins who post in the Help Community, and while they provide many of the feature suggestions that are pushed out every week, they represent .78% of all active guild leaders -- and .13% of all recently active members -- on GuildPortal.

Now, I'd love to get more guild leaders (and even their members -- we know they have ideas, too) into the Help Community, but if you don't want to join that, feel free to leave any comments on any release here on the blog at the bottom of any post, and I will read them

I know a lot of people might not expect much along the lines of responsiveness from SaaS (software as a service) providers when it comes to listening to their ideas. Even fewer people expect to actually see their feature requests implemented any time in the foreseeable future. It's totally understandable, and I empathize.

For example, I've had ideas for things I'd love to see added to Facebook, Hotmail, GitHub, and many other SaaS providers. But I didn't send them in because either 1) they didn't even bother putting up a form or forum where I could submit my request, or 2) I had absolutely zero faith that any human being that was capable of making the decision to implement my idea would ever actually see it.

Now, the guild leaders who know this isn't the case with GuildPortal are the ones who frequent the Help Community. They make feature requests all the time. Here is how we break down feature requests, and how long it usually takes before each category of request is live on production (keep in mind that providing support for existing functionality takes up the bulk of our time, and that maintenance, upgrades, tuning, and refactoring must also be constantly done to keep GuildPortal speedy and clean):

 Who Benefits from Feature Complexity ETA (Cycles) 
        
 Many Guild Leaders Low  1-2
  Medium  2-4 
  High 4-8+ 
     
 A Few Guild Leaders Low  2-4 
  Medium 4+
  High 8+
      
 Many Guild Leaders, Many Guild Members Low 
  Medium  1-4 
  High  4+ 
     
 Many Guild Members Low  1-2 
  Medium  2-4 
  High 4-8+ 
     
 Few Guild Members Low 1-4
  Medium 4+
  High 8+ 

Behind the scenes, our "cycles" are milestones. Every week has its own release milestone. But sometimes single features -- or a combination of related features -- will have their own milestone and branch of the code, so that it can be worked on without its changes (especially if there are a lot of core, architectural changes) interfering with the regular support/feature update milestone code bases. This is a fairly recent addition to our toolset, and it makes the entire development and release process -- including adding new features -- much easier to manage.

Every feature request that is made is reviewed by both Sandy and I, whether it comes in via a post on the Help Community, in a Support Ticket, or a comment on the Dev Blog. It is then entered into our issue tracking system, which is a ticket system developed in-house that integrated with GitHub, where bugs, feature requests, and a lot of other things are stored in a way that allows us to link fixes and enhancements to the actual lines of code that were affected.

Feature requests with no ETA are internally assigned to a milestone specifically for those types of feature requests, and it is regularly reviewed to see if anything in there can be squeezed into the next actual release milestone. Not all feature requests are implemented. For each of them, we need to balance the benefit versus the impact to GuildPortal overall. Also, we have to consider the amount of time each request would take to develop.

But no feature request is ever deleted or ignored.So please, for the many guild leaders out there who haven't asked for anything because you don't think anybody's listening, please, talk to us. We're not Microsoft -- we're entirely family-run by a married couple with a dog who forces us to go outside every couple hours to throw the ball for her. We do listen. Most of the features you see us adding to the service, week after week, are the direct result of a guild leader asking for it either on the Help Community forums or by sending in a support ticket. I'd like to see that extended to commenting on development blog service updates, as well.

Posted 8/31/2012 2:15 PM by Aaron Lewis. 217701 reads. Share:

The following went down during this release cycle (some of the more urgent items were deployed as hotfixes):
  • Enhancement You can now customize the names of the access levels (Public, Associate, Member, Council, Officer, SuperAdmin). They appear in the forums and other places. From any guild page, click the Admin item on the Guild Bar, hover over Member Management, and then select Custom Level Names. You can also get there from Member Management in the Control Panel.
  • Enhancement The paging while reading posts has been enhanced a little.
  • Css A div has been added that wraps around quotes (previously, the quote titles and bodies were all on their own). The class name is quoteWrapper.
  • Css Widget and forum category headers now stretch their background images vertically to match the height of the container. This was done to remove the need to regenerate your images/gradients whenever you changed the font size or the padding.
  • Css The ForumCategoryHeader class had rules with !important in them, making it difficult for those who dig into CSS to customize it. The !important directives have been removed.
  • Bug When mod authors update their mods, the "details" will now save properly.
  • Bug The edit dialog for voice server status widgets has been set back to the correct one.
  • Bug For IE users, the voting poll results were sometimes not displaying. Fixed.
  • Enhancement Auto-suggest has been added to the public GuildPortal page search (along the top). It'll make it easier for people to find your guild more quickly -- it displays things like your game, server, and even more detail if they hover over / select it via arrow keys.
  • Enhancement Tooltips from GuildHead have been incorporated and will now automatically work on any GuildWars2 site.
  • Bug The new, less-intrusive widget admin "thingy" (technical term) was sometimes making IE go into convulsive fits, playing with its mind, attempting to get it to give up the location of the rebel bases. I told it that IE does not know the location of the rebel bases, so now it's all better.
  • Enhancement Member admin tools have been added to the Guild Bar.
  • Enhancement The shout box was displaying uglier than a dirty monkey at a fancy dinner party. Well, I've never been to a fancy dinner party, but I imagine a dirty monkey would look pretty ugly at one. Anyway, it's been cleaned up (pretty much to spec with what Pinstripesc suggested). Thanks!
  • Bug The ability to disable paging, and to select page size when it's enabled, has been restored to the admin member editor grid. Now if too many of you guilds with over 1000 members disable paging and refresh the grid a whole bunch, I'm going to know about it and I'm going to do something really mean. I'm not sure what  yet, but you're not going to like it one bit!
  • Maint Some enhanced debugging tools have been added, so it'll make problem resolution faster and all that.
Thanks for choosing GuildPortal! We am u!