1. Free Shared and Reseller Hosting account applications have been temporarily suspended.

    For further information: Click Here
    Dismiss Notice
Dismiss Notice
It can take 24-48 hours for the hosting/Teamspeak applications to be reviewed. Opening a thread before 48 hours, asking about the application timeline will result in your account and application being deleted permanently.

Before signing up for an account, please see our Forbidden Countries List (https://www.instafree.com/forbidden_countries.php). If you are on that list, please do not attempt to sign up, as you will not be given a hosting account. Using a proxy to circumvent that list is a violation of our TOS and will result in immediate deletion of your account.

OUTAGE: RESOLVED Salt Lake City VPS Outage

Discussion in 'Network and Server Status' started by Bryan, Apr 5, 2021.

  1. Bryan

    Bryan Administrator

    Messages:
    7,740
    Likes Received:
    1,883
    Unfortunately, there was a catastrophic issue at the Salt Lake City datacenter. Here is the latest update:

    Hello Everyone,
    Now that we have a better understanding of what happened we would like to give everyone an update.
    One of our old generators that have worked for years and was recently load tested had a mechanical failure and caught fire resulting in power being cut to our core routers and fire suppression system controlling the fire. Unfortunately, the fire department opted to cut power to the rest of the building as a precaution even though the power systems were independent. We are currently waiting for an emergency inspector to arrive to give the all-clear so we can bring most of the servers in Ogden back online. Some servers will have an extended outage as they may require rebuilds due to some water damage. Those builds have a high probability that data is intact.
    We would like to thank you for your patience and know that we are doing everything we can to get everyone back online.
     
  2. Joshua Johnson

    Joshua Johnson Premium VPS Client Hosting Client

    Messages:
    22
    Likes Received:
    7
    Location:
    lake charles, la
    :( hopefully be back online soon
     
    Last edited: Apr 5, 2021
    Bryan likes this.
  3. Fedora

    Fedora Premium Hosting Client VPS Client

    Messages:
    2,641
    Likes Received:
    3,014
    Oh my cat! :eek:
     
    Bryan likes this.
  4. Bryan

    Bryan Administrator

    Messages:
    7,740
    Likes Received:
    1,883
    Here is the latest update. Thank you @Konstantin

    As you may have noticed, over the weekend there was an incident at the WebNX Ogden Utah data center. Since then we have been assessing the situation and determining a course of action to best help our clients get back online and running. We are taking extreme caution in order to safeguard the integrity and security of our customers’ data while also expediting the process of getting online as soon as possible. Here at WebNX we value the relationship that we have with our customers and will be as transparent as possible throughout this entire process without delaying our primary goal of returning our customers to full working order.

    Here is a brief overview of the events as they transpired. Sunday afternoon the city power was disrupted and, as designed, our backup generators automatically switched on. However, during that transition, one of our backup generators that had been recently tested and benchmarked specifically for this situation experienced a catastrophic failure, caught fire, and as a result initiated the fire suppression protocol.

    The WebNX data center in Ogden Utah experienced some damage. Customer’s servers in one of our main bays were exposed to water and possible damage may have occurred. No fire damage was inflicted on customer servers. The majority of hardware in the entirety of our data center was spared, but there are machines that need to be inspected for water damage, and possibly rebuilt. As of now, we are working to restore power, network, and unaffected hardware back up online within the next day or two.

    We thank you for your continued support and understanding. We try and plan for worst-case scenarios and hope for the best in all that we do. We ask that you please submit only one support ticket and refrain from opening multiple support requests at this time until the bulk of the situation is resolved. Please understand that we are working feverishly to resolve this as fast as possible.

    As for the issue of compensation for downtime, we would advise you to review the Service Level Agreement, which can be found at https://webnx.com/sla/. Any issues/credits will be reviewed next week once the dust has settled. Thank you once again. Please stay tuned to our Facebook page for regular updates.
    The WebNX team

    Source: Facebook
     
    Konstantin likes this.
  5. Bryan

    Bryan Administrator

    Messages:
    7,740
    Likes Received:
    1,883
    Another update:

    All our servers were in the splash zone (like Shamu at Sea World), were potentially affected by the fire prevention system, and unfortunately are not being brought up yet. The datacenter is going to need to go through every single server individually, boot them up, burn them in to make sure everything is running properly and go from there. If the server hardware is not running properly, the datacenter will attempt to insert the drives in new server equipment and boot from there.

    This is unfortunately going to take time and there is no ETA in sight. If you are a paying VPS client here, refunds will absolutely be issued for the downtime. I ask that you allow us some time to process those refunds, and please don't open a PayPal dispute, etc., as that would make me very angry. :p

    Thanks for your understanding. I will continue to update everybody as we get more information.
     
  6. Fedora

    Fedora Premium Hosting Client VPS Client

    Messages:
    2,641
    Likes Received:
    3,014
    Bryan likes this.
  7. Bryan

    Bryan Administrator

    Messages:
    7,740
    Likes Received:
    1,883
    We truly apologize for the lack of updates the last few days and we hope to do better moving forward. Tonight we just reached a milestone and it is a good time to give a positive status update. We’d first like to address those who may not fully appreciate why it could take weeks to bring all the servers back online. After the full fire, city, and electrical inspection was done and going by what options we had open to us the amount of work ahead was on par with building a new datacenter from scratch. It was a task of recreating within weeks what we have done over the past 5 years here in Utah. On the first night alone we had to build up brand new cores, something that is usually planned over weeks. Then find space wherever we could and plan installation of enough racks to replace all that was taken down. Along with all the aggregation and distribution switches supporting those racks as they were also damaged. As the days went on and thousands of servers were pulled from racks and set for drying, multiple other crews were working to restore electrical, networking, and supporting structures. It was a hectic environment all while trying to bring clients back online as others expanded the space for more servers. Even with all that work it wasn’t proceeding fast enough.

    So we also pulled in a number of vendors to brainstorm about different options we had available to us and one of them has turned out to be working quite well after tonight’s testing. We are very happy to announce that we expect to have full power restored to bay 3 later today and this will allow for over 80% of the damaged racks to come back online by repairing and replacing key components. Coupled with all the work done elsewhere in the building this will allow us to be back online within a much quicker timeline than we previously hoped. We expect to have full recovery of all servers, hopefully, as early as next Friday. We are running test gear in bay 3 on the networking gear to verify network stability.
    To address the obvious failure to provide timely updates to our client base we’re working on revamping internal processes and creating additional positions of authority to solve this in the future. While we didn’t properly communicate the amount of effort being expended we hope this update sheds some light that we have been doing everything in our power to get all of the servers back online as quickly as possible.

    We also want to extend our thanks to those who have sent positive messages, expressed understanding of the difficulty, and those of you who have even ordered food delivered to our crews who have been working overtime to bring the datacenter back online. Your patience and trust in us to make things right is truly appreciated.
    We will provide further updates as we bring the first servers in bay 3 online again.

    Daniel Pautz
    Founder and CEO
    WebNX.com
     
  8. Bryan

    Bryan Administrator

    Messages:
    7,740
    Likes Received:
    1,883
    We are quickly approaching the halfway mark with server inspections. Our rough estimates place about 20% of servers in need of some repairs. These repairs range from something as simple as a battery replacement to a new CPU. Data from these machines appear to mostly have remained intact, with few exceptions. Servers that are still down should be up and running at the very latest by Friday. Thank you so much for all your patience throughout this time. We truly appreciate the loyalty we have seen from our customers and members of the online community.
     
  9. Bryan

    Bryan Administrator

    Messages:
    7,740
    Likes Received:
    1,883
    VPS server should be back online as of an hour ago. We and the datacenter are still doing a burn in, verifying data integrity, etc, so more downtime is possible. But we're getting somewhere!
     
  10. Bryan

    Bryan Administrator

    Messages:
    7,740
    Likes Received:
    1,883
    Everything is back online except 1 server, which doesn't affect anyone here. Well it does, but not with IF. :)

    We will go ahead and show this closed. Apologies for the downtime again. The datacenter worked (and is still working) pretty dog gone hard to get everything up so quickly, so kudos to them. I know they have learned a lot from this experience, and hopefully they can prevent similar from happening in the future.

    The only ongoing issue currently is potentially slightly lower network speeds. We are having no issues currently...we are maxing out our port, in fact. But the network is still suffering slightly, and there is a lot of data being moved in and out of that facility as more servers come back online.

    Code:
    10gb.bin            100%[===================>]   9.77G   112MB/s    in 89s
    
    2021-04-14 02:28:37 (112 MB/s) - ‘10gb.bin’ saved [10485760000/10485760000]

    If your VPS is still down or not working properly, please open a ticket here, or PM or email me. Everything should be operational as of 4/13.
     

Share This Page