Cloudstar

History


Maintenance DNS Belgium ( 07/03/2024 )

07/03/2024 07:30 until 07/03/2024 09:00

On Thursday 07/03/2024 between 07:30 and 09:00 CET, there will be maintenance by DNS Belgium on the .be registration platform. This will cause a downtime of approximately 2 times 30 minutes. During that time it will be impossible to check if a .be domainname is available for registration, start a new domainname registration or start a transfer, change nameservers, change contacts,...

 

Issue in datacenter in Brussels ( 16/01/2024 )

15/01/2024 17:05 until 16/01/2024 00:00

There is currently an issue in the datacenter in Brussels, affecting the hypervisors HV-09 and HV-08. Due to this 50 of our (virtual) servers are down.

All servers and services are checked and should be available again.

 

Timeline

  • 20h : all servers are running. All services should be up again.
  • 19h35 : We are booting all cloud servers.
  • 19h20 : The startup procedure is still running.
  • 18h50 : The hardware is beïng restarted in the datacenter. Due to the complexity may take more that 30 minutes. We will provide an update in 30 minutes.
  • 18h30 : The datacenter engineers are still looking for a solution.
  • 18h00 : An engineer is currently assessing the situation in the datacenter.
  • 17h15 : The problem is escalated to the datacenter. After initial investigation there seems to be al general issue in datacenter.
  • 17h05 : Our monitoring detects that there is an issue in the datacenter in Brussels. 

Error when activating email on new domains ( 03/12/2023 )

03/12/2023 03:00 until 03/12/2023 19:00

Since Sunday morning at 3 am, there is an error when activating email on new domain names. The error occurs after an update of the spam filter software on one of the nodes (node1) of the spam filter cluster. Due to the update, the control panel of the spam filter software and the API are not available. This prevents any changes from being made to the spam filters.

The error has been escalated to the software vendor.

There is no problem with the delivery of both incoming and outgoing mails, but as a precaution we have removed node1 from the cluster. The delivery of the emails is now running via the 5 other nodes in the cluster.

We try to solve the problem as soon as possible.

UPDATE 18h50 : The softwarevendor applied a patch. The API and control panel are available again.

Hardware failure on HV11 ( 08/06/2023 )

08/06/2023 17:21 until 08/06/2023 18:00

Our monitoring detected a hw failure on hardware machine HV11. This causes downtime for cloud servers of customers.

We are migrating the cloud servers to other hardware machines.

Update 18h: all cloud servers have been migrated and are started. All issues have been resolved.

Upgrade network infrastructure ( 28/04/2023 )

28/04/2023 01:00 until 28/04/2023 07:00

On friday 28th of april, we will be conducting maintenance on the network in the Amsterdam datacenter between 1 AM and 7 AM.

The maintenance is needed to improve the networking inside the datacenter. We aim to minimize the impact for the servers, but there may be a connectivity disruption for about 2 minutes per cloud server.

Network issues in the Amsterdam datacenter - apache17 ( 05/11/2022 )

04/11/2022 13:23 until 05/11/2022 08:20

Our monitoring noticed network issues in our datacenter in Amsterdam. Some servers and services are slower or the connection to the internet is unavailable due to a DDoS attack.
Alle servers are available again.

 

Timeline

  • 13h23: our monitoring detects that some servers in our datacenter in Amsterdam are unavailable.
  • 13h29: the problem is escalated to the datacenter.
  • 13h45: we are still working to resolve the issue.
  • 13h55: Most servers are reachable again, but the situation is not stable at this time.
  • 14h15: Most servers and services are available again, but there may be intermittent issues. We keep monitoring the situation.
  • 15h05: All servers are available again, except for 1 server (apache17.websrv.be). We are trying to resolve this issue as soon as possible.
  • 18h30: we are still working to resolve the issue with apache17. This server is currenlty only available over IPv6.
  • 20h30: we are still following up with the staff in the datacenter. 
  • 23h: The webserver apache17 is now reachable from the internet. There may be short interuptions.
  • 8h20: The webserver apache17 is now also available. We will keep monitoring.

Firmwareupgrade Synergy Blade Enclosures ( 18/10/2022 )

17/10/2022 20:00 until 18/10/2022 04:00

On october 17th there will be a firmware upgrade on the blade enclosures in the Zaventem datacenters. These enclosures hold the blades HV08, HV09, HV10 and HV11. The upgrades ensure the stability, security and performance of the hardware platform.

There is no downtime expected. Users may notice a (very) short network hiccup. We will do everthing to make sure that there is no impact on our services, by performing this maintenance at night.

Virtual machine unreachable on HV10 ( 25/09/2022 )

25/09/2022 12:03 until 25/09/2022 20:45

At 12h03 our monitoring detected an issue with some virtual machines on the hardware node HV10.

During a first investigation there is an indication that a hardware failure is causing the issue. In the datacenter there was an attempt to do a reboot of the hardware machine, but that attempt failed. The hardware vendor - HPE - will also start an intervention soon.

All services are operational again

We will send a a full post incident report to customers with cloud servers that were impacted by this incident.

 

History

Update 13h50 : we are still investigating.

Update 14h30 : HPE has sent technicians to the datacenter to resolve the hardware problem.

Update 15h25 : De software of the virtualisation software is developing a work-around, so we can move the virtual machines. 

Update 15h45 : We are migraring the virtual machines one-by-one to other hardware. The first machines are back online.

Update 16h20 : Most servers are migrated to other hardware nodes and are running.

Update 16h45 : All virtual servers have been migrated to other hardware nodes.

Update 20h00 : The hardware node HV10 is not repaired by an engineer of HPE. All tests are succesfull and HV10 is re-integrated in the cloud cluster. We will be moving some non-critical servers to HV10. We will be monitoring this very closely.

 

Shared hosting, mail and nameservers - Installation of patches and upgrades ( 17/08/2022 )

17/08/2022 04:00 until 17/08/2022 08:00

On wednesday august 17th, between 4 am and 8 am, we will install software updates for the operating system and system software, in order to keep your files, emails and data on our servers safe. These updates will solve (critical) software vulnerabilities and optimize the performace of the server system.

The maintenance will be performed on all linux webhosting servers (apache01 to apache16), the database servers (mysql01 to mysql09), the cloudemail.be-environment and the nameservers (ns1 to ns5).

During the maintenance period there may be short service disruptions. We will try to minimize the nuissance to a minimum.

Update storage system 3PAR - Diegem ( 06/07/2022 )

05/07/2022 20:00 until 06/07/2022 07:00

We will be upgrading our 3PAR systems in the Diegem data centre with recommended patches.

All cloud servers and shared servers in the Brussels datacenters use the 3PAR SAN. During this maintenance there will be less redundancy. We do not expect that there will be impact on the availability and performance of the servers, as the 3PAR SAN’s are highly redundant.

Update storage system 3PAR - Zaventem ( 29/06/2022 )

28/06/2022 20:00 until 29/06/2022 06:00

We will be upgrading our 3PAR systems in the data centre in Zaventem with recommended patches.

All cloud servers and shared servers in the Brussels datacenters use the 3PAR SAN. During this maintenance there will be less redundancy. We do not expect that there will be impact on the availability and performance of the servers, as the 3PAR SAN’s are highly redundant.

Maintenance DNS Belgium ( 21/06/2022 )

21/06/2022 07:00 until 21/06/2022 10:00

DNS Belgium plan to do maintenance on Tuesday 21/06/2022 for the .be registration platform.

This will cause a downtime on these services during the maintenance window :

  • Check if a .be domain is available
  • registration or transfer of .be domains names
  • changes in the contacts or nameserver configuration

During this time window there will be no zone-file updates. Changes will be queued and processed after maintenance ends.

The nameserver of DNS Belgium are not impacted by this maintenance, so all .be domain names will remain active during the maintenance.

Hypervisor maintenance HV04-HV06-HV10-HV11 ( 14/04/2022 )

08/04/2022 09:00 until 14/04/2022 22:30

This week we will perform firmware upgrades on hypervisors HV04, HV06, HV10 and HV11 to keep them secure and fast-performing.

In order to upgrade the hypervisors we well empty the hypervisors one-by-one via a livemigrate of the cloud servers to another hypervisor. There will be no downtime for the cloud servers and the hosted services on them, however there may be a performance loss for a few minutes during the livemigration. The migration of the cloud servers will be performed outside peak-utilisation hours as much as possible.

  • HV04 will be upgraded on friday april 8th, starting at 9h.
  • HV06 will be upgraded on tuesday april 12th, starting at 9h.
  • HV10 will be upgraded on wednesday april 13th, starting at 9h.
  • HV11 will be upgraded on thursday april 14th, starting at 9h.

After the upgrades and tests, the cloud servers will be replaced on the upgrades hypervisors.

 

Shared hosting, mail and nameservers - Installation of patches and upgrades ( 12/04/2022 )

12/04/2022 04:00 until 12/04/2022 06:00

On tuesday april 12th, between 4 am and 6 am, we will install software updates for the operating system and system software, in order to keep your files, emails and data on our servers safe. These updates will solve (critical) software vulnerabilities and optimize the performace of the server system.

The maintenance will be performed on all linux webhosting servers (apache01 to apache16), the database servers (mysql01 to mysql09), the cloudemail.be-environment and the nameservers (ns1 to ns5).

During the maintenance period there may be short service disruptions. We will try to minimize the nuissance to a minimum.

Emergency maintenance - Core Network (Nossegem - Sint-Lambrachts-Woluwe) ( 15/02/2022 )

15/02/2022 22:00 until 15/02/2022 23:59

Recently we have faced network issues in the core network in the Stuart datacenters and these issues have been investigated together with our vendor. The conclusion is that we were facing a bug in the switches that causes conflicts with virtual & physical port numbers. These bugs with be fixed in a future releases.

In the meanwhile Stuart will implement a workaround to mitigate this bug and avoid further issues. On a later moment when a bug fix is released by the vendor, we will patch our core switches.

In the maintenance performed today we will make some physical changes with the related configuration changes as a workaround to avoid the bug to trigger in the future. There should be no impact, unless the bug triggers while fixing. In case we see impact on the network during this maintenance, we will fix this immediately.

 

Networking issue in the datacenter in Sint-Lambrechts-Woluwe ( 25/01/2022 )

25/01/2022 09:02 until 25/01/2022 23:11

Our monitoring picked up on a networking issue in the datacenter in Sint-Lambrechts-Woluwe. This causes some servers to be unreachable from the internet. A preliminary analysis indicated that the issues originate from the core-routers in the datacenter.

Engineers in the datacenter are investigating the issue and working on a solution.

  • 9h41 : Most servers are available again. A limited number of servers show issues.
  • 12h03 : All issues are not fully resolved. There is still a small number of servers that has issues with connecting to the network. Engineers keep working on a definitive solution.
  • 15h25 : There are still some issues with the network. A small number of Windows servers has issues with connecting to the network. Shared hosting servers and the email-servers are fully functional. Engineers keep working on a definitive solution.
  • 18h08 : Engineers keep working on a definitive solution.
  • 20h01 : Engineers keep working on a definitive solution.
  • 21h22 : All servers are reachable now. 
  • 22h30 : Some servers are offline.
  • 23h11 : all servers are reachable again.

Network issue in datacenters in Nossegem and Sint-Lambrechts-Woluwe ( 24/01/2022 )

24/01/2022 20:55 until 24/01/2022 22:36

At 20h55 our monitoring system detected a network issue in the datacenters in Nossegem and Sint-Lambrechts-Woluwe. This causes servers to react slower of they have become completely unavailable.

The problem has been escalated to the datacenter where engineers are working to resolve the issue as fast as possible.

21h57 : most servers are available again, but we still see some instability on the network.

22h36 : all servers are available again. We continue to monitor.

 

Hypervisor maintenance HV07-HV08-HV09 ( 14/01/2022 )

10/01/2022 09:00 until 14/01/2022 18:00

This week we will perform firmware upgrades on hypervisors HV07, HV08 and HV09 to keep them secure and fast-performing.

In order to upgrade the hypervisors we well empty the hypervisors one-by-one via a livemigrate of the cloud servers to another hypervisor. There will be no downtime for the cloud servers and the hosted services on them, however there may be a performance loss during the livemigration. The migration of the cloud servers will be performed outside peak-utilisation hours as much as possible.

  • HV07 will be upgraded on tuesday  januari 11th, starting at 10h.
  • HV08 will be upgraded on wednesday januari 12th, starting at 10h.
  • HV09 will be upgraded on thursday januari 13th, starting at 10h.

After the upgrades and tests, the cloud servers will be replaced on the upgrades hypervisors.

Storage migration of shared hosting servers ( 18/10/2021 )

18/10/2021 00:00 until 18/10/2021 06:00

To maximize the reliability and performance of our shared hosting environment, we will move some of our shared hosting servers to a new storage environment.

The move will take place in the night of sunday october 17th and monday october 18th, starting at midnight CET. There will be a downtime for the websites hosted on our servers. The downtime will be 30 tot 60 minutes per server, depending in the size of the disks in the virtual machines. We expect all machines to be moved by 6h.

These servers will be moved:

  • cs-one-apache06.websrv.be
  • cs-one-apache07.websrv.be
  • cs-one-apache08.websrv.be
  • cs-one-apache10.websrv.be
  • cs-one-apache11-su
  • cs-one-apache12-sj
  • cs-one-mssql03.websrv.be
  • cs-one-webfarm-lb01

Maintenance on shared hosting servers (web + database + mail) ( 28/09/2021 )

28/09/2021 04:00 until 28/09/2021 06:00

On tuesday september 28th, between 4h and 6h in the morning, we will update our shared hosting environment with the latest security patches and the latest updated versions of the operating system and software components (like apache, mysql, varnish,...). To activate the new sotware versions a restart of the servers is neccessary. Websites hosted on these servers can be unavailable during the maintenance windows for a few minutes.

This maintenance is performed every two months and is necessary to keep all services secure, fix issues, roll out new features and optimize performance.

Network migration in Nossegem and Sint-Lambrechts-Woluwe ( 09/09/2021 )

08/09/2021 23:00 until 09/09/2021 01:30

On Wednesday September 8th, between 23h and 3h, there will be a migration to a new networkinfrastructure in the datacenters of Nossegem and Sint-Lambrechts-Woluwe of the hypervisors HV8, HV9, HV10 en HV11.
Because the migration will also impact the storage network, we will - as an abundance of caution - shut down the cloud servers. After the migration the servers will be booted. This is to avoid possible corruption of the disks of the cloud servers.
There will be a downtime of 30 to 60 minutes.
We will notify all customers with impacted cloud servers with an individual communication.

Update 1h30 : the maintenance has been completed succesfully.

DDoS attack on nameservers ( 24/07/2021 )

24/07/2021 21:00 until 24/07/2021 23:30

Currently we experience a DDoS attack on our nameservers. This may cause domain names to be unavailable.

We are blocking traffic to our nameservers from several countries. Traffic from Belgium, The Nethernlands, Luxemburg, France and Germany is allowed.

UPDATE 23h30: At this time traffic to most other countries (and the US) is possible again.

Maintenance Linux Webhosting servers ( 29/06/2021 )

29/06/2021 04:00 until 29/06/2021 06:00

On june 29th we will perform a maintenance on out Linux shared hosting servers (apache01 to apache14, mysql01 to mysql07). During maintenance, the operating system and server software will be updated with security updates. After the installation, the servers will be rebooted. 

During the maintenance period there will be several minutes downtime of the websites and databases hosted within the shared hosting environment.

Maintenance is carried out between 4 a.m. and 6 a.m., so that disruption to you and your customers is limited. Thanks to maintenance, the hosted website and databases remain secure and fast.

Packetloss for some servers ( 23/06/2021 )

23/06/2021 10:43 until 23/06/2021 12:00

Since 10h43 we notice a problem in our network with higher packetloss for some international peering connections. Some websites may load slower than usual or expirience short interuptions for a small percentage of users. For users on local networks like Telenet or Proximus, there is no slowness.
We and our datacenter partner, are investigating the issue.

Update 12h : The issue has been resolved.
 

EMERGENCY MAINTENANCE - 07/06/2021 beween 23h and 2h ( 08/06/2021 )

07/06/2021 23:00 until 08/06/2021 00:20

This afternoon a firedetector generated a false alert and activated the extinguisher system in the datacenter in Nossegem (Brussels). As a result and inert gas was released in the datacenter.
The release of the extinguishing gas triggered several harddisks to go into a safety mode. This has been detected for the os disks of 2 of the hypervisors (hv08 and hv09) in our private cloud infrastructure.
We want to fix this issues as soon as possible to guarantee the stability of our services. That is why we will perform an emergency maintenance this evening between 23h CET and 2h CET. During this maintenance we will shut down the cloud servers on hv08 and hv09 and boot them on another hypervisor.
Due to the shut down and reboot of the cloud servers there will be a downtime of a few minutes per cloud server.

 

Update 0h00 : Het maintenance has been completed succesfully. All servers are available again.

Cloud Server with disks in read-only status ( 04/06/2021 )

04/06/2021 08:30 until 04/06/2021 10:00

Due to a problem on the SAN there were 62 cloud servers with disks with read-only status. All servers are fully operational since 9h56 (Brussels time).
The issue is related to the same issue there was last week thursday may 27th.

History

  • 8h11 : Disks of 62 servers go into read-only mode due an issue
  • 9h56 : All 62 servers have been rebooted, have had a disk check and the services are checked.

Cause

HPE has confirmed that the issues is caused by a softwareproblem in their SAN. The problem is related to the garbage collection, which causes the disks on the SAN to fill up and no more data can be written to the SAN. This causes the disks of the cloud servers to go into read-only mode.
HPE has created a patch for this issue and is testing it. As soon as the patch become available, we will roll it out during a maintenance window.

Actions

Until the patch of HPE is available, we monitor the impacted LUN since thursday may 27th. At the time there is not enough storage available, we add addition storage to the LUN. This morning the LUN ran out of available space because of a high amount of "garbage" information that was suddenly added, because of intensive I/O actions. We are investigating how we can avoid this. 

We will also look at additional actions that we can take while waiting for the patch.

readonly storage one some servers ( 27/05/2021 )

27/05/2021 09:54 until 27/05/2021 18:47

On May 27 at 9h46 a problem was detected with the storage system that caused disks to go read-only on some servers. As a result, information could no longer be written to databases and files. Some applications were inaccessible or gave error messages. After a reboot and disk check the servers were available again.
The problem occurred 4 times throughout the day.
There was impact on up to 62 servers. The problem was limited to 1 storage lun in the data center in Nossegem.
At 18h47 a workaround was implemented. Since then the situation is stable.

History

  • 9h46 : Our monitoring system reports that the disks of dozens of servers have gone into a read-only mode. We immediately started investigating the issue and escalated it to the data center.
  • 10h03 : After an initial analysis, it seems to be a temporary issue. We test with some test machines whether they reboot correctly.
  • 10h11: We are restarting all impacted servers one by one, after which they will become operational again.
  • 11h07 : All servers have now been rebooted and the disk checked. We are checking all services on the servers.
  • 11h30: All servers have been checked and all services are available again.
  • 14h06 : The problem occurs again. We commence restarting all machines.
  • 15h30 : All machines have started up again and are checked.
  • 16h05: The problem re-occurs. We are restarting all machines.
  • 17h10: All machines have been restarted.
  • 17h37: The disks go back to read-only. We start over with booting up and checking the servers.
  • 17h40 : The root cause has been found. The necessary logs are collected to pass to the SAN vendor so that a definitive solution can be implemented.
  • 18h47: A workaround has been implemented.

Maintenance on shared hosting - mysql03.websrv.be ( 09/04/2021 )

09/04/2021 03:00 until 09/04/2021 05:00

In the night of april 9th between 3:00 AM and 5:00 AM we will perform urgent maintenance tasks on the database server mysql03.websrv.be
During this maintenance, there will be an service disruption for the websites that use databases on the server mysql03.websrv.be. This may take about 30 minutes.
The intervention will also be performed on the replica-server mysql04.websrv.be. For this server there will be no impact on the hosted websites.

The maintenance is performed to impact the stability and performance of the databases.
 

Performance issue on Cloud Servers ( 05/04/2021 )

05/04/2020 12:22 until 05/04/2021 13:00

Since this morning 12h our monitoring system detected a performance issue on cloud servers.

Update 12h35

The performance issues are resolved. 

Maintenance on shared hosting - mysql01.websrv.be ( 10/02/2021 )

10/02/2021 03:00 until 10/02/2021 09:00

In the night of februari 10th at 3:00 AM we will perform a maintenance tasks on the database server mysql01.websrv.be
During this maintenance, there will be an service disruption for the websites that use databases on the server mysql01.websrv.be. This may take 30 minutes.
The intervention will also be performed on teh replica-server mysql02.websrv.be. For this server there will be no impact on the hosted websites.

The maintenance is performed to impact the stability and performance of the databases.

Maintenance on myaccount.cloudstar.be ( 31/10/2020 )

31/10/2020 09:00 until 31/10/2020 17:00

On Saturday 31 October, we will perform maintenance on the cloudstar control panel myaccount.cloudstar.be. As a result, there can be brief interruptions throughout the day when ordering, managing and setting up products and services. Please be patient and try again at a later time. There is no impact on the functioning of domain names, websites and servers.

During the maintenance our support teams will be available to answer your questions.

Maintenance is performed to improve performance, safety and stability.

Hardware maintenance - HV04 and HV06 ( 11/03/2020 )

10/03/2020 20:00 until 11/03/2020 06:00

On March 10h we will perform a maintenance on hardware machines HV04 and HV06.

There will be no impact on the cloud servers that are hosted on these machines. We will migrate the cloud servers to other hardware nodes.

Upgrade monitoring system ( 03/03/2020 )

03/03/2020 12:00 until 03/03/2020 18:00

On March 3rd, between noon and 6 pm, we will upgrade our monitoring platform based on Zabbix. The upgrade itself will take approximately 30 minutes.

During the upgrade it will not be possible to log in to zabbix and there will be no new measured values temporarily.

The upgrade of Zabbix will ensure that the monitoring platform remains safe and efficient.

Upgrade firmware HV05 - geen impact verwacht ( 18/12/2019 )

17/12/2019 20:00 until 18/12/2019 04:00

Op 17 december om 20h zal er een firmware upgrade gebeuren van de hypervisor HV05. Die upgrade is een onderdeel van normaal, preventief onderhoud. Het lost fouten in de firmware op en is nodig om de stabiliteit en veiligheid te blijven garanderen.

Voorafgaand aan het onderhoud zullen we alle virtuele machines op HV05 verhuizen naar andere hypervisors, zodat er geen impact is voor de toepassingen en websites die we hosten.

Border Router Network Update ( 10/10/2019 )

10/10/2019 06:00 until 10/10/2019 22:00

One of the bgp border routers in our datacenters will be upgraded to a newer version

No Impact is expected

Storage issue in Combell datacenter ( 04/10/2019 )

04/10/2019 10:30 until 04/10/2019 12:05

Er is momenteel een issue met storage in het Combell datacenter. Engineers van Combell zijn bezig met het probleem op te lossen. Door dit issue zijn de cloud servers in bxl en ant in de servernaam onbereikbaar.
Zodra er meer informatie beschikbaar is, zullen we deze hier publiceren.

Update 10h45
Het issue met de storage is opgelost en de meeste servers zijn intussen terug bereikbaar. We overlopen nu alle servers om te controleren op de goede werking.

Update 11h05
Na een initiele oplossing is de link met de storage opnieuw onbereikbaar, waardoor er cloud servers down zijn. Combell werkt om het probleem op te lossen.

Update 12h05 - opgelost
De storage issues lijken van de baan. Alle servers zijn opnieuw online. Combell heeft bevestigd dat er intussen een permanente oplossing in werking is.
De basisoorzaak is één van de storage nodes die onbereikbaar was.

We blijven dit verder opvolgen en bekijken welke acties we kunnen ondernemen om te vermijden dat dit in de toekomst nog voorvalt.

Status History