Power management room internet server networks

Hi-tech electronic and computer equipment and Internet. Better use of electricity, help with the work and specifications, equipment selection. Presentations fixtures and plans. Waves and electromagnetic pollution.
Christophe
Moderator
Moderator
posts: 79323
Registration: 10/02/03, 14:06
Location: Greenhouse planet
x 11042

Power management room internet server networks




by Christophe » 12/05/07, 13:09

There are 2 days, one of the hosts at whom I have a server experienced a power failure.

I am putting the final report here for you: it is interesting because it explains (roughly) how a network room is managed energetically and what are the power problems that can arise in the event of an incident. It also shows that an accident is often a succession of minor incident and that a simple missing check (in this case the automatic world release) can lead to a "catastrophe".

Dear Customer,

After more than 20h on a war footing, we can finally give you some explanation of the major technical incident we met between May 10h 17 and 11 May 15h30 (time or all services are operational in 100% ).
This is an exceptional incident in its nature as in its consequences ...

The summary :

Yesterday to 16h15 the power supply EDF (GEG) of the building has tripped at the cell located outside the building. We do not know (and GEG does not know the precise reasons that the food has cut) ..

In such a situation, the power supply to the servers must not be cut, since the building is secured by 3 large inverters as well as a 400kva diesel generator. The system normally works well since we already had an EDF cut a week ago and it had no impact.

It did not happen yesterday as you have seen.

The Cogent staff (international group that operates this datacenter) intervened at the beginning of the week to carry out maintenance and tests on the diesel generator. The people who intervened have obviously not ironed the group in automatic start mode for it to start as soon as it detects that there is more current EDF.

The result was irrevocable: the group did not start, the inverters were completely emptied and the servers were no longer powered.

Moreover, the site is normally monitored from several noc located in Paris, New York and from Spain so that this type of trouble can be detected very quickly by Cogent and that they can intervene. This time it did not work since there was an incident on the monitoring system for a few days. (which must be settled this day or tomorrow).

PHPNET was present on the site less than 10 minutes after the electrical failure of servers to one hand to start the group electrogene manually to restore the power and secondly to restart all servers.
As a problem never happens alone ... When GEG restored the mains power, the automatic tilt system tried to reboot it and that's where we got our
second electric break ...

In fact, the normal switching procedure is: edf => inverters => eletrogenic group. Conversely, to switch back to edf, the circuit is electrogene goupe => inverters => edf.
The inverters did not have time to recharge enough, the power was cut again for the servers.

To top it off, the EDF power supply again tripped a few minutes after the rocker because the power consumption of the site was too important. The air conditioning units and servers consuming 3 has two times more electricity at startup, consumption is not made after the settings present on the outside of the building, creating a new cut for servers.

So we had to cut the air conditioning and restart each part of the building within a certain time, so as not to redisjtter.

These numerous power cuts caused the loss of many hard drives in the servers and (especially) the loss of several file systems. However, we successfully switched to our backup system last night for a few hours.

The 1 mail server (cluster1) had to be restored to our last backup because its data was not recoverable. The situation is now fixed.

Today the responsibility of this incident must be reported to Cogent who should have made sure that the group was in automatic start mode and especially monitor properly
the data center to intervene before the inverters are empty.

We will initiate the necessary negotiations to obtain financial compensation that will be reflected on your PHPNET subscription up to the cut you have suffered.

Various solutions concerning the layout of a datacenter are currently under study PHPNET because we no longer want to depend on the goodwill of providers like Cogent or Redbus.
We will keep you informed about this project which should end in the 12 months to come.

The entire PHPNET team joins me in thanking you for your understanding and for excusing us for the inconvenience this may have caused you.

Excellent weekend to you,
----
PHPNET
123 ter Course of the liberation
38100 GRENOBLE
0 x
User avatar
nonoLeRobot
Master Kyot'Home
Master Kyot'Home
posts: 790
Registration: 19/01/05, 23:55
Location: Beaune 21 / Paris
x 13




by nonoLeRobot » 12/05/07, 14:10

We do not talk about it often but data center consumption is a real problem. (Here apparently around 400 kW saw the generator).

IBM launches a big project of reduction or in any case of non increase despite power increase of the consumption:

http://www.presence-pc.com/actualite/IBM-Green-project-23275/
0 x
Colmant
I understand econologic
I understand econologic
posts: 101
Registration: 05/09/06, 10:40
Location: vaucluse




by Colmant » 12/05/07, 14:11

actually "ca" bellowed while writing messages and at the same time I had a battery failure on my wireless mouse

it was during or just after jérome -dominique 234 got fired again, I thought that from the top of his IQ he had managed to virosé the system ...
I am reassured by the views of my limited computing capabilities
a+
0 x
Christophe
Moderator
Moderator
posts: 79323
Registration: 10/02/03, 14:06
Location: Greenhouse planet
x 11042




by Christophe » 12/05/07, 14:56

No Colmant, this incident concerns the server that was hosting the blog: www.econologie.info and not the site and this forum :)

All your remarks are therefore pure coincidences (except that the .com server begins to saturate ... investigations are underway to solve the problem of saturation)
0 x
User avatar
I Citro
Econologue expert
Econologue expert
posts: 5129
Registration: 08/03/06, 13:26
Location: Bordeaux
x 11




by I Citro » 13/05/07, 16:37

It is true that sometimes, I have the impression that the .com trains strong ... but as my adsl flow sometimes fluctuates from 300kb / s to 3000kb / s I put this on the back of my ISP ... until to what I learn (I have my sources :? ) that the server is saturating. :?
0 x

 


  • Similar topics
    Replies
    views
    Last message

Back to "Electricity, electronics and computers: Hi-tech, Internet, DIY, lighting, materials, and new"

Who is online ?

Users browsing this forum : No registered users and 129 guests