Автоматизированный включаются серверов в центре обработки данных

Вместо того, чтобы писать Ваше собственное решение по контролю, я настоятельно рекомендую использовать существующий инструмент так, чтобы весь основной контроль и предупреждение функциональности были уже реализованы. При выборе Nagios Вы получите основной контроль сервера и сетевых ресурсов бесплатно, и следующие плагины должны дать Вам большую часть остальной части, в чем Вы нуждаетесь:

check_file_ages_in_dirs скажет Вам, существуют ли файлы резервных копий; вот сообщение в блоге, которое я записал с некоторыми основными примерами.

check_file может контролировать размер файла и содержание (использующий regexes), таким образом, можно произвести резервную статистику в файл и контролировать их.

Одна вещь, которую Вы не получите от Nagios, отклоняется и изображает в виде графика; я рекомендую смотреть на Munin для этого, поскольку это просто настроить и, как Nagios, имеет стопки внесенных плагинов.

3
задан 16 November 2011 в 19:44
5 ответов

Some APC's PDUs have configurable power delays. In APC's words...

Allows users to configure the sequence in which power is turned on or off for each outlet. This helps avoid in-rushes at start-up, which can cause overloaded circuits and dropped loads. Sequencing also allows users to predetermine which piece of equipment is turned on first so other equipment dependant on that unit will function properly.

That sounds like it might meet your needs.

1
ответ дан 3 December 2019 в 04:59

On recent server hardware, you have the ability to set systems to power-on automatically. In addition, you can configure a set or random power-on delay (to avoid overloading the circuit). This is usually a BIOS setting, but can help with restoring power in a particular order.

Outside of that, I'd always recommend a switched PDU (power distribution unit) for co-location facility deployments. Using one, you can have granular control over the power application and monitor/meter individual power ports. This can tie into your monitoring system.

2
ответ дан 3 December 2019 в 04:59

Easiest case: All servers react to Wake on LAN. Wake them in the desired order and check if they are alive with Nagios or something similar.

If that doesn't work, you will need networked PDUs with at least one outlet for every server, i.e. from APC. Then you can replace the WOL part from above with turning on the outlets in the desired order. This might work with SNMP or something vendor-specific.

3
ответ дан 3 December 2019 в 04:59

The fire department, maybe. I'm not sure if it's a good idea to slam your power grid with that many systems powering up all at once...but I'm not an electrician.

At least, I don't know if I'd trust it to an automated system to do something like that.

1
ответ дан 3 December 2019 в 04:59

You have a few possibilities.

Wake on Lan in a script where you can be notified when a server is correctly rebooted or not.

Almost every recent servers have interface that allows you to connect to the server remotely to manage bios, booting option and remote started. With HP it's ILO:

http://h18013.www1.hp.com/products/servers/management/remotemgmt.html

We have a currently have a set up that use Zabbix. We have it configured to send email when a switch, server, printer is offline. We also monitor our UPS to send shutdown command on all our server, esxi, vm, switches, management console, router, etc when the power level are too low after power failure.

We then configured this zabbix to power up servers in the order we wanted. We can get notification when a server didn't reboot correctly.

Took a bit of work but was worth it.

2
ответ дан 3 December 2019 в 04:59

Теги

Похожие вопросы