Installing NVIDIA Drivers for Diskless Environment

I'm trying to set up a cluster of 8 computers plus a main file server. Ideally, I'd like to set this up in a pxe-boot, quasi-diskless/quasi-stateless environment (i.e. the only local storage is /var, where things like torque configuration will go). Each of the 8 compute nodes has 4 NVIDIA Tesla K40m's, but the root file server has no GPU.

Ideally, I'd like to be able to create the complete installation on the file server (at /node) then PXE-boot that to the compute nodes, but, I haven't found a way to install the NVIDIA drivers without an NVIDIA GPU on board. I found one question on NVIDIA's forums about how someone unsuccessfully attempted this...

Alternatively, I could install the NVIDIA drivers to one of the compute nodes (one is currently running CentOS on it's local disks) to (for example) /usr/local/nvidia and keep track of what files it creates and create a tarball of that to copy to the file server installation.

Lastly, I could just maintain eight separate installations, but, I don't like this from a long-term maintenance perspective (each compute node will be running torque jobs so I'd like the nodes to look more-or-less identical).

In summary, what I'm asking for is this:

  1. Can I install the NVIDIA drivers without an NVIDIA GPU on board?
  2. Is there some other way I should be going about this?

For reference, we're running CentOS 7.

[root@compute-3 /]# uname -a
Linux compute-3 3.10.0-514.2.2.el7.x86_64 #1 SMP Tue Dec 6 23:06:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
2
задан 16 January 2017 в 01:20
1 ответ

Используйте RPM-пакеты, как и все остальное.

На данный момент лучшие пакеты драйверов NVIDIA взяты из Negativo17.

2
ответ дан 3 December 2019 в 11:30

Теги

Похожие вопросы