Optimal ARC and L2ARC settings for purpose specific storage application

I am configuring a server that runs 3 ZFS pools, 2 of which are rather purpose specific and I feel like the default recommendations are simply not optimized for them. Networking is facilitated by dual 10gbit adapters.

Pool 1 is a big file storage, it contains raw video data that is rarely written and read, and also occasional backups. There is absolutely no point in caching anything from that pool, as it is high bandwidth data that is read through in one sweep beginning to end, caching anything from it will be a complete waste of memory. Latency is not that much of an issue, and bandwidth is ample due to highly compressible data. The pool is made of 8 HDDs in z2 mode, usable capacity of 24TB.

Pool 2 is compressed video frames storage. Portions of this content are frequently read when compositing video projects. The portion of frequently used data is usually higher than the total amount of RAM the server has, there is a low latency requirement, but not ultra low, bandwidth is more important. The pool is made of 3 HDDs in z1, usable capacity of 8TB, and a 1TB NVME SSD for L2ARC.

Pool 3 is general storage used as storage for several computer systems that boot and run software from it rather than local storage. Since it has to service several machines and primary system storage, the requirements for latency and bandwidth here are the highest. This pool is mostly read from, writes are limited to what the client systems do. The pool is made of 3 SATA SSDs in z1 mode, 1TB of usable capacity.

My intent at optimization has to do with minimizing the ARC size for the first two pools in order to maximize the ARC size for the third one.

Pool 1 has no benefit from caching whatsoever, so what's the minimum safe amount of ARC I can set for it?

Pool 2 can benefit from ARC but it is not really worth it, as L2ARC is fast enough for the purpose and the drive has 1 TB of capacity. Ideally, I would be happy if I could get away without using any ARC for this volume, and using the full terabyte of L2ARC, but it seems that at least some ARC is needed for L2ARC header data.

So considering L2ARC capacity of 1 TB and pool record size of 64k, 1tb / 64kb * 70b gives me ~0.995gb. Does this mean I can safely cap the ARC for that pool at 1GB? Or maybe it needs more?

It seems that the ARC contains both read cache as well as the information to handle the L2ARC, so it looks like what I need is some option to give emphasis to managing a larger L2ARC than bother with caching actual data in RAM. And if necessary, mandate that any cache evictions from ARC are moved to L2ARC in the event cache eviction policies do not abide to usual caching hierarchy policies.

The general recommendations I've read suggest about 1GB of RAM per 1TB of storage, I am planning 32GB of RAM per 33 TB of storage which I am almost dead on, but 4 or 5 to 1 for L2ARC vs ARC, which I fall short of by quite a lot. The goal is to cut pool 1 ARC as low as possible, and cut pool 2 ARC to only as much as it needs in order to be able to utilize the whole 1TB of L2ARC, in order to maximize the RAM available for ARC for pool 3.

3
задан 24 March 2018 в 22:59
1 ответ

Во-первых, я действительно предлагаю вам пересмотреть схему для пулов №2 и №3: трехстороннее зеркало не даст вам ни низкой задержки, ни высокой пропускной способности. Вместо дорогого диска NVMe емкостью 1 ТБ для L2ARC (который, кстати, несбалансирован из-за небольшого ARC 32 ГБ), я бы использовал больше дисков 7200 об / мин в режиме RAID10 или даже более дешевые, но надежные твердотельные накопители (например: Samsung 850 Pro / Evo или Crucial MX500).

По крайней мере, вы можете поместить все диски в один пул RAID10 (с SSD L2ARC) и сегментировать один пул с помощью нескольких наборов данных.

Тем не менее, вы можете с помощью параметров primarycache и secondarycache ] можно указать, как ARC / L2ARC следует использовать для базы данных наборов данных:

  • zfs set primarycache = none; zfs set secondarycache = none отключит любое кэширование ARC / L2ARC для набора данных. Вы также можете задать zfs set logbias = throughput привилегию througput, а не задержку во время операций записи;
  • zfs set primarycache = metadata включит кэширование только метаданных для второй набор данных. Обратите внимание, что L2ARC обрабатывается ARC; это означает, что если ARC кэширует только метаданные, то же самое будет верно и для L2ARC;
  • оставьте параметр ARC / L2ARC по умолчанию для третьего набора данных.

Наконец, вы можете настроить свой экземпляр ZFS на использование более чем (по умолчанию из) 50% вашей оперативной памяти для ARC (найдите zfs_arc_max на странице руководства модуля )

3
ответ дан 3 December 2019 в 06:26

Теги

Похожие вопросы