Bab 12. Administrasi Tingkat Lanjut

Bab ini meninjau kembali beberapa aspek yang telah kami uraikan, dengan perspektif yang berbeda: alih-alih memasang pada sebuah komputer, kita akan mempelajari sistem deployment masal; alih-alih membuat volume RAID atau LVM pada saat instalasi, kita akan belajar melakukannya secara manual sehingga nanti kita dapat merevisi pilihan awal kita. Akhirnya, kita akan mendiskusikan perkakas pemantauan dan teknik virtualisasi. Sebagai konsekuensinya, bab ini secara lebih khusus menarget para administrator profesional, and sedikit kurang brfokus pada para individu yang bertanggungjawab atas jaringan rumahan mereka.

12.1. RAID dan LVM

Bab 4, Instalasi mempresentasikan teknologi ini dari sudut pandang pemasang, dan bagaimana itu mengintegrasikan mereka untuk membuat deployment mereka mudah dari awal. Setelah instalasi awal, seorang administrator mesti bisa menangani keperluan ruang penyimpanan yang berkembang tanpa mesti mengandalkan instalasi ulang yang mahal. Maka mereka mesti paham peralatan yang diperlukan untuk memanipulasi volume RAID dan LVM.

RAID dan LVM adalah teknik untuk mengabstrakkan volume yang dikait dari pasangan fisik mereka (yaitu hard disk atau partisi); yang pertama mengamankan data dari kegagalan perangkat keras dengan memperkenalkan redundansi, yang belakangan membuat manajemen volume lebih luwes dan tak bergantung kepada ukuran sebenarnya dari disk yang mendasarinya. Dalam kedua kasus, sistem pada akhirnya mendapat perangkat blok baru, yang dapat dipakai untuk membuat sistem berkas atau ruang swap, tanpa perlu mereka dipetakan ke satu disk fisik. RAID dan LVM datang dari latar belakang yang cukup berbeda, tapi fungsionalitas mereka sebagian dapat bertumpang tindih, sehingga mereka sering disinggung bersama-sama.

PERSPEKTIF Btrfs menggabung LVM dan RAID

While LVM and RAID are two distinct kernel subsystems that come between the disk block devices and their filesystems, btrfs is a filesystem, initially developed at Oracle, that purports to combine the featuresets of LVM and RAID and much more.

→ https://btrfs.wiki.kernel.org/index.php/Main_Page

Diantara fitur yang menarik adalah kemampuan membuat snapshot dari suatu pohon sistem berkas pada sebarang waktu. Snapshot ini pada awalnya tak memakai sebarang ruang disk, data hanya diduplikasi ketika satu dari salinan-salinan dimodifikasi. Sistem berkas juga menangani kompresi transparan dari berkas, dan checksum memastikan integritas dari semua data yang disimpan.

Pada kedua kasus RAID dan LVM, kernel menyediakan suatu berkas perangkat blok, mirip dengan yang berkaitan dengan suatu hard disk atau suatu partisi. Ketika suatu aplikasi, atau bagian lain dari kernel, meminta akses ke suatu blok dari perangkat seperti itu, subsistem yang sesuai mengarahkan blok ke lapisan fisik yang relevan. Bergantung kepada konfigurasi, blok ini dapat disimpan pada satu atau beberapa disk fisik, dan lokasi fisiknya mungkin tak berkorelasi langsung ke lokasi blok dalam perangkat lojik.

12.1.1. RAID Perangkat Lunak

RAID means Redundant Array of Independent Disks. The goal of this system is to prevent data loss in case of hard disk failure. The general principle is quite simple: data are stored on several physical disks instead of only one, with a configurable level of redundancy. Depending on this amount of redundancy, and even in the event of an unexpected disk failure, data can be losslessly reconstructed from the remaining disks.

RAID can be implemented either by dedicated hardware (RAID modules integrated into SCSI or SATA controller cards) or by software abstraction (the kernel). Whether hardware or software, a RAID system with enough redundancy can transparently stay operational when a disk fails; the upper layers of the stack (applications) can even keep accessing the data in spite of the failure. Of course, this “degraded mode” can have an impact on performance, and redundancy is reduced, so a further disk failure can lead to data loss. In practice, therefore, one will strive to only stay in this degraded mode for as long as it takes to replace the failed disk. Once the new disk is in place, the RAID system can reconstruct the required data so as to return to a safe mode. The applications won't notice anything, apart from potentially reduced access speed, while the array is in degraded mode or during the reconstruction phase.

When RAID is implemented by hardware, its configuration generally happens within the BIOS setup tool, and the kernel will consider a RAID array as a single disk, which will work as a standard physical disk, although the device name may be different (depending on the driver).

Kami hanya berfokus pada RAID perangkat lunak dalam buku ini.

12.1.1.1. Tingkat-tingkat RAID

RAID is actually not a single system, but a range of systems identified by their levels; the levels differ by their layout and the amount of redundancy they provide. The more redundant, the more failure-proof, since the system will be able to keep working with more failed disks. The counterpart is that the usable space shrinks for a given set of disks; seen the other way, more disks will be needed to store a given amount of data.

RAID Linier: Even though the kernel's RAID subsystem allows creating “linear RAID”, this is not proper RAID, since this setup doesn't involve any redundancy. The kernel merely aggregates several disks end-to-end and provides the resulting aggregated volume as one virtual disk (one block device). That's about its only function. This setup is rarely used by itself (see later for the exceptions), especially since the lack of redundancy means that one disk failing makes the whole aggregate, and therefore all the data, unavailable.
RAID-0: This level doesn't provide any redundancy either, but disks aren't simply stuck on end one after another: they are divided in stripes, and the blocks on the virtual device are stored on stripes on alternating physical disks. In a two-disk RAID-0 setup, for instance, even-numbered blocks of the virtual device will be stored on the first physical disk, while odd-numbered blocks will end up on the second physical disk.
This system doesn't aim at increasing reliability, since (as in the linear case) the availability of all the data is jeopardized as soon as one disk fails, but at increasing performance: during sequential access to large amounts of contiguous data, the kernel will be able to read from both disks (or write to them) in parallel, which increases the data transfer rate. The disks are utilized entirely by the RAID device, so they should have the same size not to lose performance.
RAID-0 use is shrinking, its niche being filled by LVM (see later).
RAID-1: This level, also known as “RAID mirroring”, is both the simplest and the most widely used setup. In its standard form, it uses two physical disks of the same size, and provides a logical volume of the same size again. Data are stored identically on both disks, hence the “mirror” nickname. When one disk fails, the data is still available on the other. For really critical data, RAID-1 can of course be set up on more than two disks, with a direct impact on the ratio of hardware cost versus available payload space.
CATATAN Ukuran klaster dan disk
If two disks of different sizes are set up in a mirror, the bigger one will not be fully used, since it will contain the same data as the smallest one and nothing more. The useful available space provided by a RAID-1 volume therefore matches the size of the smallest disk in the array. This still holds for RAID volumes with a higher RAID level, even though redundancy is stored differently.
It is therefore important, when setting up RAID arrays (except for RAID-0 and “linear RAID”), to only assemble disks of identical, or very close, sizes, to avoid wasting resources.
CATATAN Disk cadangan
RAID levels that include redundancy allow assigning more disks than required to an array. The extra disks are used as spares when one of the main disks fails. For instance, in a mirror of two disks plus one spare, if one of the first two disks fails, the kernel will automatically (and immediately) reconstruct the mirror using the spare disk, so that redundancy stays assured after the reconstruction time. This can be used as another kind of safeguard for critical data.
One would be forgiven for wondering how this is better than simply mirroring on three disks to start with. The advantage of the “spare disk” configuration is that the spare disk can be shared across several RAID volumes. For instance, one can have three mirrored volumes, with redundancy assured even in the event of one disk failure, with only seven disks (three pairs, plus one shared spare), instead of the nine disks that would be required by three triplets.
This RAID level, although expensive (since only half of the physical storage space, at best, is useful), is widely used in practice. It is simple to understand, and it allows very simple backups: since both disks have identical contents, one of them can be temporarily extracted with no impact on the working system. Read performance is often increased since the kernel can read half of the data on each disk in parallel, while write performance isn't too severely degraded. In case of a RAID-1 array of N disks, the data stays available even with N-1 disk failures.
RAID-4: This RAID level, not widely used, uses N disks to store useful data, and an extra disk to store redundancy information. If that disk fails, the system can reconstruct its contents from the other N. If one of the N data disks fails, the remaining N-1 combined with the “parity” disk contain enough information to reconstruct the required data.
RAID-4 isn't too expensive since it only involves a one-in-N increase in costs and has no noticeable impact on read performance, but writes are slowed down. Furthermore, since a write to any of the N disks also involves a write to the parity disk, the latter sees many more writes than the former, and its lifespan can shorten dramatically as a consequence. Data on a RAID-4 array is safe only up to one failed disk (of the N+1).
RAID-5: RAID-5 menjawab masalah asimetri dari RAID-4: blok paritas disebar ke seluruh N+1 disk, tanpa ada satu disk yang memiliki peran tertentu.
Kinerja baca dan tulis identik dengan RAID-4. Di sini, sistem tetap berfungsi bila satu disk (dari N+1) gagal, tapi tak boleh lebih.
RAID-6: RAID-6 dapat dianggap perluasan dari RAID-5, dimana setiap seri N blok melibatkan dua blok redundansi, dan setiap seri N+2 blok disebar ke N+2 disk.
This RAID level is slightly more expensive than the previous two, but it brings some extra safety since up to two drives (of the N+2) can fail without compromising data availability. The counterpart is that write operations now involve writing one data block and two redundancy blocks, which makes them even slower.
RAID-1+0: This isn't strictly speaking, a RAID level, but a stacking of two RAID groupings. Starting from 2×N disks, one first sets them up by pairs into N RAID-1 volumes; these N volumes are then aggregated into one, either by “linear RAID” or (increasingly) by LVM. This last case goes farther than pure RAID, but there's no problem with that.
RAID-1+0 can survive multiple disk failures: up to N in the 2×N array described above, provided that at least one disk keeps working in each of the RAID-1 pairs.
LEBIH JAUH RAID-10
RAID-10 is generally considered a synonym of RAID-1+0, but a Linux specificity makes it actually a generalization. This setup allows a system where each block is stored on two different disks, even with an odd number of disks, the copies being spread out along a configurable model.
Performances will vary depending on the chosen repartition model and redundancy level, and of the workload of the logical volume.

Obviously, the RAID level will be chosen according to the constraints and requirements of each application. Note that a single computer can have several distinct RAID arrays with different configurations.

12.1.1.2. Menyiapkan RAID

Setting up RAID volumes requires the mdadm package; it provides the mdadm command, which allows creating and manipulating RAID arrays, as well as scripts and tools integrating it to the rest of the system, including the monitoring system.

Our example will be a server with a number of disks, some of which are already used, the rest being available to setup RAID. We initially have the following disks and partitions:

the sdb disk, 4 GB, is entirely available;
the sdc disk, 4 GB, is also entirely available;
on the sdd disk, only partition sdd2 (about 4 GB) is available;
finally, a sde disk, still 4 GB, entirely available.

We're going to use these physical elements to build two volumes, one RAID-0 and one mirror (RAID-1). Let's start with the RAID-0 volume:

# mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/sdb /dev/sdc
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
# mdadm --query /dev/md0
/dev/md0: 8.00GiB raid0 2 devices, 0 spares. Use mdadm --detail for more detail.
# mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Tue Jun 25 08:47:49 2019
        Raid Level : raid0
        Array Size : 8378368 (7.99 GiB 8.58 GB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

       Update Time : Tue Jun 25 08:47:49 2019
             State : clean 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

        Chunk Size : 512K

Consistency Policy : none

              Name : mirwiz:0  (local to host debian)
              UUID : 146e104f:66ccc06d:71c262d7:9af1fbc7
            Events : 0

    Number   Major   Minor   RaidDevice State
       0       8       32        0      active sync   /dev/sdb
       1       8       48        1      active sync   /dev/sdc
# mkfs.ext4 /dev/md0
mke2fs 1.44.5 (15-Dec-2018)
Discarding device blocks: done                            
Creating filesystem with 2094592 4k blocks and 524288 inodes
Filesystem UUID: 413c3dff-ab5e-44e7-ad34-cf1a029cfe98
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done 

# mkdir /srv/raid-0
# mount /dev/md0 /srv/raid-0
# df -h /srv/raid-0
Filesystem      Size  Used Avail Use% Mounted on
/dev/md0        7.9G   36M  7.4G   1% /srv/raid-0

The mdadm --create command requires several parameters: the name of the volume to create (/dev/md*, with MD standing for Multiple Device), the RAID level, the number of disks (which is compulsory despite being mostly meaningful only with RAID-1 and above), and the physical drives to use. Once the device is created, we can use it like we'd use a normal partition, create a filesystem on it, mount that filesystem, and so on. Note that our creation of a RAID-0 volume on md0 is nothing but coincidence, and the numbering of the array doesn't need to be correlated to the chosen amount of redundancy. It's also possible to create named RAID arrays, by giving mdadm parameters such as /dev/md/linear instead of /dev/md0.

Creation of a RAID-1 follows a similar fashion, the differences only being noticeable after the creation:

# mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sdd2 /dev/sde
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
mdadm: largest drive (/dev/sdd2) exceeds size (4192192K) by more than 1%
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md1 started.
# mdadm --query /dev/md1
/dev/md1: 4.00GiB raid1 2 devices, 0 spares. Use mdadm --detail for more detail.
# mdadm --detail /dev/md1
/dev/md1:
           Version : 1.2
     Creation Time : Tue Jun 25 10:21:22 2019
        Raid Level : raid1
        Array Size : 4189184 (4.00 GiB 4.29 GB)
     Used Dev Size : 4189184 (4.00 GiB 4.29 GB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

       Update Time : Tue Jun 25 10:22:03 2019
             State : clean, resyncing 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : resync

     Resync Status : 93% complete

              Name : mirwiz:1  (local to host debian)
              UUID : 7d123734:9677b7d6:72194f7d:9050771c
            Events : 16

    Number   Major   Minor   RaidDevice State
       0       8       64        0      active sync   /dev/sdd2
       1       8       80        1      active sync   /dev/sde
# mdadm --detail /dev/md1
/dev/md1:
[...]
          State : clean
[...]

A few remarks are in order. First, mdadm notices that the physical elements have different sizes; since this implies that some space will be lost on the bigger element, a confirmation is required.

More importantly, note the state of the mirror. The normal state of a RAID mirror is that both disks have exactly the same contents. However, nothing guarantees this is the case when the volume is first created. The RAID subsystem will therefore provide that guarantee itself, and there will be a synchronization phase as soon as the RAID device is created. After some time (the exact amount will depend on the actual size of the disks…), the RAID array switches to the “active” or “clean” state. Note that during this reconstruction phase, the mirror is in a degraded mode, and redundancy isn't assured. A disk failing during that risk window could lead to losing all the data. Large amounts of critical data, however, are rarely stored on a freshly created RAID array before its initial synchronization. Note that even in degraded mode, the /dev/md1 is usable, and a filesystem can be created on it, as well as some data copied on it.

TIPS Memulai mirror dalam mode terdegradasi

Sometimes two disks are not immediately available when one wants to start a RAID-1 mirror, for instance because one of the disks one plans to include is already used to store the data one wants to move to the array. In such circumstances, it is possible to deliberately create a degraded RAID-1 array by passing missing instead of a device file as one of the arguments to mdadm. Once the data have been copied to the “mirror”, the old disk can be added to the array. A synchronization will then take place, giving us the redundancy that was wanted in the first place.

TIPS Menyiapkan cermin tanpa sinkronisasi

Volume RAID-1 sering dibuat untuk digunakan sebagai disk baru, sering dianggap kosong. Isi awal sebenarnya dari disk ini karena itu tidak sangat relevan, karena hanya perlu diketahui bahwa data setelah penciptaan volume, khususnya sistem berkas, dapat diakses setelahnya.

Karena itu orang mungkin bertanya-tanya tentang titik sinkronisasi kedua disk pada waktu penciptaan. Mengapa peduli apakah isi identik pada zona volume yang kita ketahui hanya dapat dibaca setelah kita telah menulis?

Untungnya, tahap sinkronisasi ini dapat dihindari dengan memberikan opsi --assume-clean untuk mdadm. Namun, pilihan ini dapat menyebabkan kejutan dalam kasus-kasus di mana data awal akan dibaca (misalnya jika sistem berkas tersebut sudah hadir pada disk fisik), itulah sebabnya itu tidak diaktifkan secara default.

Sekarang mari kita lihat apa yang terjadi ketika salah satu elemen array RAID-1 gagal. mdadm, khususnya opsi --fail, memungkinkan simulasi suatu kegagalan disk:

# mdadm /dev/md1 --fail /dev/sde
mdadm: set /dev/sde faulty in /dev/md1
# mdadm --detail /dev/md1
/dev/md1:
[...]
       Update Time : Tue Jun 25 11:03:44 2019
             State : clean, degraded 
    Active Devices : 1
   Working Devices : 1
    Failed Devices : 1
     Spare Devices : 0

Consistency Policy : resync

              Name : mirwiz:1  (local to host debian)
              UUID : 7d123734:9677b7d6:72194f7d:9050771c
            Events : 20

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       1       8       80        1      active sync   /dev/sdd2

       0       8       64        -      faulty   /dev/sde

Isi dari volume masih dapat diakses (dan, jika dipasang, aplikasi tidak menyadari apapun), tapi keselamatan data tidak dijamin lagi: seandainya sdd disk gagal bergantian, data akan hilang. Kami ingin menghindari risiko, jadi kami akan mengganti disk yang gagal dengan yang baru, sdf:

# mdadm /dev/md1 --add /dev/sdf
mdadm: added /dev/sdf
# mdadm --detail /dev/md1
/dev/md1:
[...]
      Raid Devices : 2
     Total Devices : 3
       Persistence : Superblock is persistent

       Update Time : Tue Jun 25 11:09:42 2019
             State : clean, degraded, recovering 
    Active Devices : 1
   Working Devices : 2
    Failed Devices : 1
     Spare Devices : 1

Consistency Policy : resync

    Rebuild Status : 27% complete

              Name : mirwiz:1  (local to host debian)
              UUID : 7d123734:9677b7d6:72194f7d:9050771c
            Events : 26

    Number   Major   Minor   RaidDevice State
       2       8       96        0      spare rebuilding   /dev/sdf
       1       8       80        1      active sync   /dev/sdd2

       0       8       64        -      faulty   /dev/sde
# [...]
[...]
# mdadm --detail /dev/md1
/dev/md1:
[...]
       Update Time : Tue Jun 25 11:10:47 2019
             State : clean 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 1
     Spare Devices : 0

Consistency Policy : resync

              Name : mirwiz:1  (local to host debian)
              UUID : 7d123734:9677b7d6:72194f7d:9050771c
            Events : 39

    Number   Major   Minor   RaidDevice State
       2       8       96        0      active sync   /dev/sdd2
       1       8       80        1      active sync   /dev/sdf

       0       8       64        -      faulty   /dev/sde

Di sini lagi, kernel secara otomatis memicu tahap rekonstruksi yang ketika berlangsung, meskipun volume masih dapat diakses, berada dalam mode terdegradasi. Setelah rekonstruksi berakhir, array RAID kembali ke keadaan normal. Kita kemudian dapat memberitahu ke sistem bahwa disk sde akan dihapus dari array, sehingga berakhir dengan RAID mirror klasik pada dua disk:

# mdadm /dev/md1 --remove /dev/sde
mdadm: hot removed /dev/sde from /dev/md1
# mdadm --detail /dev/md1
/dev/md1:
[...]
    Number   Major   Minor   RaidDevice State
       2       8       96        0      active sync   /dev/sdd2
       1       8       80        1      active sync   /dev/sdf

Selanjutnya drive dapat secara fisik dicabut saat server berikutnya dimatikan, atau bahkan dicabut saat menyala ketika konfigurasi hardware mengizinkan hot-swap. Konfigurasi tersebut termasuk beberapa pengendali SCSI, kebanyakan disk SATA, dan drive eksternal yang beroperasi pada USB atau Firewire.

12.1.1.3. Mem-back up Konfigurasi

Most of the meta-data concerning RAID volumes are saved directly on the disks that make up these arrays, so that the kernel can detect the arrays and their components and assemble them automatically when the system starts up. However, backing up this configuration is encouraged, because this detection isn't fail-proof, and it is only expected that it will fail precisely in sensitive circumstances. In our example, if the sde disk failure had been real (instead of simulated) and the system had been restarted without removing this sde disk, this disk could start working again due to having been probed during the reboot. The kernel would then have three physical elements, each claiming to contain half of the same RAID volume. Another source of confusion can come when RAID volumes from two servers are consolidated onto one server only. If these arrays were running normally before the disks were moved, the kernel would be able to detect and reassemble the pairs properly; but if the moved disks had been aggregated into an md1 on the old server, and the new server already has an md1, one of the mirrors would be renamed.

Backing up the configuration is therefore important, if only for reference. The standard way to do it is by editing the /etc/mdadm/mdadm.conf file, an example of which is listed here:

Contoh 12.1. berkas konfigurasi mdadm

# mdadm.conf
#
# !NB! Run update-initramfs -u after updating this file.
# !NB! This will ensure that initramfs has an uptodate copy.
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
DEVICE /dev/sd*

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md0 metadata=1.2 name=mirwiz:0 UUID=146e104f:66ccc06d:71c262d7:9af1fbc7
ARRAY /dev/md1 metadata=1.2 name=mirwiz:1 UUID=7d123734:9677b7d6:72194f7d:9050771c

# This configuration was auto-generated on Tue, 25 Jun 2019 07:54:35 -0400 by mkconf

One of the most useful details is the DEVICE option, which lists the devices where the system will automatically look for components of RAID volumes at start-up time. In our example, we replaced the default value, partitions containers, with an explicit list of device files, since we chose to use entire disks and not only partitions, for some volumes.

Dua baris terakhir dalam contoh kita adalah yang memungkinkan kernel untuk secara aman memilih nomor volume yang ditetapkan ke array mana. Metadata yang tersimpan pada disk itu sendiri cukup untuk membangun kembali volume, tetapi tidak untuk menentukan nomor volume (dan nama perangkat /dev/md* yang cocok).

Untungnya, baris-baris ini dapat dihasilkan secara otomatis:

# mdadm --misc --detail --brief /dev/md?
ARRAY /dev/md0 metadata=1.2 name=mirwiz:0 UUID=146e104f:66ccc06d:71c262d7:9af1fbc7
ARRAY /dev/md1 metadata=1.2 name=mirwiz:1 UUID=7d123734:9677b7d6:72194f7d:9050771c

Isi dari dua baris terakhir ini tidak tergantung pada daftar disk yang disertakan dalam volume. Maka tidak diperlukan untuk meregenerasi baris-baris ini ketika menggantikan disk gagal dengan yang baru. Di sisi lain, perawatan harus diambil untuk memperbarui berkas ketika membuat atau menghapus sebuah array RAID.

12.1.2. LVM

LVM, Logical Volume Manager, adalah pendekatan lain untuk mengabstrakkan volume logis dari dukungan fisik mereka, yang berfokus pada peningkatan fleksibilitas daripada meningkatkan kehandalan. LVM dapat mengubah volume logis secara transparan bagi aplikasi; sebagai contoh, sangat mungkin untuk menambahkan disk baru, memigrasi data ke mereka, dan menghapus disk lama, tanpa melepas kait volume.

12.1.2.1. Konsep LVM

Fleksibilitas ini dicapai dengan tingkat abstraksi yang melibatkan tiga konsep.

Pertama, PV (Physical Volume) adalah entitas terdekat dengan perangkat keras: itu bisa berupa partisi pada disk atau seluruh disk, atau bahkan perangkat blok lain (termasuk, sebagai contoh, sebuah array RAID). Perhatikan bahwa ketika sebuah elemen fisik diatur hingga menjadi PV untuk LVM, itu mesti hanya diakses melalui LVM, jika tidak sistem akan bingung.

Sejumlah PV dapat dikumpulkan dalam VG (Volume Group), yang dapat dibandingkan dengan disk virtual dan extensible. VG abstrak, dan tidak muncul dalam perangkat berkas di hirarki /dev, sehingga tidak ada risiko menggunakan mereka secara langsung.

Jenis ke tiga objek adalah LV (Logical Volume), yang berupa potongan dari suatu VG; jika kita memakai analogi VG-sebagai-disk, LV setara dengan partisi. LV muncul sebagai perangkat blok dengan entri di /dev, dan dapat digunakan seperti setiap partisi fisik lainnya dapat (paling sering, mewadahi sebuah sistem berkas atau ruang swap).

The important thing is that the splitting of a VG into LVs is entirely independent of its physical components (the PVs). A VG with only a single physical component (a disk for instance) can be split into a dozen logical volumes; similarly, a VG can use several physical disks and appear as a single large logical volume. The only constraint, obviously, is that the total size allocated to LVs can't be bigger than the total capacity of the PVs in the volume group.

Namun sering masuk akal untuk memiliki semacam keseragaman antara komponen fisik VG, dan untuk membagi VG menjadi volume logis yang akan memiliki pola penggunaan serupa. Misalnya, jika perangkat keras yang tersedia termasuk disk cepat dan disk lambat, yang cepat dapat dikelompokkan ke satu VG dan yang lambat ke lain; potongan pertama dapat kemudian ditugaskan untuk aplikasi yang membutuhkan akses data yang cepat, sementara yang kedua akan disimpan untuk tugas-tugas yang kurang menuntut.

Dalam kasus apapun, perlu diingat bahwa LV tidak perlu melekat ke PV manapun. Dimungkinkan untuk mempengaruhi mana data dari LV secara fisik disimpan, tapi kemungkinan ini tidak diperlukan untuk penggunaan sehari-hari. Sebaliknya: ketika set komponen fisik VG berkembang, lokasi penyimpanan fisik yang sesuai dengan LV tertentu dapat bermigrasi di seluruh disk (dan tentu saja tetap di dalam PVs yang ditugaskan untuk VG).

12.1.2.2. Menyiapkan LVM

Mari kita sekarang ikuti, langkah demi langkah, proses pengaturan LVM untuk kasus penggunaan yang khas: kami ingin menyederhanakan situasi kompleks penyimpanan. Situasi seperti ini biasanya terjadi setelah beberapa sejarah yang panjang dan berbelit dari akumulasi langkah-langkah sementara. Untuk tujuan ilustrasi, kami akan mempertimbangkan server yang kebutuhan penyimpanannya telah berubah dari waktu ke waktu, berakhir dalam labirin dari partisi-partisi yang terpecah ke beberapa disk yang terpakai sebagian. Secara lebih konkret, partisi berikut tersedia:

pada disk sdb, sebuah partisi sdb2, 4 GB;
pada disk sdc, sebuah partisi sdc3, 3 GB;
disk sdd, 4 GB, sepenuhnya tersedia;
pada disk sdf, partisi sdf1, 4 GB; dan partisi sdf2, 5 GB.

Selain itu, mari kita asumsikan bahwa disk sdb dan sdf adalah lebih cepat daripada dua lainnya.

Tujuan kami adalah untuk mengatur tiga volume logis untuk tiga aplikasi yang berbeda: server berkas memerlukan ruang penyimpanan 5 GB, sebuah basis data (1 GB) dan ruang untuk back-up (12 GB). Dua yang pertama perlu kinerja yang baik, tapi back-up kurang kritis dalam hal kecepatan akses. Semua kendala ini mencegah penggunaan partisi sendirian; menggunakan LVM dapat mengabstraksi ukuran fisik dari perangkat, sehingga satu-satunya batas adalah jumlah ruang yang tersedia.

Alat-alat yang diperlukan ada dalam paket lvm2 dan dependensinya. Ketika mereka sedang diinstal, pengaturan LVM mengambil tiga langkah, cocok dengan konsep tiga tingkat.

Pertama, kami siapkan volume fisik menggunakan pvcreate:

# pvdisplay
# pvcreate /dev/sdb2
  Physical volume "/dev/sdb2" successfully created.
# pvdisplay
  "/dev/sdb2" is a new physical volume of "4.00 GiB"
  --- NEW Physical volume ---
  PV Name               /dev/sdb2
  VG Name               
  PV Size               4.00 GiB
  Allocatable           NO
  PE Size               0   
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               X66yQI-Q0Jk-3wOV-xmLa-wpqo-km9c-PmNOCW

# for i in sdc3 sdd sdf1 sdf2 ; do pvcreate /dev/$i ; done
  Physical volume "/dev/sdc3" successfully created.
  Physical volume "/dev/sdd" successfully created.
  Physical volume "/dev/sdf1" successfully created.
  Physical volume "/dev/sdf2" successfully created.
# pvdisplay -C
  PV         VG Fmt  Attr PSize PFree
  /dev/sdb2     lvm2 ---  4.00g 4.00g
  /dev/sdc3     lvm2 ---  3.00g 3.00g
  /dev/sdd      lvm2 ---  4.00g 4.00g
  /dev/sdf1     lvm2 ---  4.00g 4.00g
  /dev/sdf2     lvm2 ---  5.00g 5.00g

Sejauh ini, masih baik; perhatikan bahwa PV dapat disiapkan pada seluruh disk maupun pada partisi individunya. Seperti yang ditunjukkan di atas, perintah pvdisplay menampilkan daftar PVs yang ada, dengan dua format keluaran mungkin.

Sekarang mari kita merakit elemen-elemen fisik ini menjadi VG menggunakan vgcreate. Kita akan mengumpulkan hanya PV-PV dari disk cepat ke VG vg_critical; VG lain, vg_normal, juga akan memuat elemen-elemen yang lebih lambat.

# vgdisplay
  No volume groups found
# vgcreate vg_critical /dev/sdb2 /dev/sdf1
  Volume group "vg_critical" successfully created
# vgdisplay
  --- Volume group ---
  VG Name               vg_critical
  System ID             
  Format                lvm2
  Metadata Areas        2
  Metadata Sequence No  1
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                0
  Open LV               0
  Max PV                0
  Cur PV                2
  Act PV                2
  VG Size               7.99 GiB
  PE Size               4.00 MiB
  Total PE              2046
  Alloc PE / Size       0 / 0   
  Free  PE / Size       2046 / 7.99 GiB
  VG UUID               CcVj0Q-GbJA-2k9u-W6lC-7fZc-jHgk-qzKdX8

# vgcreate vg_normal /dev/sdc3 /dev/sdd /dev/sdf2
  Volume group "vg_normal" successfully created
# vgdisplay -C
  VG          #PV #LV #SN Attr   VSize   VFree  
  vg_critical   2   0   0 wz--n-   7.99g   7.99g
  vg_normal     3   0   0 wz--n- <10.99g <10.99g

Here again, commands are rather straightforward (and vgdisplay proposes two output formats). Note that it is quite possible to use two partitions of the same physical disk into two different VGs. Note also that we used a vg_ prefix to name our VGs, but it is nothing more than a convention.

Kita sekarang memiliki dua "disk virtual", masing-masing berukuran sekitar 8 GB dan 12 GB. Mari kita sekarang mengukir mereka ke dalam "partisi virtual" (LV). Ini melibatkan perintah lvcreate, dan sintaks yang agak lebih kompleks:

# lvdisplay
# lvcreate -n lv_files -L 5G vg_critical
  Logical volume "lv_files" created.
# lvdisplay
  --- Logical volume ---
  LV Path                /dev/vg_critical/lv_files
  LV Name                lv_files
  VG Name                vg_critical
  LV UUID                njyynD-gMQx-4VD1-4dYV-GaKd-4HSp-ypRFKV
  LV Write Access        read/write
  LV Creation host, time mirwiz, 2019-06-25 12:49:09 -0400
  LV Status              available
  # open                 0
  LV Size                5.00 GiB
  Current LE             1280
  Segments               2
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0

# lvcreate -n lv_base -L 1G vg_critical
  Logical volume "lv_base" created.
# lvcreate -n lv_backups -L 11.99G vg_normal
  Logical volume "lv_backups" created.
# lvdisplay -C
  LV         VG          Attr     LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync  Convert
  lv_base    vg_critical -wi-a---  1.00g                                           
  lv_files   vg_critical -wi-a---  5.00g                                           
  lv_backups vg_normal   -wi-a--- 11.99g

Dua parameter diperlukan ketika membuat volume logis; mereka harus diberikan ke lvcreate sebagai opsi. Nama LV yang akan dibuat ditetapkan dengan opsi -n, dan ukurannya biasanya diberikan menggunakan opsi -L. Tentu saja kita juga perlu memberitahu ke perintah, VG mana yang dikenai operasi, maka diberikanlah parameter terakhir pada baris perintah.

LEBIH JAUH opsi-opsi lvcreate

The lvcreate command has several options to allow tweaking how the LV is created.

Mari kita pertama menjelaskan opsi -l, dengannya ukuran LV dapat diberikan sebagai cacah blok (sebagai lawan dari unit "manusia" yang kita digunakan di atas). Blok-blok ini (disebut PE, physical extents dalam istilah LVM) adalah unit-unit ruang penyimpanan yang bersebelahan di PV, dan mereka tidak dapat dipecah di LV. Ketika seseorang ingin menentukan ruang penyimpanan untuk LV secara cukup presisi, misalnya menggunakan seluruh ruang yang tersedia, opsi -l mungkin akan lebih disukai daripada -L.

Memungkinkan juga untuk menunjuk pada lokasi fisik LV, sehingga extent disimpan pada PV tertentu (tentu saja masih tetap ada di dalam yang ditugaskan untuk VG). Karena kita tahu bahwa sdb lebih cepat daripada sdf, kita mungkin ingin menyimpan lv_base di sana jika kita ingin memberikan keuntungan kepada server basis data dibandingkan dengan server berkas. Baris perintah menjadi: lvcreate -n lv_base -L 1G vg_critical /dev/sdb2. Perhatikan bahwa perintah ini bisa gagal jika PV tidak memiliki cukup extent bebas. Dalam contoh kita, kita mungkin harus membuat lv_base sebelum lv_files untuk menghindari situasi ini - atau membebaskan sebagian ruang di sdb2 dengan perintah pvmove.

Volume logis, sekali dibuat, akan menjadi berkas perangkat blok dalam /dev/mapper/:

# ls -l /dev/mapper
total 0
crw------- 1 root root 10, 236 Jun 10 16:52 control
lrwxrwxrwx 1 root root       7 Jun 10 17:05 vg_critical-lv_base -> ../dm-1
lrwxrwxrwx 1 root root       7 Jun 10 17:05 vg_critical-lv_files -> ../dm-0
lrwxrwxrwx 1 root root       7 Jun 10 17:05 vg_normal-lv_backups -> ../dm-2
# ls -l /dev/dm-*
brw-rw---T 1 root disk 253, 0 Jun 10 17:05 /dev/dm-0
brw-rw---- 1 root disk 253, 1 Jun 10 17:05 /dev/dm-1
brw-rw---- 1 root disk 253, 2 Jun 10 17:05 /dev/dm-2

CATATAN Mendeteksi otomatis volume LVM

Ketika komputer boot, unit layanan systemd lvm2-activation mengeksekusi vgchange -aay untuk "mengaktifkan" kelompok volume: memindai perangkat yang tersedia; yang telah diinisialisasi sebagai fisik untuk LVM didaftarkan ke subsistem LVM, yang berasal dari kelompok-kelompok volume dirakit, dan volume logis yang relevan dimulai dan dibuat tersedia. Karena itu tidak perlu menyunting berkas konfigurasi ketika membuat atau memodifikasi volume-volume LVM.

Namun, perlu diketahui bahwa tata letak elemen LVM (volume fisik dan logis, dan kelompok-kelompok volume) direkam cadang dalam /etc/lvm/backup, yang dapat berguna dalam hal ada masalah (atau hanya untuk sekedar mengintip di balik layar).

Untuk membuat semua lebih mudah, taut simbolik juga dibuat dalam direktori-direktori yang cocok dengan VG:

# ls -l /dev/vg_critical
total 0
lrwxrwxrwx 1 root root 7 Jun 10 17:05 lv_base -> ../dm-1
lrwxrwxrwx 1 root root 7 Jun 10 17:05 lv_files -> ../dm-0
# ls -l /dev/vg_normal
total 0
lrwxrwxrwx 1 root root 7 Jun 10 17:05 lv_backups -> ../dm-2

LV kemudian dapat digunakan persis seperti partisi standar:

# mkfs.ext4 /dev/vg_normal/lv_backups
mke2fs 1.44.5 (15-Dec-2018)
Discarding device blocks: done                            
Creating filesystem with 3145728 4k blocks and 786432 inodes
Filesystem UUID: bae1f819-1f74-4518-80f0-53e30b8ae88d
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done 

# mkdir /srv/backups
# mount /dev/vg_normal/lv_backups /srv/backups
# df -h /srv/backups
Filesystem                        Size  Used Avail Use% Mounted on
/dev/mapper/vg_normal-lv_backups   12G   41M   12G   1% /srv/backups
# [...]
[...]
# cat /etc/fstab
[...]
/dev/vg_critical/lv_base    /srv/base       ext4 defaults 0 2
/dev/vg_critical/lv_files   /srv/files      ext4 defaults 0 2
/dev/vg_normal/lv_backups   /srv/backups    ext4 defaults 0 2

Dari sudut pandang aplikasi, berbagai partisi kecil sekarang telah diabstrakkan ke dalam satu volume 12 GB besar, dengan nama yang lebih mudah.

12.1.2.3. LVM Dari Waktu Ke Waktu

Meskipun kemampuan untuk mengagregasi partisi atau disk fisik itu nyaman, ini bukanlah keuntungan utama yang dibawa oleh LVM. Fleksibilitas yang dibawanya terutama teramati seiring berjalannya waktu, ketika kebutuhan berevolusi. Dalam contoh kita, mari kita asumsikan bahwa berkas besar baru harus disimpan, dan bahwa LV yang didedikasikan untuk server berkas terlalu kecil untuk menampung mereka. Karena kita belum menggunakan seluruh ruang yang tersedia di vg_critical, kita bisa perbesar lv_files. Untuk tujuan tersebut, kita akan menggunakan perintah lvresize, lalu resize2fs untuk mengadaptasi sistem berkas:

# df -h /srv/files/
Filesystem                        Size  Used Avail Use% Mounted on
/dev/mapper/vg_critical-lv_files  4.9G   20M  4.6G   1% /srv/files
# lvdisplay -C vg_critical/lv_files
  LV       VG          Attr     LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync  Convert
  lv_files vg_critical -wi-ao-- 5.00g
# vgdisplay -C vg_critical
  VG          #PV #LV #SN Attr   VSize VFree
  vg_critical   2   2   0 wz--n- 7.99g 1.99g
# lvresize -L 6G vg_critical/lv_files
  Size of logical volume vg_critical/lv_files changed from 5.00 GiB (1280 extents) to 6.00 GiB (1536 extents).
  Logical volume vg_critical/lv_files successfully resized.
# lvdisplay -C vg_critical/lv_files
  LV       VG          Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  lv_files vg_critical -wi-ao---- 6.00g
# resize2fs /dev/vg_critical/lv_files
resize2fs 1.44.5 (15-Dec-2018)
Filesystem at /dev/vg_critical/lv_files is mounted on /srv/files; on-line resizing required
old_desc_blocks = 1, new_desc_blocks = 1
The filesystem on /dev/vg_critical/lv_files is now 1572864 (4k) blocks long.

# df -h /srv/files/
Filesystem                        Size  Used Avail Use% Mounted on
/dev/mapper/vg_critical-lv_files  5.9G   20M  5.6G   1% /srv/files

HATI-HATI Mengubah ukuran sistem berkas

Tidak semua sistem berkas dapat diubah ukurannya secara daring; mengubah ukuran volume oleh karena itu mungkin pertama memerlukan melepas kait sistem berkas dan mengait ulang setelah itu. Tentu saja, jika seseorang ingin mengecilkan ruang yang dialokasikan untuk LV, sistem berkas harus diperkecil dulu; urutan dibalik ketika perubahan ukuran untuk arah lain: volume logis harus diperbesar sebelum sistem berkas di atasnya. Hal ini cukup sederhana, karena kapanpun ukuran sistem tidak boleh lebih besar dari perangkat blok tempat dia berada (apakah perangkat berupa partisi fisik atau volume logis).

Sistem berkas ext3, ext4, dan xfs dapat diperbesar secara daring, tanpa melepas kain; menyusutkan memerlukan melepas kait. Sistem berkas reiserfs memungkinkan perubahan ukuran secara daring di kedua arah. ext2 tidak memungkinkan keduanya, dan selalu membutuhkan melepas kait.

Kita bisa melanjutkan dengan cara yang sama untuk memperbesar volume yang mewadai basis data, tapi kita telah mencapai batas ruang VG yang tersedia:

# df -h /srv/base/
Filesystem                       Size  Used Avail Use% Mounted on
/dev/mapper/vg_critical-lv_base  976M  2.6M  907M   1% /srv/base
# vgdisplay -C vg_critical
  VG          #PV #LV #SN Attr   VSize VFree   
  vg_critical   2   2   0 wz--n- 7.99g 92.00m

Tidak masalah, karena LVM memungkinkan menambahkan volume fisik ke grup volume yang ada. Misalnya, mungkin kita telah memperhatikan bahwa partisi sdb1, yang sejauh ini digunakan di luar LVM, hanya berisi arsip yang dapat dipindahkan ke lv_backups. Kita sekarang dapat mendaur ulang itu dan mengintegrasikannya ke grup volume, dan dengan demikian memperoleh kembali ruang bebas. Ini adalah tujuan dari perintah vgextend. Tentu saja, partisi harus disiapkan sebagai sebuah volume fisik terlebih dahulu. Setelah VG telah diperbesar, kita dapat menggunakan perintah sejenis seperti yang sebelumnya untuk menumbuhkan volume logis kemudian sistem berkasnya:

# pvcreate /dev/sdb1
  Physical volume "/dev/sdb1" successfully created.
# vgextend vg_critical /dev/sdb1
  Volume group "vg_critical" successfully extended
# vgdisplay -C vg_critical
  VG          #PV #LV #SN Attr   VSize VFree
  vg_critical   3   2   0 wz--n- 9.09g 1.09g
# [...]
[...]
# df -h /srv/base/
Filesystem                       Size  Used Avail Use% Mounted on
/dev/mapper/vg_critical-lv_base  2.0G  854M  1.1G  45% /srv/base

LEBIH JAUH LVM tingkat lanjut

LVM also caters for more advanced uses, where many details can be specified by hand. For instance, an administrator can tweak the size of the blocks that make up physical and logical volumes, as well as their physical layout. It is also possible to move blocks across PVs, for instance, to fine-tune performance or, in a more mundane way, to free a PV when one needs to extract the corresponding physical disk from the VG (whether to affect it to another VG or to remove it from LVM altogether). The manual pages describing the commands are generally clear and detailed. A good entry point is the lvm(8) manual page.

12.1.3. RAID atau LVM?

RAID dan LVM keduanya membawa keuntungan tak terbantahkan bila kita abaikan kasus sederhana komputer desktop dengan satu hard disk dengan pola penggunaan tidak berubah dari waktu ke waktu. Namun, RAID dan LVM mengambil arah yang berbeda, dengan tujuan divergen, dan sah-sah saja bertanya-tanya mana yang harus diambil. Jawaban paling tepat akan tentu saja tergantung pada kebutuhan saat ini dan masa mendatang.

Ada beberapa kasus sederhana dimana pertanyaan tidak benar-benar muncul. Jika kebutuhan adalah untuk mengamankan data terhadap kegagalan perangkat keras, maka jelas RAID akan disiapkan pada array disk, karena LVM tidak benar-benar menjawab masalah ini. Si sisi lain, jika kebutuhan adalah untuk skema penyimpanan yang fleksibel dimana volume dibuat independen terhadap tata letak fisik dari disk, RAID tidak banyak membantu dan LVM akan menjadi pilihan yang tepat.

CATATAN Jika kinerja penting…

Jika kecepatan masukan/keluar adalah esensinya, terutama dalam hal waktu akses, menggunakan LVM dan/atau RAID di salah satu dari banyak kombinasi mungkin memiliki dampak pada kinerja, dan ini mungkin mempengaruhi keputusan untuk memilih yang mana. Namun, perbedaan-perbedaan dalam kinerja benar-benar kecil, dan hanya terukur dalam beberapa kasus penggunaan. Jika kinerja penting, keuntungan terbaik yang diperoleh adalah menggunakan media penyimpanan bukan rotasi ( solid-state drive); biaya per megabyte mereka lebih tinggi daripada hard disk drive standar, dan kapasitas mereka biasanya lebih kecil, tapi mereka memberikan kinerja yang sangat baik untuk akses acak. Jika pola penggunaan mencakup banyak operasi keluaran/masukan yang terpencar di seluruh sistem berkas, misalnya untuk basis data tempat query-query yang kompleks rutin dijalankan, maka keuntungan dari menjalankan mereka pada SSD jauh lebih besar daripada apa pun yang bisa diperoleh dengan memilih LVM atas RAID atau sebaliknya. Dalam situasi ini, pilihan harus ditentukan oleh pertimbangan selain murni kecepatan, karena aspek kinerja paling mudah ditangani dengan menggunakan SSD.

The third notable use case is when one just wants to aggregate two disks into one volume, either for performance reasons or to have a single filesystem that is larger than any of the available disks. This case can be addressed both by a RAID-0 (or even linear-RAID) and by an LVM volume. When in this situation, and barring extra constraints (for instance, keeping in line with the rest of the computers if they only use RAID), the configuration of choice will often be LVM. The initial set up is barely more complex, and that slight increase in complexity more than makes up for the extra flexibility that LVM brings if the requirements change or if new disks need to be added.

Kemudian tentu saja, ada kasus penggunaan yang benar-benar menarik, dimana sistem penyimpanan perlu dibuat tahan terhadap kegagalan perangkat keras dan fleksibel tentang alokasi volume. RAID maupun LVM masing-masing dapat menjawab kedua persyaratan; ini adalah di mana kita menggunakan keduanya pada saat yang sama -- atau lebih tepatnya, satu di atas yang lain. Skema yang memiliki semua tapi belum menjadi standar karena RAID dan LVM telah mencapai kedewasaan untuk memastikan redundansi data pertama dengan pengelompokan disk dalam sejumlah kecil larik RAID besar, dan menggunakan larik RAID ini sebagai volume fisik LVM; partisi logis kemudian dapat ditoreh dari LV-LV ini untuk sistem berkas. Nilai jual konfigurasi ini adalah bahwa ketika sebuah disk gagal, hanya sejumlah kecil larik RAID yang perlu dibangun kembali, sehingga membatasi waktu yang dihabiskan oleh administrator untuk pemulihan.

Mari kita ambil contoh konkret: departemen hubungan masyarakat di Falcot Corp memerlukan sebuah workstation untuk penyuntingan video, tapi anggaran departemen tidak mengizinkan berinvestasi di perangkat keras kelas tinggi secara lengkap. Keputusan dibuat untuk mendukung perangkat keras yang khusus untuk sifat grafis pekerjaan (monitor dan kartu video), dan tetap dengan perangkat keras generik untuk penyimpanan. Namun, seperti sudah dikenal luas, video digital memiliki beberapa persyaratan khusus untuk penyimpanan: banyaknya data yang akan disimpan besar, dan kecepatan pembacaan dan penulisan data ini penting untuk keseluruhan kinerja sistem (lebih daripada waktu akses rata-rata, misalnya). Batasan-batasn ini perlu dipenuhi dengan perangkat keras generik, dalam kasus ini dua hard disk drive SATA 300 GB; data sistem juga harus dibuat tahan terhadap kegagalan perangkat keras, termasuk sebagian data pengguna. Klip video yang diedit memang harus aman, tetapi tidak perlu bergegas menyunting video yang tertunda, karena mereka masih berada pada kaset.

RAID-1 and LVM are combined to satisfy these constraints. The disks are attached to two different SATA controllers to optimize parallel access and reduce the risk of a simultaneous failure, and they therefore appear as sda and sdc. They are partitioned identically along the following scheme:

# fdisk -l /dev/sda

Disk /dev/sda: 300 GB, 300090728448 bytes, 586114704 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00039a9f

Device    Boot     Start       End   Sectors Size Id Type
/dev/sda1 *         2048   1992060   1990012 1.0G fd Linux raid autodetect
/dev/sda2        1992061   3984120   1992059 1.0G 82 Linux swap / Solaris
/dev/sda3        4000185 586099395 582099210 298G 5  Extended
/dev/sda5        4000185 203977305 199977120 102G fd Linux raid autodetect
/dev/sda6      203977306 403970490 199993184 102G fd Linux raid autodetect
/dev/sda7      403970491 586099395 182128904  93G 8e Linux LVM

The first partitions of both disks (about 1 GB) are assembled into a RAID-1 volume, md0. This mirror is directly used to store the root filesystem.
The sda2 and sdc2 partitions are used as swap partitions, providing a total 2 GB of swap space. With 1 GB of RAM, the workstation has a comfortable amount of available memory.
The sda5 and sdc5 partitions, as well as sda6 and sdc6, are assembled into two new RAID-1 volumes of about 100 GB each, md1 and md2. Both these mirrors are initialized as physical volumes for LVM, and assigned to the vg_raid volume group. This VG thus contains about 200 GB of safe space.
The remaining partitions, sda7 and sdc7, are directly used as physical volumes, and assigned to another VG called vg_bulk, which therefore ends up with roughly 200 GB of space.

Once the VGs are created, they can be partitioned in a very flexible way. One must keep in mind that LVs created in vg_raid will be preserved even if one of the disks fails, which will not be the case for LVs created in vg_bulk; on the other hand, the latter will be allocated in parallel on both disks, which allows higher read or write speeds for large files.

We will therefore create the lv_var and lv_home LVs on vg_raid, to host the matching filesystems; another large LV, lv_movies, will be used to host the definitive versions of movies after editing. The other VG will be split into a large lv_rushes, for data straight out of the digital video cameras, and a lv_tmp for temporary files. The location of the work area is a less straightforward choice to make: while good performance is needed for that volume, is it worth risking losing work if a disk fails during an editing session? Depending on the answer to that question, the relevant LV will be created on one VG or the other.

We now have both some redundancy for important data and much flexibility in how the available space is split across the applications.

CATATAN Mengapa tiga volume RAID-1?

We could have set up one RAID-1 volume only, to serve as a physical volume for vg_raid. Why create three of them, then?

The rationale for the first split (md0 vs. the others) is about data safety: data written to both elements of a RAID-1 mirror are exactly the same, and it is therefore possible to bypass the RAID layer and mount one of the disks directly. In case of a kernel bug, for instance, or if the LVM metadata become corrupted, it is still possible to boot a minimal system to access critical data such as the layout of disks in the RAID and LVM volumes; the metadata can then be reconstructed and the files can be accessed again, so that the system can be brought back to its nominal state.

The rationale for the second split (md1 vs. md2) is less clear-cut, and more related to acknowledging that the future is uncertain. When the workstation is first assembled, the exact storage requirements are not necessarily known with perfect precision; they can also evolve over time. In our case, we can't know in advance the actual storage space requirements for video rushes and complete video clips. If one particular clip needs a very large amount of rushes, and the VG dedicated to redundant data is less than halfway full, we can re-use some of its unneeded space. We can remove one of the physical volumes, say md2, from vg_raid and either assign it to vg_bulk directly (if the expected duration of the operation is short enough that we can live with the temporary drop in performance), or undo the RAID setup on md2 and integrate its components sda6 and sdc6 into the bulk VG (which grows by 200 GB instead of 100 GB); the lv_rushes logical volume can then be grown according to requirements.