Optimising RDBMS performance with WAL

aditya goel

5 min readJan 21, 2024

If you are landing at this blog directly, it’s advisable to first check through this blog.

Question → Explain the structure of Hard-Disk & how the data is being stored in Database ?

Answer → Each Disk is circular in shape and have following properties :-

Disk has various Tracks & Sectors on it.
The combination of Track & Sector is also called as BLOCK. Data is usually kept in Blocks. It can be of any size, since it depends upon manufacturer.
Whenever we read/write into the Disk, we always do the same in terms of Blocks. Each Block is therefore addressed in terms of : (Track-Number, Sector-Number).
Assuming the size of each block to be 512, Inside each Block, addresses of each Byte starts from 0 and goes up-till 511. We also call it as OffSet. Each Byte is therefore addressed in terms of :: (Track-Number, Sector-Number, OffSet). These are the three crucial things, that we require to reach to a particular Byte.
The data on the disk is read by moving the HeadPost and spinning the spindle.

Question → Demonstrate the concept of Database-Block with an example ?

Answer → Let’s consider an example where each record is of 128 Bytes and assume that, each block size is of 512 Bytes, then we would be able to keep at-max 4 records (rows) in any given block.

Question → What are the usual Block-Sizes with the Database ?

Answer → Typically, the size of each block is as follows :-

1.) For PostGres, by default, the size of block is 8 KiloBytes.

2.) For MySql, default size is 16KB per block, although this can be decreased to 1KB or increased to 64KB.

3.) For Oracle, default block-size recommendations are 8 KB to 200 KB for optimal performance. The default block size is 8 KB.

Oracle recommends smaller block sizes for online transaction processing (OLTP) or mixed workload environments. For decision support system (DSS) workload environments, Oracle recommends larger block sizes.
Oracle has five block sizes (configurable): 2 KB, 4 KB, 8 KB, 16 KB, and 32 KB.
Smaller block sizes are good for small rows with lots of random access. However, small block sizes waste a relatively large amount of each block for metadata. Large block sizes waste storage and memory.

Question → What’s the importance of Commit in the database ?

Answer → Once a transaction is committed on the Hard-disk, it shall be safe from power-loss, OS-failure OR Hardware-Failure as well.

Question → What’s the importance of Commit in the database ?

Answer → Once a transaction is committed on the Hard-disk, it shall be safe from power-loss, OS-failure OR Hardware-Failure as well.

Question → What happens whenever say any particular row is being updated ?

Answer → Once the row is updated, the corresponding block in which this row is present shall be updated.

Question → What happens whenever say any particular row is being updated by the user ?

Answer → Once the row is updated, the corresponding block in which this row is present shall be updated.

The place/file where these changes are being logged is called as WAL i.e. Write-Ahead-Logging.

This WAL is then synched to the Database periodically.

Question → What are the Advantages of WAL ?

Advantages → WAL is one way of improvising the performance of the database in a massive way. Note the following aspects :-

Only update-statements are being appended into this Log-File and then periodically, these changes are being sent to the Table.
Basically the motivation is NOT to write every transaction/update to the database synchronously but rather buffer these changes In-Memory (let’s say every one second) and then flush it onto the Disk.
Here we are slashing/reducing a lot of disk-writes, but we are not running away from the Guarantees that RDBMS have given to the user.
Typically we gain an immense set of performance by avoiding a lot of disk-writes.

For example :-

1.) With MySql → The innodb_flush_log_at_timeout variable controls the frequency of InnoDB log file flushes into the MySQL. The frequency can range from 1 to 2700 seconds, with the default value being 1 second.

2.) With PostgreSQL → The term flush is often used when the Write Ahead Logs (WAL) are written to their final location. The WAL buffers can then be reused for incoming transactions.

Question → How does WAL ensure the Data-Integrity ?

Answer → Below is how the WAL ensures the Consistency of the data as that promised by RDBMS :-

Whenever any particular update is being written to the WAL, first the HASH for that row is being computed using CRC-32 algorithm (i.e. Cyclic Redundancy Check).
This Hash is first written to the WAL file and then contents are being written there in WAL file.
Now, while reading the contents from WAL, we can use this hash to check if the record is correct & it has not been messed.

Question → How is this WAL structured ?

Answer → Below is how the WAL files are structured :-

WAL is nothing, but it is composition of some files, where each file is of size 16 MB.
Each File consists of Pages, where each Page is of size 8 KB. Within this page, we add our “Write-Ahead-Log-Entry”.
As soon as we write something into this WAL, it gives an identifier to this entry. This identifier is known as “Log Sequence Number”. This LSN is Byte-Offset into the Log File.

That’s all in this blog. We shall see you in next blog.

References :-

Optimising RDBMS performance with WAL

Written by aditya goel

No responses yet