Last Updated on October 26, 2022 by Pradeep


SAP HANA Database – Introduction

We all know the SAP HANA Database as a massive parallel database management system (RDBMS) that runs fully in the main memory. Traditional RDBMS was designed for optimizing performance on hardware with the main memory. However, the SAP HANA database is designed from the idea that memory is available in large quantities to keep all business data.

Additionally, with SAP HANA Database the Input/Output ( I/O) access to the hard disk is not a constraint. Whereas traditional database systems put the most effort into optimizing the hard disks. Noteworthy, SAP HANA focuses on optimizing memory access between the CPU cache and main memory.

Robotic human face on a black background with codings along with overlapping text as SAP HANA In memory databse

Features of SAP HANA Database

In order to process massive quantities of data in main memory and provide immediate results for both analysis and transaction, SAP HANA deploys the following activities:

SAP HANA Database keeps data In Memory

Although SAP HANA Database keeps data in memory, you still need non-volatile storage to ensure the write operations are durable. Also to ensure the read options can anticipate that all relevant data resides permanently in the main memory. Thus, you can execute them without disk I/O.

Conceptual View of SAP HANA Database
Conceptual View of SAP HANA Database
Source: https://blogs.sap.com/

Computer Architecture – Old vs. New

Computer architectures have changed over the last few decades. Dramatic drops in price accompany the drastic growth of the main memory capacity that you get on a single computer.

Today, a single enterprise-class server can hold terabytes of data in the main memory. Hard disks are at the very bottom of the storage hierarchy. Because they are cheap, it is affordable to have a very large amount of storage at this level. Not only it is the slowest medium, but also (because there are typically four layers between the hard disk and CPU register. It has the highest latency.

columnar-data-organisation-skillstek1
SAP HANA Database Storage Hierarchy

Learn About: SAP Simple Finance Training

Optimize In-Memory Data access

With HANA Migration, all data is readily available in the main memory of the SAP HANA Database. Additionally, the data movement between the CPU cache and main memory becomes the new performance bottleneck. SAP HANA resolves this by using columnar storage and effective data compression techniques to effectively reduce the overall size of data in memory and achieve high hit ratios in the different caching layers of the CPU.

Support massive parallel data processing

Modern computer systems have a continuously increasing number of processing cores. To take advantage of massively parallel multi-core processors, SAP HANA Database manages the SQL processing instructions into an optimized model. This model allows parallel execution and scales incredibly well with the number of cores. The optimization includes partitioning the data into sections for which the calculations can be executed in parallel.  The advantages of the columnar store for fast read performance have their price. Write operations, particularly inserts and updates to a columnar store, are more complicated and less efficient. To overcome this drawback, SAP HANA introduces the concept of delta store.

In the SAP HANA Database, the main memory is the first level of storage, next to CPU caches and it is directly accessible. Compared with accessing data on hard disks, typically data in the main memory can be accessed more than 100,000 times faster.

Compared with traditional RDBMS, which employs disks as the primary data store and uses main memory as a buffer for data processing, keeping data in memory can improve database performance just by advantage in access time.

Keeping data in SAP HANA Database In-Memory, raise a number of questions, first, what happens if there is a power outage? The main memory is volatile storage. It loses its content when it is out of power. In this context, we refer to a set of properties known as atomicity, consistency, isolation, and durability (ACID). Database technology ensures that database transactions are processed reliably and are not liable to external disruptions.

SAP HANA ensures that changes are durable and that the database can be restored to the most recent committed state after a restart. To achieve this goal in an efficient way, the persistence layer uses a combination of write-ahead logs, shadow paging, and data save points.

Know More:- SAP Controlling Training

Columnar Data Organization

Relational databases represent data in two-dimensional structures called tables. A table is a set of data elements organized in terms of vertical columns or attributes, (which are identified by their name) and horizontal rows or records. The main memory, however, is a single-dimensional space, providing a memory address that starts at zero and increases serially to the highest available location. To store the data in memory, the database storage layer has to decide how to map the two-dimensional table structures to the linear memory address space.

Also read: Layers in SAP Software Application

Let’s take an example to understand this concept.

Table showing Columnar Data Organization in SAP HANA Database
Columnar Data Organization in SAP HANA Database

In this row-based layout, all attributes of a tuple are stored consecutively and sequentially in memory. On the other hand, in a columnar layout, the values of individual columns are stored together.

Row-based Layout

Row based Layout in table showing Columnar data in SAP HANA

Columnar Layout

Column based Layout in table showing Columnar data in SAP HANA

Advantage of Column-based layout

  • The calculation is typically executed on single or few columns only
  • The table is searched based on values of a few columns
  • The table has a large number of columns
  • The table has a large number of records, and mostly columnar operations are required (aggregate, scan, etc).
  • The majority of the columns contain few distinct values compared to the number of rows.

SAP HANA Database supports both row-stored and column-stored tables. High performance is achieved when column-stored tables are used in memory.

Parallel Execution in Columnar Store

Column-based storage simplifies parallel execution by using multiple processor cores. In a column store, data is already vertically partitioned. That means that parallelization can be achieved on different levels.

First, operations on different columns can easily be processed in parallel. If multiple columns need to be searched or aggregated, then each of these operations can be assigned to a different processor core.

Operations on one column can be executed in parallel by dividing the column into multiple sections that are processed by different processor cores.

Flow chart showing Parallel execution in a column store in SAP HANA
Parallel execution in a column store in SAP HANA
Source: https://blogs.sap.com/

Watch Video: SAP HANA In-Memory technology