HANA News Blog

Golden Rules for HANA Partitioning

Jens Gleichmann • 26. November 2024
SAP HANA Partitioning process

Partitioning is not only useful to get rid of the record limitation alerts ID 17 ("Record count of non-partitioned column-store tables"), 

20 ("Table growth of non-partitioned column-store tables") or 27 ("Record count of column-store table partitions") it can improve performance of SQLs, 

startup, HSR, data migration and recovery. It is also useful to have a proper design to make use of NSE with paging out range partitions which are not frequently used.

I have also seen quite many of wrong designs like using multiple columns as partition attribute, too big or too small partitions, partitions with too many empty partitions. It can also happen that a design which is currently totally correct is not scalable and has bad performance due to massive change rate (increase due to growth as well as decrease due to archiving). This means you have to monitor the SQLs and the change rate and rate your partition design after 8-12 months again. Also due too new features (dyn. range partitioning / dynamic aging: threshold, interval + distance) it can make sense to redesign the partitioning. It is a moving target and like building a house - it will never be 100% complete and fitting your requirements, because the requirements will change over time.

Overall it is a very complex topic which should only handled by experts. A repartitioning can be really painful due to long downtimes (offline / standard partitioning ) or impacting the business due to long runtimes and resource consumption (online partitioning).

Rule of thumbs

Rule of thumbs for the initial partitioning:

  • min. 100 mio entries per partition for initial partitioning
  • max. 500 mio entries per partition for initial partitioning
  • if you choose too many partitions you can achieve a bad performance, because one thread per partition have to be triggered (e.g. you have 20 threads as statement concurrency limit and 40 partition have to be scanned which results in waiting for resources)
  • if you choose too less partitions, it can be that you have to repartition pretty timely which means another maintenance window / downtime 
  • recommendation: HASH partitioning on a selective column, typically part of primary key
  • making sure that a single partition doesn't exceed a size of about 25 to 30 GB due to delta merge performance (SAP Note 2100010) 
  • not too many empty partitions


Some of this rules can be checked by using the mini checks of SAP note 1969700.


HASH 

Use HASH partitioning if you have not a number, id or interval which is constantly increasing and used by the SQLs where clauses. If you are not aware of

the change rate and distribution, you can use HASH partitioning as a safe harbor. But use the number of partitions wisely! You can not easily add

partitions like for RANGE partitioning. The scalability is limited. A repartitioning can be expensive! Create the number of partitions wisely not too many and not too less. 


RANGE

Use RANGE partitioning if you have a time attribute or a change number. This means a integer value which is constantly increasing and used by the SQLs 

where clauses. 

Typical date/time fields (in ABAP dictionary the data type is another one compared to HANA DB):

GJAHR (4 chars, ABAP DDIC: NUMC, HANA : NVARCHAR)

UDATE (8 chars, ABAP DDIC: DATS, HANA : NVARCHAR)

TIMESTAMP (21 chars, ABAP DDIC: DEC, HANA : DECIMAL)


Typical other integer fields which can be used for RANGE partitioning (sure there are more):

KNUMV

CHANGENR


This means not that every table with such columns should be partitioned by this attributes. It depends on the distribution and selectivity. Every system is different and there is no silver bullet.

The advantage of RANGE partitioning is, that you can add new partitions within milliseconds without disturbing the business operations. It means RANGE partitioning is the smarter partitioning option. You can also rebalance the design by merging or splitting partitions. During the partitioning process only the defined and affected ranges will be touched. This allows you to redesign the partitioning without big downtimes. This applies to standard/offline partitioning. For online partitioning always the complete table has to be touched!

RANGE partitioning includes also in all normal cases a OTHERS partition which all data will be stored which has no valid range. There are some features regarding dynamic range options which will be handled in an own article.

Multilevel partitioning

If your system includes huge tables it might be wise to use more than one attribute. One on the first level and one on the second level. This depends on the affected table and columns if it makes sense or not. There are only rare scenarios where it makes sense to combine multiple attributes on one level.


Designing Partitions

Actually the online repartitioning is based on table replication. Tables with the naming convention _SYS_OMR_<source_table>#<id> are used as interim 

tables during online repartitioning operations. For details please read the “Designing Partitions” section in the documentation.


Summary:

  1.   Use partitioning columns that are often used in WHERE clauses for partition pruning
  2.   If you don’t know which partition scheme to use, start with hash partitioning
  3.   Use as many columns in the hash partitioning as required for good load balancing, but try to use only those columns that are typically used in requests
  4.   Queries do not necessarily become faster when smaller partitions are searched. Often queries make use of indexes and the table or partition size is 
  5.   not significant. If the search criterion is not selective though, partition size does matter.
  6.   Using time-based partitioning often involves the use of hash-range partitioning with range on a date column
  7.   If you split an index (SAP names the CS tables also as index), always use a multiple of the source parts (for example 2 to 4 partitions). This way the
  8.   split will be executed in parallel mode and also does not require parts to be moved to a single server first.
  9.   Do not split/merge a table unless necessary.
  10.   Ideally tables have a time criterion in the primary key. This can then be used for time-based partitioning.
  11.   Single level partitioning limitation: the limitation of only being able to use key columns as partitioning columns (homogeneous partitioning)
  12.   the client (MANDT/MANDANT) as single attribute for partitioning is not recommended - only useful in multi level partitioning scenarios with real multi clients environments


In the end if you want be on the safe site, just contact us. We will find a scalable design and may improve performance or also find a NSE design for your tables. It is a complex topic on for a proper design you need deep knowledge in SQL performance tuning, partitioning options and HANA is working in general. In the near future our new book will be released with all details regarding partitioning.


SAP HANA News by XLC

HANA Roadmap
von Jens Gleichmann 21. November 2024
End of maintenance for HANA 2.0 SPS05 in 2025 - plan your upgrade path
SAP HANA 2.0 SPS08 Release for Customers
von Jens Gleichmann 21. November 2024
SAP HANA 2.0 SPS08 was released to customers at the end of November 2024. The release has 4 years of maintenance. In parallel, SPS05 will remain under maintenance until December 2025 and SPS07 until April 2028.
Feedback of customer projects and the labyrinth of the roles & responsibilities list
von Jens Gleichmann 18. November 2024
Feedback of customer projects and the labyrinth of the roles & responsibilities list
Performance degradation after upgrade to SPS07
von Jens Gleichmann 5. November 2024
With SPS06 and even stronger in SPS07 the HEX engine was pushed to be used more often. This results on the one hand side in easy scenario to perfect results with lower memory and CPU consumption ending up in faster response times. But in scenarios with FAE (for all entries) together with FDA (fast data access), it can result in bad performance. After some customers upgraded their first systems to SPS07 I recommended to wait for Rev. 73/74. But some started early with Rev. 71/72 and we had to troubleshoot many statement. If you have similar performance issues after the upgrade to SPS07 feel free to contact us! Our current recommendation is to use Rev. 74 with some workarounds. The performance degradation is extreme in systems like EWM and BW with high analytical workload.
HANA OS maintenance
von Jens Gleichmann 29. Oktober 2024
Please notice that when you want to run HANA 2.0 SPS07, you need defined OS levels. As you can see RHEL7 and SLES12 are not certified for SPS07. The SPS07 release of HANA is the basis for the S/4HANA release 2023 which is my recommended go-to release for the next years. Keep in mind that you have to go to SPS07 when you are running SPS06 because it will run out of maintenance end of 2023.
News for the hyperscaler AWS, GCP and MS Azure
von Jens Gleichmann 20. September 2024
news instances with SAPS, memory and CPU values in comparison
Unforeseen cloud cost increases
von Jens Gleichmann 13. September 2024
Unforeseen cloud cost increases - RedHat announced back in January this year that the costs for cloud partners will be changed effective April 1, 2024. They called it scalable pricing.
HANA 2.0 SPS08 Roadmap
von Jens Gleichmann 13. September 2024
SAP HANA 2.0 SPS08 Roadmap and features Q4 2024
SUSE maintenance
von Jens Gleichmann 16. August 2024
How to interpret the SUSE Lifecycle
RISE with SAP: Roles & Responsibilities
von Jens Gleichmann 24. Mai 2024
For every possible RISE with SAP customer it is essential to know the difference of the status quo system construct (on-prem self managed / hosted or managed by a MSP) and the RISE offering with a lot of excluded tasks or tasks with additional costs. If you don't need this tasks, it might be a perfect solution, but our experience is that most customers need some of the services with extra costs.
more
Share by: