OSAdmin
Valued Contributor

BLUEPRINT BULLETIN

ARCHITECT FACTORY

Release Date:  12/08/2020 Updated

Prepared by: Peter Fugere
Severity: Critical

  Summary:   Do not create load partitions by period because it circumvents the default logic of the system to correctly do the math and present the correct data for different views.

 

Purpose:    Understand the correct method to partition large data loads.

 

Key Facts and Recommendations:

Facts of a real-life situation

  • a client was loading data by month for a large data set (about 8 million records). The IT group pushed back on grouping the load by entity, saying it wasn’t possible. Because the data being seeded was for the forecast, it would be easy to load a single file for a given month to an “Admin” location. The load could run overnight and have less impact on performance. When adding a new scenario, the load was for different months. This effectively partitioned the file by period.
  • Because data was being loaded to the same entities for different periods, it created an issue where the system could not figure out the missing data or could not clear the data properly. 

Key Recommendation

Create a load file by entity group. This is a much faster load because a subset of data can be loaded and because the data base grouped by entity there would be smaller partitions which allow the use of multithreading that’s built into OneStream software.

 

Benefits and Costs

Spreading sequential loads speeds up the data load and consolidation process, significantly improving performance improvement by using the built-in multithreading capability of the software. 

 

Any deviations must be reviewed by the project team and signed off by the customer.

Version history
Last update:
‎08-24-2021 12:36 PM
Updated by: