Knowledge Base Article

Data Processing and Performance - A comprehensive guide of tables, and design

Overview

To maintain well performing application, one must understand how the underlying database works and more importantly its limitations. Understanding how a system works, allows designers and administrators to create reliable, stable, and optimal performing applications.   This white paper is intended to guide the design of those optimal data processing strategies for the OneStream platform.

First, this document will provide a detailed look at the data structures used by the stage engine as well as those used by the in-memory financial analytic engine, providing a deep understanding of how the OneStream stage engine functions in relation to the in-memory financial analytic engine. The relationship between stage engine data structures and finance engine data structures will be discussed in detail. Understanding how data is stored and manipulated by these engines will help consultants build OneStream applications that are optimized for high-volume data processing.

Second, the workflow engine configuration will be examined in detail throughout the document since it acts as the controller / orchestrated of most tasks in the system. The workflow engine is the primary tool used to configure data processing sequences and performance characteristics in an OneStream application. The are many different workflow structures and settings that specifically relate to data processing and these settings will be discussed in relation to the processing engine that they impact.

Finally, this document will define best practices and logical data processing limits. This will include suggestions on how to create workflow structures and settings for specific data processing workloads. With respect to data defining processing limits, this document will help define practical / logical data processing limits in relation to hard/physical data processing limits and will provide a detailed explanation of the suggested logical limits. This is an important topic because in many situations the physical data processing limit will accept/tolerate that amount of data that is being processed, but the same data may be able to be processed in a much more efficient manner by adhering to logical limits and building the appropriate workflow structures to partition data.

These concepts are particularly important because they enable efficient storage, potential parallel processing and high-performance reporting/consumption when properly implemented.

Conclusion

Large Data Units can create problems for loading, calculating, consolidating, and reporting data. This really is a limitation of what the hardware and networks can support. Your design needs to consider this. But from this paper, I hope you can take away some options to relieve some of the pressure points that could appear.

Published 3 years ago
Version 1.0
No CommentsBe the first to comment