When loading data into OneStream, our longest workflow takes ~40 minutes to load. We would like to try to kick off the 5 direct connect loads within this workflow to start simultaneously instead of sequentially. Have any of you organized your loads similarly and do you have tips or recommendations please?

Hi Kelley, we just did actually.It is important that the workflow steps are in different workflows (different workflow steps in 1 workflow don’t run in parallel). You also need to make the load ID (the first item of your load file) dynamic.OneStream advises to have a max of 3 in parallel per server so you could create two dm jobs to two different servers with both 3 threads (that’s what we do).We also use the direct load (if you don’t expect tranformation issues) that is also 20 to 30% faster with large data sets.Next to that the transformation itself is important so you could check the log after the load what takes most time. If it is the parsing or the writing to cube then parallel will help you, if it is the complex transformation then you can try to fix that in the SQL/ source.

Direct Connect Loads into OneStream

8 Replies

MarcR
VIP
4 years ago
Hi Kelley, we just did actually.
It is important that the workflow steps are in different workflows (different workflow steps in 1 workflow don’t run in parallel). You also need to make the load ID (the first item of your load file) dynamic.
OneStream advises to have a max of 3 in parallel per server so you could create two dm jobs to two different servers with both 3 threads (that’s what we do).
We also use the direct load (if you don’t expect tranformation issues) that is also 20 to 30% faster with large data sets.
Next to that the transformation itself is important so you could check the log after the load what takes most time. If it is the parsing or the writing to cube then parallel will help you, if it is the complex transformation then you can try to fix that in the SQL/ source.
- Kelley_M
  New Contributor
  4 years ago
  Thank you Mark! I’ve copied this and sent to my OneStream consultant.
- KH1
  Contributor II
  2 years ago
  G'day Marc,
  
  TY for your insights.
  
  1. Pls share the reasons why different Imports in the same Workflow Profile don’t (can't) run in parallel.
  When I read this white paper, I thought multiple Imports in one Workflow Profile will run in parallel.
  https://community.onestreamsoftware.com/t5/Foundational-Design-Guides/Data-Processing-and-Performance-A-comprehensive-guide-of-tables/ta-p/4384
  
  2. To Import in parallel, will WF Channels need to be used? Pls share the reasons.
  
  TY.
Clarke_Hair
Contributor
4 years ago
We have a BR that is scheduled that imports from 4 different ERP’s into 50+ different WF’s. Once finished it marks the Import task as complete and sends out an email. I have seen this done two ways. One way was to take the base WF’s and cycle through them, querying for assigned Entities. The other was separate BR’s that were configured to an individual ERP and would update those WF’s with assigned entites being pulled. We had to do it these ways as we update every hour in one case and every 4 in the other.
ckattookaran
VIP
4 years ago
Keep in mind that direct connects does not have a drill back capability, also it is kind of like an all-or-nothing sort of load. If you got level 2 data units in multiple workflows that need to be sibling workflow profiles (similar to what is there currently).
Shameless plug: We do have a chapter on direct connects in our upcoming book.
- Kelley_M
  New Contributor
  4 years ago
  Thank you Celvin. Do you need a “proofreader” for that chapter? LOL!
  - ckattookaran
    VIP
    4 years ago
    I meant direct load (not direct connect). It only keeps one Table in Stage (only summary is kept).
FrankDK
Contributor
4 years ago
I would start looking at, what part of the load that takes the longest time. Is the actual pulling of data from the source-system, or the transformation rules or? From there, then start optimizing for performance. Eg, maybe you can have the source system to aggregate data before pulling data. Maybe you have some transformation rules using the "Mask" option, with BR's that can be optimized etc.

Forum Discussion

Direct Connect Loads into OneStream

8 Replies

Related Content

Direct Load Automation

OneStream 9.0 Stage Schema Changes

Preparing for Direct Connection to OneStream

OneStream Identity Server (OIS)

Setup On-Prem Oracle DB connection in OneStream Cloud