05-05-2022 05:20 PM - last edited on 08-11-2023 11:15 AM by JackLacava
Recently I heard in these forums that FDX methods (like FdxExecuteCubeView) should be used from a BR whenever the performance of the data is critical.
Can someone help me understand why they are faster?
We've already built quite a lot of client-integration software using cubeviews. The cubeviews are being executed via the REST api. We are retrieving data with "api/DataProvider/GetAdoDataSetForCubeViewCommand".
Now that I have discovered the "FDX" methods, it puts me in the troublesome position. I realize that a ton of our integration software should probably be re-written to use those FDX methods. The re-write may take some time. Why won't onestream just extend the "normal" cubeview interface of the REST api, and add an option to run it in FDX mode? That would be similar to the "turbo" buttons they used to put on a the tower of a PC to make it run faster (for those users who had a preference between a fast computer and a slow one.)
It would be helpful to understand the differences between these two (fast and slow) mechanisms for retrieving data from cube views.
Solved! Go to Solution.
05-06-2022 08:48 AM - edited 05-06-2022 08:53 AM
To add to Peter's points, the FDX methods were specifically created to programmatically extract data from the appropriate OneStream tables or other source systems in the most efficient way. The FDX connectors were primarily created in conjunction of the BI Blend process but can be used for any process that requires data to be extracted from OneStream
FDX, Fast Data Extract, BRAPIs allow a variety of options for connecting to DataSources for BI Blend. A key differentiator between the FDX BRAPI’s and other collection methods is support of parallel processing, in memory processing, and management of the Time dimension.
FDX BRAPI’s provide functionality to build Connectors to extract data from:
Performance is gained through the BRAPI’s ability to parallel process. For example, extracting data by Cube Data Unit will parallel process all the Data Units defined in the filter. Second, the FDX BRAPI’s do not generate a “.CSV” file as do Data Management File “Export Data” or “Export File” processes. The results of the export are managed during the BI Blend “in-memory” processing.
In cases of very large data sets, which where multiple periods are loaded, the processing time can be slow because each period is reflected as a data record. FDX BRAPI’s offer solutions to pivot the Time records to columns in order to create a matrix data layout. The Datasource can associate each of the periods with an “Attribute Value” field within the Integration settings. The design will treat each record as a collection of 12 periods when processing.
05-06-2022 06:47 AM
Hi,
This is not my area of expertise, but I found something from the AnalyticBlend Training that I have attended. See print screen below. It looks like the FDX are BRapi's that can run in paralell, making the performce better in this way.
Peter
05-06-2022 08:38 AM
I noticed that as well. I think that is part of it.
But I think there is more to it than just parallelization. I already am running parallel cubeviews via the REST api. But even if I introduce the parallelization, I still don't get the same overall efficiency as those FDX methods.
Additionally the data from the FDX methods seems a bit more "low-level" or "raw" in nature. It returns only member names (not descriptions), and for whatever reason it annotates the current/selected member of all dimensions (UD's) even when those dimensions do not appear in the definition of the cubeview rows or columns.
05-06-2022 08:48 AM - edited 05-06-2022 08:53 AM
To add to Peter's points, the FDX methods were specifically created to programmatically extract data from the appropriate OneStream tables or other source systems in the most efficient way. The FDX connectors were primarily created in conjunction of the BI Blend process but can be used for any process that requires data to be extracted from OneStream
FDX, Fast Data Extract, BRAPIs allow a variety of options for connecting to DataSources for BI Blend. A key differentiator between the FDX BRAPI’s and other collection methods is support of parallel processing, in memory processing, and management of the Time dimension.
FDX BRAPI’s provide functionality to build Connectors to extract data from:
Performance is gained through the BRAPI’s ability to parallel process. For example, extracting data by Cube Data Unit will parallel process all the Data Units defined in the filter. Second, the FDX BRAPI’s do not generate a “.CSV” file as do Data Management File “Export Data” or “Export File” processes. The results of the export are managed during the BI Blend “in-memory” processing.
In cases of very large data sets, which where multiple periods are loaded, the processing time can be slow because each period is reflected as a data record. FDX BRAPI’s offer solutions to pivot the Time records to columns in order to create a matrix data layout. The Datasource can associate each of the periods with an “Attribute Value” field within the Integration settings. The design will treat each record as a collection of 12 periods when processing.
05-06-2022 09:42 AM
Thanks for the update Tony!
Peter
05-06-2022 10:56 AM
Thanks Tony.
Sorry to tag on another question. One issue that came up when swapping over to the FDX for cubeviews is a concern about memory usage . Is there any additional responsibility that our BR may have insofar as memory is concerned? Will the owner of the BR be responsible for release memory used by the FDX, or possibly flushing a cache? Sometimes .Net API's provide deliberate Dispose() methods for this purpose, if the release of memory needs to happen proactively.
I was working with our pre-prod environment the other day (I was the only one in there), and was running a massive number of FDX operations. Memory usage continually increased until it reached the capacity of the server and then we eventually needed to restart IIS since the dashboard adapters started failing and returning the word "null". If this memory in IIS was related to legitimate "caching" then I would expect that memory usage would level out after repeatedly executing the same FDX-cubeviews. We are talking about 16 GB of RAM on the server, and each cubeview would only conceivably use 1 GB at the very most.
I am preparing to deploy my work to production and this will become someone else's problem. 😉 But I just want to make sure that I've done my own due diligence, to limit the need for technical support tickets with onestream in the future. I'm pretty good at working with .Net memory dumps so if you can give me an idea about what particular FDX class instances should be the focus of my attention, then I should be able to do more to test for possible memory leaks.
We are using 6.5 in pre-prod and 6.3 in prod.
11-19-2022 03:59 AM
Hi Tony,
I recently modified my parameters to generate a range of dates (From and To parameters) instead of only inputting one month per extract. I just recently discovered that it requires a lot of memory as files are significant, and the process takes time and gets stuck. Is this always the case? That is why one of my client's setups only extracts one month at a time. I wonder if this is okay in Production since they have more memory and I am just concern that it will mess up in prod once deployed.
11-21-2022 08:25 AM
Since you changed the parameters to go beyond one period of data, did you reconstruct the way the data table is configured by using one of the TimePivot options such as FdxExectuteCubeViewTimePivot? So now the number of data records in the data table are reduced and each data record has a supporting column for values for the various periods. That should help reduce the amount of memory needed since each data record is updating value in a column for the periods vs. creating an entirely new data record row in the table for each additional period.
03-22-2023 08:55 AM - edited 03-22-2023 08:56 AM
Hi Tony,
Thank you for the feedback. We are using the FdxExectuteCubeView.
Will it be better to use the timepivot instead? I can test this but are there other options for fdxexecutecubeview? Thanks in advance
Dim dt As DataTable = BRApi.Import.Data.FdxExecuteCubeView(si, cvNameToExtract, entityDimName, entityMemFilter, scenarioDimName, scenarioMemFilter, timeMemFilter, nameValuePairs, includeCellText, useStandardFields, filter, parallelQueryCount, logStats)
06-12-2023 10:28 AM
Thanks for the details. Is it possible to extract the in memory data to a sql table using FdxExecuteCubeView ?
Any example code would be helpful.
Thank you.
Sridhar
06-12-2023 10:49 AM
What memory data are you taking about? Once the data comes into a datatable, you can use SaveCustomTable method in Brapi to save it to a custom table. If this is an external table then you would have to configure that in OneStream.
06-12-2023 10:55 AM
Thank you so much @ckattookaran
That's what I am looking for, I am not aware of that OS has SaveCustomTable BRAPI.
06-12-2023 11:05 AM
Thank you @ckattookaran
Sridhar - There are a lot of BRApi's that may interest you under Database or Import > Data if you are in a Business Rule.
08-23-2023 04:32 AM
Hi Tony,
Can you share the business rule for FDXExecuteStageTargetTimePivot? Appreciate it.
05-06-2022 12:10 PM
To address your questions, I would recommend submitting a ticket with Technical Support to go through your situation. In general, you should not be responsible for clearing memory. The garbage disposal of IIS and Memory Manager should be clearing out memory as the execution of the function is complete. The assumption of a Cube View rendering not exceeding 1 GB is not entirely a correct assumption. It all depends on the amount of data pulled into cache and then inserted into a data table. Each data record in a database takes 3,200 bytes. If you have 1 million data records from the result of the Cube View definitions, then you will use 3.2 GB of memory just for the data records not including the additional memory for the new object of a data table. Every Cube View can be wildly different so I can't give you a definitive answer on any of this. It would be a good idea to meet with Technical Support to run through all of this. The memory not releasing could be a concern but we won't know unless Technical Support gets involved to troubleshoot. OneStream has a lot of memory intensive processes. Cube Views being one of them. 16 GB's on a server is ok for general development but not optimal for a Production environment. 16 GB's of memory can be consumed very quickly depending on the process, data volumes, number of large data tables in memory, etc.
05-06-2022 12:42 PM
Thanks @TonyToniTone
In the past the tech support team was not terribly interested in having our memory dumps. Moreover, a memory dump of 16 GB that demonstrates a problem is pretty unwieldy, even after you are able to download it and open it with a tool like in WinDbg. I was just going to watch counters, and take some smaller memory dumps after running a few cubeviews and triggering the garbage collection.
Ideally it would be possible to confirm a memory problem before 3 or 5 GB. If necessary I could work on creating an artificial repro with tiny cubeviews. I just wanted to know what to look for, to determine whether things were "leaking" or just "caching". (I know that this distinction can sometimes be based on lots of factors.)
Based on your response, it sounds like you have no knowledge of FDX-specific memory issues. I will hold off on assuming that there is a problem for now.... The only reason for my suspicion is based on the behavior I noticed, and the requirement for IIS to be reset every night. Yes, our production servers do have quite a lot more RAM (32 or 64 I think).