Forum Discussion

NicoleBruno's avatar
NicoleBruno
Valued Contributor
2 years ago

What's the best way to pull 16M+ rows of data

Hello! 
I have a user group who needs a specific set of data pulled just after close every month in order to populate other reporting tools they use. Here's the formatting they require: 

I'm struggling on my options because this results in about 16M+ rows of data. I get an error when trying to export the CV and a QV won't show all rows even with the max # of seconds set to the largest number (which I believe is 600). A DM extract will not show the right format but I don't think a QV can be auto run and sent via Parcel Service. What are my other options? 
Thanks for helping me brainstorm!

  • Jorge_Haces's avatar
    Jorge_Haces
    New Contributor II

    You can use this function "BRApi.Import.Data.FdxExecuteCubeView(si, cubeViewToExtract, entityDimName, entityMemFilter, scenarioDimName, scenarioMemFilter, timeMemFilter, nameValuePairs, includeCellText, useStandardFields, filter, parallelQueryCount, logStats)" to retrieve the data into a DataTable. Then, using a StreamWriter to a CSV.

    Hope it helps!

     

    • NicoleBruno's avatar
      NicoleBruno
      Valued Contributor

      Thanks - is there more instruction somewhere that you know of which I could review to see if this will allow the flexibility I need? 

      • JackLacava's avatar
        JackLacava
        Honored Contributor

        It's part of the so-called FDX (Fast Data eXtract) APIs. If you search around the forum you should find quite a few examples. We also cover them in the BiBlend class, so if you know anyone who attended that you can ask him for the material. They are in the latest docs at https://documentation.onestream.com - they show them applied to Data Source Connectors, but will work as well in Dashboard Data Set rules or any other type of rule.

        Basically, they allow fast extraction of Cube Views, Data Unit records, or Stage records, and also allow to pivot them on the Time dimension if necessary (i.e. getting a column for January, one for February, etc, which can compress the number of records pretty dramatically). I don't think flexibility will be a problem, but they do require a bit of coding to set up all the parameters they require.