Does the "In Use" tag impact Data Unit Size?

camagruder
New Contributor III

We are looking for ways to reduce the data unit size in one of our cubes in order to increase reporting performance without having to do a redesign.  We currently have +20k UD1's and +10k accounts in this cube and 16 years of historical data. 

Would updating the "In use" tag to false for inactive accounts and UD1's reduce the data unit size and help improve data pulls and reporting performance for current years?  We aren't able to completely remove these accounts/UD1's because we are going back 16 years for historical purposes.

1 ACCEPTED SOLUTION

Hi Christy,  

As for zeros source, I would be just taking an educated guess.  But you are right to want to explore that.  Some people find the No-Data for zero settings create a situation where they feel they have to load zeros. Ive been guilty of this too, but it can be a problem.  A zero is not a NoData value.

 

As for not disrupting the end users - It is always best practice to have at least two cubes - rarely would you only have one.  A detail and summary cube.  This allows to make an update like what you need here without impacting the end users.  If you only have one cube, then your options are getting limited.  With two cubes its easier to update the metadata for the parent cube.  IF you are working with RCS, they can help you.  I am happy to help too off this thread if you want to brainstorm design ideas.  

View solution in original post

5 REPLIES 5

ChristianW
Valued Contributor

Data unit size is mainly related to data not metadata.

If increasing the reporting performance is the only reason for your request, I would strongly advise to have a look at hybrid scenarios. If done correctly it can dramatically reduce the data unit size used for reporting without copying data on the database (just in memory).

ChristianW_0-1645542753762.png

 

We currently have >900 users with >1k reports and who knows how many quickviews and spreadsheets so even going down the hybrid scenario route may be a difficult task because of the impact to so many users.  We may end of having to go down that route but I was looking for anything that would be seamless to users as a first step.

So what you are describing is not an uncommon design issue on large applications, and it is covered in the Data Processing and Performance Guide, that is posted on this forum.  You have a couple options before you will have to fix the design.  I would determine if you have a high number of "zero and near zero data" in your application.  That would need to be found, identify how it got there and removed.  Next you will need to set up the application for better use of dimension extensibility.  If you don't do look at the dimensions, other steps will either not work well or confuse the users more.

 

After that -  

1) You could create a Hybrid Scenario.  This will create summary data at parent entities, so users will need to use the new scenario for reporting an analysis, and understand how to drill to the detail.

2) You could summarize via consolidation rules - There is not a new scenario, but rules of this complexity present different challenges.

If you find this is not enough, and you need to go down the redesign path, you will need to plan on multiple cubes and deeper dimensions (dimensions, not members)

Hi Peter - thanks for the reply!

We have been reading your performance guides and I really appreciate all the detail and information that you and Tom have provided.  The large client that was mentioned in the performance white paper seemed so familiar 😀

We have been combing through the system diagnostics dashboard looking for opportunities to improve performance.  It appears most of the zeroes in the forecasts are where data from actuals were seeded into the forecast since actuals are hitting many more accounts than the forecast.  So if you look at a February forecast, January actuals are causing a lot of zeros in the forecast months (Feb-Dec) since there is nothing forecast in those months.  Once we get to a December forecast, we have much less zeroes (less than 10% which I believe was your recommendation).  Should we be looking at our seeding logic or are these type of zeroes unavoidable?

Capture.PNG

I am not exactly sure how we would explore better use of dimension extensibility without disrupting users.  I might need to explore that further.  Do you have any suggestions?

We are starting to play around with hybrid scenarios this week so I am also hopeful that might be a possibility that we can start moving some of our users toward for reporting.  

Thanks again,

Christy

 

 

 

 

Hi Christy,  

As for zeros source, I would be just taking an educated guess.  But you are right to want to explore that.  Some people find the No-Data for zero settings create a situation where they feel they have to load zeros. Ive been guilty of this too, but it can be a problem.  A zero is not a NoData value.

 

As for not disrupting the end users - It is always best practice to have at least two cubes - rarely would you only have one.  A detail and summary cube.  This allows to make an update like what you need here without impacting the end users.  If you only have one cube, then your options are getting limited.  With two cubes its easier to update the metadata for the parent cube.  IF you are working with RCS, they can help you.  I am happy to help too off this thread if you want to brainstorm design ideas.