The OneStream Community is temporarily frozen until June 29th due to the ongoing maintenance. Please read the blog post here to learn more.
Forum Discussion
YahyaOS
2 years agoNew Contributor II
Problem with Harvest Directory
Hi everyone,
I'm having a problem with the Harvest directory. Indeed, I have two batches running at the same time and which retrieve files from an SFTP to put them in the Harvest directory in order...
RobbSalzmann
2 years agoValued Contributor II
My understanding of batch processing is that all files available will be processed by the batch. The way to "filter" is to put the filter upstream of this (in your case, the SFTP process), allowing only the files you wish to process to land in the harvest folder.
- YahyaOS2 years agoNew Contributor II
thank you for your response.
my problem is that I have two "data management sequences" which are executed at the same time and which each recovers a type of file from an SFTP.
we manage to filter by file name upstream before the files arrive in Harvest. but once the two Data Management are launched and each recovers its files, they all end up in the Harvest which means that sometimes one of the files of one of the tasks is processed by the other and vice versa.And I need a solution to avoid that.
Thank you in advance
- franciscoamores2 years agoContributor II
Right now there is no way to have specific batch executions harvesting from different folders. It’d be great to have an additional parameter for harvest sub folder so this is a good one for ideastream.
in the meantime, the workaround I can think of is to have sempahore solution:
- have different batch subfolders, one for each system (you can download to other parent folder like contents)
- have 2 DM seqs, one per system
- have one ER which performs semaphore logic
1) get current DM sequence name
2) check if the other DM sequence task is running
3) if it is, skip execution (red light)
4) if it is not (green light), move files from subfolder of the current source system
5) execute harvest
in this way you make sure each sequence runs its file. If you want to have more parallelism, you could move the batch harvest execution to a ER in a separate DM and have the semaphore EX executing the DM with Queue api method.
You can also have a single DM job getting system as parameter and updating the description of the task to include the system. You can do this with API
Related Content
- 2 years ago
- 2 years ago