Hey everyone. We are implementing a use case for a client that requires import of transactional data found within a csv file. The data is comma delimited, and enclosed in quotes. For example, if the data source contained only three fields, this is what it would look like:
"Field 1 value","Field 2 value","Field 3 value"
The problem that I'm having is that sometimes, Field 2 also contains a quote in it. For example, the client sells parts, and Field 2 contains a description of the part. The description happens to contain a quote (for example, 2" pipe). The data extract routine the client uses automatically includes a second quote to cancel out the first. So the value of Field 2 in this example would be: 2"" pipe. The file, if opened in notepad, would look like this:
"Field 1 value","2"" pipe","Field 3 value"
I've not figured out a way for OneStream to correctly handle this. The only solution that seems to work is pre-processing the file to strip out instances of two double quotes that aren't preceded by or followed by a comma because we have to handle null field cases, where a null field would appropriately be represented by two double quotes. For example: if Field 1 and Field 3 were null, the file would appear as follows, and the only set of two double quotes that needs to be removed is the one within Field 2:
Am I missing a configuration option that handles this use case more elegantly than pre-processing the file?
Appreciate any tips!
Hi Krishna, I ended up writing a RegEx statement within the workflow automation rule I wrote to read each line in the file, find and replace the offending string, and write the dataset to a new file. It does not seem possible to handle this using a complex expression or transformation rule within the Data Source definition, because the existence of the two double quotes corrupts the processing of the delimiters in the file and causes OneStream to import data into incorrect columns. I was hoping there was a way to handle this more natively within the Data Source definition itself, as the existence of two quotes is an accepted csv file format and technically should be handled correctly by the platform.
I do appreciate your comment, though.