Replace comma delimited file with pipe, but not remove comma or quotes etc within text qualifier field,...











up vote
0
down vote

favorite
1












My file is a comma delimited file and text qualifier is ~, but my requirement is to find and replace comma delimited file with |(pipe) delimited file and remove text qualifier ~ with nothing, however,
I should not remove quote or double quotes or any special character within the data present in text qualifier. eg: ~abc",~ I need it as abc",



Below is the content of my source file and the way i expect the output or manipulated file to be.



Source file:



364034,2015652205,26722,2015,4,~C25753-4~,~TC25753,~,~2WD Double Cab 144.2" SLT,~,~Y~,40506.16,43555.00,1095.00,~043,005,006,007,003,008,016,041,012,029,068,027,028,033~,3,~2WD Double Cab 144.2"~,~SLT~,6,4,~N~,~S~,~N~,~S~,~N~,~N~,~N~,~~,~ ~,~Confirmed~,~w2015k65m22t5~,~Sierra 2500HD~,~Double Cab Standard Box 2-Wheel Drive SLT~,~Rear Wheel Drive~,~Extended Cab Pickup - Standard Bed~


After cleansing I need the file like this:



364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043|005|006|007|003|008|016|041|012|029|068|027|028|033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed


I tried sed -i -e with multiple options but the output is not 100% right.



I tried following but it's not giving right result that i wanted



sed -i -e 's/,~/|/g' file_name
sed -i -e 's/~,/|/g' file_name
sed -i -e 's/~//g' file_name
sed -i -e 's/([0-9],[0-9])/|/g' file_name
sed -i -e 's/r//g' file_name









share|improve this question
























  • format your question to make it readable
    – RomanPerekhrest
    May 26 '17 at 21:06






  • 1




    How about tr -d '~' <file | tr -s ',' '|', or sed equivalent sed 's/~//g;s/,+/|/g' file. Is this what you are looking for?
    – jimmij
    May 26 '17 at 21:13












  • Hi There, I am looking for basically replacing , with | ( but not when these are enclosed within ~abc,#~) then finally remove text qualifier ~ Input file is like: 1234,4826,~Ter1337,#~,~~,~46437uyy"@~ Desired manipulated input file 1234|4826|Ter1337,#||46437uyy"@
    – Sudheer
    May 30 '17 at 19:56

















up vote
0
down vote

favorite
1












My file is a comma delimited file and text qualifier is ~, but my requirement is to find and replace comma delimited file with |(pipe) delimited file and remove text qualifier ~ with nothing, however,
I should not remove quote or double quotes or any special character within the data present in text qualifier. eg: ~abc",~ I need it as abc",



Below is the content of my source file and the way i expect the output or manipulated file to be.



Source file:



364034,2015652205,26722,2015,4,~C25753-4~,~TC25753,~,~2WD Double Cab 144.2" SLT,~,~Y~,40506.16,43555.00,1095.00,~043,005,006,007,003,008,016,041,012,029,068,027,028,033~,3,~2WD Double Cab 144.2"~,~SLT~,6,4,~N~,~S~,~N~,~S~,~N~,~N~,~N~,~~,~ ~,~Confirmed~,~w2015k65m22t5~,~Sierra 2500HD~,~Double Cab Standard Box 2-Wheel Drive SLT~,~Rear Wheel Drive~,~Extended Cab Pickup - Standard Bed~


After cleansing I need the file like this:



364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043|005|006|007|003|008|016|041|012|029|068|027|028|033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed


I tried sed -i -e with multiple options but the output is not 100% right.



I tried following but it's not giving right result that i wanted



sed -i -e 's/,~/|/g' file_name
sed -i -e 's/~,/|/g' file_name
sed -i -e 's/~//g' file_name
sed -i -e 's/([0-9],[0-9])/|/g' file_name
sed -i -e 's/r//g' file_name









share|improve this question
























  • format your question to make it readable
    – RomanPerekhrest
    May 26 '17 at 21:06






  • 1




    How about tr -d '~' <file | tr -s ',' '|', or sed equivalent sed 's/~//g;s/,+/|/g' file. Is this what you are looking for?
    – jimmij
    May 26 '17 at 21:13












  • Hi There, I am looking for basically replacing , with | ( but not when these are enclosed within ~abc,#~) then finally remove text qualifier ~ Input file is like: 1234,4826,~Ter1337,#~,~~,~46437uyy"@~ Desired manipulated input file 1234|4826|Ter1337,#||46437uyy"@
    – Sudheer
    May 30 '17 at 19:56















up vote
0
down vote

favorite
1









up vote
0
down vote

favorite
1






1





My file is a comma delimited file and text qualifier is ~, but my requirement is to find and replace comma delimited file with |(pipe) delimited file and remove text qualifier ~ with nothing, however,
I should not remove quote or double quotes or any special character within the data present in text qualifier. eg: ~abc",~ I need it as abc",



Below is the content of my source file and the way i expect the output or manipulated file to be.



Source file:



364034,2015652205,26722,2015,4,~C25753-4~,~TC25753,~,~2WD Double Cab 144.2" SLT,~,~Y~,40506.16,43555.00,1095.00,~043,005,006,007,003,008,016,041,012,029,068,027,028,033~,3,~2WD Double Cab 144.2"~,~SLT~,6,4,~N~,~S~,~N~,~S~,~N~,~N~,~N~,~~,~ ~,~Confirmed~,~w2015k65m22t5~,~Sierra 2500HD~,~Double Cab Standard Box 2-Wheel Drive SLT~,~Rear Wheel Drive~,~Extended Cab Pickup - Standard Bed~


After cleansing I need the file like this:



364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043|005|006|007|003|008|016|041|012|029|068|027|028|033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed


I tried sed -i -e with multiple options but the output is not 100% right.



I tried following but it's not giving right result that i wanted



sed -i -e 's/,~/|/g' file_name
sed -i -e 's/~,/|/g' file_name
sed -i -e 's/~//g' file_name
sed -i -e 's/([0-9],[0-9])/|/g' file_name
sed -i -e 's/r//g' file_name









share|improve this question















My file is a comma delimited file and text qualifier is ~, but my requirement is to find and replace comma delimited file with |(pipe) delimited file and remove text qualifier ~ with nothing, however,
I should not remove quote or double quotes or any special character within the data present in text qualifier. eg: ~abc",~ I need it as abc",



Below is the content of my source file and the way i expect the output or manipulated file to be.



Source file:



364034,2015652205,26722,2015,4,~C25753-4~,~TC25753,~,~2WD Double Cab 144.2" SLT,~,~Y~,40506.16,43555.00,1095.00,~043,005,006,007,003,008,016,041,012,029,068,027,028,033~,3,~2WD Double Cab 144.2"~,~SLT~,6,4,~N~,~S~,~N~,~S~,~N~,~N~,~N~,~~,~ ~,~Confirmed~,~w2015k65m22t5~,~Sierra 2500HD~,~Double Cab Standard Box 2-Wheel Drive SLT~,~Rear Wheel Drive~,~Extended Cab Pickup - Standard Bed~


After cleansing I need the file like this:



364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043|005|006|007|003|008|016|041|012|029|068|027|028|033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed


I tried sed -i -e with multiple options but the output is not 100% right.



I tried following but it's not giving right result that i wanted



sed -i -e 's/,~/|/g' file_name
sed -i -e 's/~,/|/g' file_name
sed -i -e 's/~//g' file_name
sed -i -e 's/([0-9],[0-9])/|/g' file_name
sed -i -e 's/r//g' file_name






shell-script sed regular-expression replace






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 2 days ago









Rui F Ribeiro

38.2k1475123




38.2k1475123










asked May 26 '17 at 20:40









Sudheer

1




1












  • format your question to make it readable
    – RomanPerekhrest
    May 26 '17 at 21:06






  • 1




    How about tr -d '~' <file | tr -s ',' '|', or sed equivalent sed 's/~//g;s/,+/|/g' file. Is this what you are looking for?
    – jimmij
    May 26 '17 at 21:13












  • Hi There, I am looking for basically replacing , with | ( but not when these are enclosed within ~abc,#~) then finally remove text qualifier ~ Input file is like: 1234,4826,~Ter1337,#~,~~,~46437uyy"@~ Desired manipulated input file 1234|4826|Ter1337,#||46437uyy"@
    – Sudheer
    May 30 '17 at 19:56




















  • format your question to make it readable
    – RomanPerekhrest
    May 26 '17 at 21:06






  • 1




    How about tr -d '~' <file | tr -s ',' '|', or sed equivalent sed 's/~//g;s/,+/|/g' file. Is this what you are looking for?
    – jimmij
    May 26 '17 at 21:13












  • Hi There, I am looking for basically replacing , with | ( but not when these are enclosed within ~abc,#~) then finally remove text qualifier ~ Input file is like: 1234,4826,~Ter1337,#~,~~,~46437uyy"@~ Desired manipulated input file 1234|4826|Ter1337,#||46437uyy"@
    – Sudheer
    May 30 '17 at 19:56


















format your question to make it readable
– RomanPerekhrest
May 26 '17 at 21:06




format your question to make it readable
– RomanPerekhrest
May 26 '17 at 21:06




1




1




How about tr -d '~' <file | tr -s ',' '|', or sed equivalent sed 's/~//g;s/,+/|/g' file. Is this what you are looking for?
– jimmij
May 26 '17 at 21:13






How about tr -d '~' <file | tr -s ',' '|', or sed equivalent sed 's/~//g;s/,+/|/g' file. Is this what you are looking for?
– jimmij
May 26 '17 at 21:13














Hi There, I am looking for basically replacing , with | ( but not when these are enclosed within ~abc,#~) then finally remove text qualifier ~ Input file is like: 1234,4826,~Ter1337,#~,~~,~46437uyy"@~ Desired manipulated input file 1234|4826|Ter1337,#||46437uyy"@
– Sudheer
May 30 '17 at 19:56






Hi There, I am looking for basically replacing , with | ( but not when these are enclosed within ~abc,#~) then finally remove text qualifier ~ Input file is like: 1234,4826,~Ter1337,#~,~~,~46437uyy"@~ Desired manipulated input file 1234|4826|Ter1337,#||46437uyy"@
– Sudheer
May 30 '17 at 19:56












2 Answers
2






active

oldest

votes

















up vote
2
down vote













ESC=$(printf '33')
RED="${ESC}[0;31m"
NC="${ESC}[0m"

sed -e '
/./!b
/[^[:space:]]/!b

s/.*/
&,/

:loop
h
s/(n),/|1/; # An empty field
s/(n)([+-]{0,1}[.][0-9]{1,}),/2|1/; # +-.NNN
s/(n)([+-]{0,1}[0-9]{1,}([.][0-9]*){0,1}),/2|1/; # +-NNN.MMM +-NNN. +-NNN
s/(n)~([0-9][0-9]*),/2|1/; # ~NNN
s/(n)([0-9][0-9]*)~,/2|1/; # NNN~
s/(n)~([^~]*)~,/2|1/; # ~...~
x;G
/^(.*)n1$/{
g;'"s/n([^,]*)/${RED}1${NC}/"'
i
***'"${RED}ERROR${NC}"'*** Unable to process the field shown colored.

Cause of error: What this means is that this particular field is not

Fix: You should add to the sed code in the :loop label to
digest the able to be processed by the sed code as it stands.

The record with the offending field shown colored red:

q
}
g; # all clear: recover and carry on...
/n$/!bloop

s/..$//
' csv.data




Working




  • We base the solution on the various types of fields.

  • Skip empty or blank lines.

  • Append a "," to simplify regex used, we'll take it away at the end.

  • To set the ball rolling we place a marker, n, at the beginning of line. This marker will travel from left to right, jumping a field processed a time.

  • The action begins in the do-while loop, in the body of which, we process a field at a time. The field beginning is signaled by n and we process the various varieties of fields that can occur. Everytime, we bring the field processed away to the left of n and replace the , by a |.

  • The looping stops when the marker n hits the end of line /n$/ and we then take away the marker as well as the dummy , we had placed in the beginning.




Results



364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043|005|006|007|003|008|016|041|012|029|068|027|028|033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed





share|improve this answer























  • Hi Rakesh, Thanks much for replying quickly. I incorporated the logic you provided in a Shell script which is supposed to perform the task explained for multiple files in a directory. Following is the shell script and when I execute the code it gets stuck on First file execution without performing anything. Can you please check and provide any suggestions to address the issue?
    – Sudheer
    May 30 '17 at 19:44












  • Here is the shell script #!/bin/ksh cd /CustomFolder/input/Test FILES=$(ls -t *.txt) for f in $FILES do sed -e ' # skip empty and blank lines /./!b /[^[:space:]]/!b s/$/,/ G;s/(.*)(.)/21/ :loop s/(n),/|1/ s/(n)([+-]{0,1}[.][0-9]{1,}),/2|1/ s/(n)([+-]{0,1}[0-9]{1,}([.][0-9]*){0,1}),/2|1/ s/(n)~([0-9]*),/2|1/ s/(n)([0-9]*)~,/2|1/ s/(n)~([^~]*)~,/2|1/ /n$/!bloop s/..$// ' $f done exit 0
    – Sudheer
    May 30 '17 at 19:46










  • @Sudheer What that means is that there are some fields which are not of the type being handled presently in the code. Providing a diagnostic facility within to help you zero in on the offender.
    – user218374
    May 30 '17 at 22:21


















up vote
1
down vote













I'd suggest using a dedicated CSV parser such as perl's Text::CSV



perl -MText::CSV -lne '
BEGIN{ $csv = Text::CSV->new({ quote_char => "~" , escape_char => "~" , allow_whitespace => 1}) }
print join "|", $csv->fields() if $csv->parse($_)
' file_name
364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043,005,006,007,003,008,016,041,012,029,068,027,028,033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed





share|improve this answer





















  • Thanks for your suggestion but I can't use perl, I need to use standard unix shell scripting or unix commands.
    – Sudheer
    May 30 '17 at 19:52











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f367488%2freplace-comma-delimited-file-with-pipe-but-not-remove-comma-or-quotes-etc-withi%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote













ESC=$(printf '33')
RED="${ESC}[0;31m"
NC="${ESC}[0m"

sed -e '
/./!b
/[^[:space:]]/!b

s/.*/
&,/

:loop
h
s/(n),/|1/; # An empty field
s/(n)([+-]{0,1}[.][0-9]{1,}),/2|1/; # +-.NNN
s/(n)([+-]{0,1}[0-9]{1,}([.][0-9]*){0,1}),/2|1/; # +-NNN.MMM +-NNN. +-NNN
s/(n)~([0-9][0-9]*),/2|1/; # ~NNN
s/(n)([0-9][0-9]*)~,/2|1/; # NNN~
s/(n)~([^~]*)~,/2|1/; # ~...~
x;G
/^(.*)n1$/{
g;'"s/n([^,]*)/${RED}1${NC}/"'
i
***'"${RED}ERROR${NC}"'*** Unable to process the field shown colored.

Cause of error: What this means is that this particular field is not

Fix: You should add to the sed code in the :loop label to
digest the able to be processed by the sed code as it stands.

The record with the offending field shown colored red:

q
}
g; # all clear: recover and carry on...
/n$/!bloop

s/..$//
' csv.data




Working




  • We base the solution on the various types of fields.

  • Skip empty or blank lines.

  • Append a "," to simplify regex used, we'll take it away at the end.

  • To set the ball rolling we place a marker, n, at the beginning of line. This marker will travel from left to right, jumping a field processed a time.

  • The action begins in the do-while loop, in the body of which, we process a field at a time. The field beginning is signaled by n and we process the various varieties of fields that can occur. Everytime, we bring the field processed away to the left of n and replace the , by a |.

  • The looping stops when the marker n hits the end of line /n$/ and we then take away the marker as well as the dummy , we had placed in the beginning.




Results



364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043|005|006|007|003|008|016|041|012|029|068|027|028|033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed





share|improve this answer























  • Hi Rakesh, Thanks much for replying quickly. I incorporated the logic you provided in a Shell script which is supposed to perform the task explained for multiple files in a directory. Following is the shell script and when I execute the code it gets stuck on First file execution without performing anything. Can you please check and provide any suggestions to address the issue?
    – Sudheer
    May 30 '17 at 19:44












  • Here is the shell script #!/bin/ksh cd /CustomFolder/input/Test FILES=$(ls -t *.txt) for f in $FILES do sed -e ' # skip empty and blank lines /./!b /[^[:space:]]/!b s/$/,/ G;s/(.*)(.)/21/ :loop s/(n),/|1/ s/(n)([+-]{0,1}[.][0-9]{1,}),/2|1/ s/(n)([+-]{0,1}[0-9]{1,}([.][0-9]*){0,1}),/2|1/ s/(n)~([0-9]*),/2|1/ s/(n)([0-9]*)~,/2|1/ s/(n)~([^~]*)~,/2|1/ /n$/!bloop s/..$// ' $f done exit 0
    – Sudheer
    May 30 '17 at 19:46










  • @Sudheer What that means is that there are some fields which are not of the type being handled presently in the code. Providing a diagnostic facility within to help you zero in on the offender.
    – user218374
    May 30 '17 at 22:21















up vote
2
down vote













ESC=$(printf '33')
RED="${ESC}[0;31m"
NC="${ESC}[0m"

sed -e '
/./!b
/[^[:space:]]/!b

s/.*/
&,/

:loop
h
s/(n),/|1/; # An empty field
s/(n)([+-]{0,1}[.][0-9]{1,}),/2|1/; # +-.NNN
s/(n)([+-]{0,1}[0-9]{1,}([.][0-9]*){0,1}),/2|1/; # +-NNN.MMM +-NNN. +-NNN
s/(n)~([0-9][0-9]*),/2|1/; # ~NNN
s/(n)([0-9][0-9]*)~,/2|1/; # NNN~
s/(n)~([^~]*)~,/2|1/; # ~...~
x;G
/^(.*)n1$/{
g;'"s/n([^,]*)/${RED}1${NC}/"'
i
***'"${RED}ERROR${NC}"'*** Unable to process the field shown colored.

Cause of error: What this means is that this particular field is not

Fix: You should add to the sed code in the :loop label to
digest the able to be processed by the sed code as it stands.

The record with the offending field shown colored red:

q
}
g; # all clear: recover and carry on...
/n$/!bloop

s/..$//
' csv.data




Working




  • We base the solution on the various types of fields.

  • Skip empty or blank lines.

  • Append a "," to simplify regex used, we'll take it away at the end.

  • To set the ball rolling we place a marker, n, at the beginning of line. This marker will travel from left to right, jumping a field processed a time.

  • The action begins in the do-while loop, in the body of which, we process a field at a time. The field beginning is signaled by n and we process the various varieties of fields that can occur. Everytime, we bring the field processed away to the left of n and replace the , by a |.

  • The looping stops when the marker n hits the end of line /n$/ and we then take away the marker as well as the dummy , we had placed in the beginning.




Results



364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043|005|006|007|003|008|016|041|012|029|068|027|028|033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed





share|improve this answer























  • Hi Rakesh, Thanks much for replying quickly. I incorporated the logic you provided in a Shell script which is supposed to perform the task explained for multiple files in a directory. Following is the shell script and when I execute the code it gets stuck on First file execution without performing anything. Can you please check and provide any suggestions to address the issue?
    – Sudheer
    May 30 '17 at 19:44












  • Here is the shell script #!/bin/ksh cd /CustomFolder/input/Test FILES=$(ls -t *.txt) for f in $FILES do sed -e ' # skip empty and blank lines /./!b /[^[:space:]]/!b s/$/,/ G;s/(.*)(.)/21/ :loop s/(n),/|1/ s/(n)([+-]{0,1}[.][0-9]{1,}),/2|1/ s/(n)([+-]{0,1}[0-9]{1,}([.][0-9]*){0,1}),/2|1/ s/(n)~([0-9]*),/2|1/ s/(n)([0-9]*)~,/2|1/ s/(n)~([^~]*)~,/2|1/ /n$/!bloop s/..$// ' $f done exit 0
    – Sudheer
    May 30 '17 at 19:46










  • @Sudheer What that means is that there are some fields which are not of the type being handled presently in the code. Providing a diagnostic facility within to help you zero in on the offender.
    – user218374
    May 30 '17 at 22:21













up vote
2
down vote










up vote
2
down vote









ESC=$(printf '33')
RED="${ESC}[0;31m"
NC="${ESC}[0m"

sed -e '
/./!b
/[^[:space:]]/!b

s/.*/
&,/

:loop
h
s/(n),/|1/; # An empty field
s/(n)([+-]{0,1}[.][0-9]{1,}),/2|1/; # +-.NNN
s/(n)([+-]{0,1}[0-9]{1,}([.][0-9]*){0,1}),/2|1/; # +-NNN.MMM +-NNN. +-NNN
s/(n)~([0-9][0-9]*),/2|1/; # ~NNN
s/(n)([0-9][0-9]*)~,/2|1/; # NNN~
s/(n)~([^~]*)~,/2|1/; # ~...~
x;G
/^(.*)n1$/{
g;'"s/n([^,]*)/${RED}1${NC}/"'
i
***'"${RED}ERROR${NC}"'*** Unable to process the field shown colored.

Cause of error: What this means is that this particular field is not

Fix: You should add to the sed code in the :loop label to
digest the able to be processed by the sed code as it stands.

The record with the offending field shown colored red:

q
}
g; # all clear: recover and carry on...
/n$/!bloop

s/..$//
' csv.data




Working




  • We base the solution on the various types of fields.

  • Skip empty or blank lines.

  • Append a "," to simplify regex used, we'll take it away at the end.

  • To set the ball rolling we place a marker, n, at the beginning of line. This marker will travel from left to right, jumping a field processed a time.

  • The action begins in the do-while loop, in the body of which, we process a field at a time. The field beginning is signaled by n and we process the various varieties of fields that can occur. Everytime, we bring the field processed away to the left of n and replace the , by a |.

  • The looping stops when the marker n hits the end of line /n$/ and we then take away the marker as well as the dummy , we had placed in the beginning.




Results



364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043|005|006|007|003|008|016|041|012|029|068|027|028|033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed





share|improve this answer














ESC=$(printf '33')
RED="${ESC}[0;31m"
NC="${ESC}[0m"

sed -e '
/./!b
/[^[:space:]]/!b

s/.*/
&,/

:loop
h
s/(n),/|1/; # An empty field
s/(n)([+-]{0,1}[.][0-9]{1,}),/2|1/; # +-.NNN
s/(n)([+-]{0,1}[0-9]{1,}([.][0-9]*){0,1}),/2|1/; # +-NNN.MMM +-NNN. +-NNN
s/(n)~([0-9][0-9]*),/2|1/; # ~NNN
s/(n)([0-9][0-9]*)~,/2|1/; # NNN~
s/(n)~([^~]*)~,/2|1/; # ~...~
x;G
/^(.*)n1$/{
g;'"s/n([^,]*)/${RED}1${NC}/"'
i
***'"${RED}ERROR${NC}"'*** Unable to process the field shown colored.

Cause of error: What this means is that this particular field is not

Fix: You should add to the sed code in the :loop label to
digest the able to be processed by the sed code as it stands.

The record with the offending field shown colored red:

q
}
g; # all clear: recover and carry on...
/n$/!bloop

s/..$//
' csv.data




Working




  • We base the solution on the various types of fields.

  • Skip empty or blank lines.

  • Append a "," to simplify regex used, we'll take it away at the end.

  • To set the ball rolling we place a marker, n, at the beginning of line. This marker will travel from left to right, jumping a field processed a time.

  • The action begins in the do-while loop, in the body of which, we process a field at a time. The field beginning is signaled by n and we process the various varieties of fields that can occur. Everytime, we bring the field processed away to the left of n and replace the , by a |.

  • The looping stops when the marker n hits the end of line /n$/ and we then take away the marker as well as the dummy , we had placed in the beginning.




Results



364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043|005|006|007|003|008|016|041|012|029|068|027|028|033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed






share|improve this answer














share|improve this answer



share|improve this answer








edited May 31 '17 at 14:47

























answered May 27 '17 at 9:49







user218374



















  • Hi Rakesh, Thanks much for replying quickly. I incorporated the logic you provided in a Shell script which is supposed to perform the task explained for multiple files in a directory. Following is the shell script and when I execute the code it gets stuck on First file execution without performing anything. Can you please check and provide any suggestions to address the issue?
    – Sudheer
    May 30 '17 at 19:44












  • Here is the shell script #!/bin/ksh cd /CustomFolder/input/Test FILES=$(ls -t *.txt) for f in $FILES do sed -e ' # skip empty and blank lines /./!b /[^[:space:]]/!b s/$/,/ G;s/(.*)(.)/21/ :loop s/(n),/|1/ s/(n)([+-]{0,1}[.][0-9]{1,}),/2|1/ s/(n)([+-]{0,1}[0-9]{1,}([.][0-9]*){0,1}),/2|1/ s/(n)~([0-9]*),/2|1/ s/(n)([0-9]*)~,/2|1/ s/(n)~([^~]*)~,/2|1/ /n$/!bloop s/..$// ' $f done exit 0
    – Sudheer
    May 30 '17 at 19:46










  • @Sudheer What that means is that there are some fields which are not of the type being handled presently in the code. Providing a diagnostic facility within to help you zero in on the offender.
    – user218374
    May 30 '17 at 22:21


















  • Hi Rakesh, Thanks much for replying quickly. I incorporated the logic you provided in a Shell script which is supposed to perform the task explained for multiple files in a directory. Following is the shell script and when I execute the code it gets stuck on First file execution without performing anything. Can you please check and provide any suggestions to address the issue?
    – Sudheer
    May 30 '17 at 19:44












  • Here is the shell script #!/bin/ksh cd /CustomFolder/input/Test FILES=$(ls -t *.txt) for f in $FILES do sed -e ' # skip empty and blank lines /./!b /[^[:space:]]/!b s/$/,/ G;s/(.*)(.)/21/ :loop s/(n),/|1/ s/(n)([+-]{0,1}[.][0-9]{1,}),/2|1/ s/(n)([+-]{0,1}[0-9]{1,}([.][0-9]*){0,1}),/2|1/ s/(n)~([0-9]*),/2|1/ s/(n)([0-9]*)~,/2|1/ s/(n)~([^~]*)~,/2|1/ /n$/!bloop s/..$// ' $f done exit 0
    – Sudheer
    May 30 '17 at 19:46










  • @Sudheer What that means is that there are some fields which are not of the type being handled presently in the code. Providing a diagnostic facility within to help you zero in on the offender.
    – user218374
    May 30 '17 at 22:21
















Hi Rakesh, Thanks much for replying quickly. I incorporated the logic you provided in a Shell script which is supposed to perform the task explained for multiple files in a directory. Following is the shell script and when I execute the code it gets stuck on First file execution without performing anything. Can you please check and provide any suggestions to address the issue?
– Sudheer
May 30 '17 at 19:44






Hi Rakesh, Thanks much for replying quickly. I incorporated the logic you provided in a Shell script which is supposed to perform the task explained for multiple files in a directory. Following is the shell script and when I execute the code it gets stuck on First file execution without performing anything. Can you please check and provide any suggestions to address the issue?
– Sudheer
May 30 '17 at 19:44














Here is the shell script #!/bin/ksh cd /CustomFolder/input/Test FILES=$(ls -t *.txt) for f in $FILES do sed -e ' # skip empty and blank lines /./!b /[^[:space:]]/!b s/$/,/ G;s/(.*)(.)/21/ :loop s/(n),/|1/ s/(n)([+-]{0,1}[.][0-9]{1,}),/2|1/ s/(n)([+-]{0,1}[0-9]{1,}([.][0-9]*){0,1}),/2|1/ s/(n)~([0-9]*),/2|1/ s/(n)([0-9]*)~,/2|1/ s/(n)~([^~]*)~,/2|1/ /n$/!bloop s/..$// ' $f done exit 0
– Sudheer
May 30 '17 at 19:46




Here is the shell script #!/bin/ksh cd /CustomFolder/input/Test FILES=$(ls -t *.txt) for f in $FILES do sed -e ' # skip empty and blank lines /./!b /[^[:space:]]/!b s/$/,/ G;s/(.*)(.)/21/ :loop s/(n),/|1/ s/(n)([+-]{0,1}[.][0-9]{1,}),/2|1/ s/(n)([+-]{0,1}[0-9]{1,}([.][0-9]*){0,1}),/2|1/ s/(n)~([0-9]*),/2|1/ s/(n)([0-9]*)~,/2|1/ s/(n)~([^~]*)~,/2|1/ /n$/!bloop s/..$// ' $f done exit 0
– Sudheer
May 30 '17 at 19:46












@Sudheer What that means is that there are some fields which are not of the type being handled presently in the code. Providing a diagnostic facility within to help you zero in on the offender.
– user218374
May 30 '17 at 22:21




@Sudheer What that means is that there are some fields which are not of the type being handled presently in the code. Providing a diagnostic facility within to help you zero in on the offender.
– user218374
May 30 '17 at 22:21












up vote
1
down vote













I'd suggest using a dedicated CSV parser such as perl's Text::CSV



perl -MText::CSV -lne '
BEGIN{ $csv = Text::CSV->new({ quote_char => "~" , escape_char => "~" , allow_whitespace => 1}) }
print join "|", $csv->fields() if $csv->parse($_)
' file_name
364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043,005,006,007,003,008,016,041,012,029,068,027,028,033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed





share|improve this answer





















  • Thanks for your suggestion but I can't use perl, I need to use standard unix shell scripting or unix commands.
    – Sudheer
    May 30 '17 at 19:52















up vote
1
down vote













I'd suggest using a dedicated CSV parser such as perl's Text::CSV



perl -MText::CSV -lne '
BEGIN{ $csv = Text::CSV->new({ quote_char => "~" , escape_char => "~" , allow_whitespace => 1}) }
print join "|", $csv->fields() if $csv->parse($_)
' file_name
364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043,005,006,007,003,008,016,041,012,029,068,027,028,033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed





share|improve this answer





















  • Thanks for your suggestion but I can't use perl, I need to use standard unix shell scripting or unix commands.
    – Sudheer
    May 30 '17 at 19:52













up vote
1
down vote










up vote
1
down vote









I'd suggest using a dedicated CSV parser such as perl's Text::CSV



perl -MText::CSV -lne '
BEGIN{ $csv = Text::CSV->new({ quote_char => "~" , escape_char => "~" , allow_whitespace => 1}) }
print join "|", $csv->fields() if $csv->parse($_)
' file_name
364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043,005,006,007,003,008,016,041,012,029,068,027,028,033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed





share|improve this answer












I'd suggest using a dedicated CSV parser such as perl's Text::CSV



perl -MText::CSV -lne '
BEGIN{ $csv = Text::CSV->new({ quote_char => "~" , escape_char => "~" , allow_whitespace => 1}) }
print join "|", $csv->fields() if $csv->parse($_)
' file_name
364034|2015652205|26722|2015|4|C25753-4|TC25753,|2WD Double Cab 144.2" SLT,|Y|40506.16|43555.00|1095.00|043,005,006,007,003,008,016,041,012,029,068,027,028,033|3|2WD Double Cab 144.2"|SLT|6|4|N|S|N|S|N|N|N|| |Confirmed|w2015k65m22t5|Sierra 2500HD|Double Cab Standard Box 2-Wheel Drive SLT|Rear Wheel Drive|Extended Cab Pickup - Standard Bed






share|improve this answer












share|improve this answer



share|improve this answer










answered May 26 '17 at 22:27









steeldriver

33.6k34982




33.6k34982












  • Thanks for your suggestion but I can't use perl, I need to use standard unix shell scripting or unix commands.
    – Sudheer
    May 30 '17 at 19:52


















  • Thanks for your suggestion but I can't use perl, I need to use standard unix shell scripting or unix commands.
    – Sudheer
    May 30 '17 at 19:52
















Thanks for your suggestion but I can't use perl, I need to use standard unix shell scripting or unix commands.
– Sudheer
May 30 '17 at 19:52




Thanks for your suggestion but I can't use perl, I need to use standard unix shell scripting or unix commands.
– Sudheer
May 30 '17 at 19:52


















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f367488%2freplace-comma-delimited-file-with-pipe-but-not-remove-comma-or-quotes-etc-withi%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

サソリ

広島県道265号伴広島線

Setup Asymptote in Texstudio