Truncate trailing commas from each line based on 1st field in a delimited file
up vote
2
down vote
favorite
I need to removing trailing commas based on the record_type (1st field). Input file has 50 delimiters, I need reduce them based on the record type. IF 1st field is 400, remove last 10 delimiters, if 300 remove 5 delimiters, if 210 remove 2 commas. The pattern of 400, 300 and 210 repeats and the order has to me maintained.
For example:
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,,,,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,,,
I need the output as
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,
I tried awk and sed but they are truncating the whole file.
text-processing awk sed
add a comment |
up vote
2
down vote
favorite
I need to removing trailing commas based on the record_type (1st field). Input file has 50 delimiters, I need reduce them based on the record type. IF 1st field is 400, remove last 10 delimiters, if 300 remove 5 delimiters, if 210 remove 2 commas. The pattern of 400, 300 and 210 repeats and the order has to me maintained.
For example:
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,,,,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,,,
I need the output as
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,
I tried awk and sed but they are truncating the whole file.
text-processing awk sed
you saidif 200 remove 2 commas
- did you mean if 210 remove 2 commas?
– рüффп
Jan 18 '17 at 22:33
yes, 210. Thank you.
– sata
Jan 18 '17 at 22:47
This seems strange to me. You don't want the same number of fields on all lines?
– Kusalananda
Jan 26 '17 at 12:18
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I need to removing trailing commas based on the record_type (1st field). Input file has 50 delimiters, I need reduce them based on the record type. IF 1st field is 400, remove last 10 delimiters, if 300 remove 5 delimiters, if 210 remove 2 commas. The pattern of 400, 300 and 210 repeats and the order has to me maintained.
For example:
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,,,,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,,,
I need the output as
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,
I tried awk and sed but they are truncating the whole file.
text-processing awk sed
I need to removing trailing commas based on the record_type (1st field). Input file has 50 delimiters, I need reduce them based on the record type. IF 1st field is 400, remove last 10 delimiters, if 300 remove 5 delimiters, if 210 remove 2 commas. The pattern of 400, 300 and 210 repeats and the order has to me maintained.
For example:
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,,,,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,,,
I need the output as
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,
I tried awk and sed but they are truncating the whole file.
text-processing awk sed
text-processing awk sed
edited Nov 21 at 21:07
Rui F Ribeiro
38.2k1475125
38.2k1475125
asked Jan 18 '17 at 21:51
sata
112
112
you saidif 200 remove 2 commas
- did you mean if 210 remove 2 commas?
– рüффп
Jan 18 '17 at 22:33
yes, 210. Thank you.
– sata
Jan 18 '17 at 22:47
This seems strange to me. You don't want the same number of fields on all lines?
– Kusalananda
Jan 26 '17 at 12:18
add a comment |
you saidif 200 remove 2 commas
- did you mean if 210 remove 2 commas?
– рüффп
Jan 18 '17 at 22:33
yes, 210. Thank you.
– sata
Jan 18 '17 at 22:47
This seems strange to me. You don't want the same number of fields on all lines?
– Kusalananda
Jan 26 '17 at 12:18
you said
if 200 remove 2 commas
- did you mean if 210 remove 2 commas?– рüффп
Jan 18 '17 at 22:33
you said
if 200 remove 2 commas
- did you mean if 210 remove 2 commas?– рüффп
Jan 18 '17 at 22:33
yes, 210. Thank you.
– sata
Jan 18 '17 at 22:47
yes, 210. Thank you.
– sata
Jan 18 '17 at 22:47
This seems strange to me. You don't want the same number of fields on all lines?
– Kusalananda
Jan 26 '17 at 12:18
This seems strange to me. You don't want the same number of fields on all lines?
– Kusalananda
Jan 26 '17 at 12:18
add a comment |
3 Answers
3
active
oldest
votes
up vote
3
down vote
Sed can meet your requirements. This will match the desired beginning of the string, and then remove the desired number of commas from the end.
sed -e '/^400/ s/,{10}$//' -e '/^300/ s/,{5}$//' -e '/^210/ s/,{2}$//'
add a comment |
up vote
1
down vote
AWK approach. We define trunk
function to print a sub-string of the whole line, from index 0 to index length - n characters. The rest is simply pattern matching, and calling trunk
function with appropriate number of characters to be removed.
As one liner :
$ awk -F ',' 'function trunk(n){print substr($0,0,length($0)-n)}; $1==400{trunk(10)};$1==300{trunk(5)};$1==210{trunk(2)} ' input.txt
From a script this would be like so:
#!/usr/bin/awk -f
BEGIN { FS="," };
function trunk(n){
print substr($0,0,length($0)-n)
};
$1==400{ trunk(10)};
$1==300{trunk(5)};
$1==210{trunk(2)};
And here it is in action:
$ ./trunk_lines.awk input.txt
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,
add a comment |
up vote
1
down vote
Given that the trailing fields are empty (or if you want to delete them as well)
awk -F, -vOFS=, '$1=="400"{NF-=10} $1=="300"{NF-=5} $1=="210"{NF-=2} 1' file
or if you want to be clever (which can be both a good and bad thing)
awk -F, -vOFS=, 'BEGIN{x[400]=10;x[300]=5;x[210]=2} {NF-=x[$1]} 1' file
Trailing fields are mostly empty..I just had to match the number of delimiters for the other application to accept my file.
– sata
Jan 18 '17 at 23:09
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
Sed can meet your requirements. This will match the desired beginning of the string, and then remove the desired number of commas from the end.
sed -e '/^400/ s/,{10}$//' -e '/^300/ s/,{5}$//' -e '/^210/ s/,{2}$//'
add a comment |
up vote
3
down vote
Sed can meet your requirements. This will match the desired beginning of the string, and then remove the desired number of commas from the end.
sed -e '/^400/ s/,{10}$//' -e '/^300/ s/,{5}$//' -e '/^210/ s/,{2}$//'
add a comment |
up vote
3
down vote
up vote
3
down vote
Sed can meet your requirements. This will match the desired beginning of the string, and then remove the desired number of commas from the end.
sed -e '/^400/ s/,{10}$//' -e '/^300/ s/,{5}$//' -e '/^210/ s/,{2}$//'
Sed can meet your requirements. This will match the desired beginning of the string, and then remove the desired number of commas from the end.
sed -e '/^400/ s/,{10}$//' -e '/^300/ s/,{5}$//' -e '/^210/ s/,{2}$//'
answered Jan 18 '17 at 22:19
Stephen Rauch
3,308101328
3,308101328
add a comment |
add a comment |
up vote
1
down vote
AWK approach. We define trunk
function to print a sub-string of the whole line, from index 0 to index length - n characters. The rest is simply pattern matching, and calling trunk
function with appropriate number of characters to be removed.
As one liner :
$ awk -F ',' 'function trunk(n){print substr($0,0,length($0)-n)}; $1==400{trunk(10)};$1==300{trunk(5)};$1==210{trunk(2)} ' input.txt
From a script this would be like so:
#!/usr/bin/awk -f
BEGIN { FS="," };
function trunk(n){
print substr($0,0,length($0)-n)
};
$1==400{ trunk(10)};
$1==300{trunk(5)};
$1==210{trunk(2)};
And here it is in action:
$ ./trunk_lines.awk input.txt
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,
add a comment |
up vote
1
down vote
AWK approach. We define trunk
function to print a sub-string of the whole line, from index 0 to index length - n characters. The rest is simply pattern matching, and calling trunk
function with appropriate number of characters to be removed.
As one liner :
$ awk -F ',' 'function trunk(n){print substr($0,0,length($0)-n)}; $1==400{trunk(10)};$1==300{trunk(5)};$1==210{trunk(2)} ' input.txt
From a script this would be like so:
#!/usr/bin/awk -f
BEGIN { FS="," };
function trunk(n){
print substr($0,0,length($0)-n)
};
$1==400{ trunk(10)};
$1==300{trunk(5)};
$1==210{trunk(2)};
And here it is in action:
$ ./trunk_lines.awk input.txt
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,
add a comment |
up vote
1
down vote
up vote
1
down vote
AWK approach. We define trunk
function to print a sub-string of the whole line, from index 0 to index length - n characters. The rest is simply pattern matching, and calling trunk
function with appropriate number of characters to be removed.
As one liner :
$ awk -F ',' 'function trunk(n){print substr($0,0,length($0)-n)}; $1==400{trunk(10)};$1==300{trunk(5)};$1==210{trunk(2)} ' input.txt
From a script this would be like so:
#!/usr/bin/awk -f
BEGIN { FS="," };
function trunk(n){
print substr($0,0,length($0)-n)
};
$1==400{ trunk(10)};
$1==300{trunk(5)};
$1==210{trunk(2)};
And here it is in action:
$ ./trunk_lines.awk input.txt
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,
AWK approach. We define trunk
function to print a sub-string of the whole line, from index 0 to index length - n characters. The rest is simply pattern matching, and calling trunk
function with appropriate number of characters to be removed.
As one liner :
$ awk -F ',' 'function trunk(n){print substr($0,0,length($0)-n)}; $1==400{trunk(10)};$1==300{trunk(5)};$1==210{trunk(2)} ' input.txt
From a script this would be like so:
#!/usr/bin/awk -f
BEGIN { FS="," };
function trunk(n){
print substr($0,0,length($0)-n)
};
$1==400{ trunk(10)};
$1==300{trunk(5)};
$1==210{trunk(2)};
And here it is in action:
$ ./trunk_lines.awk input.txt
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,
answered Jan 18 '17 at 22:58
Sergiy Kolodyazhnyy
8,11212051
8,11212051
add a comment |
add a comment |
up vote
1
down vote
Given that the trailing fields are empty (or if you want to delete them as well)
awk -F, -vOFS=, '$1=="400"{NF-=10} $1=="300"{NF-=5} $1=="210"{NF-=2} 1' file
or if you want to be clever (which can be both a good and bad thing)
awk -F, -vOFS=, 'BEGIN{x[400]=10;x[300]=5;x[210]=2} {NF-=x[$1]} 1' file
Trailing fields are mostly empty..I just had to match the number of delimiters for the other application to accept my file.
– sata
Jan 18 '17 at 23:09
add a comment |
up vote
1
down vote
Given that the trailing fields are empty (or if you want to delete them as well)
awk -F, -vOFS=, '$1=="400"{NF-=10} $1=="300"{NF-=5} $1=="210"{NF-=2} 1' file
or if you want to be clever (which can be both a good and bad thing)
awk -F, -vOFS=, 'BEGIN{x[400]=10;x[300]=5;x[210]=2} {NF-=x[$1]} 1' file
Trailing fields are mostly empty..I just had to match the number of delimiters for the other application to accept my file.
– sata
Jan 18 '17 at 23:09
add a comment |
up vote
1
down vote
up vote
1
down vote
Given that the trailing fields are empty (or if you want to delete them as well)
awk -F, -vOFS=, '$1=="400"{NF-=10} $1=="300"{NF-=5} $1=="210"{NF-=2} 1' file
or if you want to be clever (which can be both a good and bad thing)
awk -F, -vOFS=, 'BEGIN{x[400]=10;x[300]=5;x[210]=2} {NF-=x[$1]} 1' file
Given that the trailing fields are empty (or if you want to delete them as well)
awk -F, -vOFS=, '$1=="400"{NF-=10} $1=="300"{NF-=5} $1=="210"{NF-=2} 1' file
or if you want to be clever (which can be both a good and bad thing)
awk -F, -vOFS=, 'BEGIN{x[400]=10;x[300]=5;x[210]=2} {NF-=x[$1]} 1' file
answered Jan 18 '17 at 22:59
dave_thompson_085
2,02711011
2,02711011
Trailing fields are mostly empty..I just had to match the number of delimiters for the other application to accept my file.
– sata
Jan 18 '17 at 23:09
add a comment |
Trailing fields are mostly empty..I just had to match the number of delimiters for the other application to accept my file.
– sata
Jan 18 '17 at 23:09
Trailing fields are mostly empty..I just had to match the number of delimiters for the other application to accept my file.
– sata
Jan 18 '17 at 23:09
Trailing fields are mostly empty..I just had to match the number of delimiters for the other application to accept my file.
– sata
Jan 18 '17 at 23:09
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f338460%2ftruncate-trailing-commas-from-each-line-based-on-1st-field-in-a-delimited-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
you said
if 200 remove 2 commas
- did you mean if 210 remove 2 commas?– рüффп
Jan 18 '17 at 22:33
yes, 210. Thank you.
– sata
Jan 18 '17 at 22:47
This seems strange to me. You don't want the same number of fields on all lines?
– Kusalananda
Jan 26 '17 at 12:18