Remove all duplicate word from string using shell script
up vote
7
down vote
favorite
I have a string like
"aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
I want to remove duplicate word from string then output will be like
"aaa,bbb,ccc"
I tried This code Source
$ echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs
It is working fine with same value,but when I give my variable value then it is showing all duplicate word also.
How can I remove duplicate value.
UPDATE
My question is adding all corresponding value into a single string if user is same .I have data like this ->
user name | colour
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green
In coding I fetch all distinct user then I concatenate color string successfully .For that I am using code -
while read the records
if [ "$c" == "" ]; then #$c I defined global
c="$colour1"
else
c="$c,$colour1"
fi
When I print this $c variable i get the output (For User AAA)
"red,black,blue,red,green,red,black,blue,red,green,"
I want to remove duplicate color .Then desired output should be like
"red,black,blue,green"
For this desired output i used above code
echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs
but it is displaying the output with duplicate values .Like
"red,black,blue,red,green,red,black,blue,red,green,"
Thanks
shell-script shell text-processing xargs duplicate
|
show 8 more comments
up vote
7
down vote
favorite
I have a string like
"aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
I want to remove duplicate word from string then output will be like
"aaa,bbb,ccc"
I tried This code Source
$ echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs
It is working fine with same value,but when I give my variable value then it is showing all duplicate word also.
How can I remove duplicate value.
UPDATE
My question is adding all corresponding value into a single string if user is same .I have data like this ->
user name | colour
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green
In coding I fetch all distinct user then I concatenate color string successfully .For that I am using code -
while read the records
if [ "$c" == "" ]; then #$c I defined global
c="$colour1"
else
c="$c,$colour1"
fi
When I print this $c variable i get the output (For User AAA)
"red,black,blue,red,green,red,black,blue,red,green,"
I want to remove duplicate color .Then desired output should be like
"red,black,blue,green"
For this desired output i used above code
echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs
but it is displaying the output with duplicate values .Like
"red,black,blue,red,green,red,black,blue,red,green,"
Thanks
shell-script shell text-processing xargs duplicate
3
Please clarify what is wrong with what you are using. I don't understand what you mean by "when I give my variable value". What value do you give? Where does it fail?
– terdon♦
Mar 23 '17 at 12:57
echo 'aaa aaa aaa bbb bbb ccc bbb ccc' | xargs -n1 | sort -u | xargs
givesaaa bbb ccc
.. so you need to show exact code you tired and output you got.. with the string in variable:s='aaa aaa aaa bbb bbb ccc bbb ccc'; echo "$s" | xargs -n1 | sort -u | xargs
– Sundeep
Mar 23 '17 at 13:01
string value comes dynamically. It is printing same value (contain duplicate value).
– Urvashi
Mar 23 '17 at 13:02
1
yeah, show the code that failed, otherwise how would we know what could've gone wrong?
– Sundeep
Mar 23 '17 at 13:02
Does the order matter?
– Jacob Vlijm
Mar 23 '17 at 14:06
|
show 8 more comments
up vote
7
down vote
favorite
up vote
7
down vote
favorite
I have a string like
"aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
I want to remove duplicate word from string then output will be like
"aaa,bbb,ccc"
I tried This code Source
$ echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs
It is working fine with same value,but when I give my variable value then it is showing all duplicate word also.
How can I remove duplicate value.
UPDATE
My question is adding all corresponding value into a single string if user is same .I have data like this ->
user name | colour
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green
In coding I fetch all distinct user then I concatenate color string successfully .For that I am using code -
while read the records
if [ "$c" == "" ]; then #$c I defined global
c="$colour1"
else
c="$c,$colour1"
fi
When I print this $c variable i get the output (For User AAA)
"red,black,blue,red,green,red,black,blue,red,green,"
I want to remove duplicate color .Then desired output should be like
"red,black,blue,green"
For this desired output i used above code
echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs
but it is displaying the output with duplicate values .Like
"red,black,blue,red,green,red,black,blue,red,green,"
Thanks
shell-script shell text-processing xargs duplicate
I have a string like
"aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
I want to remove duplicate word from string then output will be like
"aaa,bbb,ccc"
I tried This code Source
$ echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs
It is working fine with same value,but when I give my variable value then it is showing all duplicate word also.
How can I remove duplicate value.
UPDATE
My question is adding all corresponding value into a single string if user is same .I have data like this ->
user name | colour
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green
In coding I fetch all distinct user then I concatenate color string successfully .For that I am using code -
while read the records
if [ "$c" == "" ]; then #$c I defined global
c="$colour1"
else
c="$c,$colour1"
fi
When I print this $c variable i get the output (For User AAA)
"red,black,blue,red,green,red,black,blue,red,green,"
I want to remove duplicate color .Then desired output should be like
"red,black,blue,green"
For this desired output i used above code
echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs
but it is displaying the output with duplicate values .Like
"red,black,blue,red,green,red,black,blue,red,green,"
Thanks
shell-script shell text-processing xargs duplicate
shell-script shell text-processing xargs duplicate
edited May 23 '17 at 12:39
Community♦
1
1
asked Mar 23 '17 at 12:41
Urvashi
7316
7316
3
Please clarify what is wrong with what you are using. I don't understand what you mean by "when I give my variable value". What value do you give? Where does it fail?
– terdon♦
Mar 23 '17 at 12:57
echo 'aaa aaa aaa bbb bbb ccc bbb ccc' | xargs -n1 | sort -u | xargs
givesaaa bbb ccc
.. so you need to show exact code you tired and output you got.. with the string in variable:s='aaa aaa aaa bbb bbb ccc bbb ccc'; echo "$s" | xargs -n1 | sort -u | xargs
– Sundeep
Mar 23 '17 at 13:01
string value comes dynamically. It is printing same value (contain duplicate value).
– Urvashi
Mar 23 '17 at 13:02
1
yeah, show the code that failed, otherwise how would we know what could've gone wrong?
– Sundeep
Mar 23 '17 at 13:02
Does the order matter?
– Jacob Vlijm
Mar 23 '17 at 14:06
|
show 8 more comments
3
Please clarify what is wrong with what you are using. I don't understand what you mean by "when I give my variable value". What value do you give? Where does it fail?
– terdon♦
Mar 23 '17 at 12:57
echo 'aaa aaa aaa bbb bbb ccc bbb ccc' | xargs -n1 | sort -u | xargs
givesaaa bbb ccc
.. so you need to show exact code you tired and output you got.. with the string in variable:s='aaa aaa aaa bbb bbb ccc bbb ccc'; echo "$s" | xargs -n1 | sort -u | xargs
– Sundeep
Mar 23 '17 at 13:01
string value comes dynamically. It is printing same value (contain duplicate value).
– Urvashi
Mar 23 '17 at 13:02
1
yeah, show the code that failed, otherwise how would we know what could've gone wrong?
– Sundeep
Mar 23 '17 at 13:02
Does the order matter?
– Jacob Vlijm
Mar 23 '17 at 14:06
3
3
Please clarify what is wrong with what you are using. I don't understand what you mean by "when I give my variable value". What value do you give? Where does it fail?
– terdon♦
Mar 23 '17 at 12:57
Please clarify what is wrong with what you are using. I don't understand what you mean by "when I give my variable value". What value do you give? Where does it fail?
– terdon♦
Mar 23 '17 at 12:57
echo 'aaa aaa aaa bbb bbb ccc bbb ccc' | xargs -n1 | sort -u | xargs
gives aaa bbb ccc
.. so you need to show exact code you tired and output you got.. with the string in variable: s='aaa aaa aaa bbb bbb ccc bbb ccc'; echo "$s" | xargs -n1 | sort -u | xargs
– Sundeep
Mar 23 '17 at 13:01
echo 'aaa aaa aaa bbb bbb ccc bbb ccc' | xargs -n1 | sort -u | xargs
gives aaa bbb ccc
.. so you need to show exact code you tired and output you got.. with the string in variable: s='aaa aaa aaa bbb bbb ccc bbb ccc'; echo "$s" | xargs -n1 | sort -u | xargs
– Sundeep
Mar 23 '17 at 13:01
string value comes dynamically. It is printing same value (contain duplicate value).
– Urvashi
Mar 23 '17 at 13:02
string value comes dynamically. It is printing same value (contain duplicate value).
– Urvashi
Mar 23 '17 at 13:02
1
1
yeah, show the code that failed, otherwise how would we know what could've gone wrong?
– Sundeep
Mar 23 '17 at 13:02
yeah, show the code that failed, otherwise how would we know what could've gone wrong?
– Sundeep
Mar 23 '17 at 13:02
Does the order matter?
– Jacob Vlijm
Mar 23 '17 at 14:06
Does the order matter?
– Jacob Vlijm
Mar 23 '17 at 14:06
|
show 8 more comments
9 Answers
9
active
oldest
votes
up vote
7
down vote
accepted
One more awk, just for fun:
$ a="aaa bbb aaa bbb ccc aaa ddd bbb ccc"
$ echo "$a" | awk '{for (i=1;i<=NF;i++) if (!a[$i]++) printf("%s%s",$i,FS)}{printf("n")}'
aaa bbb ccc ddd
By the way, even your solution works fine with variables:
$ b="zebra ant spider spider ant zebra ant"
$ echo "$b" | xargs -n1 | sort -u | xargs
ant spider zebra
This works for me .Thanks @George Vasiliou
– Urvashi
Mar 24 '17 at 5:59
add a comment |
up vote
8
down vote
$ echo "zebra ant spider spider ant zebra ant" | awk -v RS="[ n]+" '!n[$0]++'
zebra
ant
spider
1
Very clever!!!!
– George Vasiliou
Mar 24 '17 at 0:54
@GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]
– JJoao
Mar 24 '17 at 8:44
add a comment |
up vote
6
down vote
With tr
, sort
and uniq
echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq
or
echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq | xargs
to get one line
You need to add| xargs
to join the output to one line again
– Philippos
Mar 23 '17 at 12:59
3
Or usesort -u
. Or even aawk '!u[$0]++
.
– Benoît
Mar 23 '17 at 18:42
1
@Benoît Wow, I did not know aboutsort -u
. I've been usingsort | uniq
all this time. The wasted keystrokes...
– gardenhead
Mar 24 '17 at 1:25
add a comment |
up vote
2
down vote
With gnu sed
:
sed ':s;s/(<S*>)(.*)<1>/12/g;ts'
You may add ;s/ */ /g
to remove dublicate spaces.
Functions like this: If a word is a second time in this line, remove it and start over until no dublication is found anymore.
What are<
and>
?
– someonewithpc
Mar 23 '17 at 20:19
@someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.
– Philippos
Mar 23 '17 at 21:29
Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.
– someonewithpc
Mar 23 '17 at 21:34
1
@someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately
– Philippos
Mar 23 '17 at 21:44
add a comment |
up vote
2
down vote
perl -lane '$,=$";print grep { ! $h{$_}++ } @F'
add a comment |
up vote
2
down vote
Obligatory awk solution:
$ echo "ant zebra ant spider spider ant zebra ant" |
awk -vRS=" " -vORS=" " '!a[$1] {a[$1]++} END{ for (x in a) print x; } ' ; echo
zebra ant spider
(The final echo
is there for the newline)
Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.
– George Vasiliou
Mar 23 '17 at 14:14
Yes, they will be printed in an essentially random order. Thesort
solution doesn't keep the original order either, though.
– ilkkachu
Mar 23 '17 at 14:17
Yes, good point! Even sort prints in different order than input.
– George Vasiliou
Mar 23 '17 at 14:18
1
@ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code:awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo
This preserves the order.
– user218374
Mar 23 '17 at 14:31
add a comment |
up vote
1
down vote
Python
Option 1
#!/usr/bin/env python
# get_unique_words.py
import sys
l =
for w in sys.argv[1].split(','):
if w not in l:
l += [ w ]
print ','.join(l)
Make executable, then call from Bash:
$ ./get_unique_words.py "aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
aaa,bbb,ccc
Or you could implement it as a Bash function, but the syntax is messy.
get_unique_words(){
python -c "
l =
for w in '$1'.split(','):
if w not in l:
l += [ w ]
print ','.join(l)"
}
Option 2
This option can become a one-liner if needed:
#!/usr/bin/env python
# get_unique_words.py
import sys
s_in = sys.argv[1]
l_in = s_in.split(',') # Turn string into a list.
set_out = set(l_in) # Turning a list into a set removes duplicates items.
s_out = ','.join(set_out)
print s_out
In Bash:
get_unique_words(){
python -c "print ','.join(set('$1'.split(',')))"
}
add a comment |
up vote
0
down vote
cat filename | awk '{ delete a; for (i=1; i<=NF; i++) a[$i]++; n=asorti(a, b); for (i=1; i<=n; i++) printf b[i]" "; print "" }' > newfile
New contributor
I do not get it
– Pierre.Vriens
Dec 2 at 7:00
add a comment |
up vote
-2
down vote
a="aaa aaa aaa bbb bbb ccc bbb ccc"
for item in $a
do
echo $item
done | sort -u | (while read i; do ans="$ans $i"; done ; echo $ans)
Please add an explanation on how your code works and why you did this and that.
– xhienne
Mar 24 '17 at 1:37
add a comment |
9 Answers
9
active
oldest
votes
9 Answers
9
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
7
down vote
accepted
One more awk, just for fun:
$ a="aaa bbb aaa bbb ccc aaa ddd bbb ccc"
$ echo "$a" | awk '{for (i=1;i<=NF;i++) if (!a[$i]++) printf("%s%s",$i,FS)}{printf("n")}'
aaa bbb ccc ddd
By the way, even your solution works fine with variables:
$ b="zebra ant spider spider ant zebra ant"
$ echo "$b" | xargs -n1 | sort -u | xargs
ant spider zebra
This works for me .Thanks @George Vasiliou
– Urvashi
Mar 24 '17 at 5:59
add a comment |
up vote
7
down vote
accepted
One more awk, just for fun:
$ a="aaa bbb aaa bbb ccc aaa ddd bbb ccc"
$ echo "$a" | awk '{for (i=1;i<=NF;i++) if (!a[$i]++) printf("%s%s",$i,FS)}{printf("n")}'
aaa bbb ccc ddd
By the way, even your solution works fine with variables:
$ b="zebra ant spider spider ant zebra ant"
$ echo "$b" | xargs -n1 | sort -u | xargs
ant spider zebra
This works for me .Thanks @George Vasiliou
– Urvashi
Mar 24 '17 at 5:59
add a comment |
up vote
7
down vote
accepted
up vote
7
down vote
accepted
One more awk, just for fun:
$ a="aaa bbb aaa bbb ccc aaa ddd bbb ccc"
$ echo "$a" | awk '{for (i=1;i<=NF;i++) if (!a[$i]++) printf("%s%s",$i,FS)}{printf("n")}'
aaa bbb ccc ddd
By the way, even your solution works fine with variables:
$ b="zebra ant spider spider ant zebra ant"
$ echo "$b" | xargs -n1 | sort -u | xargs
ant spider zebra
One more awk, just for fun:
$ a="aaa bbb aaa bbb ccc aaa ddd bbb ccc"
$ echo "$a" | awk '{for (i=1;i<=NF;i++) if (!a[$i]++) printf("%s%s",$i,FS)}{printf("n")}'
aaa bbb ccc ddd
By the way, even your solution works fine with variables:
$ b="zebra ant spider spider ant zebra ant"
$ echo "$b" | xargs -n1 | sort -u | xargs
ant spider zebra
edited Mar 23 '17 at 14:20
answered Mar 23 '17 at 14:12
George Vasiliou
5,57531028
5,57531028
This works for me .Thanks @George Vasiliou
– Urvashi
Mar 24 '17 at 5:59
add a comment |
This works for me .Thanks @George Vasiliou
– Urvashi
Mar 24 '17 at 5:59
This works for me .Thanks @George Vasiliou
– Urvashi
Mar 24 '17 at 5:59
This works for me .Thanks @George Vasiliou
– Urvashi
Mar 24 '17 at 5:59
add a comment |
up vote
8
down vote
$ echo "zebra ant spider spider ant zebra ant" | awk -v RS="[ n]+" '!n[$0]++'
zebra
ant
spider
1
Very clever!!!!
– George Vasiliou
Mar 24 '17 at 0:54
@GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]
– JJoao
Mar 24 '17 at 8:44
add a comment |
up vote
8
down vote
$ echo "zebra ant spider spider ant zebra ant" | awk -v RS="[ n]+" '!n[$0]++'
zebra
ant
spider
1
Very clever!!!!
– George Vasiliou
Mar 24 '17 at 0:54
@GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]
– JJoao
Mar 24 '17 at 8:44
add a comment |
up vote
8
down vote
up vote
8
down vote
$ echo "zebra ant spider spider ant zebra ant" | awk -v RS="[ n]+" '!n[$0]++'
zebra
ant
spider
$ echo "zebra ant spider spider ant zebra ant" | awk -v RS="[ n]+" '!n[$0]++'
zebra
ant
spider
answered Mar 23 '17 at 15:25
JJoao
7,0041827
7,0041827
1
Very clever!!!!
– George Vasiliou
Mar 24 '17 at 0:54
@GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]
– JJoao
Mar 24 '17 at 8:44
add a comment |
1
Very clever!!!!
– George Vasiliou
Mar 24 '17 at 0:54
@GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]
– JJoao
Mar 24 '17 at 8:44
1
1
Very clever!!!!
– George Vasiliou
Mar 24 '17 at 0:54
Very clever!!!!
– George Vasiliou
Mar 24 '17 at 0:54
@GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]
– JJoao
Mar 24 '17 at 8:44
@GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]
– JJoao
Mar 24 '17 at 8:44
add a comment |
up vote
6
down vote
With tr
, sort
and uniq
echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq
or
echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq | xargs
to get one line
You need to add| xargs
to join the output to one line again
– Philippos
Mar 23 '17 at 12:59
3
Or usesort -u
. Or even aawk '!u[$0]++
.
– Benoît
Mar 23 '17 at 18:42
1
@Benoît Wow, I did not know aboutsort -u
. I've been usingsort | uniq
all this time. The wasted keystrokes...
– gardenhead
Mar 24 '17 at 1:25
add a comment |
up vote
6
down vote
With tr
, sort
and uniq
echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq
or
echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq | xargs
to get one line
You need to add| xargs
to join the output to one line again
– Philippos
Mar 23 '17 at 12:59
3
Or usesort -u
. Or even aawk '!u[$0]++
.
– Benoît
Mar 23 '17 at 18:42
1
@Benoît Wow, I did not know aboutsort -u
. I've been usingsort | uniq
all this time. The wasted keystrokes...
– gardenhead
Mar 24 '17 at 1:25
add a comment |
up vote
6
down vote
up vote
6
down vote
With tr
, sort
and uniq
echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq
or
echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq | xargs
to get one line
With tr
, sort
and uniq
echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq
or
echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq | xargs
to get one line
edited Mar 23 '17 at 13:01
answered Mar 23 '17 at 12:55
Michael D.
1,534816
1,534816
You need to add| xargs
to join the output to one line again
– Philippos
Mar 23 '17 at 12:59
3
Or usesort -u
. Or even aawk '!u[$0]++
.
– Benoît
Mar 23 '17 at 18:42
1
@Benoît Wow, I did not know aboutsort -u
. I've been usingsort | uniq
all this time. The wasted keystrokes...
– gardenhead
Mar 24 '17 at 1:25
add a comment |
You need to add| xargs
to join the output to one line again
– Philippos
Mar 23 '17 at 12:59
3
Or usesort -u
. Or even aawk '!u[$0]++
.
– Benoît
Mar 23 '17 at 18:42
1
@Benoît Wow, I did not know aboutsort -u
. I've been usingsort | uniq
all this time. The wasted keystrokes...
– gardenhead
Mar 24 '17 at 1:25
You need to add
| xargs
to join the output to one line again– Philippos
Mar 23 '17 at 12:59
You need to add
| xargs
to join the output to one line again– Philippos
Mar 23 '17 at 12:59
3
3
Or use
sort -u
. Or even a awk '!u[$0]++
.– Benoît
Mar 23 '17 at 18:42
Or use
sort -u
. Or even a awk '!u[$0]++
.– Benoît
Mar 23 '17 at 18:42
1
1
@Benoît Wow, I did not know about
sort -u
. I've been using sort | uniq
all this time. The wasted keystrokes...– gardenhead
Mar 24 '17 at 1:25
@Benoît Wow, I did not know about
sort -u
. I've been using sort | uniq
all this time. The wasted keystrokes...– gardenhead
Mar 24 '17 at 1:25
add a comment |
up vote
2
down vote
With gnu sed
:
sed ':s;s/(<S*>)(.*)<1>/12/g;ts'
You may add ;s/ */ /g
to remove dublicate spaces.
Functions like this: If a word is a second time in this line, remove it and start over until no dublication is found anymore.
What are<
and>
?
– someonewithpc
Mar 23 '17 at 20:19
@someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.
– Philippos
Mar 23 '17 at 21:29
Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.
– someonewithpc
Mar 23 '17 at 21:34
1
@someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately
– Philippos
Mar 23 '17 at 21:44
add a comment |
up vote
2
down vote
With gnu sed
:
sed ':s;s/(<S*>)(.*)<1>/12/g;ts'
You may add ;s/ */ /g
to remove dublicate spaces.
Functions like this: If a word is a second time in this line, remove it and start over until no dublication is found anymore.
What are<
and>
?
– someonewithpc
Mar 23 '17 at 20:19
@someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.
– Philippos
Mar 23 '17 at 21:29
Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.
– someonewithpc
Mar 23 '17 at 21:34
1
@someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately
– Philippos
Mar 23 '17 at 21:44
add a comment |
up vote
2
down vote
up vote
2
down vote
With gnu sed
:
sed ':s;s/(<S*>)(.*)<1>/12/g;ts'
You may add ;s/ */ /g
to remove dublicate spaces.
Functions like this: If a word is a second time in this line, remove it and start over until no dublication is found anymore.
With gnu sed
:
sed ':s;s/(<S*>)(.*)<1>/12/g;ts'
You may add ;s/ */ /g
to remove dublicate spaces.
Functions like this: If a word is a second time in this line, remove it and start over until no dublication is found anymore.
answered Mar 23 '17 at 12:52
Philippos
5,98211547
5,98211547
What are<
and>
?
– someonewithpc
Mar 23 '17 at 20:19
@someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.
– Philippos
Mar 23 '17 at 21:29
Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.
– someonewithpc
Mar 23 '17 at 21:34
1
@someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately
– Philippos
Mar 23 '17 at 21:44
add a comment |
What are<
and>
?
– someonewithpc
Mar 23 '17 at 20:19
@someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.
– Philippos
Mar 23 '17 at 21:29
Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.
– someonewithpc
Mar 23 '17 at 21:34
1
@someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately
– Philippos
Mar 23 '17 at 21:44
What are
<
and >
?– someonewithpc
Mar 23 '17 at 20:19
What are
<
and >
?– someonewithpc
Mar 23 '17 at 20:19
@someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.
– Philippos
Mar 23 '17 at 21:29
@someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.
– Philippos
Mar 23 '17 at 21:29
Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.
– someonewithpc
Mar 23 '17 at 21:34
Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.
– someonewithpc
Mar 23 '17 at 21:34
1
1
@someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately
– Philippos
Mar 23 '17 at 21:44
@someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately
– Philippos
Mar 23 '17 at 21:44
add a comment |
up vote
2
down vote
perl -lane '$,=$";print grep { ! $h{$_}++ } @F'
add a comment |
up vote
2
down vote
perl -lane '$,=$";print grep { ! $h{$_}++ } @F'
add a comment |
up vote
2
down vote
up vote
2
down vote
perl -lane '$,=$";print grep { ! $h{$_}++ } @F'
perl -lane '$,=$";print grep { ! $h{$_}++ } @F'
answered Mar 23 '17 at 13:07
user218374
add a comment |
add a comment |
up vote
2
down vote
Obligatory awk solution:
$ echo "ant zebra ant spider spider ant zebra ant" |
awk -vRS=" " -vORS=" " '!a[$1] {a[$1]++} END{ for (x in a) print x; } ' ; echo
zebra ant spider
(The final echo
is there for the newline)
Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.
– George Vasiliou
Mar 23 '17 at 14:14
Yes, they will be printed in an essentially random order. Thesort
solution doesn't keep the original order either, though.
– ilkkachu
Mar 23 '17 at 14:17
Yes, good point! Even sort prints in different order than input.
– George Vasiliou
Mar 23 '17 at 14:18
1
@ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code:awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo
This preserves the order.
– user218374
Mar 23 '17 at 14:31
add a comment |
up vote
2
down vote
Obligatory awk solution:
$ echo "ant zebra ant spider spider ant zebra ant" |
awk -vRS=" " -vORS=" " '!a[$1] {a[$1]++} END{ for (x in a) print x; } ' ; echo
zebra ant spider
(The final echo
is there for the newline)
Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.
– George Vasiliou
Mar 23 '17 at 14:14
Yes, they will be printed in an essentially random order. Thesort
solution doesn't keep the original order either, though.
– ilkkachu
Mar 23 '17 at 14:17
Yes, good point! Even sort prints in different order than input.
– George Vasiliou
Mar 23 '17 at 14:18
1
@ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code:awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo
This preserves the order.
– user218374
Mar 23 '17 at 14:31
add a comment |
up vote
2
down vote
up vote
2
down vote
Obligatory awk solution:
$ echo "ant zebra ant spider spider ant zebra ant" |
awk -vRS=" " -vORS=" " '!a[$1] {a[$1]++} END{ for (x in a) print x; } ' ; echo
zebra ant spider
(The final echo
is there for the newline)
Obligatory awk solution:
$ echo "ant zebra ant spider spider ant zebra ant" |
awk -vRS=" " -vORS=" " '!a[$1] {a[$1]++} END{ for (x in a) print x; } ' ; echo
zebra ant spider
(The final echo
is there for the newline)
edited Mar 23 '17 at 13:58
answered Mar 23 '17 at 13:52
ilkkachu
54.2k782147
54.2k782147
Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.
– George Vasiliou
Mar 23 '17 at 14:14
Yes, they will be printed in an essentially random order. Thesort
solution doesn't keep the original order either, though.
– ilkkachu
Mar 23 '17 at 14:17
Yes, good point! Even sort prints in different order than input.
– George Vasiliou
Mar 23 '17 at 14:18
1
@ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code:awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo
This preserves the order.
– user218374
Mar 23 '17 at 14:31
add a comment |
Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.
– George Vasiliou
Mar 23 '17 at 14:14
Yes, they will be printed in an essentially random order. Thesort
solution doesn't keep the original order either, though.
– ilkkachu
Mar 23 '17 at 14:17
Yes, good point! Even sort prints in different order than input.
– George Vasiliou
Mar 23 '17 at 14:18
1
@ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code:awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo
This preserves the order.
– user218374
Mar 23 '17 at 14:31
Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.
– George Vasiliou
Mar 23 '17 at 14:14
Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.
– George Vasiliou
Mar 23 '17 at 14:14
Yes, they will be printed in an essentially random order. The
sort
solution doesn't keep the original order either, though.– ilkkachu
Mar 23 '17 at 14:17
Yes, they will be printed in an essentially random order. The
sort
solution doesn't keep the original order either, though.– ilkkachu
Mar 23 '17 at 14:17
Yes, good point! Even sort prints in different order than input.
– George Vasiliou
Mar 23 '17 at 14:18
Yes, good point! Even sort prints in different order than input.
– George Vasiliou
Mar 23 '17 at 14:18
1
1
@ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code:
awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo
This preserves the order.– user218374
Mar 23 '17 at 14:31
@ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code:
awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo
This preserves the order.– user218374
Mar 23 '17 at 14:31
add a comment |
up vote
1
down vote
Python
Option 1
#!/usr/bin/env python
# get_unique_words.py
import sys
l =
for w in sys.argv[1].split(','):
if w not in l:
l += [ w ]
print ','.join(l)
Make executable, then call from Bash:
$ ./get_unique_words.py "aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
aaa,bbb,ccc
Or you could implement it as a Bash function, but the syntax is messy.
get_unique_words(){
python -c "
l =
for w in '$1'.split(','):
if w not in l:
l += [ w ]
print ','.join(l)"
}
Option 2
This option can become a one-liner if needed:
#!/usr/bin/env python
# get_unique_words.py
import sys
s_in = sys.argv[1]
l_in = s_in.split(',') # Turn string into a list.
set_out = set(l_in) # Turning a list into a set removes duplicates items.
s_out = ','.join(set_out)
print s_out
In Bash:
get_unique_words(){
python -c "print ','.join(set('$1'.split(',')))"
}
add a comment |
up vote
1
down vote
Python
Option 1
#!/usr/bin/env python
# get_unique_words.py
import sys
l =
for w in sys.argv[1].split(','):
if w not in l:
l += [ w ]
print ','.join(l)
Make executable, then call from Bash:
$ ./get_unique_words.py "aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
aaa,bbb,ccc
Or you could implement it as a Bash function, but the syntax is messy.
get_unique_words(){
python -c "
l =
for w in '$1'.split(','):
if w not in l:
l += [ w ]
print ','.join(l)"
}
Option 2
This option can become a one-liner if needed:
#!/usr/bin/env python
# get_unique_words.py
import sys
s_in = sys.argv[1]
l_in = s_in.split(',') # Turn string into a list.
set_out = set(l_in) # Turning a list into a set removes duplicates items.
s_out = ','.join(set_out)
print s_out
In Bash:
get_unique_words(){
python -c "print ','.join(set('$1'.split(',')))"
}
add a comment |
up vote
1
down vote
up vote
1
down vote
Python
Option 1
#!/usr/bin/env python
# get_unique_words.py
import sys
l =
for w in sys.argv[1].split(','):
if w not in l:
l += [ w ]
print ','.join(l)
Make executable, then call from Bash:
$ ./get_unique_words.py "aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
aaa,bbb,ccc
Or you could implement it as a Bash function, but the syntax is messy.
get_unique_words(){
python -c "
l =
for w in '$1'.split(','):
if w not in l:
l += [ w ]
print ','.join(l)"
}
Option 2
This option can become a one-liner if needed:
#!/usr/bin/env python
# get_unique_words.py
import sys
s_in = sys.argv[1]
l_in = s_in.split(',') # Turn string into a list.
set_out = set(l_in) # Turning a list into a set removes duplicates items.
s_out = ','.join(set_out)
print s_out
In Bash:
get_unique_words(){
python -c "print ','.join(set('$1'.split(',')))"
}
Python
Option 1
#!/usr/bin/env python
# get_unique_words.py
import sys
l =
for w in sys.argv[1].split(','):
if w not in l:
l += [ w ]
print ','.join(l)
Make executable, then call from Bash:
$ ./get_unique_words.py "aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
aaa,bbb,ccc
Or you could implement it as a Bash function, but the syntax is messy.
get_unique_words(){
python -c "
l =
for w in '$1'.split(','):
if w not in l:
l += [ w ]
print ','.join(l)"
}
Option 2
This option can become a one-liner if needed:
#!/usr/bin/env python
# get_unique_words.py
import sys
s_in = sys.argv[1]
l_in = s_in.split(',') # Turn string into a list.
set_out = set(l_in) # Turning a list into a set removes duplicates items.
s_out = ','.join(set_out)
print s_out
In Bash:
get_unique_words(){
python -c "print ','.join(set('$1'.split(',')))"
}
edited May 14 '17 at 3:19
answered Mar 23 '17 at 20:34
wjandrea
466413
466413
add a comment |
add a comment |
up vote
0
down vote
cat filename | awk '{ delete a; for (i=1; i<=NF; i++) a[$i]++; n=asorti(a, b); for (i=1; i<=n; i++) printf b[i]" "; print "" }' > newfile
New contributor
I do not get it
– Pierre.Vriens
Dec 2 at 7:00
add a comment |
up vote
0
down vote
cat filename | awk '{ delete a; for (i=1; i<=NF; i++) a[$i]++; n=asorti(a, b); for (i=1; i<=n; i++) printf b[i]" "; print "" }' > newfile
New contributor
I do not get it
– Pierre.Vriens
Dec 2 at 7:00
add a comment |
up vote
0
down vote
up vote
0
down vote
cat filename | awk '{ delete a; for (i=1; i<=NF; i++) a[$i]++; n=asorti(a, b); for (i=1; i<=n; i++) printf b[i]" "; print "" }' > newfile
New contributor
cat filename | awk '{ delete a; for (i=1; i<=NF; i++) a[$i]++; n=asorti(a, b); for (i=1; i<=n; i++) printf b[i]" "; print "" }' > newfile
New contributor
New contributor
answered Dec 2 at 4:18
天津神 こと
1
1
New contributor
New contributor
I do not get it
– Pierre.Vriens
Dec 2 at 7:00
add a comment |
I do not get it
– Pierre.Vriens
Dec 2 at 7:00
I do not get it
– Pierre.Vriens
Dec 2 at 7:00
I do not get it
– Pierre.Vriens
Dec 2 at 7:00
add a comment |
up vote
-2
down vote
a="aaa aaa aaa bbb bbb ccc bbb ccc"
for item in $a
do
echo $item
done | sort -u | (while read i; do ans="$ans $i"; done ; echo $ans)
Please add an explanation on how your code works and why you did this and that.
– xhienne
Mar 24 '17 at 1:37
add a comment |
up vote
-2
down vote
a="aaa aaa aaa bbb bbb ccc bbb ccc"
for item in $a
do
echo $item
done | sort -u | (while read i; do ans="$ans $i"; done ; echo $ans)
Please add an explanation on how your code works and why you did this and that.
– xhienne
Mar 24 '17 at 1:37
add a comment |
up vote
-2
down vote
up vote
-2
down vote
a="aaa aaa aaa bbb bbb ccc bbb ccc"
for item in $a
do
echo $item
done | sort -u | (while read i; do ans="$ans $i"; done ; echo $ans)
a="aaa aaa aaa bbb bbb ccc bbb ccc"
for item in $a
do
echo $item
done | sort -u | (while read i; do ans="$ans $i"; done ; echo $ans)
edited Mar 24 '17 at 0:27
answered Mar 24 '17 at 0:18
Tododo Fly
11
11
Please add an explanation on how your code works and why you did this and that.
– xhienne
Mar 24 '17 at 1:37
add a comment |
Please add an explanation on how your code works and why you did this and that.
– xhienne
Mar 24 '17 at 1:37
Please add an explanation on how your code works and why you did this and that.
– xhienne
Mar 24 '17 at 1:37
Please add an explanation on how your code works and why you did this and that.
– xhienne
Mar 24 '17 at 1:37
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f353321%2fremove-all-duplicate-word-from-string-using-shell-script%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
Please clarify what is wrong with what you are using. I don't understand what you mean by "when I give my variable value". What value do you give? Where does it fail?
– terdon♦
Mar 23 '17 at 12:57
echo 'aaa aaa aaa bbb bbb ccc bbb ccc' | xargs -n1 | sort -u | xargs
givesaaa bbb ccc
.. so you need to show exact code you tired and output you got.. with the string in variable:s='aaa aaa aaa bbb bbb ccc bbb ccc'; echo "$s" | xargs -n1 | sort -u | xargs
– Sundeep
Mar 23 '17 at 13:01
string value comes dynamically. It is printing same value (contain duplicate value).
– Urvashi
Mar 23 '17 at 13:02
1
yeah, show the code that failed, otherwise how would we know what could've gone wrong?
– Sundeep
Mar 23 '17 at 13:02
Does the order matter?
– Jacob Vlijm
Mar 23 '17 at 14:06