Delete all files that DON'T have duplicate names?
Given a large list of files, containing the following:
FILE1.doc
FILE1.pdf
FILE2.doc
FILE3.doc
FILE3.pdf
FILE4.doc
Is there a terminal command that would allow me to remove all files that do not have a duplicate name in the list? In this case...FILE2.doc and FILE4.doc?
shell command-line terminal
add a comment |
Given a large list of files, containing the following:
FILE1.doc
FILE1.pdf
FILE2.doc
FILE3.doc
FILE3.pdf
FILE4.doc
Is there a terminal command that would allow me to remove all files that do not have a duplicate name in the list? In this case...FILE2.doc and FILE4.doc?
shell command-line terminal
1
How do you know that the first filename of that list is notFILE.docnFILE1.pdf? Any of the newlines can be part of a file name.
– Anthon
Jul 16 '15 at 6:10
At what point are you halting the uniqueness comparison? At the first dot? At the last dot? What if there is no dot in the filename? If you had file1.doc and FILE1.DOC is this a duplicate?
– roaima
Jul 16 '15 at 7:18
If you have two files, file.doc, file.pdf and only one of them appears in your list, should both be deleted or just the one that is listed?
– roaima
Jul 16 '15 at 7:21
add a comment |
Given a large list of files, containing the following:
FILE1.doc
FILE1.pdf
FILE2.doc
FILE3.doc
FILE3.pdf
FILE4.doc
Is there a terminal command that would allow me to remove all files that do not have a duplicate name in the list? In this case...FILE2.doc and FILE4.doc?
shell command-line terminal
Given a large list of files, containing the following:
FILE1.doc
FILE1.pdf
FILE2.doc
FILE3.doc
FILE3.pdf
FILE4.doc
Is there a terminal command that would allow me to remove all files that do not have a duplicate name in the list? In this case...FILE2.doc and FILE4.doc?
shell command-line terminal
shell command-line terminal
edited Jul 16 '15 at 6:09
Anthon
60.2k17102163
60.2k17102163
asked Jul 16 '15 at 4:46
antonpug
1061
1061
1
How do you know that the first filename of that list is notFILE.docnFILE1.pdf? Any of the newlines can be part of a file name.
– Anthon
Jul 16 '15 at 6:10
At what point are you halting the uniqueness comparison? At the first dot? At the last dot? What if there is no dot in the filename? If you had file1.doc and FILE1.DOC is this a duplicate?
– roaima
Jul 16 '15 at 7:18
If you have two files, file.doc, file.pdf and only one of them appears in your list, should both be deleted or just the one that is listed?
– roaima
Jul 16 '15 at 7:21
add a comment |
1
How do you know that the first filename of that list is notFILE.docnFILE1.pdf? Any of the newlines can be part of a file name.
– Anthon
Jul 16 '15 at 6:10
At what point are you halting the uniqueness comparison? At the first dot? At the last dot? What if there is no dot in the filename? If you had file1.doc and FILE1.DOC is this a duplicate?
– roaima
Jul 16 '15 at 7:18
If you have two files, file.doc, file.pdf and only one of them appears in your list, should both be deleted or just the one that is listed?
– roaima
Jul 16 '15 at 7:21
1
1
How do you know that the first filename of that list is not
FILE.docnFILE1.pdf? Any of the newlines can be part of a file name.– Anthon
Jul 16 '15 at 6:10
How do you know that the first filename of that list is not
FILE.docnFILE1.pdf? Any of the newlines can be part of a file name.– Anthon
Jul 16 '15 at 6:10
At what point are you halting the uniqueness comparison? At the first dot? At the last dot? What if there is no dot in the filename? If you had file1.doc and FILE1.DOC is this a duplicate?
– roaima
Jul 16 '15 at 7:18
At what point are you halting the uniqueness comparison? At the first dot? At the last dot? What if there is no dot in the filename? If you had file1.doc and FILE1.DOC is this a duplicate?
– roaima
Jul 16 '15 at 7:18
If you have two files, file.doc, file.pdf and only one of them appears in your list, should both be deleted or just the one that is listed?
– roaima
Jul 16 '15 at 7:21
If you have two files, file.doc, file.pdf and only one of them appears in your list, should both be deleted or just the one that is listed?
– roaima
Jul 16 '15 at 7:21
add a comment |
3 Answers
3
active
oldest
votes
Using bash, this will remove all files that don't have another file with the same name but different extension:
for f in *; do same=("${f%.*}".*); [ "${#same[@]}" -eq 1 ] && rm "$f"; done
This approach is safe for all file names, even those with white space in their names.
How it works
for f in *; do
This starts a loop over all files in the current directory.
same=("${f%.*}".*)
This creates a bash array with the names of all files with the same basename.
$fis the name of our file.${f%.*}is the name of the file without its extension. If, for example, the file isFILE1.doc, then${f%.*}isFILE1."${f%.*}".*is all the files with the same basename but any extension.("${f%.*}".*)is a bash array of those names.same=("${f%.*}".*)assigns the array to the variablesame.
[ "${#same[@]}" -eq 1 ] && rm "$f"
If there is only one file with this basename, we delete it.
"${#same[@]}"is the number of files in the arraysame.[ "${#same[@]}" -eq 1 ]is true if there is only one such file.
&&is logical-and. It causes the statement which follows,rm "$f"to be executed only if the statement which precedes it returns logical true.
done
This marks the end of the
forloop.
Thank you for a very descriptive answer. This is awesome.
– antonpug
Jul 16 '15 at 13:25
add a comment |
Suppose your list of files is in some file /tmp/files.list, e.g. after ls * > /tmp/files.list ;
Then sort -u /tmp/files.list gives you a sorted file lists without duplicate (not needed if you did the ls * > /tmp/files.list above). You could process that with some awk script inspired from this, e.g.
sort -u /tmp/files.list | awk '{
function basename(file) {
sub(".*/", "", file)
return file
}
curfil=$0;
if (basename(curfil)==basename(prevfil)) system("rm " + curfil);
prevfil=curfil;
}'
Beware, I have not tested this.
add a comment |
Another simple looking complex way can be:
for x in `for i in *; do echo $i ; done | cut -d'.' -f1 | uniq -u `; do rm $x.*; done
Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
– Gilles
Jul 16 '15 at 21:14
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f216357%2fdelete-all-files-that-dont-have-duplicate-names%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Using bash, this will remove all files that don't have another file with the same name but different extension:
for f in *; do same=("${f%.*}".*); [ "${#same[@]}" -eq 1 ] && rm "$f"; done
This approach is safe for all file names, even those with white space in their names.
How it works
for f in *; do
This starts a loop over all files in the current directory.
same=("${f%.*}".*)
This creates a bash array with the names of all files with the same basename.
$fis the name of our file.${f%.*}is the name of the file without its extension. If, for example, the file isFILE1.doc, then${f%.*}isFILE1."${f%.*}".*is all the files with the same basename but any extension.("${f%.*}".*)is a bash array of those names.same=("${f%.*}".*)assigns the array to the variablesame.
[ "${#same[@]}" -eq 1 ] && rm "$f"
If there is only one file with this basename, we delete it.
"${#same[@]}"is the number of files in the arraysame.[ "${#same[@]}" -eq 1 ]is true if there is only one such file.
&&is logical-and. It causes the statement which follows,rm "$f"to be executed only if the statement which precedes it returns logical true.
done
This marks the end of the
forloop.
Thank you for a very descriptive answer. This is awesome.
– antonpug
Jul 16 '15 at 13:25
add a comment |
Using bash, this will remove all files that don't have another file with the same name but different extension:
for f in *; do same=("${f%.*}".*); [ "${#same[@]}" -eq 1 ] && rm "$f"; done
This approach is safe for all file names, even those with white space in their names.
How it works
for f in *; do
This starts a loop over all files in the current directory.
same=("${f%.*}".*)
This creates a bash array with the names of all files with the same basename.
$fis the name of our file.${f%.*}is the name of the file without its extension. If, for example, the file isFILE1.doc, then${f%.*}isFILE1."${f%.*}".*is all the files with the same basename but any extension.("${f%.*}".*)is a bash array of those names.same=("${f%.*}".*)assigns the array to the variablesame.
[ "${#same[@]}" -eq 1 ] && rm "$f"
If there is only one file with this basename, we delete it.
"${#same[@]}"is the number of files in the arraysame.[ "${#same[@]}" -eq 1 ]is true if there is only one such file.
&&is logical-and. It causes the statement which follows,rm "$f"to be executed only if the statement which precedes it returns logical true.
done
This marks the end of the
forloop.
Thank you for a very descriptive answer. This is awesome.
– antonpug
Jul 16 '15 at 13:25
add a comment |
Using bash, this will remove all files that don't have another file with the same name but different extension:
for f in *; do same=("${f%.*}".*); [ "${#same[@]}" -eq 1 ] && rm "$f"; done
This approach is safe for all file names, even those with white space in their names.
How it works
for f in *; do
This starts a loop over all files in the current directory.
same=("${f%.*}".*)
This creates a bash array with the names of all files with the same basename.
$fis the name of our file.${f%.*}is the name of the file without its extension. If, for example, the file isFILE1.doc, then${f%.*}isFILE1."${f%.*}".*is all the files with the same basename but any extension.("${f%.*}".*)is a bash array of those names.same=("${f%.*}".*)assigns the array to the variablesame.
[ "${#same[@]}" -eq 1 ] && rm "$f"
If there is only one file with this basename, we delete it.
"${#same[@]}"is the number of files in the arraysame.[ "${#same[@]}" -eq 1 ]is true if there is only one such file.
&&is logical-and. It causes the statement which follows,rm "$f"to be executed only if the statement which precedes it returns logical true.
done
This marks the end of the
forloop.
Using bash, this will remove all files that don't have another file with the same name but different extension:
for f in *; do same=("${f%.*}".*); [ "${#same[@]}" -eq 1 ] && rm "$f"; done
This approach is safe for all file names, even those with white space in their names.
How it works
for f in *; do
This starts a loop over all files in the current directory.
same=("${f%.*}".*)
This creates a bash array with the names of all files with the same basename.
$fis the name of our file.${f%.*}is the name of the file without its extension. If, for example, the file isFILE1.doc, then${f%.*}isFILE1."${f%.*}".*is all the files with the same basename but any extension.("${f%.*}".*)is a bash array of those names.same=("${f%.*}".*)assigns the array to the variablesame.
[ "${#same[@]}" -eq 1 ] && rm "$f"
If there is only one file with this basename, we delete it.
"${#same[@]}"is the number of files in the arraysame.[ "${#same[@]}" -eq 1 ]is true if there is only one such file.
&&is logical-and. It causes the statement which follows,rm "$f"to be executed only if the statement which precedes it returns logical true.
done
This marks the end of the
forloop.
edited Jul 16 '15 at 6:31
answered Jul 16 '15 at 5:23
John1024
45.8k4103119
45.8k4103119
Thank you for a very descriptive answer. This is awesome.
– antonpug
Jul 16 '15 at 13:25
add a comment |
Thank you for a very descriptive answer. This is awesome.
– antonpug
Jul 16 '15 at 13:25
Thank you for a very descriptive answer. This is awesome.
– antonpug
Jul 16 '15 at 13:25
Thank you for a very descriptive answer. This is awesome.
– antonpug
Jul 16 '15 at 13:25
add a comment |
Suppose your list of files is in some file /tmp/files.list, e.g. after ls * > /tmp/files.list ;
Then sort -u /tmp/files.list gives you a sorted file lists without duplicate (not needed if you did the ls * > /tmp/files.list above). You could process that with some awk script inspired from this, e.g.
sort -u /tmp/files.list | awk '{
function basename(file) {
sub(".*/", "", file)
return file
}
curfil=$0;
if (basename(curfil)==basename(prevfil)) system("rm " + curfil);
prevfil=curfil;
}'
Beware, I have not tested this.
add a comment |
Suppose your list of files is in some file /tmp/files.list, e.g. after ls * > /tmp/files.list ;
Then sort -u /tmp/files.list gives you a sorted file lists without duplicate (not needed if you did the ls * > /tmp/files.list above). You could process that with some awk script inspired from this, e.g.
sort -u /tmp/files.list | awk '{
function basename(file) {
sub(".*/", "", file)
return file
}
curfil=$0;
if (basename(curfil)==basename(prevfil)) system("rm " + curfil);
prevfil=curfil;
}'
Beware, I have not tested this.
add a comment |
Suppose your list of files is in some file /tmp/files.list, e.g. after ls * > /tmp/files.list ;
Then sort -u /tmp/files.list gives you a sorted file lists without duplicate (not needed if you did the ls * > /tmp/files.list above). You could process that with some awk script inspired from this, e.g.
sort -u /tmp/files.list | awk '{
function basename(file) {
sub(".*/", "", file)
return file
}
curfil=$0;
if (basename(curfil)==basename(prevfil)) system("rm " + curfil);
prevfil=curfil;
}'
Beware, I have not tested this.
Suppose your list of files is in some file /tmp/files.list, e.g. after ls * > /tmp/files.list ;
Then sort -u /tmp/files.list gives you a sorted file lists without duplicate (not needed if you did the ls * > /tmp/files.list above). You could process that with some awk script inspired from this, e.g.
sort -u /tmp/files.list | awk '{
function basename(file) {
sub(".*/", "", file)
return file
}
curfil=$0;
if (basename(curfil)==basename(prevfil)) system("rm " + curfil);
prevfil=curfil;
}'
Beware, I have not tested this.
edited Apr 13 '17 at 12:36
Community♦
1
1
answered Jul 16 '15 at 5:08
Basile Starynkevitch
8,0412041
8,0412041
add a comment |
add a comment |
Another simple looking complex way can be:
for x in `for i in *; do echo $i ; done | cut -d'.' -f1 | uniq -u `; do rm $x.*; done
Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
– Gilles
Jul 16 '15 at 21:14
add a comment |
Another simple looking complex way can be:
for x in `for i in *; do echo $i ; done | cut -d'.' -f1 | uniq -u `; do rm $x.*; done
Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
– Gilles
Jul 16 '15 at 21:14
add a comment |
Another simple looking complex way can be:
for x in `for i in *; do echo $i ; done | cut -d'.' -f1 | uniq -u `; do rm $x.*; done
Another simple looking complex way can be:
for x in `for i in *; do echo $i ; done | cut -d'.' -f1 | uniq -u `; do rm $x.*; done
edited 1 hour ago
Rui F Ribeiro
38.9k1479129
38.9k1479129
answered Jul 16 '15 at 6:10
neuron
1,625517
1,625517
Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
– Gilles
Jul 16 '15 at 21:14
add a comment |
Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
– Gilles
Jul 16 '15 at 21:14
Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
– Gilles
Jul 16 '15 at 21:14
Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
– Gilles
Jul 16 '15 at 21:14
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f216357%2fdelete-all-files-that-dont-have-duplicate-names%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
How do you know that the first filename of that list is not
FILE.docnFILE1.pdf? Any of the newlines can be part of a file name.– Anthon
Jul 16 '15 at 6:10
At what point are you halting the uniqueness comparison? At the first dot? At the last dot? What if there is no dot in the filename? If you had file1.doc and FILE1.DOC is this a duplicate?
– roaima
Jul 16 '15 at 7:18
If you have two files, file.doc, file.pdf and only one of them appears in your list, should both be deleted or just the one that is listed?
– roaima
Jul 16 '15 at 7:21