Delete all files that DON'T have duplicate names?












0














Given a large list of files, containing the following:



FILE1.doc
FILE1.pdf
FILE2.doc
FILE3.doc
FILE3.pdf
FILE4.doc


Is there a terminal command that would allow me to remove all files that do not have a duplicate name in the list? In this case...FILE2.doc and FILE4.doc?










share|improve this question




















  • 1




    How do you know that the first filename of that list is not FILE.docnFILE1.pdf? Any of the newlines can be part of a file name.
    – Anthon
    Jul 16 '15 at 6:10










  • At what point are you halting the uniqueness comparison? At the first dot? At the last dot? What if there is no dot in the filename? If you had file1.doc and FILE1.DOC is this a duplicate?
    – roaima
    Jul 16 '15 at 7:18












  • If you have two files, file.doc, file.pdf and only one of them appears in your list, should both be deleted or just the one that is listed?
    – roaima
    Jul 16 '15 at 7:21
















0














Given a large list of files, containing the following:



FILE1.doc
FILE1.pdf
FILE2.doc
FILE3.doc
FILE3.pdf
FILE4.doc


Is there a terminal command that would allow me to remove all files that do not have a duplicate name in the list? In this case...FILE2.doc and FILE4.doc?










share|improve this question




















  • 1




    How do you know that the first filename of that list is not FILE.docnFILE1.pdf? Any of the newlines can be part of a file name.
    – Anthon
    Jul 16 '15 at 6:10










  • At what point are you halting the uniqueness comparison? At the first dot? At the last dot? What if there is no dot in the filename? If you had file1.doc and FILE1.DOC is this a duplicate?
    – roaima
    Jul 16 '15 at 7:18












  • If you have two files, file.doc, file.pdf and only one of them appears in your list, should both be deleted or just the one that is listed?
    – roaima
    Jul 16 '15 at 7:21














0












0








0







Given a large list of files, containing the following:



FILE1.doc
FILE1.pdf
FILE2.doc
FILE3.doc
FILE3.pdf
FILE4.doc


Is there a terminal command that would allow me to remove all files that do not have a duplicate name in the list? In this case...FILE2.doc and FILE4.doc?










share|improve this question















Given a large list of files, containing the following:



FILE1.doc
FILE1.pdf
FILE2.doc
FILE3.doc
FILE3.pdf
FILE4.doc


Is there a terminal command that would allow me to remove all files that do not have a duplicate name in the list? In this case...FILE2.doc and FILE4.doc?







shell command-line terminal






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jul 16 '15 at 6:09









Anthon

60.2k17102163




60.2k17102163










asked Jul 16 '15 at 4:46









antonpug

1061




1061








  • 1




    How do you know that the first filename of that list is not FILE.docnFILE1.pdf? Any of the newlines can be part of a file name.
    – Anthon
    Jul 16 '15 at 6:10










  • At what point are you halting the uniqueness comparison? At the first dot? At the last dot? What if there is no dot in the filename? If you had file1.doc and FILE1.DOC is this a duplicate?
    – roaima
    Jul 16 '15 at 7:18












  • If you have two files, file.doc, file.pdf and only one of them appears in your list, should both be deleted or just the one that is listed?
    – roaima
    Jul 16 '15 at 7:21














  • 1




    How do you know that the first filename of that list is not FILE.docnFILE1.pdf? Any of the newlines can be part of a file name.
    – Anthon
    Jul 16 '15 at 6:10










  • At what point are you halting the uniqueness comparison? At the first dot? At the last dot? What if there is no dot in the filename? If you had file1.doc and FILE1.DOC is this a duplicate?
    – roaima
    Jul 16 '15 at 7:18












  • If you have two files, file.doc, file.pdf and only one of them appears in your list, should both be deleted or just the one that is listed?
    – roaima
    Jul 16 '15 at 7:21








1




1




How do you know that the first filename of that list is not FILE.docnFILE1.pdf? Any of the newlines can be part of a file name.
– Anthon
Jul 16 '15 at 6:10




How do you know that the first filename of that list is not FILE.docnFILE1.pdf? Any of the newlines can be part of a file name.
– Anthon
Jul 16 '15 at 6:10












At what point are you halting the uniqueness comparison? At the first dot? At the last dot? What if there is no dot in the filename? If you had file1.doc and FILE1.DOC is this a duplicate?
– roaima
Jul 16 '15 at 7:18






At what point are you halting the uniqueness comparison? At the first dot? At the last dot? What if there is no dot in the filename? If you had file1.doc and FILE1.DOC is this a duplicate?
– roaima
Jul 16 '15 at 7:18














If you have two files, file.doc, file.pdf and only one of them appears in your list, should both be deleted or just the one that is listed?
– roaima
Jul 16 '15 at 7:21




If you have two files, file.doc, file.pdf and only one of them appears in your list, should both be deleted or just the one that is listed?
– roaima
Jul 16 '15 at 7:21










3 Answers
3






active

oldest

votes


















1














Using bash, this will remove all files that don't have another file with the same name but different extension:



for f in *; do same=("${f%.*}".*); [ "${#same[@]}" -eq 1 ] && rm "$f"; done


This approach is safe for all file names, even those with white space in their names.



How it works





  • for f in *; do



    This starts a loop over all files in the current directory.




  • same=("${f%.*}".*)



    This creates a bash array with the names of all files with the same basename.



    $f is the name of our file. ${f%.*} is the name of the file without its extension. If, for example, the file is FILE1.doc, then ${f%.*} is FILE1. "${f%.*}".* is all the files with the same basename but any extension. ("${f%.*}".*) is a bash array of those names. same=("${f%.*}".*) assigns the array to the variable same.




  • [ "${#same[@]}" -eq 1 ] && rm "$f"



    If there is only one file with this basename, we delete it.



    "${#same[@]}" is the number of files in the array same. [ "${#same[@]}" -eq 1 ] is true if there is only one such file.



    && is logical-and. It causes the statement which follows, rm "$f" to be executed only if the statement which precedes it returns logical true.




  • done



    This marks the end of the for loop.








share|improve this answer























  • Thank you for a very descriptive answer. This is awesome.
    – antonpug
    Jul 16 '15 at 13:25



















0














Suppose your list of files is in some file /tmp/files.list, e.g. after ls * > /tmp/files.list ;



Then sort -u /tmp/files.list gives you a sorted file lists without duplicate (not needed if you did the ls * > /tmp/files.list above). You could process that with some awk script inspired from this, e.g.



sort -u /tmp/files.list | awk '{
function basename(file) {
sub(".*/", "", file)
return file
}
curfil=$0;
if (basename(curfil)==basename(prevfil)) system("rm " + curfil);
prevfil=curfil;
}'


Beware, I have not tested this.






share|improve this answer































    0














    Another simple looking complex way can be:



    for x in `for i in *; do echo $i ; done | cut -d'.' -f1 | uniq -u `; do rm $x.*; done





    share|improve this answer























    • Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
      – Gilles
      Jul 16 '15 at 21:14











    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "106"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f216357%2fdelete-all-files-that-dont-have-duplicate-names%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    Using bash, this will remove all files that don't have another file with the same name but different extension:



    for f in *; do same=("${f%.*}".*); [ "${#same[@]}" -eq 1 ] && rm "$f"; done


    This approach is safe for all file names, even those with white space in their names.



    How it works





    • for f in *; do



      This starts a loop over all files in the current directory.




    • same=("${f%.*}".*)



      This creates a bash array with the names of all files with the same basename.



      $f is the name of our file. ${f%.*} is the name of the file without its extension. If, for example, the file is FILE1.doc, then ${f%.*} is FILE1. "${f%.*}".* is all the files with the same basename but any extension. ("${f%.*}".*) is a bash array of those names. same=("${f%.*}".*) assigns the array to the variable same.




    • [ "${#same[@]}" -eq 1 ] && rm "$f"



      If there is only one file with this basename, we delete it.



      "${#same[@]}" is the number of files in the array same. [ "${#same[@]}" -eq 1 ] is true if there is only one such file.



      && is logical-and. It causes the statement which follows, rm "$f" to be executed only if the statement which precedes it returns logical true.




    • done



      This marks the end of the for loop.








    share|improve this answer























    • Thank you for a very descriptive answer. This is awesome.
      – antonpug
      Jul 16 '15 at 13:25
















    1














    Using bash, this will remove all files that don't have another file with the same name but different extension:



    for f in *; do same=("${f%.*}".*); [ "${#same[@]}" -eq 1 ] && rm "$f"; done


    This approach is safe for all file names, even those with white space in their names.



    How it works





    • for f in *; do



      This starts a loop over all files in the current directory.




    • same=("${f%.*}".*)



      This creates a bash array with the names of all files with the same basename.



      $f is the name of our file. ${f%.*} is the name of the file without its extension. If, for example, the file is FILE1.doc, then ${f%.*} is FILE1. "${f%.*}".* is all the files with the same basename but any extension. ("${f%.*}".*) is a bash array of those names. same=("${f%.*}".*) assigns the array to the variable same.




    • [ "${#same[@]}" -eq 1 ] && rm "$f"



      If there is only one file with this basename, we delete it.



      "${#same[@]}" is the number of files in the array same. [ "${#same[@]}" -eq 1 ] is true if there is only one such file.



      && is logical-and. It causes the statement which follows, rm "$f" to be executed only if the statement which precedes it returns logical true.




    • done



      This marks the end of the for loop.








    share|improve this answer























    • Thank you for a very descriptive answer. This is awesome.
      – antonpug
      Jul 16 '15 at 13:25














    1












    1








    1






    Using bash, this will remove all files that don't have another file with the same name but different extension:



    for f in *; do same=("${f%.*}".*); [ "${#same[@]}" -eq 1 ] && rm "$f"; done


    This approach is safe for all file names, even those with white space in their names.



    How it works





    • for f in *; do



      This starts a loop over all files in the current directory.




    • same=("${f%.*}".*)



      This creates a bash array with the names of all files with the same basename.



      $f is the name of our file. ${f%.*} is the name of the file without its extension. If, for example, the file is FILE1.doc, then ${f%.*} is FILE1. "${f%.*}".* is all the files with the same basename but any extension. ("${f%.*}".*) is a bash array of those names. same=("${f%.*}".*) assigns the array to the variable same.




    • [ "${#same[@]}" -eq 1 ] && rm "$f"



      If there is only one file with this basename, we delete it.



      "${#same[@]}" is the number of files in the array same. [ "${#same[@]}" -eq 1 ] is true if there is only one such file.



      && is logical-and. It causes the statement which follows, rm "$f" to be executed only if the statement which precedes it returns logical true.




    • done



      This marks the end of the for loop.








    share|improve this answer














    Using bash, this will remove all files that don't have another file with the same name but different extension:



    for f in *; do same=("${f%.*}".*); [ "${#same[@]}" -eq 1 ] && rm "$f"; done


    This approach is safe for all file names, even those with white space in their names.



    How it works





    • for f in *; do



      This starts a loop over all files in the current directory.




    • same=("${f%.*}".*)



      This creates a bash array with the names of all files with the same basename.



      $f is the name of our file. ${f%.*} is the name of the file without its extension. If, for example, the file is FILE1.doc, then ${f%.*} is FILE1. "${f%.*}".* is all the files with the same basename but any extension. ("${f%.*}".*) is a bash array of those names. same=("${f%.*}".*) assigns the array to the variable same.




    • [ "${#same[@]}" -eq 1 ] && rm "$f"



      If there is only one file with this basename, we delete it.



      "${#same[@]}" is the number of files in the array same. [ "${#same[@]}" -eq 1 ] is true if there is only one such file.



      && is logical-and. It causes the statement which follows, rm "$f" to be executed only if the statement which precedes it returns logical true.




    • done



      This marks the end of the for loop.









    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jul 16 '15 at 6:31

























    answered Jul 16 '15 at 5:23









    John1024

    45.8k4103119




    45.8k4103119












    • Thank you for a very descriptive answer. This is awesome.
      – antonpug
      Jul 16 '15 at 13:25


















    • Thank you for a very descriptive answer. This is awesome.
      – antonpug
      Jul 16 '15 at 13:25
















    Thank you for a very descriptive answer. This is awesome.
    – antonpug
    Jul 16 '15 at 13:25




    Thank you for a very descriptive answer. This is awesome.
    – antonpug
    Jul 16 '15 at 13:25













    0














    Suppose your list of files is in some file /tmp/files.list, e.g. after ls * > /tmp/files.list ;



    Then sort -u /tmp/files.list gives you a sorted file lists without duplicate (not needed if you did the ls * > /tmp/files.list above). You could process that with some awk script inspired from this, e.g.



    sort -u /tmp/files.list | awk '{
    function basename(file) {
    sub(".*/", "", file)
    return file
    }
    curfil=$0;
    if (basename(curfil)==basename(prevfil)) system("rm " + curfil);
    prevfil=curfil;
    }'


    Beware, I have not tested this.






    share|improve this answer




























      0














      Suppose your list of files is in some file /tmp/files.list, e.g. after ls * > /tmp/files.list ;



      Then sort -u /tmp/files.list gives you a sorted file lists without duplicate (not needed if you did the ls * > /tmp/files.list above). You could process that with some awk script inspired from this, e.g.



      sort -u /tmp/files.list | awk '{
      function basename(file) {
      sub(".*/", "", file)
      return file
      }
      curfil=$0;
      if (basename(curfil)==basename(prevfil)) system("rm " + curfil);
      prevfil=curfil;
      }'


      Beware, I have not tested this.






      share|improve this answer


























        0












        0








        0






        Suppose your list of files is in some file /tmp/files.list, e.g. after ls * > /tmp/files.list ;



        Then sort -u /tmp/files.list gives you a sorted file lists without duplicate (not needed if you did the ls * > /tmp/files.list above). You could process that with some awk script inspired from this, e.g.



        sort -u /tmp/files.list | awk '{
        function basename(file) {
        sub(".*/", "", file)
        return file
        }
        curfil=$0;
        if (basename(curfil)==basename(prevfil)) system("rm " + curfil);
        prevfil=curfil;
        }'


        Beware, I have not tested this.






        share|improve this answer














        Suppose your list of files is in some file /tmp/files.list, e.g. after ls * > /tmp/files.list ;



        Then sort -u /tmp/files.list gives you a sorted file lists without duplicate (not needed if you did the ls * > /tmp/files.list above). You could process that with some awk script inspired from this, e.g.



        sort -u /tmp/files.list | awk '{
        function basename(file) {
        sub(".*/", "", file)
        return file
        }
        curfil=$0;
        if (basename(curfil)==basename(prevfil)) system("rm " + curfil);
        prevfil=curfil;
        }'


        Beware, I have not tested this.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Apr 13 '17 at 12:36









        Community

        1




        1










        answered Jul 16 '15 at 5:08









        Basile Starynkevitch

        8,0412041




        8,0412041























            0














            Another simple looking complex way can be:



            for x in `for i in *; do echo $i ; done | cut -d'.' -f1 | uniq -u `; do rm $x.*; done





            share|improve this answer























            • Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
              – Gilles
              Jul 16 '15 at 21:14
















            0














            Another simple looking complex way can be:



            for x in `for i in *; do echo $i ; done | cut -d'.' -f1 | uniq -u `; do rm $x.*; done





            share|improve this answer























            • Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
              – Gilles
              Jul 16 '15 at 21:14














            0












            0








            0






            Another simple looking complex way can be:



            for x in `for i in *; do echo $i ; done | cut -d'.' -f1 | uniq -u `; do rm $x.*; done





            share|improve this answer














            Another simple looking complex way can be:



            for x in `for i in *; do echo $i ; done | cut -d'.' -f1 | uniq -u `; do rm $x.*; done






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited 1 hour ago









            Rui F Ribeiro

            38.9k1479129




            38.9k1479129










            answered Jul 16 '15 at 6:10









            neuron

            1,625517




            1,625517












            • Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
              – Gilles
              Jul 16 '15 at 21:14


















            • Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
              – Gilles
              Jul 16 '15 at 21:14
















            Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
            – Gilles
            Jul 16 '15 at 21:14




            Your code breaks with file names containing spaces and other special characters. See unix.stackexchange.com/questions/131766/… for some tips to fix that.
            – Gilles
            Jul 16 '15 at 21:14


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f216357%2fdelete-all-files-that-dont-have-duplicate-names%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Entries order in /etc/network/interfaces

            新発田市

            Grub takes very long (several minutes) to open Menu (in Multi-Boot-System)