Shell script sort











up vote
0
down vote

favorite












Im trying to sort a small file with some entries containing two words but i want to sort this as one entry.



for example consider this small list



 peter barker painter
carl baker cook
joshua carpenter


These are all names and occupations. now say i want to use sort to sort these entries.



Problem is sort uses white spaces as fields
so if i sort -k 1n i'll sort by first name



But i want to sort by full name and then have the option to sort by occupation aswell. As you can see some entires don't have a full name, joshua only have his first name and his occupation. So for him i want to sort only by first name but for the others full name.



Can this be achieved?










share|improve this question




























    up vote
    0
    down vote

    favorite












    Im trying to sort a small file with some entries containing two words but i want to sort this as one entry.



    for example consider this small list



     peter barker painter
    carl baker cook
    joshua carpenter


    These are all names and occupations. now say i want to use sort to sort these entries.



    Problem is sort uses white spaces as fields
    so if i sort -k 1n i'll sort by first name



    But i want to sort by full name and then have the option to sort by occupation aswell. As you can see some entires don't have a full name, joshua only have his first name and his occupation. So for him i want to sort only by first name but for the others full name.



    Can this be achieved?










    share|improve this question


























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      Im trying to sort a small file with some entries containing two words but i want to sort this as one entry.



      for example consider this small list



       peter barker painter
      carl baker cook
      joshua carpenter


      These are all names and occupations. now say i want to use sort to sort these entries.



      Problem is sort uses white spaces as fields
      so if i sort -k 1n i'll sort by first name



      But i want to sort by full name and then have the option to sort by occupation aswell. As you can see some entires don't have a full name, joshua only have his first name and his occupation. So for him i want to sort only by first name but for the others full name.



      Can this be achieved?










      share|improve this question















      Im trying to sort a small file with some entries containing two words but i want to sort this as one entry.



      for example consider this small list



       peter barker painter
      carl baker cook
      joshua carpenter


      These are all names and occupations. now say i want to use sort to sort these entries.



      Problem is sort uses white spaces as fields
      so if i sort -k 1n i'll sort by first name



      But i want to sort by full name and then have the option to sort by occupation aswell. As you can see some entires don't have a full name, joshua only have his first name and his occupation. So for him i want to sort only by first name but for the others full name.



      Can this be achieved?







      linux text-processing sort






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 23 at 15:58









      Kusalananda

      117k16220357




      117k16220357










      asked Nov 23 at 15:42









      mrmagin

      122




      122






















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted










          Assuming that it is only ever going to be the surname that is missing (and not the first name) and that the words in the file does not include spaces (which would make it extremely difficult), first get the data into tab-delimited format with the missing surnames replaced by empty fields:



          $ awk -v OFS='t' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file
          peter barker painter
          carl baker cook
          joshua carpenter


          The awk script will detect lines that contain two or three fields. It will simply reformat the lines that already has three fields into three tab-delimited fields while moving the second field into the third field for the lines that originally only contained two fields.



          Then sort the data with tabs as delimiters:



          $ awk -v OFS='t' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t $'t' -k1,2 -k3
          carl baker cook
          joshua carpenter
          peter barker painter


          The sorting done here is on the full name (fields one and two) and then by occupation. It is assumed that you are using a shell like bash that understands $'t' as a tab character.





          Instead of tabs, you may use any other character that does not interfere with the data (here :):



          $ awk -v OFS=':' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t ':' -k1,2 -k3
          carl:baker:cook
          joshua::carpenter
          peter:barker:painter


          Then replace the chosen delimiter character by passing the result through tr (here replacing with tabs, because it looks nice):



          $ awk -v OFS=':' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t ':' -k1,2 -k3 | tr ':' 't'
          carl baker cook
          joshua carpenter
          peter barker painter





          share|improve this answer























            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "106"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f483710%2fshell-script-sort%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            1
            down vote



            accepted










            Assuming that it is only ever going to be the surname that is missing (and not the first name) and that the words in the file does not include spaces (which would make it extremely difficult), first get the data into tab-delimited format with the missing surnames replaced by empty fields:



            $ awk -v OFS='t' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file
            peter barker painter
            carl baker cook
            joshua carpenter


            The awk script will detect lines that contain two or three fields. It will simply reformat the lines that already has three fields into three tab-delimited fields while moving the second field into the third field for the lines that originally only contained two fields.



            Then sort the data with tabs as delimiters:



            $ awk -v OFS='t' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t $'t' -k1,2 -k3
            carl baker cook
            joshua carpenter
            peter barker painter


            The sorting done here is on the full name (fields one and two) and then by occupation. It is assumed that you are using a shell like bash that understands $'t' as a tab character.





            Instead of tabs, you may use any other character that does not interfere with the data (here :):



            $ awk -v OFS=':' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t ':' -k1,2 -k3
            carl:baker:cook
            joshua::carpenter
            peter:barker:painter


            Then replace the chosen delimiter character by passing the result through tr (here replacing with tabs, because it looks nice):



            $ awk -v OFS=':' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t ':' -k1,2 -k3 | tr ':' 't'
            carl baker cook
            joshua carpenter
            peter barker painter





            share|improve this answer



























              up vote
              1
              down vote



              accepted










              Assuming that it is only ever going to be the surname that is missing (and not the first name) and that the words in the file does not include spaces (which would make it extremely difficult), first get the data into tab-delimited format with the missing surnames replaced by empty fields:



              $ awk -v OFS='t' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file
              peter barker painter
              carl baker cook
              joshua carpenter


              The awk script will detect lines that contain two or three fields. It will simply reformat the lines that already has three fields into three tab-delimited fields while moving the second field into the third field for the lines that originally only contained two fields.



              Then sort the data with tabs as delimiters:



              $ awk -v OFS='t' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t $'t' -k1,2 -k3
              carl baker cook
              joshua carpenter
              peter barker painter


              The sorting done here is on the full name (fields one and two) and then by occupation. It is assumed that you are using a shell like bash that understands $'t' as a tab character.





              Instead of tabs, you may use any other character that does not interfere with the data (here :):



              $ awk -v OFS=':' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t ':' -k1,2 -k3
              carl:baker:cook
              joshua::carpenter
              peter:barker:painter


              Then replace the chosen delimiter character by passing the result through tr (here replacing with tabs, because it looks nice):



              $ awk -v OFS=':' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t ':' -k1,2 -k3 | tr ':' 't'
              carl baker cook
              joshua carpenter
              peter barker painter





              share|improve this answer

























                up vote
                1
                down vote



                accepted







                up vote
                1
                down vote



                accepted






                Assuming that it is only ever going to be the surname that is missing (and not the first name) and that the words in the file does not include spaces (which would make it extremely difficult), first get the data into tab-delimited format with the missing surnames replaced by empty fields:



                $ awk -v OFS='t' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file
                peter barker painter
                carl baker cook
                joshua carpenter


                The awk script will detect lines that contain two or three fields. It will simply reformat the lines that already has three fields into three tab-delimited fields while moving the second field into the third field for the lines that originally only contained two fields.



                Then sort the data with tabs as delimiters:



                $ awk -v OFS='t' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t $'t' -k1,2 -k3
                carl baker cook
                joshua carpenter
                peter barker painter


                The sorting done here is on the full name (fields one and two) and then by occupation. It is assumed that you are using a shell like bash that understands $'t' as a tab character.





                Instead of tabs, you may use any other character that does not interfere with the data (here :):



                $ awk -v OFS=':' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t ':' -k1,2 -k3
                carl:baker:cook
                joshua::carpenter
                peter:barker:painter


                Then replace the chosen delimiter character by passing the result through tr (here replacing with tabs, because it looks nice):



                $ awk -v OFS=':' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t ':' -k1,2 -k3 | tr ':' 't'
                carl baker cook
                joshua carpenter
                peter barker painter





                share|improve this answer














                Assuming that it is only ever going to be the surname that is missing (and not the first name) and that the words in the file does not include spaces (which would make it extremely difficult), first get the data into tab-delimited format with the missing surnames replaced by empty fields:



                $ awk -v OFS='t' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file
                peter barker painter
                carl baker cook
                joshua carpenter


                The awk script will detect lines that contain two or three fields. It will simply reformat the lines that already has three fields into three tab-delimited fields while moving the second field into the third field for the lines that originally only contained two fields.



                Then sort the data with tabs as delimiters:



                $ awk -v OFS='t' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t $'t' -k1,2 -k3
                carl baker cook
                joshua carpenter
                peter barker painter


                The sorting done here is on the full name (fields one and two) and then by occupation. It is assumed that you are using a shell like bash that understands $'t' as a tab character.





                Instead of tabs, you may use any other character that does not interfere with the data (here :):



                $ awk -v OFS=':' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t ':' -k1,2 -k3
                carl:baker:cook
                joshua::carpenter
                peter:barker:painter


                Then replace the chosen delimiter character by passing the result through tr (here replacing with tabs, because it looks nice):



                $ awk -v OFS=':' 'NF == 3 { $1 = $1 } NF == 2 { $3 = $2; $2 = "" } { print }' <file | sort -t ':' -k1,2 -k3 | tr ':' 't'
                carl baker cook
                joshua carpenter
                peter barker painter






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Nov 23 at 15:57

























                answered Nov 23 at 15:50









                Kusalananda

                117k16220357




                117k16220357






























                     

                    draft saved


                    draft discarded



















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f483710%2fshell-script-sort%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    サソリ

                    広島県道265号伴広島線

                    Accessing regular linux commands in Huawei's Dopra Linux