Delete multiple columns using awk or sed












1















I have a database with 6037 space-separated columns and 450 rows like the one below:



1807 1452 1598 1 6.655713  A B A B ... 0 
1808 1452 1763 1 9.362033 0 0 A B ... A
1809 1452 1527 2 6.728534 A B A A ... B
1810 1452 1367 2 9.4055 A B A A B ... A
... ... ... ... ... ... ... ... ... ...
1812 1452 1258 1 6.363032 0 0 A B ... B


I want to get a new database with only the first 676 columns.



Preferably, some form that uses awk or sed command.










share|improve this question









New contributor




andrec is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • The delimiter is the space.

    – andrec
    2 hours ago
















1















I have a database with 6037 space-separated columns and 450 rows like the one below:



1807 1452 1598 1 6.655713  A B A B ... 0 
1808 1452 1763 1 9.362033 0 0 A B ... A
1809 1452 1527 2 6.728534 A B A A ... B
1810 1452 1367 2 9.4055 A B A A B ... A
... ... ... ... ... ... ... ... ... ...
1812 1452 1258 1 6.363032 0 0 A B ... B


I want to get a new database with only the first 676 columns.



Preferably, some form that uses awk or sed command.










share|improve this question









New contributor




andrec is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • The delimiter is the space.

    – andrec
    2 hours ago














1












1








1








I have a database with 6037 space-separated columns and 450 rows like the one below:



1807 1452 1598 1 6.655713  A B A B ... 0 
1808 1452 1763 1 9.362033 0 0 A B ... A
1809 1452 1527 2 6.728534 A B A A ... B
1810 1452 1367 2 9.4055 A B A A B ... A
... ... ... ... ... ... ... ... ... ...
1812 1452 1258 1 6.363032 0 0 A B ... B


I want to get a new database with only the first 676 columns.



Preferably, some form that uses awk or sed command.










share|improve this question









New contributor




andrec is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












I have a database with 6037 space-separated columns and 450 rows like the one below:



1807 1452 1598 1 6.655713  A B A B ... 0 
1808 1452 1763 1 9.362033 0 0 A B ... A
1809 1452 1527 2 6.728534 A B A A ... B
1810 1452 1367 2 9.4055 A B A A B ... A
... ... ... ... ... ... ... ... ... ...
1812 1452 1258 1 6.363032 0 0 A B ... B


I want to get a new database with only the first 676 columns.



Preferably, some form that uses awk or sed command.







text-processing sed awk






share|improve this question









New contributor




andrec is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




andrec is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 2 hours ago









dessert

24.7k672105




24.7k672105






New contributor




andrec is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 3 hours ago









andrecandrec

61




61




New contributor




andrec is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





andrec is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






andrec is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.













  • The delimiter is the space.

    – andrec
    2 hours ago



















  • The delimiter is the space.

    – andrec
    2 hours ago

















The delimiter is the space.

– andrec
2 hours ago





The delimiter is the space.

– andrec
2 hours ago










2 Answers
2






active

oldest

votes


















4
















If the column delimiter in your file is a single character, e.g. a space, cut can do that easily:



cut -d' ' -f-676 <in >out


This prints only the space-separated columns from the first to the 676th.



If you need e.g. every whitespace character to count as a delimiter, a sed solution is:



sed -r 's/s+S+//677g' <in >out


This replaces every column (= at least one whitespace character followed by at least one non-whitespace character) beginning with the 677th with nothing. Using character groups you can specify any set of delimiters you need, e.g. for “4”, “#” and “K”:



sed -r 's/[4#K]+[^4#K]+//677g' <in >out


For a reasonable awk approach kindly refer to steeldriver’s answer, but here is another one looping over the columns and only printing them (separated by FS) if their number is <= 676:



awk '{for (i=1;i<=676;i++) {printf (i==1?"":FS)$i}; print ""}' <in >out


For a character group you have to specify the output field separator for the output, e.g. for [4#K] and "sep":




awk -F'[4#K]' '{for (i=1;i<=676;i++) {printf (i==1?"":"sep")$i}; print ""}' <in >out





share|improve this answer

































    1














    For a single-character delimiter (such as space or comma) I would recommend using the cut command over either awk or sed.



    However since you asked about awk specifically, I think a reasonable way to do it would be to decrement the field count:



    awk -v last=676 '{while(NF>last) NF--} 1' datafile


    Tested in GNU Awk (gawk) and mawk.






    share|improve this answer

























      Your Answer








      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "89"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });






      andrec is a new contributor. Be nice, and check out our Code of Conduct.










      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1127670%2fdelete-multiple-columns-using-awk-or-sed%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      4
















      If the column delimiter in your file is a single character, e.g. a space, cut can do that easily:



      cut -d' ' -f-676 <in >out


      This prints only the space-separated columns from the first to the 676th.



      If you need e.g. every whitespace character to count as a delimiter, a sed solution is:



      sed -r 's/s+S+//677g' <in >out


      This replaces every column (= at least one whitespace character followed by at least one non-whitespace character) beginning with the 677th with nothing. Using character groups you can specify any set of delimiters you need, e.g. for “4”, “#” and “K”:



      sed -r 's/[4#K]+[^4#K]+//677g' <in >out


      For a reasonable awk approach kindly refer to steeldriver’s answer, but here is another one looping over the columns and only printing them (separated by FS) if their number is <= 676:



      awk '{for (i=1;i<=676;i++) {printf (i==1?"":FS)$i}; print ""}' <in >out


      For a character group you have to specify the output field separator for the output, e.g. for [4#K] and "sep":




      awk -F'[4#K]' '{for (i=1;i<=676;i++) {printf (i==1?"":"sep")$i}; print ""}' <in >out





      share|improve this answer






























        4
















        If the column delimiter in your file is a single character, e.g. a space, cut can do that easily:



        cut -d' ' -f-676 <in >out


        This prints only the space-separated columns from the first to the 676th.



        If you need e.g. every whitespace character to count as a delimiter, a sed solution is:



        sed -r 's/s+S+//677g' <in >out


        This replaces every column (= at least one whitespace character followed by at least one non-whitespace character) beginning with the 677th with nothing. Using character groups you can specify any set of delimiters you need, e.g. for “4”, “#” and “K”:



        sed -r 's/[4#K]+[^4#K]+//677g' <in >out


        For a reasonable awk approach kindly refer to steeldriver’s answer, but here is another one looping over the columns and only printing them (separated by FS) if their number is <= 676:



        awk '{for (i=1;i<=676;i++) {printf (i==1?"":FS)$i}; print ""}' <in >out


        For a character group you have to specify the output field separator for the output, e.g. for [4#K] and "sep":




        awk -F'[4#K]' '{for (i=1;i<=676;i++) {printf (i==1?"":"sep")$i}; print ""}' <in >out





        share|improve this answer




























          4












          4








          4









          If the column delimiter in your file is a single character, e.g. a space, cut can do that easily:



          cut -d' ' -f-676 <in >out


          This prints only the space-separated columns from the first to the 676th.



          If you need e.g. every whitespace character to count as a delimiter, a sed solution is:



          sed -r 's/s+S+//677g' <in >out


          This replaces every column (= at least one whitespace character followed by at least one non-whitespace character) beginning with the 677th with nothing. Using character groups you can specify any set of delimiters you need, e.g. for “4”, “#” and “K”:



          sed -r 's/[4#K]+[^4#K]+//677g' <in >out


          For a reasonable awk approach kindly refer to steeldriver’s answer, but here is another one looping over the columns and only printing them (separated by FS) if their number is <= 676:



          awk '{for (i=1;i<=676;i++) {printf (i==1?"":FS)$i}; print ""}' <in >out


          For a character group you have to specify the output field separator for the output, e.g. for [4#K] and "sep":




          awk -F'[4#K]' '{for (i=1;i<=676;i++) {printf (i==1?"":"sep")$i}; print ""}' <in >out





          share|improve this answer

















          If the column delimiter in your file is a single character, e.g. a space, cut can do that easily:



          cut -d' ' -f-676 <in >out


          This prints only the space-separated columns from the first to the 676th.



          If you need e.g. every whitespace character to count as a delimiter, a sed solution is:



          sed -r 's/s+S+//677g' <in >out


          This replaces every column (= at least one whitespace character followed by at least one non-whitespace character) beginning with the 677th with nothing. Using character groups you can specify any set of delimiters you need, e.g. for “4”, “#” and “K”:



          sed -r 's/[4#K]+[^4#K]+//677g' <in >out


          For a reasonable awk approach kindly refer to steeldriver’s answer, but here is another one looping over the columns and only printing them (separated by FS) if their number is <= 676:



          awk '{for (i=1;i<=676;i++) {printf (i==1?"":FS)$i}; print ""}' <in >out


          For a character group you have to specify the output field separator for the output, e.g. for [4#K] and "sep":




          awk -F'[4#K]' '{for (i=1;i<=676;i++) {printf (i==1?"":"sep")$i}; print ""}' <in >out






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 2 hours ago

























          answered 3 hours ago









          dessertdessert

          24.7k672105




          24.7k672105

























              1














              For a single-character delimiter (such as space or comma) I would recommend using the cut command over either awk or sed.



              However since you asked about awk specifically, I think a reasonable way to do it would be to decrement the field count:



              awk -v last=676 '{while(NF>last) NF--} 1' datafile


              Tested in GNU Awk (gawk) and mawk.






              share|improve this answer






























                1














                For a single-character delimiter (such as space or comma) I would recommend using the cut command over either awk or sed.



                However since you asked about awk specifically, I think a reasonable way to do it would be to decrement the field count:



                awk -v last=676 '{while(NF>last) NF--} 1' datafile


                Tested in GNU Awk (gawk) and mawk.






                share|improve this answer




























                  1












                  1








                  1







                  For a single-character delimiter (such as space or comma) I would recommend using the cut command over either awk or sed.



                  However since you asked about awk specifically, I think a reasonable way to do it would be to decrement the field count:



                  awk -v last=676 '{while(NF>last) NF--} 1' datafile


                  Tested in GNU Awk (gawk) and mawk.






                  share|improve this answer















                  For a single-character delimiter (such as space or comma) I would recommend using the cut command over either awk or sed.



                  However since you asked about awk specifically, I think a reasonable way to do it would be to decrement the field count:



                  awk -v last=676 '{while(NF>last) NF--} 1' datafile


                  Tested in GNU Awk (gawk) and mawk.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited 2 hours ago

























                  answered 2 hours ago









                  steeldriversteeldriver

                  69.8k11114186




                  69.8k11114186






















                      andrec is a new contributor. Be nice, and check out our Code of Conduct.










                      draft saved

                      draft discarded


















                      andrec is a new contributor. Be nice, and check out our Code of Conduct.













                      andrec is a new contributor. Be nice, and check out our Code of Conduct.












                      andrec is a new contributor. Be nice, and check out our Code of Conduct.
















                      Thanks for contributing an answer to Ask Ubuntu!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1127670%2fdelete-multiple-columns-using-awk-or-sed%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Accessing regular linux commands in Huawei's Dopra Linux

                      Can't connect RFCOMM socket: Host is down

                      Kernel panic - not syncing: Fatal Exception in Interrupt