How can I get the compressed file size of all the lines returned by zgrep on a .gz file? [on hold]











up vote
0
down vote

favorite












If I grep something on a text file then pipe to wc -c, I can see the size of all the lines returned in bytes. How can I get the compressed file size of all the lines returned by zgrep on a .gz file?



For example, I have a file named a.gz:



    zgrep abc a.gz | wc -c
bytes of abc in gz, 395714


ll *.gz gives me:



    bytes of *.gz file, 113276


ll a (the uncompressed file) gives me:



    bytes of a, 1501625


How can I find the compressed size of all the lines returned by zgrep abc a.gz? I've tried to pipe to wc -c above and it gives me the uncompressed size (since 395714 is bigger than 113276).










share|improve this question















put on hold as unclear what you're asking by Jeff Schaller, JigglyNaga, Archemar, X Tian, thrig Nov 28 at 20:17


Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.











  • 1




    The size of those lines as compressed in the original file, or re-compressed?
    – Jeff Schaller
    Nov 28 at 1:38






  • 2




    I'm not sure, but I think gzip compresses data in blocks, not line-by-line. Also, zgrep's output is the decompressed string (abc in your example), not the compressed version.
    – Jeff Schaller
    Nov 28 at 1:49






  • 3




    Unless your original text is large enough to be a block size, I think it'll be compressed with data around it, potentially throwing off your calculations...?
    – Jeff Schaller
    Nov 28 at 1:51






  • 1




    That's a good point. It'll be hard to strip the data compressed around it. Also, the compressing algorithm might compress each of the same pattern differently depending on the data around it.
    – kouichi
    Nov 28 at 2:00






  • 1




    gzip doesn't know or care about lines. You can get a rough estimate of how much those lines add to the size by comparing the output of wc -c a.gz to zgrep -v pattern a.gz | gzip -c | wc -c. Notice the -v option to zgrep.
    – mosvy
    Nov 28 at 3:46

















up vote
0
down vote

favorite












If I grep something on a text file then pipe to wc -c, I can see the size of all the lines returned in bytes. How can I get the compressed file size of all the lines returned by zgrep on a .gz file?



For example, I have a file named a.gz:



    zgrep abc a.gz | wc -c
bytes of abc in gz, 395714


ll *.gz gives me:



    bytes of *.gz file, 113276


ll a (the uncompressed file) gives me:



    bytes of a, 1501625


How can I find the compressed size of all the lines returned by zgrep abc a.gz? I've tried to pipe to wc -c above and it gives me the uncompressed size (since 395714 is bigger than 113276).










share|improve this question















put on hold as unclear what you're asking by Jeff Schaller, JigglyNaga, Archemar, X Tian, thrig Nov 28 at 20:17


Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.











  • 1




    The size of those lines as compressed in the original file, or re-compressed?
    – Jeff Schaller
    Nov 28 at 1:38






  • 2




    I'm not sure, but I think gzip compresses data in blocks, not line-by-line. Also, zgrep's output is the decompressed string (abc in your example), not the compressed version.
    – Jeff Schaller
    Nov 28 at 1:49






  • 3




    Unless your original text is large enough to be a block size, I think it'll be compressed with data around it, potentially throwing off your calculations...?
    – Jeff Schaller
    Nov 28 at 1:51






  • 1




    That's a good point. It'll be hard to strip the data compressed around it. Also, the compressing algorithm might compress each of the same pattern differently depending on the data around it.
    – kouichi
    Nov 28 at 2:00






  • 1




    gzip doesn't know or care about lines. You can get a rough estimate of how much those lines add to the size by comparing the output of wc -c a.gz to zgrep -v pattern a.gz | gzip -c | wc -c. Notice the -v option to zgrep.
    – mosvy
    Nov 28 at 3:46















up vote
0
down vote

favorite









up vote
0
down vote

favorite











If I grep something on a text file then pipe to wc -c, I can see the size of all the lines returned in bytes. How can I get the compressed file size of all the lines returned by zgrep on a .gz file?



For example, I have a file named a.gz:



    zgrep abc a.gz | wc -c
bytes of abc in gz, 395714


ll *.gz gives me:



    bytes of *.gz file, 113276


ll a (the uncompressed file) gives me:



    bytes of a, 1501625


How can I find the compressed size of all the lines returned by zgrep abc a.gz? I've tried to pipe to wc -c above and it gives me the uncompressed size (since 395714 is bigger than 113276).










share|improve this question















If I grep something on a text file then pipe to wc -c, I can see the size of all the lines returned in bytes. How can I get the compressed file size of all the lines returned by zgrep on a .gz file?



For example, I have a file named a.gz:



    zgrep abc a.gz | wc -c
bytes of abc in gz, 395714


ll *.gz gives me:



    bytes of *.gz file, 113276


ll a (the uncompressed file) gives me:



    bytes of a, 1501625


How can I find the compressed size of all the lines returned by zgrep abc a.gz? I've tried to pipe to wc -c above and it gives me the uncompressed size (since 395714 is bigger than 113276).







linux grep gzip wc






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 28 at 1:37









Jeff Schaller

37k1052121




37k1052121










asked Nov 28 at 1:24









kouichi

338




338




put on hold as unclear what you're asking by Jeff Schaller, JigglyNaga, Archemar, X Tian, thrig Nov 28 at 20:17


Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.






put on hold as unclear what you're asking by Jeff Schaller, JigglyNaga, Archemar, X Tian, thrig Nov 28 at 20:17


Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.










  • 1




    The size of those lines as compressed in the original file, or re-compressed?
    – Jeff Schaller
    Nov 28 at 1:38






  • 2




    I'm not sure, but I think gzip compresses data in blocks, not line-by-line. Also, zgrep's output is the decompressed string (abc in your example), not the compressed version.
    – Jeff Schaller
    Nov 28 at 1:49






  • 3




    Unless your original text is large enough to be a block size, I think it'll be compressed with data around it, potentially throwing off your calculations...?
    – Jeff Schaller
    Nov 28 at 1:51






  • 1




    That's a good point. It'll be hard to strip the data compressed around it. Also, the compressing algorithm might compress each of the same pattern differently depending on the data around it.
    – kouichi
    Nov 28 at 2:00






  • 1




    gzip doesn't know or care about lines. You can get a rough estimate of how much those lines add to the size by comparing the output of wc -c a.gz to zgrep -v pattern a.gz | gzip -c | wc -c. Notice the -v option to zgrep.
    – mosvy
    Nov 28 at 3:46
















  • 1




    The size of those lines as compressed in the original file, or re-compressed?
    – Jeff Schaller
    Nov 28 at 1:38






  • 2




    I'm not sure, but I think gzip compresses data in blocks, not line-by-line. Also, zgrep's output is the decompressed string (abc in your example), not the compressed version.
    – Jeff Schaller
    Nov 28 at 1:49






  • 3




    Unless your original text is large enough to be a block size, I think it'll be compressed with data around it, potentially throwing off your calculations...?
    – Jeff Schaller
    Nov 28 at 1:51






  • 1




    That's a good point. It'll be hard to strip the data compressed around it. Also, the compressing algorithm might compress each of the same pattern differently depending on the data around it.
    – kouichi
    Nov 28 at 2:00






  • 1




    gzip doesn't know or care about lines. You can get a rough estimate of how much those lines add to the size by comparing the output of wc -c a.gz to zgrep -v pattern a.gz | gzip -c | wc -c. Notice the -v option to zgrep.
    – mosvy
    Nov 28 at 3:46










1




1




The size of those lines as compressed in the original file, or re-compressed?
– Jeff Schaller
Nov 28 at 1:38




The size of those lines as compressed in the original file, or re-compressed?
– Jeff Schaller
Nov 28 at 1:38




2




2




I'm not sure, but I think gzip compresses data in blocks, not line-by-line. Also, zgrep's output is the decompressed string (abc in your example), not the compressed version.
– Jeff Schaller
Nov 28 at 1:49




I'm not sure, but I think gzip compresses data in blocks, not line-by-line. Also, zgrep's output is the decompressed string (abc in your example), not the compressed version.
– Jeff Schaller
Nov 28 at 1:49




3




3




Unless your original text is large enough to be a block size, I think it'll be compressed with data around it, potentially throwing off your calculations...?
– Jeff Schaller
Nov 28 at 1:51




Unless your original text is large enough to be a block size, I think it'll be compressed with data around it, potentially throwing off your calculations...?
– Jeff Schaller
Nov 28 at 1:51




1




1




That's a good point. It'll be hard to strip the data compressed around it. Also, the compressing algorithm might compress each of the same pattern differently depending on the data around it.
– kouichi
Nov 28 at 2:00




That's a good point. It'll be hard to strip the data compressed around it. Also, the compressing algorithm might compress each of the same pattern differently depending on the data around it.
– kouichi
Nov 28 at 2:00




1




1




gzip doesn't know or care about lines. You can get a rough estimate of how much those lines add to the size by comparing the output of wc -c a.gz to zgrep -v pattern a.gz | gzip -c | wc -c. Notice the -v option to zgrep.
– mosvy
Nov 28 at 3:46






gzip doesn't know or care about lines. You can get a rough estimate of how much those lines add to the size by comparing the output of wc -c a.gz to zgrep -v pattern a.gz | gzip -c | wc -c. Notice the -v option to zgrep.
– mosvy
Nov 28 at 3:46

















active

oldest

votes






















active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes

Popular posts from this blog

Accessing regular linux commands in Huawei's Dopra Linux

Can't connect RFCOMM socket: Host is down

Kernel panic - not syncing: Fatal Exception in Interrupt