Ghostscript taking a lot of time for loading fonts while extracting text from postscript file












0















I want to extract the text of the first page (i.e., the cover) of the postscript file manual.ps.



For doing this, I am using the following command (the sed is necessary for removing the long spaces that I got on the output):



$ gs -q -sDEVICE=txtwrite -sOutputFile=- -dFirstPage=1 -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER manual.ps | sed -e 's/ +/ /g' -e 's/^ *//'

PLAN 9
from
BELL LABS
PROGRAMMER’S MANUAL
First Edition
Computing Science Research Center
AT&T Bell Laboratories
mMurray Hill, New Jersey


This command takes 16 long seconds for running on this specific file, while it takes only 0.1s for other files. But the output I got is correct, it is actually the cover of the document:



Then, I removed the -q (quiet) option to see what is happening:



$ gs -sDEVICE=txtwrite -sOutputFile=- -dFirstPage=1 -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER manual.ps | sed -e 's/ +/ /g' -e 's/^ *//'

GPL Ghostscript 9.26 (2018-11-20)
Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Loading StandardSymbolsPS font from /usr/share/ghostscript/9.26/Resource/Font/StandardSymbolsPS... 5059180 3427366 2377528 1080269 1 done.
Loading NimbusRoman-Regular font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Regular... 5109484 3613274 2397728 1095599 1 done.
Loading NimbusRoman-Italic font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Italic... 5195796 3824381 2438128 1133973 2 done.
PLAN 9
from
BELL LABS
PROGRAMMER’S MANUAL
First Edition
Computing Science Research Center
AT&T Bell Laboratories
Murray Hill, New Jersey
Loading NimbusRoman-Bold font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Bold... 5423508 4050637 2438128 1136854 2 done.
Loading NimbusMonoPS-Regular font from /usr/share/ghostscript/9.26/Resource/Font/NimbusMonoPS-Regular... 5651220 4284228 2559328 1246375 2 done.


For comparison, this is the output (without the -q option) of another file:



$ gs -sDEVICE=txtwrite -sOutputFile=- -dFirstPage=1 -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER troff.ps | sed -e 's/ +/ /g' -e 's/^ +//'

GPL Ghostscript 9.26 (2018-11-20)
Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Loading StandardSymbolsPS font from /usr/share/ghostscript/9.26/Resource/Font/StandardSymbolsPS... 5059180 3436099 2801728 1490759 1 done.
Loading NimbusRoman-Regular font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Regular... 5109484 3622007 2821928 1505697 1 done.
Troff User's Manual
Joseph F. Χssanna
Brian W. Kernighan
bwk@research.bell-labs.com
Introduction
Troff and nroff are text processors that format [...]


This time, there are also fonts been loaded, but only before the output been written, and it takes only a tenth of a second.



So, how to get that first command run faster? Also, is it really necessary to load fonts for converting a .ps file into text? How can I optimize the gs command. And, most importantly, why this is happening only with manual.ps and not with all the other postscript files I have tried?










share|improve this question



























    0















    I want to extract the text of the first page (i.e., the cover) of the postscript file manual.ps.



    For doing this, I am using the following command (the sed is necessary for removing the long spaces that I got on the output):



    $ gs -q -sDEVICE=txtwrite -sOutputFile=- -dFirstPage=1 -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER manual.ps | sed -e 's/ +/ /g' -e 's/^ *//'

    PLAN 9
    from
    BELL LABS
    PROGRAMMER’S MANUAL
    First Edition
    Computing Science Research Center
    AT&T Bell Laboratories
    mMurray Hill, New Jersey


    This command takes 16 long seconds for running on this specific file, while it takes only 0.1s for other files. But the output I got is correct, it is actually the cover of the document:



    Then, I removed the -q (quiet) option to see what is happening:



    $ gs -sDEVICE=txtwrite -sOutputFile=- -dFirstPage=1 -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER manual.ps | sed -e 's/ +/ /g' -e 's/^ *//'

    GPL Ghostscript 9.26 (2018-11-20)
    Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
    This software comes with NO WARRANTY: see the file PUBLIC for details.
    Loading StandardSymbolsPS font from /usr/share/ghostscript/9.26/Resource/Font/StandardSymbolsPS... 5059180 3427366 2377528 1080269 1 done.
    Loading NimbusRoman-Regular font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Regular... 5109484 3613274 2397728 1095599 1 done.
    Loading NimbusRoman-Italic font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Italic... 5195796 3824381 2438128 1133973 2 done.
    PLAN 9
    from
    BELL LABS
    PROGRAMMER’S MANUAL
    First Edition
    Computing Science Research Center
    AT&T Bell Laboratories
    Murray Hill, New Jersey
    Loading NimbusRoman-Bold font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Bold... 5423508 4050637 2438128 1136854 2 done.
    Loading NimbusMonoPS-Regular font from /usr/share/ghostscript/9.26/Resource/Font/NimbusMonoPS-Regular... 5651220 4284228 2559328 1246375 2 done.


    For comparison, this is the output (without the -q option) of another file:



    $ gs -sDEVICE=txtwrite -sOutputFile=- -dFirstPage=1 -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER troff.ps | sed -e 's/ +/ /g' -e 's/^ +//'

    GPL Ghostscript 9.26 (2018-11-20)
    Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
    This software comes with NO WARRANTY: see the file PUBLIC for details.
    Loading StandardSymbolsPS font from /usr/share/ghostscript/9.26/Resource/Font/StandardSymbolsPS... 5059180 3436099 2801728 1490759 1 done.
    Loading NimbusRoman-Regular font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Regular... 5109484 3622007 2821928 1505697 1 done.
    Troff User's Manual
    Joseph F. Χssanna
    Brian W. Kernighan
    bwk@research.bell-labs.com
    Introduction
    Troff and nroff are text processors that format [...]


    This time, there are also fonts been loaded, but only before the output been written, and it takes only a tenth of a second.



    So, how to get that first command run faster? Also, is it really necessary to load fonts for converting a .ps file into text? How can I optimize the gs command. And, most importantly, why this is happening only with manual.ps and not with all the other postscript files I have tried?










    share|improve this question

























      0












      0








      0








      I want to extract the text of the first page (i.e., the cover) of the postscript file manual.ps.



      For doing this, I am using the following command (the sed is necessary for removing the long spaces that I got on the output):



      $ gs -q -sDEVICE=txtwrite -sOutputFile=- -dFirstPage=1 -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER manual.ps | sed -e 's/ +/ /g' -e 's/^ *//'

      PLAN 9
      from
      BELL LABS
      PROGRAMMER’S MANUAL
      First Edition
      Computing Science Research Center
      AT&T Bell Laboratories
      mMurray Hill, New Jersey


      This command takes 16 long seconds for running on this specific file, while it takes only 0.1s for other files. But the output I got is correct, it is actually the cover of the document:



      Then, I removed the -q (quiet) option to see what is happening:



      $ gs -sDEVICE=txtwrite -sOutputFile=- -dFirstPage=1 -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER manual.ps | sed -e 's/ +/ /g' -e 's/^ *//'

      GPL Ghostscript 9.26 (2018-11-20)
      Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
      This software comes with NO WARRANTY: see the file PUBLIC for details.
      Loading StandardSymbolsPS font from /usr/share/ghostscript/9.26/Resource/Font/StandardSymbolsPS... 5059180 3427366 2377528 1080269 1 done.
      Loading NimbusRoman-Regular font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Regular... 5109484 3613274 2397728 1095599 1 done.
      Loading NimbusRoman-Italic font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Italic... 5195796 3824381 2438128 1133973 2 done.
      PLAN 9
      from
      BELL LABS
      PROGRAMMER’S MANUAL
      First Edition
      Computing Science Research Center
      AT&T Bell Laboratories
      Murray Hill, New Jersey
      Loading NimbusRoman-Bold font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Bold... 5423508 4050637 2438128 1136854 2 done.
      Loading NimbusMonoPS-Regular font from /usr/share/ghostscript/9.26/Resource/Font/NimbusMonoPS-Regular... 5651220 4284228 2559328 1246375 2 done.


      For comparison, this is the output (without the -q option) of another file:



      $ gs -sDEVICE=txtwrite -sOutputFile=- -dFirstPage=1 -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER troff.ps | sed -e 's/ +/ /g' -e 's/^ +//'

      GPL Ghostscript 9.26 (2018-11-20)
      Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
      This software comes with NO WARRANTY: see the file PUBLIC for details.
      Loading StandardSymbolsPS font from /usr/share/ghostscript/9.26/Resource/Font/StandardSymbolsPS... 5059180 3436099 2801728 1490759 1 done.
      Loading NimbusRoman-Regular font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Regular... 5109484 3622007 2821928 1505697 1 done.
      Troff User's Manual
      Joseph F. Χssanna
      Brian W. Kernighan
      bwk@research.bell-labs.com
      Introduction
      Troff and nroff are text processors that format [...]


      This time, there are also fonts been loaded, but only before the output been written, and it takes only a tenth of a second.



      So, how to get that first command run faster? Also, is it really necessary to load fonts for converting a .ps file into text? How can I optimize the gs command. And, most importantly, why this is happening only with manual.ps and not with all the other postscript files I have tried?










      share|improve this question














      I want to extract the text of the first page (i.e., the cover) of the postscript file manual.ps.



      For doing this, I am using the following command (the sed is necessary for removing the long spaces that I got on the output):



      $ gs -q -sDEVICE=txtwrite -sOutputFile=- -dFirstPage=1 -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER manual.ps | sed -e 's/ +/ /g' -e 's/^ *//'

      PLAN 9
      from
      BELL LABS
      PROGRAMMER’S MANUAL
      First Edition
      Computing Science Research Center
      AT&T Bell Laboratories
      mMurray Hill, New Jersey


      This command takes 16 long seconds for running on this specific file, while it takes only 0.1s for other files. But the output I got is correct, it is actually the cover of the document:



      Then, I removed the -q (quiet) option to see what is happening:



      $ gs -sDEVICE=txtwrite -sOutputFile=- -dFirstPage=1 -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER manual.ps | sed -e 's/ +/ /g' -e 's/^ *//'

      GPL Ghostscript 9.26 (2018-11-20)
      Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
      This software comes with NO WARRANTY: see the file PUBLIC for details.
      Loading StandardSymbolsPS font from /usr/share/ghostscript/9.26/Resource/Font/StandardSymbolsPS... 5059180 3427366 2377528 1080269 1 done.
      Loading NimbusRoman-Regular font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Regular... 5109484 3613274 2397728 1095599 1 done.
      Loading NimbusRoman-Italic font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Italic... 5195796 3824381 2438128 1133973 2 done.
      PLAN 9
      from
      BELL LABS
      PROGRAMMER’S MANUAL
      First Edition
      Computing Science Research Center
      AT&T Bell Laboratories
      Murray Hill, New Jersey
      Loading NimbusRoman-Bold font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Bold... 5423508 4050637 2438128 1136854 2 done.
      Loading NimbusMonoPS-Regular font from /usr/share/ghostscript/9.26/Resource/Font/NimbusMonoPS-Regular... 5651220 4284228 2559328 1246375 2 done.


      For comparison, this is the output (without the -q option) of another file:



      $ gs -sDEVICE=txtwrite -sOutputFile=- -dFirstPage=1 -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER troff.ps | sed -e 's/ +/ /g' -e 's/^ +//'

      GPL Ghostscript 9.26 (2018-11-20)
      Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
      This software comes with NO WARRANTY: see the file PUBLIC for details.
      Loading StandardSymbolsPS font from /usr/share/ghostscript/9.26/Resource/Font/StandardSymbolsPS... 5059180 3436099 2801728 1490759 1 done.
      Loading NimbusRoman-Regular font from /usr/share/ghostscript/9.26/Resource/Font/NimbusRoman-Regular... 5109484 3622007 2821928 1505697 1 done.
      Troff User's Manual
      Joseph F. Χssanna
      Brian W. Kernighan
      bwk@research.bell-labs.com
      Introduction
      Troff and nroff are text processors that format [...]


      This time, there are also fonts been loaded, but only before the output been written, and it takes only a tenth of a second.



      So, how to get that first command run faster? Also, is it really necessary to load fonts for converting a .ps file into text? How can I optimize the gs command. And, most importantly, why this is happening only with manual.ps and not with all the other postscript files I have tried?







      ghostscript postscript






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 2 hours ago









      SeninhaSeninha

      37129




      37129






















          0






          active

          oldest

          votes











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "106"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f503023%2fghostscript-taking-a-lot-of-time-for-loading-fonts-while-extracting-text-from-po%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Unix & Linux Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f503023%2fghostscript-taking-a-lot-of-time-for-loading-fonts-while-extracting-text-from-po%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Entries order in /etc/network/interfaces

          新発田市

          Grub takes very long (several minutes) to open Menu (in Multi-Boot-System)