/bin/ls: Argument list too long
I am a biology person and running a program named autodock. I have some files from ZINC library in .mol2 format. As per requirement I need to split this files with the csplit
command and I received all the content in my directory. The parent file was split into very very many small files. Every file name is like this: ZINC14382748.mol2
. Now I have to change all these files to pdbqt format and I have to use the following script:
#!/bin/csh # # $Id: ex02.csh,v 1.5 2007/07/19 21:52:59 rhuey Exp $
#
# use the 'prepare_ligands.py' python script to create pdbq files
cd $VSTROOT/VirtualScreening/Ligands
foreach f (`ls *`) echo $f pythonsh ../../prepare_ligand4.py -l $f -d ../etc/ligand_dict.py end
When I use it, it says
/bin/ls: Argument list too long
In short upon successful completion it will duplicate the above number of files into another format. So is there any reasonable solution to tackle this problem?
shell-script files wildcards arguments csh
|
show 4 more comments
I am a biology person and running a program named autodock. I have some files from ZINC library in .mol2 format. As per requirement I need to split this files with the csplit
command and I received all the content in my directory. The parent file was split into very very many small files. Every file name is like this: ZINC14382748.mol2
. Now I have to change all these files to pdbqt format and I have to use the following script:
#!/bin/csh # # $Id: ex02.csh,v 1.5 2007/07/19 21:52:59 rhuey Exp $
#
# use the 'prepare_ligands.py' python script to create pdbq files
cd $VSTROOT/VirtualScreening/Ligands
foreach f (`ls *`) echo $f pythonsh ../../prepare_ligand4.py -l $f -d ../etc/ligand_dict.py end
When I use it, it says
/bin/ls: Argument list too long
In short upon successful completion it will duplicate the above number of files into another format. So is there any reasonable solution to tackle this problem?
shell-script files wildcards arguments csh
2
What is the command you typed? It probably contained something likels *
which will expand to all the files in your directory. If you have a lot of those files, the resulting command line is too long to process.
– michas
Jul 9 '16 at 3:16
I used the following script, and i forgot to mention that i am working on a workstation have 32G of RAM with and it says 6 cores and 12 processors with a cache size of 12288.
– Ash
Jul 9 '16 at 3:27
#!/bin/csh # # $Id: ex02.csh,v 1.5 2007/07/19 21:52:59 rhuey Exp $ # # use the 'prepare_ligands.py' python script to create pdbq files cd $VSTROOT/VirtualScreening/Ligands foreach f (ls *
) echo $f pythonsh ../../prepare_ligand4.py -l $f -d ../etc/ligand_dict.py end
– Ash
Jul 9 '16 at 3:27
2
Please edit your question so that it has the script in it.
– Stephen Harris
Jul 9 '16 at 3:54
1
Try changing the loop toforeach f (./*)
instead offoreach f (ls *)
.
– clk
Jul 9 '16 at 4:36
|
show 4 more comments
I am a biology person and running a program named autodock. I have some files from ZINC library in .mol2 format. As per requirement I need to split this files with the csplit
command and I received all the content in my directory. The parent file was split into very very many small files. Every file name is like this: ZINC14382748.mol2
. Now I have to change all these files to pdbqt format and I have to use the following script:
#!/bin/csh # # $Id: ex02.csh,v 1.5 2007/07/19 21:52:59 rhuey Exp $
#
# use the 'prepare_ligands.py' python script to create pdbq files
cd $VSTROOT/VirtualScreening/Ligands
foreach f (`ls *`) echo $f pythonsh ../../prepare_ligand4.py -l $f -d ../etc/ligand_dict.py end
When I use it, it says
/bin/ls: Argument list too long
In short upon successful completion it will duplicate the above number of files into another format. So is there any reasonable solution to tackle this problem?
shell-script files wildcards arguments csh
I am a biology person and running a program named autodock. I have some files from ZINC library in .mol2 format. As per requirement I need to split this files with the csplit
command and I received all the content in my directory. The parent file was split into very very many small files. Every file name is like this: ZINC14382748.mol2
. Now I have to change all these files to pdbqt format and I have to use the following script:
#!/bin/csh # # $Id: ex02.csh,v 1.5 2007/07/19 21:52:59 rhuey Exp $
#
# use the 'prepare_ligands.py' python script to create pdbq files
cd $VSTROOT/VirtualScreening/Ligands
foreach f (`ls *`) echo $f pythonsh ../../prepare_ligand4.py -l $f -d ../etc/ligand_dict.py end
When I use it, it says
/bin/ls: Argument list too long
In short upon successful completion it will duplicate the above number of files into another format. So is there any reasonable solution to tackle this problem?
shell-script files wildcards arguments csh
shell-script files wildcards arguments csh
edited Jul 9 '16 at 22:38
Gilles
538k12810881606
538k12810881606
asked Jul 9 '16 at 3:13
AshAsh
2114
2114
2
What is the command you typed? It probably contained something likels *
which will expand to all the files in your directory. If you have a lot of those files, the resulting command line is too long to process.
– michas
Jul 9 '16 at 3:16
I used the following script, and i forgot to mention that i am working on a workstation have 32G of RAM with and it says 6 cores and 12 processors with a cache size of 12288.
– Ash
Jul 9 '16 at 3:27
#!/bin/csh # # $Id: ex02.csh,v 1.5 2007/07/19 21:52:59 rhuey Exp $ # # use the 'prepare_ligands.py' python script to create pdbq files cd $VSTROOT/VirtualScreening/Ligands foreach f (ls *
) echo $f pythonsh ../../prepare_ligand4.py -l $f -d ../etc/ligand_dict.py end
– Ash
Jul 9 '16 at 3:27
2
Please edit your question so that it has the script in it.
– Stephen Harris
Jul 9 '16 at 3:54
1
Try changing the loop toforeach f (./*)
instead offoreach f (ls *)
.
– clk
Jul 9 '16 at 4:36
|
show 4 more comments
2
What is the command you typed? It probably contained something likels *
which will expand to all the files in your directory. If you have a lot of those files, the resulting command line is too long to process.
– michas
Jul 9 '16 at 3:16
I used the following script, and i forgot to mention that i am working on a workstation have 32G of RAM with and it says 6 cores and 12 processors with a cache size of 12288.
– Ash
Jul 9 '16 at 3:27
#!/bin/csh # # $Id: ex02.csh,v 1.5 2007/07/19 21:52:59 rhuey Exp $ # # use the 'prepare_ligands.py' python script to create pdbq files cd $VSTROOT/VirtualScreening/Ligands foreach f (ls *
) echo $f pythonsh ../../prepare_ligand4.py -l $f -d ../etc/ligand_dict.py end
– Ash
Jul 9 '16 at 3:27
2
Please edit your question so that it has the script in it.
– Stephen Harris
Jul 9 '16 at 3:54
1
Try changing the loop toforeach f (./*)
instead offoreach f (ls *)
.
– clk
Jul 9 '16 at 4:36
2
2
What is the command you typed? It probably contained something like
ls *
which will expand to all the files in your directory. If you have a lot of those files, the resulting command line is too long to process.– michas
Jul 9 '16 at 3:16
What is the command you typed? It probably contained something like
ls *
which will expand to all the files in your directory. If you have a lot of those files, the resulting command line is too long to process.– michas
Jul 9 '16 at 3:16
I used the following script, and i forgot to mention that i am working on a workstation have 32G of RAM with and it says 6 cores and 12 processors with a cache size of 12288.
– Ash
Jul 9 '16 at 3:27
I used the following script, and i forgot to mention that i am working on a workstation have 32G of RAM with and it says 6 cores and 12 processors with a cache size of 12288.
– Ash
Jul 9 '16 at 3:27
#!/bin/csh # # $Id: ex02.csh,v 1.5 2007/07/19 21:52:59 rhuey Exp $ # # use the 'prepare_ligands.py' python script to create pdbq files cd $VSTROOT/VirtualScreening/Ligands foreach f (
ls *
) echo $f pythonsh ../../prepare_ligand4.py -l $f -d ../etc/ligand_dict.py end– Ash
Jul 9 '16 at 3:27
#!/bin/csh # # $Id: ex02.csh,v 1.5 2007/07/19 21:52:59 rhuey Exp $ # # use the 'prepare_ligands.py' python script to create pdbq files cd $VSTROOT/VirtualScreening/Ligands foreach f (
ls *
) echo $f pythonsh ../../prepare_ligand4.py -l $f -d ../etc/ligand_dict.py end– Ash
Jul 9 '16 at 3:27
2
2
Please edit your question so that it has the script in it.
– Stephen Harris
Jul 9 '16 at 3:54
Please edit your question so that it has the script in it.
– Stephen Harris
Jul 9 '16 at 3:54
1
1
Try changing the loop to
foreach f (./*)
instead of foreach f (ls *)
.– clk
Jul 9 '16 at 4:36
Try changing the loop to
foreach f (./*)
instead of foreach f (ls *)
.– clk
Jul 9 '16 at 4:36
|
show 4 more comments
4 Answers
4
active
oldest
votes
Don’t parse the output ofls
.
Just sayforeach f (*)
. Also,- You should always quote your shell variable references
(e.g.,"$f"
) unless you have a good reason not to,
and you’re sure you know what you’re doing.
Now it says, command not found. Well I think I need to recompile my kernel. Already I have change values directly in to this file, binfmts.h. Do you know how it successfully recompile it? I am using a work station with 32G of RAM.
– Ash
Jul 9 '16 at 7:15
2
Well, go ahead and do what you think you need to do, but I doubt that recompiling your kernel will have any effect on whether this script works.
– G-Man
Jul 9 '16 at 7:28
1
@Ash It should not say "command not found". Please show your new script and the complete error message. Exactly what did you type to run it? Did you maybe try to use (ba)sh to run it? Try to run it explicitly with csh:csh myscript
Also be aware that your problem has nothing to do with the memory of your computer or the configuration of your kernel. It is simply a bug in your script.
– michas
Jul 9 '16 at 9:28
Dear Michas, in tcsh i run it with source ex02.csh, the error is same "argument too long". In bash i ran it "./ex02.sch", the error is same "argument too long". I changed ("ls *") to ("./ *") and got an error like this, "./:Permission denied"). Also i tried your way, csh ex02.csh, error is same "argument too long"
– Ash
Jul 9 '16 at 10:48
add a comment |
The source of the problem is that you have far too many small files.
If I'm reading it right, you have over 14 million files. There is no way that ANY shell is going to be able to have over 14 million file names on the command line. Aside from that. your filenames seem to be about 18 characters long, so that's roughly 18*14M or about 252 megabytes just to hold the filenames.
bash
for example, has a limit of 128KB. ever so slightly smaller than 252MB. I have no idea what limit csh
has because i don't use it. It's unlikely to be any larger than bash's command-line length limit. It certainly won't be 252MB or greater.
However, all is not lost, you can use find ... -exec
instead.
find . -maxdepth 1 -type f -name '*.mol2'
-exec pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py ;
This will run prepare_ligand4.py
ONCE for each file, so will take a very long time. You might be able to speed it up a little (not much, not with 14+M files to process) by using find ... -print0
with xargs -0 -P ...
or GNU parallel -0 ...
instead of find ... -exec
A much better solution would be to download source code for prepare_ligand4.py
and modify it so that you can give it one large file (e.g. the original file before csplit
-ing it) and it will process each block individually. It'll be much faster and easier to work with. You'll probably still have over 14M output files (assuming that a combined output file would be useless...if it isn't you're in luck!), but that's better than having 14M input files AND 14M output files.
This would, of course, require some skill with python
programming.
Maybe someone has already encountered the same problem and written their own enhanced version of prepare_ligand4.py
. It's worth spending some time searching, or maybe try the Autodock Forum or even contact the Autodock author.
2
The 128kB limit is not a bash limit, it's a kernel limit. Bash can handle command lines that are as long as memory allows, but they have to be calls to builtins or functions. Contrastbash -c 'echo {1..999999} >/dev/null'
withbash -c '/bin/echo {1..999999} >/dev/null'
– Gilles
Jul 9 '16 at 22:34
add a comment |
You obviously has a lot of files. Consider using GNU Parallel http://www.gnu.org/software/parallel/ The 'ls -U' does not sort the files and then it is faster.
cd $VSTROOT/VirtualScreening/Ligands
ls -U ZINC* | parallel echo {} ; pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py
I don't understand why you echo it. Do you parse it on to a new script? My guess is that 'prepare_ligand4.py' is the script for conversion and then this should do the job (in parallel):
cd $VSTROOT/VirtualScreening/Ligands
ls -U ZINC* | parallel pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py
Yes man i have 24 Million files but that is good that all of them are zipped in different files and each of that contain like 1,36,000 files. Sorry I am not good in computer stuffs, my friends helped me and he changed the csh for me to bash. So if you mean i can run it very fast, i will be happy to see and try your script :P. Most of the things are out of my box you explained above. Cheers @hschou
– Ash
Jul 9 '16 at 11:43
Your script will only process one file at the time so it will take a long time for you process. Install GNU Parallel to use all your cores in parallel.
– hschou
Jul 9 '16 at 14:46
add a comment |
I have solved the problem, let me share with you. I Rename the bash.csh to bash.sh, Next I change my script in order to run it in bash. Here is my new script to help in future for the same issue.
#!/bin/bash
cd $VSTROOT/VirtualScreening/Ligands/
for f in ZINC*.mol2
do
echo "$f"
pythonsh ../../prepare_ligand4.py -l "$f" -d ../etc/ligand_dict.py
done
For a beginner like me, here ZINC is a part of name present in all of the ligand names, so must keep according to your ligand name. Thanks for you time and my friend here who helped me passionately.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f294783%2fbin-ls-argument-list-too-long%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
Don’t parse the output ofls
.
Just sayforeach f (*)
. Also,- You should always quote your shell variable references
(e.g.,"$f"
) unless you have a good reason not to,
and you’re sure you know what you’re doing.
Now it says, command not found. Well I think I need to recompile my kernel. Already I have change values directly in to this file, binfmts.h. Do you know how it successfully recompile it? I am using a work station with 32G of RAM.
– Ash
Jul 9 '16 at 7:15
2
Well, go ahead and do what you think you need to do, but I doubt that recompiling your kernel will have any effect on whether this script works.
– G-Man
Jul 9 '16 at 7:28
1
@Ash It should not say "command not found". Please show your new script and the complete error message. Exactly what did you type to run it? Did you maybe try to use (ba)sh to run it? Try to run it explicitly with csh:csh myscript
Also be aware that your problem has nothing to do with the memory of your computer or the configuration of your kernel. It is simply a bug in your script.
– michas
Jul 9 '16 at 9:28
Dear Michas, in tcsh i run it with source ex02.csh, the error is same "argument too long". In bash i ran it "./ex02.sch", the error is same "argument too long". I changed ("ls *") to ("./ *") and got an error like this, "./:Permission denied"). Also i tried your way, csh ex02.csh, error is same "argument too long"
– Ash
Jul 9 '16 at 10:48
add a comment |
Don’t parse the output ofls
.
Just sayforeach f (*)
. Also,- You should always quote your shell variable references
(e.g.,"$f"
) unless you have a good reason not to,
and you’re sure you know what you’re doing.
Now it says, command not found. Well I think I need to recompile my kernel. Already I have change values directly in to this file, binfmts.h. Do you know how it successfully recompile it? I am using a work station with 32G of RAM.
– Ash
Jul 9 '16 at 7:15
2
Well, go ahead and do what you think you need to do, but I doubt that recompiling your kernel will have any effect on whether this script works.
– G-Man
Jul 9 '16 at 7:28
1
@Ash It should not say "command not found". Please show your new script and the complete error message. Exactly what did you type to run it? Did you maybe try to use (ba)sh to run it? Try to run it explicitly with csh:csh myscript
Also be aware that your problem has nothing to do with the memory of your computer or the configuration of your kernel. It is simply a bug in your script.
– michas
Jul 9 '16 at 9:28
Dear Michas, in tcsh i run it with source ex02.csh, the error is same "argument too long". In bash i ran it "./ex02.sch", the error is same "argument too long". I changed ("ls *") to ("./ *") and got an error like this, "./:Permission denied"). Also i tried your way, csh ex02.csh, error is same "argument too long"
– Ash
Jul 9 '16 at 10:48
add a comment |
Don’t parse the output ofls
.
Just sayforeach f (*)
. Also,- You should always quote your shell variable references
(e.g.,"$f"
) unless you have a good reason not to,
and you’re sure you know what you’re doing.
Don’t parse the output ofls
.
Just sayforeach f (*)
. Also,- You should always quote your shell variable references
(e.g.,"$f"
) unless you have a good reason not to,
and you’re sure you know what you’re doing.
answered Jul 9 '16 at 4:52
G-ManG-Man
13.2k93466
13.2k93466
Now it says, command not found. Well I think I need to recompile my kernel. Already I have change values directly in to this file, binfmts.h. Do you know how it successfully recompile it? I am using a work station with 32G of RAM.
– Ash
Jul 9 '16 at 7:15
2
Well, go ahead and do what you think you need to do, but I doubt that recompiling your kernel will have any effect on whether this script works.
– G-Man
Jul 9 '16 at 7:28
1
@Ash It should not say "command not found". Please show your new script and the complete error message. Exactly what did you type to run it? Did you maybe try to use (ba)sh to run it? Try to run it explicitly with csh:csh myscript
Also be aware that your problem has nothing to do with the memory of your computer or the configuration of your kernel. It is simply a bug in your script.
– michas
Jul 9 '16 at 9:28
Dear Michas, in tcsh i run it with source ex02.csh, the error is same "argument too long". In bash i ran it "./ex02.sch", the error is same "argument too long". I changed ("ls *") to ("./ *") and got an error like this, "./:Permission denied"). Also i tried your way, csh ex02.csh, error is same "argument too long"
– Ash
Jul 9 '16 at 10:48
add a comment |
Now it says, command not found. Well I think I need to recompile my kernel. Already I have change values directly in to this file, binfmts.h. Do you know how it successfully recompile it? I am using a work station with 32G of RAM.
– Ash
Jul 9 '16 at 7:15
2
Well, go ahead and do what you think you need to do, but I doubt that recompiling your kernel will have any effect on whether this script works.
– G-Man
Jul 9 '16 at 7:28
1
@Ash It should not say "command not found". Please show your new script and the complete error message. Exactly what did you type to run it? Did you maybe try to use (ba)sh to run it? Try to run it explicitly with csh:csh myscript
Also be aware that your problem has nothing to do with the memory of your computer or the configuration of your kernel. It is simply a bug in your script.
– michas
Jul 9 '16 at 9:28
Dear Michas, in tcsh i run it with source ex02.csh, the error is same "argument too long". In bash i ran it "./ex02.sch", the error is same "argument too long". I changed ("ls *") to ("./ *") and got an error like this, "./:Permission denied"). Also i tried your way, csh ex02.csh, error is same "argument too long"
– Ash
Jul 9 '16 at 10:48
Now it says, command not found. Well I think I need to recompile my kernel. Already I have change values directly in to this file, binfmts.h. Do you know how it successfully recompile it? I am using a work station with 32G of RAM.
– Ash
Jul 9 '16 at 7:15
Now it says, command not found. Well I think I need to recompile my kernel. Already I have change values directly in to this file, binfmts.h. Do you know how it successfully recompile it? I am using a work station with 32G of RAM.
– Ash
Jul 9 '16 at 7:15
2
2
Well, go ahead and do what you think you need to do, but I doubt that recompiling your kernel will have any effect on whether this script works.
– G-Man
Jul 9 '16 at 7:28
Well, go ahead and do what you think you need to do, but I doubt that recompiling your kernel will have any effect on whether this script works.
– G-Man
Jul 9 '16 at 7:28
1
1
@Ash It should not say "command not found". Please show your new script and the complete error message. Exactly what did you type to run it? Did you maybe try to use (ba)sh to run it? Try to run it explicitly with csh:
csh myscript
Also be aware that your problem has nothing to do with the memory of your computer or the configuration of your kernel. It is simply a bug in your script.– michas
Jul 9 '16 at 9:28
@Ash It should not say "command not found". Please show your new script and the complete error message. Exactly what did you type to run it? Did you maybe try to use (ba)sh to run it? Try to run it explicitly with csh:
csh myscript
Also be aware that your problem has nothing to do with the memory of your computer or the configuration of your kernel. It is simply a bug in your script.– michas
Jul 9 '16 at 9:28
Dear Michas, in tcsh i run it with source ex02.csh, the error is same "argument too long". In bash i ran it "./ex02.sch", the error is same "argument too long". I changed ("ls *") to ("./ *") and got an error like this, "./:Permission denied"). Also i tried your way, csh ex02.csh, error is same "argument too long"
– Ash
Jul 9 '16 at 10:48
Dear Michas, in tcsh i run it with source ex02.csh, the error is same "argument too long". In bash i ran it "./ex02.sch", the error is same "argument too long". I changed ("ls *") to ("./ *") and got an error like this, "./:Permission denied"). Also i tried your way, csh ex02.csh, error is same "argument too long"
– Ash
Jul 9 '16 at 10:48
add a comment |
The source of the problem is that you have far too many small files.
If I'm reading it right, you have over 14 million files. There is no way that ANY shell is going to be able to have over 14 million file names on the command line. Aside from that. your filenames seem to be about 18 characters long, so that's roughly 18*14M or about 252 megabytes just to hold the filenames.
bash
for example, has a limit of 128KB. ever so slightly smaller than 252MB. I have no idea what limit csh
has because i don't use it. It's unlikely to be any larger than bash's command-line length limit. It certainly won't be 252MB or greater.
However, all is not lost, you can use find ... -exec
instead.
find . -maxdepth 1 -type f -name '*.mol2'
-exec pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py ;
This will run prepare_ligand4.py
ONCE for each file, so will take a very long time. You might be able to speed it up a little (not much, not with 14+M files to process) by using find ... -print0
with xargs -0 -P ...
or GNU parallel -0 ...
instead of find ... -exec
A much better solution would be to download source code for prepare_ligand4.py
and modify it so that you can give it one large file (e.g. the original file before csplit
-ing it) and it will process each block individually. It'll be much faster and easier to work with. You'll probably still have over 14M output files (assuming that a combined output file would be useless...if it isn't you're in luck!), but that's better than having 14M input files AND 14M output files.
This would, of course, require some skill with python
programming.
Maybe someone has already encountered the same problem and written their own enhanced version of prepare_ligand4.py
. It's worth spending some time searching, or maybe try the Autodock Forum or even contact the Autodock author.
2
The 128kB limit is not a bash limit, it's a kernel limit. Bash can handle command lines that are as long as memory allows, but they have to be calls to builtins or functions. Contrastbash -c 'echo {1..999999} >/dev/null'
withbash -c '/bin/echo {1..999999} >/dev/null'
– Gilles
Jul 9 '16 at 22:34
add a comment |
The source of the problem is that you have far too many small files.
If I'm reading it right, you have over 14 million files. There is no way that ANY shell is going to be able to have over 14 million file names on the command line. Aside from that. your filenames seem to be about 18 characters long, so that's roughly 18*14M or about 252 megabytes just to hold the filenames.
bash
for example, has a limit of 128KB. ever so slightly smaller than 252MB. I have no idea what limit csh
has because i don't use it. It's unlikely to be any larger than bash's command-line length limit. It certainly won't be 252MB or greater.
However, all is not lost, you can use find ... -exec
instead.
find . -maxdepth 1 -type f -name '*.mol2'
-exec pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py ;
This will run prepare_ligand4.py
ONCE for each file, so will take a very long time. You might be able to speed it up a little (not much, not with 14+M files to process) by using find ... -print0
with xargs -0 -P ...
or GNU parallel -0 ...
instead of find ... -exec
A much better solution would be to download source code for prepare_ligand4.py
and modify it so that you can give it one large file (e.g. the original file before csplit
-ing it) and it will process each block individually. It'll be much faster and easier to work with. You'll probably still have over 14M output files (assuming that a combined output file would be useless...if it isn't you're in luck!), but that's better than having 14M input files AND 14M output files.
This would, of course, require some skill with python
programming.
Maybe someone has already encountered the same problem and written their own enhanced version of prepare_ligand4.py
. It's worth spending some time searching, or maybe try the Autodock Forum or even contact the Autodock author.
2
The 128kB limit is not a bash limit, it's a kernel limit. Bash can handle command lines that are as long as memory allows, but they have to be calls to builtins or functions. Contrastbash -c 'echo {1..999999} >/dev/null'
withbash -c '/bin/echo {1..999999} >/dev/null'
– Gilles
Jul 9 '16 at 22:34
add a comment |
The source of the problem is that you have far too many small files.
If I'm reading it right, you have over 14 million files. There is no way that ANY shell is going to be able to have over 14 million file names on the command line. Aside from that. your filenames seem to be about 18 characters long, so that's roughly 18*14M or about 252 megabytes just to hold the filenames.
bash
for example, has a limit of 128KB. ever so slightly smaller than 252MB. I have no idea what limit csh
has because i don't use it. It's unlikely to be any larger than bash's command-line length limit. It certainly won't be 252MB or greater.
However, all is not lost, you can use find ... -exec
instead.
find . -maxdepth 1 -type f -name '*.mol2'
-exec pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py ;
This will run prepare_ligand4.py
ONCE for each file, so will take a very long time. You might be able to speed it up a little (not much, not with 14+M files to process) by using find ... -print0
with xargs -0 -P ...
or GNU parallel -0 ...
instead of find ... -exec
A much better solution would be to download source code for prepare_ligand4.py
and modify it so that you can give it one large file (e.g. the original file before csplit
-ing it) and it will process each block individually. It'll be much faster and easier to work with. You'll probably still have over 14M output files (assuming that a combined output file would be useless...if it isn't you're in luck!), but that's better than having 14M input files AND 14M output files.
This would, of course, require some skill with python
programming.
Maybe someone has already encountered the same problem and written their own enhanced version of prepare_ligand4.py
. It's worth spending some time searching, or maybe try the Autodock Forum or even contact the Autodock author.
The source of the problem is that you have far too many small files.
If I'm reading it right, you have over 14 million files. There is no way that ANY shell is going to be able to have over 14 million file names on the command line. Aside from that. your filenames seem to be about 18 characters long, so that's roughly 18*14M or about 252 megabytes just to hold the filenames.
bash
for example, has a limit of 128KB. ever so slightly smaller than 252MB. I have no idea what limit csh
has because i don't use it. It's unlikely to be any larger than bash's command-line length limit. It certainly won't be 252MB or greater.
However, all is not lost, you can use find ... -exec
instead.
find . -maxdepth 1 -type f -name '*.mol2'
-exec pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py ;
This will run prepare_ligand4.py
ONCE for each file, so will take a very long time. You might be able to speed it up a little (not much, not with 14+M files to process) by using find ... -print0
with xargs -0 -P ...
or GNU parallel -0 ...
instead of find ... -exec
A much better solution would be to download source code for prepare_ligand4.py
and modify it so that you can give it one large file (e.g. the original file before csplit
-ing it) and it will process each block individually. It'll be much faster and easier to work with. You'll probably still have over 14M output files (assuming that a combined output file would be useless...if it isn't you're in luck!), but that's better than having 14M input files AND 14M output files.
This would, of course, require some skill with python
programming.
Maybe someone has already encountered the same problem and written their own enhanced version of prepare_ligand4.py
. It's worth spending some time searching, or maybe try the Autodock Forum or even contact the Autodock author.
answered Jul 9 '16 at 15:41
cascas
39.1k454101
39.1k454101
2
The 128kB limit is not a bash limit, it's a kernel limit. Bash can handle command lines that are as long as memory allows, but they have to be calls to builtins or functions. Contrastbash -c 'echo {1..999999} >/dev/null'
withbash -c '/bin/echo {1..999999} >/dev/null'
– Gilles
Jul 9 '16 at 22:34
add a comment |
2
The 128kB limit is not a bash limit, it's a kernel limit. Bash can handle command lines that are as long as memory allows, but they have to be calls to builtins or functions. Contrastbash -c 'echo {1..999999} >/dev/null'
withbash -c '/bin/echo {1..999999} >/dev/null'
– Gilles
Jul 9 '16 at 22:34
2
2
The 128kB limit is not a bash limit, it's a kernel limit. Bash can handle command lines that are as long as memory allows, but they have to be calls to builtins or functions. Contrast
bash -c 'echo {1..999999} >/dev/null'
with bash -c '/bin/echo {1..999999} >/dev/null'
– Gilles
Jul 9 '16 at 22:34
The 128kB limit is not a bash limit, it's a kernel limit. Bash can handle command lines that are as long as memory allows, but they have to be calls to builtins or functions. Contrast
bash -c 'echo {1..999999} >/dev/null'
with bash -c '/bin/echo {1..999999} >/dev/null'
– Gilles
Jul 9 '16 at 22:34
add a comment |
You obviously has a lot of files. Consider using GNU Parallel http://www.gnu.org/software/parallel/ The 'ls -U' does not sort the files and then it is faster.
cd $VSTROOT/VirtualScreening/Ligands
ls -U ZINC* | parallel echo {} ; pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py
I don't understand why you echo it. Do you parse it on to a new script? My guess is that 'prepare_ligand4.py' is the script for conversion and then this should do the job (in parallel):
cd $VSTROOT/VirtualScreening/Ligands
ls -U ZINC* | parallel pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py
Yes man i have 24 Million files but that is good that all of them are zipped in different files and each of that contain like 1,36,000 files. Sorry I am not good in computer stuffs, my friends helped me and he changed the csh for me to bash. So if you mean i can run it very fast, i will be happy to see and try your script :P. Most of the things are out of my box you explained above. Cheers @hschou
– Ash
Jul 9 '16 at 11:43
Your script will only process one file at the time so it will take a long time for you process. Install GNU Parallel to use all your cores in parallel.
– hschou
Jul 9 '16 at 14:46
add a comment |
You obviously has a lot of files. Consider using GNU Parallel http://www.gnu.org/software/parallel/ The 'ls -U' does not sort the files and then it is faster.
cd $VSTROOT/VirtualScreening/Ligands
ls -U ZINC* | parallel echo {} ; pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py
I don't understand why you echo it. Do you parse it on to a new script? My guess is that 'prepare_ligand4.py' is the script for conversion and then this should do the job (in parallel):
cd $VSTROOT/VirtualScreening/Ligands
ls -U ZINC* | parallel pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py
Yes man i have 24 Million files but that is good that all of them are zipped in different files and each of that contain like 1,36,000 files. Sorry I am not good in computer stuffs, my friends helped me and he changed the csh for me to bash. So if you mean i can run it very fast, i will be happy to see and try your script :P. Most of the things are out of my box you explained above. Cheers @hschou
– Ash
Jul 9 '16 at 11:43
Your script will only process one file at the time so it will take a long time for you process. Install GNU Parallel to use all your cores in parallel.
– hschou
Jul 9 '16 at 14:46
add a comment |
You obviously has a lot of files. Consider using GNU Parallel http://www.gnu.org/software/parallel/ The 'ls -U' does not sort the files and then it is faster.
cd $VSTROOT/VirtualScreening/Ligands
ls -U ZINC* | parallel echo {} ; pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py
I don't understand why you echo it. Do you parse it on to a new script? My guess is that 'prepare_ligand4.py' is the script for conversion and then this should do the job (in parallel):
cd $VSTROOT/VirtualScreening/Ligands
ls -U ZINC* | parallel pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py
You obviously has a lot of files. Consider using GNU Parallel http://www.gnu.org/software/parallel/ The 'ls -U' does not sort the files and then it is faster.
cd $VSTROOT/VirtualScreening/Ligands
ls -U ZINC* | parallel echo {} ; pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py
I don't understand why you echo it. Do you parse it on to a new script? My guess is that 'prepare_ligand4.py' is the script for conversion and then this should do the job (in parallel):
cd $VSTROOT/VirtualScreening/Ligands
ls -U ZINC* | parallel pythonsh ../../prepare_ligand4.py -l {} -d ../etc/ligand_dict.py
edited Jul 9 '16 at 14:43
answered Jul 9 '16 at 11:16
hschouhschou
2,196610
2,196610
Yes man i have 24 Million files but that is good that all of them are zipped in different files and each of that contain like 1,36,000 files. Sorry I am not good in computer stuffs, my friends helped me and he changed the csh for me to bash. So if you mean i can run it very fast, i will be happy to see and try your script :P. Most of the things are out of my box you explained above. Cheers @hschou
– Ash
Jul 9 '16 at 11:43
Your script will only process one file at the time so it will take a long time for you process. Install GNU Parallel to use all your cores in parallel.
– hschou
Jul 9 '16 at 14:46
add a comment |
Yes man i have 24 Million files but that is good that all of them are zipped in different files and each of that contain like 1,36,000 files. Sorry I am not good in computer stuffs, my friends helped me and he changed the csh for me to bash. So if you mean i can run it very fast, i will be happy to see and try your script :P. Most of the things are out of my box you explained above. Cheers @hschou
– Ash
Jul 9 '16 at 11:43
Your script will only process one file at the time so it will take a long time for you process. Install GNU Parallel to use all your cores in parallel.
– hschou
Jul 9 '16 at 14:46
Yes man i have 24 Million files but that is good that all of them are zipped in different files and each of that contain like 1,36,000 files. Sorry I am not good in computer stuffs, my friends helped me and he changed the csh for me to bash. So if you mean i can run it very fast, i will be happy to see and try your script :P. Most of the things are out of my box you explained above. Cheers @hschou
– Ash
Jul 9 '16 at 11:43
Yes man i have 24 Million files but that is good that all of them are zipped in different files and each of that contain like 1,36,000 files. Sorry I am not good in computer stuffs, my friends helped me and he changed the csh for me to bash. So if you mean i can run it very fast, i will be happy to see and try your script :P. Most of the things are out of my box you explained above. Cheers @hschou
– Ash
Jul 9 '16 at 11:43
Your script will only process one file at the time so it will take a long time for you process. Install GNU Parallel to use all your cores in parallel.
– hschou
Jul 9 '16 at 14:46
Your script will only process one file at the time so it will take a long time for you process. Install GNU Parallel to use all your cores in parallel.
– hschou
Jul 9 '16 at 14:46
add a comment |
I have solved the problem, let me share with you. I Rename the bash.csh to bash.sh, Next I change my script in order to run it in bash. Here is my new script to help in future for the same issue.
#!/bin/bash
cd $VSTROOT/VirtualScreening/Ligands/
for f in ZINC*.mol2
do
echo "$f"
pythonsh ../../prepare_ligand4.py -l "$f" -d ../etc/ligand_dict.py
done
For a beginner like me, here ZINC is a part of name present in all of the ligand names, so must keep according to your ligand name. Thanks for you time and my friend here who helped me passionately.
add a comment |
I have solved the problem, let me share with you. I Rename the bash.csh to bash.sh, Next I change my script in order to run it in bash. Here is my new script to help in future for the same issue.
#!/bin/bash
cd $VSTROOT/VirtualScreening/Ligands/
for f in ZINC*.mol2
do
echo "$f"
pythonsh ../../prepare_ligand4.py -l "$f" -d ../etc/ligand_dict.py
done
For a beginner like me, here ZINC is a part of name present in all of the ligand names, so must keep according to your ligand name. Thanks for you time and my friend here who helped me passionately.
add a comment |
I have solved the problem, let me share with you. I Rename the bash.csh to bash.sh, Next I change my script in order to run it in bash. Here is my new script to help in future for the same issue.
#!/bin/bash
cd $VSTROOT/VirtualScreening/Ligands/
for f in ZINC*.mol2
do
echo "$f"
pythonsh ../../prepare_ligand4.py -l "$f" -d ../etc/ligand_dict.py
done
For a beginner like me, here ZINC is a part of name present in all of the ligand names, so must keep according to your ligand name. Thanks for you time and my friend here who helped me passionately.
I have solved the problem, let me share with you. I Rename the bash.csh to bash.sh, Next I change my script in order to run it in bash. Here is my new script to help in future for the same issue.
#!/bin/bash
cd $VSTROOT/VirtualScreening/Ligands/
for f in ZINC*.mol2
do
echo "$f"
pythonsh ../../prepare_ligand4.py -l "$f" -d ../etc/ligand_dict.py
done
For a beginner like me, here ZINC is a part of name present in all of the ligand names, so must keep according to your ligand name. Thanks for you time and my friend here who helped me passionately.
edited Jul 9 '16 at 22:37
Gilles
538k12810881606
538k12810881606
answered Jul 9 '16 at 11:09
AshAsh
2114
2114
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f294783%2fbin-ls-argument-list-too-long%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
What is the command you typed? It probably contained something like
ls *
which will expand to all the files in your directory. If you have a lot of those files, the resulting command line is too long to process.– michas
Jul 9 '16 at 3:16
I used the following script, and i forgot to mention that i am working on a workstation have 32G of RAM with and it says 6 cores and 12 processors with a cache size of 12288.
– Ash
Jul 9 '16 at 3:27
#!/bin/csh # # $Id: ex02.csh,v 1.5 2007/07/19 21:52:59 rhuey Exp $ # # use the 'prepare_ligands.py' python script to create pdbq files cd $VSTROOT/VirtualScreening/Ligands foreach f (
ls *
) echo $f pythonsh ../../prepare_ligand4.py -l $f -d ../etc/ligand_dict.py end– Ash
Jul 9 '16 at 3:27
2
Please edit your question so that it has the script in it.
– Stephen Harris
Jul 9 '16 at 3:54
1
Try changing the loop to
foreach f (./*)
instead offoreach f (ls *)
.– clk
Jul 9 '16 at 4:36