Splitting a file using perl
up vote
0
down vote
favorite
I have a csv file and like to split the file into smaller files based on column matching in the file using perl. I am working on Linux Rhel6.
example:
fruit1, fruit2,pricerate,quantity
orange, apple, 3,9
apple,lemon,8,1
orange, apple,3,8
pineapple,papaya,9,19
orange,apple,3,7
pineapple,papaya,9,10
Output is something like:
file1:
fruit1,fruit2,pricerate,quantity
orange,apple, 3,9
orange,apple,3,8
orange,apple,3,7
file2:
fruit1,fruit2,pricerate,quantity
pineapple,papaya,9,19
pineapple,papaya,9,10
the unmatched ones goes into a seperate file. Say file3.
perl split csv-simple
add a comment |
up vote
0
down vote
favorite
I have a csv file and like to split the file into smaller files based on column matching in the file using perl. I am working on Linux Rhel6.
example:
fruit1, fruit2,pricerate,quantity
orange, apple, 3,9
apple,lemon,8,1
orange, apple,3,8
pineapple,papaya,9,19
orange,apple,3,7
pineapple,papaya,9,10
Output is something like:
file1:
fruit1,fruit2,pricerate,quantity
orange,apple, 3,9
orange,apple,3,8
orange,apple,3,7
file2:
fruit1,fruit2,pricerate,quantity
pineapple,papaya,9,19
pineapple,papaya,9,10
the unmatched ones goes into a seperate file. Say file3.
perl split csv-simple
1
Just a question, Why do you think this question is related to Linux/Unix?
– VaTo
Jun 8 '15 at 16:32
Also, where wouldapple,lemon
go?
– choroba
Jun 8 '15 at 16:49
Apologies Saul - I should have mentioned earlier ..I am trying this in Linux rhel6.
– namai
Jun 8 '15 at 17:03
hi choroba - all the unmatched ones in a separate file
– namai
Jun 8 '15 at 17:04
@namai No problem, please feel free to go back to your question editing it. Include all that information along the information you think is relevant for people that see it so, it would be easier for them to help you out with your problem (the more details related to your problem the better). Failure to do this will make the people that want to help frustrated and not willing to help. Take this as a suggestion from my part.
– VaTo
Jun 8 '15 at 17:09
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have a csv file and like to split the file into smaller files based on column matching in the file using perl. I am working on Linux Rhel6.
example:
fruit1, fruit2,pricerate,quantity
orange, apple, 3,9
apple,lemon,8,1
orange, apple,3,8
pineapple,papaya,9,19
orange,apple,3,7
pineapple,papaya,9,10
Output is something like:
file1:
fruit1,fruit2,pricerate,quantity
orange,apple, 3,9
orange,apple,3,8
orange,apple,3,7
file2:
fruit1,fruit2,pricerate,quantity
pineapple,papaya,9,19
pineapple,papaya,9,10
the unmatched ones goes into a seperate file. Say file3.
perl split csv-simple
I have a csv file and like to split the file into smaller files based on column matching in the file using perl. I am working on Linux Rhel6.
example:
fruit1, fruit2,pricerate,quantity
orange, apple, 3,9
apple,lemon,8,1
orange, apple,3,8
pineapple,papaya,9,19
orange,apple,3,7
pineapple,papaya,9,10
Output is something like:
file1:
fruit1,fruit2,pricerate,quantity
orange,apple, 3,9
orange,apple,3,8
orange,apple,3,7
file2:
fruit1,fruit2,pricerate,quantity
pineapple,papaya,9,19
pineapple,papaya,9,10
the unmatched ones goes into a seperate file. Say file3.
perl split csv-simple
perl split csv-simple
edited Nov 25 at 14:59
Rui F Ribeiro
38.3k1475126
38.3k1475126
asked Jun 8 '15 at 16:26
namai
11
11
1
Just a question, Why do you think this question is related to Linux/Unix?
– VaTo
Jun 8 '15 at 16:32
Also, where wouldapple,lemon
go?
– choroba
Jun 8 '15 at 16:49
Apologies Saul - I should have mentioned earlier ..I am trying this in Linux rhel6.
– namai
Jun 8 '15 at 17:03
hi choroba - all the unmatched ones in a separate file
– namai
Jun 8 '15 at 17:04
@namai No problem, please feel free to go back to your question editing it. Include all that information along the information you think is relevant for people that see it so, it would be easier for them to help you out with your problem (the more details related to your problem the better). Failure to do this will make the people that want to help frustrated and not willing to help. Take this as a suggestion from my part.
– VaTo
Jun 8 '15 at 17:09
add a comment |
1
Just a question, Why do you think this question is related to Linux/Unix?
– VaTo
Jun 8 '15 at 16:32
Also, where wouldapple,lemon
go?
– choroba
Jun 8 '15 at 16:49
Apologies Saul - I should have mentioned earlier ..I am trying this in Linux rhel6.
– namai
Jun 8 '15 at 17:03
hi choroba - all the unmatched ones in a separate file
– namai
Jun 8 '15 at 17:04
@namai No problem, please feel free to go back to your question editing it. Include all that information along the information you think is relevant for people that see it so, it would be easier for them to help you out with your problem (the more details related to your problem the better). Failure to do this will make the people that want to help frustrated and not willing to help. Take this as a suggestion from my part.
– VaTo
Jun 8 '15 at 17:09
1
1
Just a question, Why do you think this question is related to Linux/Unix?
– VaTo
Jun 8 '15 at 16:32
Just a question, Why do you think this question is related to Linux/Unix?
– VaTo
Jun 8 '15 at 16:32
Also, where would
apple,lemon
go?– choroba
Jun 8 '15 at 16:49
Also, where would
apple,lemon
go?– choroba
Jun 8 '15 at 16:49
Apologies Saul - I should have mentioned earlier ..I am trying this in Linux rhel6.
– namai
Jun 8 '15 at 17:03
Apologies Saul - I should have mentioned earlier ..I am trying this in Linux rhel6.
– namai
Jun 8 '15 at 17:03
hi choroba - all the unmatched ones in a separate file
– namai
Jun 8 '15 at 17:04
hi choroba - all the unmatched ones in a separate file
– namai
Jun 8 '15 at 17:04
@namai No problem, please feel free to go back to your question editing it. Include all that information along the information you think is relevant for people that see it so, it would be easier for them to help you out with your problem (the more details related to your problem the better). Failure to do this will make the people that want to help frustrated and not willing to help. Take this as a suggestion from my part.
– VaTo
Jun 8 '15 at 17:09
@namai No problem, please feel free to go back to your question editing it. Include all that information along the information you think is relevant for people that see it so, it would be easier for them to help you out with your problem (the more details related to your problem the better). Failure to do this will make the people that want to help frustrated and not willing to help. Take this as a suggestion from my part.
– VaTo
Jun 8 '15 at 17:09
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
One of the ways in which you can solve this is:
- Open the input file
- Store the first line of the input file (the header)
For every line in the input file after the header:
- Read the first two columns
- If we haven't opened an output file for the fields you want to match on yet, open a new output file and store its file handle in a hash. Write the header line to the new output file too.
- Fetch the handle of the output file in which we should store this line from the file handle hash. Write the line to that file.
Here is some example code, which will match on the first two fields:
#!/usr/bin/perl
use strict;
use warnings;
my %filehandles=();
my $filenum=1;
open INPUT, "fruit.csv"
or die "Cannot open input file.";
my $header = <INPUT>;
while ( <INPUT> )
{ # Remove spaces from input
$_ =~ s/ //g;
my @fields = split ',', $_;
if ( ! $filehandles{$fields[0]}{$fields[1]} )
{ open $filehandles{$fields[0]}{$fields[1]} , ">file$filenum"
or die "Cannot open output file file$filenum.";
print {$filehandles{$fields[0]}{$fields[1]}} $header;
$filenum++;
}
print {$filehandles{$fields[0]}{$fields[1]}} $_;
}
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
One of the ways in which you can solve this is:
- Open the input file
- Store the first line of the input file (the header)
For every line in the input file after the header:
- Read the first two columns
- If we haven't opened an output file for the fields you want to match on yet, open a new output file and store its file handle in a hash. Write the header line to the new output file too.
- Fetch the handle of the output file in which we should store this line from the file handle hash. Write the line to that file.
Here is some example code, which will match on the first two fields:
#!/usr/bin/perl
use strict;
use warnings;
my %filehandles=();
my $filenum=1;
open INPUT, "fruit.csv"
or die "Cannot open input file.";
my $header = <INPUT>;
while ( <INPUT> )
{ # Remove spaces from input
$_ =~ s/ //g;
my @fields = split ',', $_;
if ( ! $filehandles{$fields[0]}{$fields[1]} )
{ open $filehandles{$fields[0]}{$fields[1]} , ">file$filenum"
or die "Cannot open output file file$filenum.";
print {$filehandles{$fields[0]}{$fields[1]}} $header;
$filenum++;
}
print {$filehandles{$fields[0]}{$fields[1]}} $_;
}
add a comment |
up vote
0
down vote
One of the ways in which you can solve this is:
- Open the input file
- Store the first line of the input file (the header)
For every line in the input file after the header:
- Read the first two columns
- If we haven't opened an output file for the fields you want to match on yet, open a new output file and store its file handle in a hash. Write the header line to the new output file too.
- Fetch the handle of the output file in which we should store this line from the file handle hash. Write the line to that file.
Here is some example code, which will match on the first two fields:
#!/usr/bin/perl
use strict;
use warnings;
my %filehandles=();
my $filenum=1;
open INPUT, "fruit.csv"
or die "Cannot open input file.";
my $header = <INPUT>;
while ( <INPUT> )
{ # Remove spaces from input
$_ =~ s/ //g;
my @fields = split ',', $_;
if ( ! $filehandles{$fields[0]}{$fields[1]} )
{ open $filehandles{$fields[0]}{$fields[1]} , ">file$filenum"
or die "Cannot open output file file$filenum.";
print {$filehandles{$fields[0]}{$fields[1]}} $header;
$filenum++;
}
print {$filehandles{$fields[0]}{$fields[1]}} $_;
}
add a comment |
up vote
0
down vote
up vote
0
down vote
One of the ways in which you can solve this is:
- Open the input file
- Store the first line of the input file (the header)
For every line in the input file after the header:
- Read the first two columns
- If we haven't opened an output file for the fields you want to match on yet, open a new output file and store its file handle in a hash. Write the header line to the new output file too.
- Fetch the handle of the output file in which we should store this line from the file handle hash. Write the line to that file.
Here is some example code, which will match on the first two fields:
#!/usr/bin/perl
use strict;
use warnings;
my %filehandles=();
my $filenum=1;
open INPUT, "fruit.csv"
or die "Cannot open input file.";
my $header = <INPUT>;
while ( <INPUT> )
{ # Remove spaces from input
$_ =~ s/ //g;
my @fields = split ',', $_;
if ( ! $filehandles{$fields[0]}{$fields[1]} )
{ open $filehandles{$fields[0]}{$fields[1]} , ">file$filenum"
or die "Cannot open output file file$filenum.";
print {$filehandles{$fields[0]}{$fields[1]}} $header;
$filenum++;
}
print {$filehandles{$fields[0]}{$fields[1]}} $_;
}
One of the ways in which you can solve this is:
- Open the input file
- Store the first line of the input file (the header)
For every line in the input file after the header:
- Read the first two columns
- If we haven't opened an output file for the fields you want to match on yet, open a new output file and store its file handle in a hash. Write the header line to the new output file too.
- Fetch the handle of the output file in which we should store this line from the file handle hash. Write the line to that file.
Here is some example code, which will match on the first two fields:
#!/usr/bin/perl
use strict;
use warnings;
my %filehandles=();
my $filenum=1;
open INPUT, "fruit.csv"
or die "Cannot open input file.";
my $header = <INPUT>;
while ( <INPUT> )
{ # Remove spaces from input
$_ =~ s/ //g;
my @fields = split ',', $_;
if ( ! $filehandles{$fields[0]}{$fields[1]} )
{ open $filehandles{$fields[0]}{$fields[1]} , ">file$filenum"
or die "Cannot open output file file$filenum.";
print {$filehandles{$fields[0]}{$fields[1]}} $header;
$filenum++;
}
print {$filehandles{$fields[0]}{$fields[1]}} $_;
}
answered Jun 30 '15 at 13:06
Sietse
11
11
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f208290%2fsplitting-a-file-using-perl%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Just a question, Why do you think this question is related to Linux/Unix?
– VaTo
Jun 8 '15 at 16:32
Also, where would
apple,lemon
go?– choroba
Jun 8 '15 at 16:49
Apologies Saul - I should have mentioned earlier ..I am trying this in Linux rhel6.
– namai
Jun 8 '15 at 17:03
hi choroba - all the unmatched ones in a separate file
– namai
Jun 8 '15 at 17:04
@namai No problem, please feel free to go back to your question editing it. Include all that information along the information you think is relevant for people that see it so, it would be easier for them to help you out with your problem (the more details related to your problem the better). Failure to do this will make the people that want to help frustrated and not willing to help. Take this as a suggestion from my part.
– VaTo
Jun 8 '15 at 17:09