egrep regular expression for times over five minutes
up vote
4
down vote
favorite
I have the following time formats in a text file
`1` equals one second.
`5|01` equals five minutes and one seconds.
`13|01` equals thirteen minutes and one seconds.
`21|12|01` equals 21 hours, 12 minutes, and 1 seconds.
I need to egrep for any times over five minutes.
I'm using the following regex but it doesn't work because it excludes times such as 13|00
.
'[[:space:]0-9][[:space:]0-9][[:space:]|][[:space:]0-9][[:space:]6-9][|][0-9][0-9]'
Here's an example:
lite on 1
lite on 01
lite on 5|22
lite on 23|14
lite on 1|14|23
grep regular-expression
add a comment |
up vote
4
down vote
favorite
I have the following time formats in a text file
`1` equals one second.
`5|01` equals five minutes and one seconds.
`13|01` equals thirteen minutes and one seconds.
`21|12|01` equals 21 hours, 12 minutes, and 1 seconds.
I need to egrep for any times over five minutes.
I'm using the following regex but it doesn't work because it excludes times such as 13|00
.
'[[:space:]0-9][[:space:]0-9][[:space:]|][[:space:]0-9][[:space:]6-9][|][0-9][0-9]'
Here's an example:
lite on 1
lite on 01
lite on 5|22
lite on 23|14
lite on 1|14|23
grep regular-expression
grep
alone wont solve your problem because you're not matching characters, but a condition. Seeawk
linux.die.net/man/1/awk
– Centimane
Oct 1 '15 at 16:03
1
Can you give an example of some of the actual output of a file containing these times? I have a couple idea's but with regex formatting is always the key.
– Gravy
Oct 1 '15 at 16:36
1
Does it have to begrep
? Becauseregex
is amazing for matching text patterns, it's really quite poor at understanding them as values. I'd strongly suggest looking towardsperl
orawk
instead.
– Sobrique
Oct 1 '15 at 20:54
@user72055, If you have one hour and one minute, is the one minuted padded with a space or a zero?
– glenn jackman
Oct 2 '15 at 16:14
add a comment |
up vote
4
down vote
favorite
up vote
4
down vote
favorite
I have the following time formats in a text file
`1` equals one second.
`5|01` equals five minutes and one seconds.
`13|01` equals thirteen minutes and one seconds.
`21|12|01` equals 21 hours, 12 minutes, and 1 seconds.
I need to egrep for any times over five minutes.
I'm using the following regex but it doesn't work because it excludes times such as 13|00
.
'[[:space:]0-9][[:space:]0-9][[:space:]|][[:space:]0-9][[:space:]6-9][|][0-9][0-9]'
Here's an example:
lite on 1
lite on 01
lite on 5|22
lite on 23|14
lite on 1|14|23
grep regular-expression
I have the following time formats in a text file
`1` equals one second.
`5|01` equals five minutes and one seconds.
`13|01` equals thirteen minutes and one seconds.
`21|12|01` equals 21 hours, 12 minutes, and 1 seconds.
I need to egrep for any times over five minutes.
I'm using the following regex but it doesn't work because it excludes times such as 13|00
.
'[[:space:]0-9][[:space:]0-9][[:space:]|][[:space:]0-9][[:space:]6-9][|][0-9][0-9]'
Here's an example:
lite on 1
lite on 01
lite on 5|22
lite on 23|14
lite on 1|14|23
grep regular-expression
grep regular-expression
edited Nov 21 at 21:37
Rui F Ribeiro
38.2k1475125
38.2k1475125
asked Oct 1 '15 at 15:57
user72055
85110
85110
grep
alone wont solve your problem because you're not matching characters, but a condition. Seeawk
linux.die.net/man/1/awk
– Centimane
Oct 1 '15 at 16:03
1
Can you give an example of some of the actual output of a file containing these times? I have a couple idea's but with regex formatting is always the key.
– Gravy
Oct 1 '15 at 16:36
1
Does it have to begrep
? Becauseregex
is amazing for matching text patterns, it's really quite poor at understanding them as values. I'd strongly suggest looking towardsperl
orawk
instead.
– Sobrique
Oct 1 '15 at 20:54
@user72055, If you have one hour and one minute, is the one minuted padded with a space or a zero?
– glenn jackman
Oct 2 '15 at 16:14
add a comment |
grep
alone wont solve your problem because you're not matching characters, but a condition. Seeawk
linux.die.net/man/1/awk
– Centimane
Oct 1 '15 at 16:03
1
Can you give an example of some of the actual output of a file containing these times? I have a couple idea's but with regex formatting is always the key.
– Gravy
Oct 1 '15 at 16:36
1
Does it have to begrep
? Becauseregex
is amazing for matching text patterns, it's really quite poor at understanding them as values. I'd strongly suggest looking towardsperl
orawk
instead.
– Sobrique
Oct 1 '15 at 20:54
@user72055, If you have one hour and one minute, is the one minuted padded with a space or a zero?
– glenn jackman
Oct 2 '15 at 16:14
grep
alone wont solve your problem because you're not matching characters, but a condition. See awk
linux.die.net/man/1/awk– Centimane
Oct 1 '15 at 16:03
grep
alone wont solve your problem because you're not matching characters, but a condition. See awk
linux.die.net/man/1/awk– Centimane
Oct 1 '15 at 16:03
1
1
Can you give an example of some of the actual output of a file containing these times? I have a couple idea's but with regex formatting is always the key.
– Gravy
Oct 1 '15 at 16:36
Can you give an example of some of the actual output of a file containing these times? I have a couple idea's but with regex formatting is always the key.
– Gravy
Oct 1 '15 at 16:36
1
1
Does it have to be
grep
? Because regex
is amazing for matching text patterns, it's really quite poor at understanding them as values. I'd strongly suggest looking towards perl
or awk
instead.– Sobrique
Oct 1 '15 at 20:54
Does it have to be
grep
? Because regex
is amazing for matching text patterns, it's really quite poor at understanding them as values. I'd strongly suggest looking towards perl
or awk
instead.– Sobrique
Oct 1 '15 at 20:54
@user72055, If you have one hour and one minute, is the one minuted padded with a space or a zero?
– glenn jackman
Oct 2 '15 at 16:14
@user72055, If you have one hour and one minute, is the one minuted padded with a space or a zero?
– glenn jackman
Oct 2 '15 at 16:14
add a comment |
3 Answers
3
active
oldest
votes
up vote
5
down vote
accepted
Ignoring the spaces (which you can fill in yourself later) and possible leading zeros (likewise), you're looking to match any of
[5-9]|[0-9]+
[1-9][0-9]|[0-9]+
[0-9]+|[0-9]+|[0-9]+
for times in the range
[5,10) minutes
[10,99) minutes
1+ hours
respectively.
So join those together in a match group (...|...)
with sufficient anchoring at the beginning and end (so you don't match on 14|59
or 1|00|00
).
This gives
grep -E 'on +([5-9]|[0-9]+|[1-9][0-9]|[0-9]+|[0-9]+|[0-9]+|[0-9]+) *$'
We can simplify a little, because the seconds are common to all three regexps:
grep -E 'on +([5-9]|[1-9][0-9]|[0-9]+|[0-9]+)|[0-9]+ *$'
Note: I'm assuming that5|00
counts as "over 5 minutes" in the above, as there's probably a truncated fractional second hiding there...
– Toby Speight
Oct 1 '15 at 17:05
Correct me if I'm wrong, but wouldn't something like 0|0|3 pass your filter? I'd check that the leftmost [x-9]|... is [1-9] and not [0-9]
– Dani_l
Oct 1 '15 at 17:07
OP's examples don't specify testings for validity, so you are probably correct. however, filtering for 0|01|13 should be simple enough by requiring [1-9] on each leftmost range in the group, no?
– Dani_l
Oct 1 '15 at 17:17
@Dani: I did start by ignoring leading zeros - if they may be present, then it's simple to account for them, as you say. Keeping regexps simple always depends on how much you can assume about the format; here, it appears to be machine-generated rather than written by humans, so let's capitalise on that!
– Toby Speight
Oct 1 '15 at 17:25
I know the OP explicitly requested a regex, but I'm really not sure that that's the tool to be using for this job. Regex really isn't particularly good at numeric comparisons.
– Sobrique
Oct 2 '15 at 11:09
add a comment |
up vote
1
down vote
This should work:
grep -E '( [0-9]{1,2}|[0-9]{1,2}|[0-9][1-9] )|( [0-9][0-9]|[0-9]{2,2} )|( [5-9]|[0-9][1-9] )|( [6-9]|[0-9][0-9] )' <file>
Basically you are creating 4 patterns encased in ()
's separated by |
's. the |
's act as or
in the regex.
the {1,2}
parts are 1-2 instances of preceding pattern
so [0-9]{1,2}
means 1-2 instances of 0-9
After which we create a basic test case for all your possible digit combinations's via the or
syntax
@Dani_l No no no, Nitpick away. Its one of those day's where I apparently can't brain at all. Thanks for keeping an eye out.
– Gravy
Oct 1 '15 at 17:16
add a comment |
up vote
1
down vote
Don't use grep
. Regular expressions are for matching patterns, but they're really horrible for matching values. You can probably do it, but you're using a hammer as a screwdriver. Technically it works, but it's messy and inefficient.
So instead:
#!/usr/bin/env perl
use strict;
use warnings;
while (<DATA>) {
my @numbers = m/(d+)/g;
my $seconds = pop(@numbers);
$seconds += ( pop(@numbers) // 0 ) * 60; #second digit minutes -> seconds
$seconds
+= ( pop(@numbers) // 0 ) * 60 * 60; #third digit, hours -> seconds;
print if $seconds > 300;
}
__DATA__
lite on 1
lite on 01
lite on 5|22
lite on 23|14
lite on 1|14|23
This prints:
lite on 5|22
lite on 23|14
lite on 1|14|23
You can one-liner-ify this as:
perl -ne 'for ( m/(d+)/g ) { $t *= 60; $t += $_ }; print if $t > 300;'
For bonus points - this copes with fairly arbitrary validation criteria, without too much difficulty, and doesn't require anywhere near as much mucking around if you decide to grep for a different value one day.
But the above works by:
- use
m/(d+)/g
- as ag
match, means it selects repeated instances of "one or more digits" into an array. (@numbers
or just as a self contained iterator in the for loop in the second example). - it converts that chain of digits to seconds by mutiplying up by 60. (This won't work so well if you ever add days to it!)
- And then tests that number for being greater than 300 - which is 5 minutes in seconds.
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
5
down vote
accepted
Ignoring the spaces (which you can fill in yourself later) and possible leading zeros (likewise), you're looking to match any of
[5-9]|[0-9]+
[1-9][0-9]|[0-9]+
[0-9]+|[0-9]+|[0-9]+
for times in the range
[5,10) minutes
[10,99) minutes
1+ hours
respectively.
So join those together in a match group (...|...)
with sufficient anchoring at the beginning and end (so you don't match on 14|59
or 1|00|00
).
This gives
grep -E 'on +([5-9]|[0-9]+|[1-9][0-9]|[0-9]+|[0-9]+|[0-9]+|[0-9]+) *$'
We can simplify a little, because the seconds are common to all three regexps:
grep -E 'on +([5-9]|[1-9][0-9]|[0-9]+|[0-9]+)|[0-9]+ *$'
Note: I'm assuming that5|00
counts as "over 5 minutes" in the above, as there's probably a truncated fractional second hiding there...
– Toby Speight
Oct 1 '15 at 17:05
Correct me if I'm wrong, but wouldn't something like 0|0|3 pass your filter? I'd check that the leftmost [x-9]|... is [1-9] and not [0-9]
– Dani_l
Oct 1 '15 at 17:07
OP's examples don't specify testings for validity, so you are probably correct. however, filtering for 0|01|13 should be simple enough by requiring [1-9] on each leftmost range in the group, no?
– Dani_l
Oct 1 '15 at 17:17
@Dani: I did start by ignoring leading zeros - if they may be present, then it's simple to account for them, as you say. Keeping regexps simple always depends on how much you can assume about the format; here, it appears to be machine-generated rather than written by humans, so let's capitalise on that!
– Toby Speight
Oct 1 '15 at 17:25
I know the OP explicitly requested a regex, but I'm really not sure that that's the tool to be using for this job. Regex really isn't particularly good at numeric comparisons.
– Sobrique
Oct 2 '15 at 11:09
add a comment |
up vote
5
down vote
accepted
Ignoring the spaces (which you can fill in yourself later) and possible leading zeros (likewise), you're looking to match any of
[5-9]|[0-9]+
[1-9][0-9]|[0-9]+
[0-9]+|[0-9]+|[0-9]+
for times in the range
[5,10) minutes
[10,99) minutes
1+ hours
respectively.
So join those together in a match group (...|...)
with sufficient anchoring at the beginning and end (so you don't match on 14|59
or 1|00|00
).
This gives
grep -E 'on +([5-9]|[0-9]+|[1-9][0-9]|[0-9]+|[0-9]+|[0-9]+|[0-9]+) *$'
We can simplify a little, because the seconds are common to all three regexps:
grep -E 'on +([5-9]|[1-9][0-9]|[0-9]+|[0-9]+)|[0-9]+ *$'
Note: I'm assuming that5|00
counts as "over 5 minutes" in the above, as there's probably a truncated fractional second hiding there...
– Toby Speight
Oct 1 '15 at 17:05
Correct me if I'm wrong, but wouldn't something like 0|0|3 pass your filter? I'd check that the leftmost [x-9]|... is [1-9] and not [0-9]
– Dani_l
Oct 1 '15 at 17:07
OP's examples don't specify testings for validity, so you are probably correct. however, filtering for 0|01|13 should be simple enough by requiring [1-9] on each leftmost range in the group, no?
– Dani_l
Oct 1 '15 at 17:17
@Dani: I did start by ignoring leading zeros - if they may be present, then it's simple to account for them, as you say. Keeping regexps simple always depends on how much you can assume about the format; here, it appears to be machine-generated rather than written by humans, so let's capitalise on that!
– Toby Speight
Oct 1 '15 at 17:25
I know the OP explicitly requested a regex, but I'm really not sure that that's the tool to be using for this job. Regex really isn't particularly good at numeric comparisons.
– Sobrique
Oct 2 '15 at 11:09
add a comment |
up vote
5
down vote
accepted
up vote
5
down vote
accepted
Ignoring the spaces (which you can fill in yourself later) and possible leading zeros (likewise), you're looking to match any of
[5-9]|[0-9]+
[1-9][0-9]|[0-9]+
[0-9]+|[0-9]+|[0-9]+
for times in the range
[5,10) minutes
[10,99) minutes
1+ hours
respectively.
So join those together in a match group (...|...)
with sufficient anchoring at the beginning and end (so you don't match on 14|59
or 1|00|00
).
This gives
grep -E 'on +([5-9]|[0-9]+|[1-9][0-9]|[0-9]+|[0-9]+|[0-9]+|[0-9]+) *$'
We can simplify a little, because the seconds are common to all three regexps:
grep -E 'on +([5-9]|[1-9][0-9]|[0-9]+|[0-9]+)|[0-9]+ *$'
Ignoring the spaces (which you can fill in yourself later) and possible leading zeros (likewise), you're looking to match any of
[5-9]|[0-9]+
[1-9][0-9]|[0-9]+
[0-9]+|[0-9]+|[0-9]+
for times in the range
[5,10) minutes
[10,99) minutes
1+ hours
respectively.
So join those together in a match group (...|...)
with sufficient anchoring at the beginning and end (so you don't match on 14|59
or 1|00|00
).
This gives
grep -E 'on +([5-9]|[0-9]+|[1-9][0-9]|[0-9]+|[0-9]+|[0-9]+|[0-9]+) *$'
We can simplify a little, because the seconds are common to all three regexps:
grep -E 'on +([5-9]|[1-9][0-9]|[0-9]+|[0-9]+)|[0-9]+ *$'
edited Oct 1 '15 at 17:03
answered Oct 1 '15 at 16:57
Toby Speight
5,23211031
5,23211031
Note: I'm assuming that5|00
counts as "over 5 minutes" in the above, as there's probably a truncated fractional second hiding there...
– Toby Speight
Oct 1 '15 at 17:05
Correct me if I'm wrong, but wouldn't something like 0|0|3 pass your filter? I'd check that the leftmost [x-9]|... is [1-9] and not [0-9]
– Dani_l
Oct 1 '15 at 17:07
OP's examples don't specify testings for validity, so you are probably correct. however, filtering for 0|01|13 should be simple enough by requiring [1-9] on each leftmost range in the group, no?
– Dani_l
Oct 1 '15 at 17:17
@Dani: I did start by ignoring leading zeros - if they may be present, then it's simple to account for them, as you say. Keeping regexps simple always depends on how much you can assume about the format; here, it appears to be machine-generated rather than written by humans, so let's capitalise on that!
– Toby Speight
Oct 1 '15 at 17:25
I know the OP explicitly requested a regex, but I'm really not sure that that's the tool to be using for this job. Regex really isn't particularly good at numeric comparisons.
– Sobrique
Oct 2 '15 at 11:09
add a comment |
Note: I'm assuming that5|00
counts as "over 5 minutes" in the above, as there's probably a truncated fractional second hiding there...
– Toby Speight
Oct 1 '15 at 17:05
Correct me if I'm wrong, but wouldn't something like 0|0|3 pass your filter? I'd check that the leftmost [x-9]|... is [1-9] and not [0-9]
– Dani_l
Oct 1 '15 at 17:07
OP's examples don't specify testings for validity, so you are probably correct. however, filtering for 0|01|13 should be simple enough by requiring [1-9] on each leftmost range in the group, no?
– Dani_l
Oct 1 '15 at 17:17
@Dani: I did start by ignoring leading zeros - if they may be present, then it's simple to account for them, as you say. Keeping regexps simple always depends on how much you can assume about the format; here, it appears to be machine-generated rather than written by humans, so let's capitalise on that!
– Toby Speight
Oct 1 '15 at 17:25
I know the OP explicitly requested a regex, but I'm really not sure that that's the tool to be using for this job. Regex really isn't particularly good at numeric comparisons.
– Sobrique
Oct 2 '15 at 11:09
Note: I'm assuming that
5|00
counts as "over 5 minutes" in the above, as there's probably a truncated fractional second hiding there...– Toby Speight
Oct 1 '15 at 17:05
Note: I'm assuming that
5|00
counts as "over 5 minutes" in the above, as there's probably a truncated fractional second hiding there...– Toby Speight
Oct 1 '15 at 17:05
Correct me if I'm wrong, but wouldn't something like 0|0|3 pass your filter? I'd check that the leftmost [x-9]|... is [1-9] and not [0-9]
– Dani_l
Oct 1 '15 at 17:07
Correct me if I'm wrong, but wouldn't something like 0|0|3 pass your filter? I'd check that the leftmost [x-9]|... is [1-9] and not [0-9]
– Dani_l
Oct 1 '15 at 17:07
OP's examples don't specify testings for validity, so you are probably correct. however, filtering for 0|01|13 should be simple enough by requiring [1-9] on each leftmost range in the group, no?
– Dani_l
Oct 1 '15 at 17:17
OP's examples don't specify testings for validity, so you are probably correct. however, filtering for 0|01|13 should be simple enough by requiring [1-9] on each leftmost range in the group, no?
– Dani_l
Oct 1 '15 at 17:17
@Dani: I did start by ignoring leading zeros - if they may be present, then it's simple to account for them, as you say. Keeping regexps simple always depends on how much you can assume about the format; here, it appears to be machine-generated rather than written by humans, so let's capitalise on that!
– Toby Speight
Oct 1 '15 at 17:25
@Dani: I did start by ignoring leading zeros - if they may be present, then it's simple to account for them, as you say. Keeping regexps simple always depends on how much you can assume about the format; here, it appears to be machine-generated rather than written by humans, so let's capitalise on that!
– Toby Speight
Oct 1 '15 at 17:25
I know the OP explicitly requested a regex, but I'm really not sure that that's the tool to be using for this job. Regex really isn't particularly good at numeric comparisons.
– Sobrique
Oct 2 '15 at 11:09
I know the OP explicitly requested a regex, but I'm really not sure that that's the tool to be using for this job. Regex really isn't particularly good at numeric comparisons.
– Sobrique
Oct 2 '15 at 11:09
add a comment |
up vote
1
down vote
This should work:
grep -E '( [0-9]{1,2}|[0-9]{1,2}|[0-9][1-9] )|( [0-9][0-9]|[0-9]{2,2} )|( [5-9]|[0-9][1-9] )|( [6-9]|[0-9][0-9] )' <file>
Basically you are creating 4 patterns encased in ()
's separated by |
's. the |
's act as or
in the regex.
the {1,2}
parts are 1-2 instances of preceding pattern
so [0-9]{1,2}
means 1-2 instances of 0-9
After which we create a basic test case for all your possible digit combinations's via the or
syntax
@Dani_l No no no, Nitpick away. Its one of those day's where I apparently can't brain at all. Thanks for keeping an eye out.
– Gravy
Oct 1 '15 at 17:16
add a comment |
up vote
1
down vote
This should work:
grep -E '( [0-9]{1,2}|[0-9]{1,2}|[0-9][1-9] )|( [0-9][0-9]|[0-9]{2,2} )|( [5-9]|[0-9][1-9] )|( [6-9]|[0-9][0-9] )' <file>
Basically you are creating 4 patterns encased in ()
's separated by |
's. the |
's act as or
in the regex.
the {1,2}
parts are 1-2 instances of preceding pattern
so [0-9]{1,2}
means 1-2 instances of 0-9
After which we create a basic test case for all your possible digit combinations's via the or
syntax
@Dani_l No no no, Nitpick away. Its one of those day's where I apparently can't brain at all. Thanks for keeping an eye out.
– Gravy
Oct 1 '15 at 17:16
add a comment |
up vote
1
down vote
up vote
1
down vote
This should work:
grep -E '( [0-9]{1,2}|[0-9]{1,2}|[0-9][1-9] )|( [0-9][0-9]|[0-9]{2,2} )|( [5-9]|[0-9][1-9] )|( [6-9]|[0-9][0-9] )' <file>
Basically you are creating 4 patterns encased in ()
's separated by |
's. the |
's act as or
in the regex.
the {1,2}
parts are 1-2 instances of preceding pattern
so [0-9]{1,2}
means 1-2 instances of 0-9
After which we create a basic test case for all your possible digit combinations's via the or
syntax
This should work:
grep -E '( [0-9]{1,2}|[0-9]{1,2}|[0-9][1-9] )|( [0-9][0-9]|[0-9]{2,2} )|( [5-9]|[0-9][1-9] )|( [6-9]|[0-9][0-9] )' <file>
Basically you are creating 4 patterns encased in ()
's separated by |
's. the |
's act as or
in the regex.
the {1,2}
parts are 1-2 instances of preceding pattern
so [0-9]{1,2}
means 1-2 instances of 0-9
After which we create a basic test case for all your possible digit combinations's via the or
syntax
edited Oct 1 '15 at 17:14
answered Oct 1 '15 at 16:57
Gravy
1,353520
1,353520
@Dani_l No no no, Nitpick away. Its one of those day's where I apparently can't brain at all. Thanks for keeping an eye out.
– Gravy
Oct 1 '15 at 17:16
add a comment |
@Dani_l No no no, Nitpick away. Its one of those day's where I apparently can't brain at all. Thanks for keeping an eye out.
– Gravy
Oct 1 '15 at 17:16
@Dani_l No no no, Nitpick away. Its one of those day's where I apparently can't brain at all. Thanks for keeping an eye out.
– Gravy
Oct 1 '15 at 17:16
@Dani_l No no no, Nitpick away. Its one of those day's where I apparently can't brain at all. Thanks for keeping an eye out.
– Gravy
Oct 1 '15 at 17:16
add a comment |
up vote
1
down vote
Don't use grep
. Regular expressions are for matching patterns, but they're really horrible for matching values. You can probably do it, but you're using a hammer as a screwdriver. Technically it works, but it's messy and inefficient.
So instead:
#!/usr/bin/env perl
use strict;
use warnings;
while (<DATA>) {
my @numbers = m/(d+)/g;
my $seconds = pop(@numbers);
$seconds += ( pop(@numbers) // 0 ) * 60; #second digit minutes -> seconds
$seconds
+= ( pop(@numbers) // 0 ) * 60 * 60; #third digit, hours -> seconds;
print if $seconds > 300;
}
__DATA__
lite on 1
lite on 01
lite on 5|22
lite on 23|14
lite on 1|14|23
This prints:
lite on 5|22
lite on 23|14
lite on 1|14|23
You can one-liner-ify this as:
perl -ne 'for ( m/(d+)/g ) { $t *= 60; $t += $_ }; print if $t > 300;'
For bonus points - this copes with fairly arbitrary validation criteria, without too much difficulty, and doesn't require anywhere near as much mucking around if you decide to grep for a different value one day.
But the above works by:
- use
m/(d+)/g
- as ag
match, means it selects repeated instances of "one or more digits" into an array. (@numbers
or just as a self contained iterator in the for loop in the second example). - it converts that chain of digits to seconds by mutiplying up by 60. (This won't work so well if you ever add days to it!)
- And then tests that number for being greater than 300 - which is 5 minutes in seconds.
add a comment |
up vote
1
down vote
Don't use grep
. Regular expressions are for matching patterns, but they're really horrible for matching values. You can probably do it, but you're using a hammer as a screwdriver. Technically it works, but it's messy and inefficient.
So instead:
#!/usr/bin/env perl
use strict;
use warnings;
while (<DATA>) {
my @numbers = m/(d+)/g;
my $seconds = pop(@numbers);
$seconds += ( pop(@numbers) // 0 ) * 60; #second digit minutes -> seconds
$seconds
+= ( pop(@numbers) // 0 ) * 60 * 60; #third digit, hours -> seconds;
print if $seconds > 300;
}
__DATA__
lite on 1
lite on 01
lite on 5|22
lite on 23|14
lite on 1|14|23
This prints:
lite on 5|22
lite on 23|14
lite on 1|14|23
You can one-liner-ify this as:
perl -ne 'for ( m/(d+)/g ) { $t *= 60; $t += $_ }; print if $t > 300;'
For bonus points - this copes with fairly arbitrary validation criteria, without too much difficulty, and doesn't require anywhere near as much mucking around if you decide to grep for a different value one day.
But the above works by:
- use
m/(d+)/g
- as ag
match, means it selects repeated instances of "one or more digits" into an array. (@numbers
or just as a self contained iterator in the for loop in the second example). - it converts that chain of digits to seconds by mutiplying up by 60. (This won't work so well if you ever add days to it!)
- And then tests that number for being greater than 300 - which is 5 minutes in seconds.
add a comment |
up vote
1
down vote
up vote
1
down vote
Don't use grep
. Regular expressions are for matching patterns, but they're really horrible for matching values. You can probably do it, but you're using a hammer as a screwdriver. Technically it works, but it's messy and inefficient.
So instead:
#!/usr/bin/env perl
use strict;
use warnings;
while (<DATA>) {
my @numbers = m/(d+)/g;
my $seconds = pop(@numbers);
$seconds += ( pop(@numbers) // 0 ) * 60; #second digit minutes -> seconds
$seconds
+= ( pop(@numbers) // 0 ) * 60 * 60; #third digit, hours -> seconds;
print if $seconds > 300;
}
__DATA__
lite on 1
lite on 01
lite on 5|22
lite on 23|14
lite on 1|14|23
This prints:
lite on 5|22
lite on 23|14
lite on 1|14|23
You can one-liner-ify this as:
perl -ne 'for ( m/(d+)/g ) { $t *= 60; $t += $_ }; print if $t > 300;'
For bonus points - this copes with fairly arbitrary validation criteria, without too much difficulty, and doesn't require anywhere near as much mucking around if you decide to grep for a different value one day.
But the above works by:
- use
m/(d+)/g
- as ag
match, means it selects repeated instances of "one or more digits" into an array. (@numbers
or just as a self contained iterator in the for loop in the second example). - it converts that chain of digits to seconds by mutiplying up by 60. (This won't work so well if you ever add days to it!)
- And then tests that number for being greater than 300 - which is 5 minutes in seconds.
Don't use grep
. Regular expressions are for matching patterns, but they're really horrible for matching values. You can probably do it, but you're using a hammer as a screwdriver. Technically it works, but it's messy and inefficient.
So instead:
#!/usr/bin/env perl
use strict;
use warnings;
while (<DATA>) {
my @numbers = m/(d+)/g;
my $seconds = pop(@numbers);
$seconds += ( pop(@numbers) // 0 ) * 60; #second digit minutes -> seconds
$seconds
+= ( pop(@numbers) // 0 ) * 60 * 60; #third digit, hours -> seconds;
print if $seconds > 300;
}
__DATA__
lite on 1
lite on 01
lite on 5|22
lite on 23|14
lite on 1|14|23
This prints:
lite on 5|22
lite on 23|14
lite on 1|14|23
You can one-liner-ify this as:
perl -ne 'for ( m/(d+)/g ) { $t *= 60; $t += $_ }; print if $t > 300;'
For bonus points - this copes with fairly arbitrary validation criteria, without too much difficulty, and doesn't require anywhere near as much mucking around if you decide to grep for a different value one day.
But the above works by:
- use
m/(d+)/g
- as ag
match, means it selects repeated instances of "one or more digits" into an array. (@numbers
or just as a self contained iterator in the for loop in the second example). - it converts that chain of digits to seconds by mutiplying up by 60. (This won't work so well if you ever add days to it!)
- And then tests that number for being greater than 300 - which is 5 minutes in seconds.
edited Oct 1 '15 at 20:56
answered Oct 1 '15 at 20:47
Sobrique
3,759517
3,759517
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f233309%2fegrep-regular-expression-for-times-over-five-minutes%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
grep
alone wont solve your problem because you're not matching characters, but a condition. Seeawk
linux.die.net/man/1/awk– Centimane
Oct 1 '15 at 16:03
1
Can you give an example of some of the actual output of a file containing these times? I have a couple idea's but with regex formatting is always the key.
– Gravy
Oct 1 '15 at 16:36
1
Does it have to be
grep
? Becauseregex
is amazing for matching text patterns, it's really quite poor at understanding them as values. I'd strongly suggest looking towardsperl
orawk
instead.– Sobrique
Oct 1 '15 at 20:54
@user72055, If you have one hour and one minute, is the one minuted padded with a space or a zero?
– glenn jackman
Oct 2 '15 at 16:14