How to extract logs between two time stamps
up vote
22
down vote
favorite
I want to extract all logs between two timestamps. Some lines may not have the timestamp, but I want those lines also. In short, I want every line that falls under two time stamps. My log structure looks like:
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
Suppose I want to extract everything between 2014-04-07 23:00
and 2014-04-08 02:00
.
Please note the start time stamp or end time stamp may not be there in the log, but I want every line between these two time stamps.
text-processing sed awk grep
|
show 1 more comment
up vote
22
down vote
favorite
I want to extract all logs between two timestamps. Some lines may not have the timestamp, but I want those lines also. In short, I want every line that falls under two time stamps. My log structure looks like:
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
Suppose I want to extract everything between 2014-04-07 23:00
and 2014-04-08 02:00
.
Please note the start time stamp or end time stamp may not be there in the log, but I want every line between these two time stamps.
text-processing sed awk grep
Possible duplicate of stackoverflow.com/questions/7575267/…
– Ramesh
Apr 9 '14 at 20:16
Do you just need to do this just once or programmatically at various times?
– Bratchley
Apr 9 '14 at 20:23
Reason I ask is because you can do two contextual grep's (one to grab everything after the starting delimiter and another to stop printing at the ending delimiter) if you know the literal values. If the dates/times can change, tou can easily generate these on the fly by feeding user input through thedate -d
command and using that to construct the search pattern.
– Bratchley
Apr 9 '14 at 20:28
@Ramesh, the referenced question is too broad.
– maxschlepzig
Apr 9 '14 at 20:33
@JoelDavis : I want to do it programmatically. So everytime i just need to enter desired time stamp to extract the logs between those time stamp in my /tmp location.
– Amit
Apr 9 '14 at 21:36
|
show 1 more comment
up vote
22
down vote
favorite
up vote
22
down vote
favorite
I want to extract all logs between two timestamps. Some lines may not have the timestamp, but I want those lines also. In short, I want every line that falls under two time stamps. My log structure looks like:
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
Suppose I want to extract everything between 2014-04-07 23:00
and 2014-04-08 02:00
.
Please note the start time stamp or end time stamp may not be there in the log, but I want every line between these two time stamps.
text-processing sed awk grep
I want to extract all logs between two timestamps. Some lines may not have the timestamp, but I want those lines also. In short, I want every line that falls under two time stamps. My log structure looks like:
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
Suppose I want to extract everything between 2014-04-07 23:00
and 2014-04-08 02:00
.
Please note the start time stamp or end time stamp may not be there in the log, but I want every line between these two time stamps.
text-processing sed awk grep
text-processing sed awk grep
edited Feb 28 '16 at 21:06
don_crissti
48.7k15129157
48.7k15129157
asked Apr 9 '14 at 20:03
Amit
111113
111113
Possible duplicate of stackoverflow.com/questions/7575267/…
– Ramesh
Apr 9 '14 at 20:16
Do you just need to do this just once or programmatically at various times?
– Bratchley
Apr 9 '14 at 20:23
Reason I ask is because you can do two contextual grep's (one to grab everything after the starting delimiter and another to stop printing at the ending delimiter) if you know the literal values. If the dates/times can change, tou can easily generate these on the fly by feeding user input through thedate -d
command and using that to construct the search pattern.
– Bratchley
Apr 9 '14 at 20:28
@Ramesh, the referenced question is too broad.
– maxschlepzig
Apr 9 '14 at 20:33
@JoelDavis : I want to do it programmatically. So everytime i just need to enter desired time stamp to extract the logs between those time stamp in my /tmp location.
– Amit
Apr 9 '14 at 21:36
|
show 1 more comment
Possible duplicate of stackoverflow.com/questions/7575267/…
– Ramesh
Apr 9 '14 at 20:16
Do you just need to do this just once or programmatically at various times?
– Bratchley
Apr 9 '14 at 20:23
Reason I ask is because you can do two contextual grep's (one to grab everything after the starting delimiter and another to stop printing at the ending delimiter) if you know the literal values. If the dates/times can change, tou can easily generate these on the fly by feeding user input through thedate -d
command and using that to construct the search pattern.
– Bratchley
Apr 9 '14 at 20:28
@Ramesh, the referenced question is too broad.
– maxschlepzig
Apr 9 '14 at 20:33
@JoelDavis : I want to do it programmatically. So everytime i just need to enter desired time stamp to extract the logs between those time stamp in my /tmp location.
– Amit
Apr 9 '14 at 21:36
Possible duplicate of stackoverflow.com/questions/7575267/…
– Ramesh
Apr 9 '14 at 20:16
Possible duplicate of stackoverflow.com/questions/7575267/…
– Ramesh
Apr 9 '14 at 20:16
Do you just need to do this just once or programmatically at various times?
– Bratchley
Apr 9 '14 at 20:23
Do you just need to do this just once or programmatically at various times?
– Bratchley
Apr 9 '14 at 20:23
Reason I ask is because you can do two contextual grep's (one to grab everything after the starting delimiter and another to stop printing at the ending delimiter) if you know the literal values. If the dates/times can change, tou can easily generate these on the fly by feeding user input through the
date -d
command and using that to construct the search pattern.– Bratchley
Apr 9 '14 at 20:28
Reason I ask is because you can do two contextual grep's (one to grab everything after the starting delimiter and another to stop printing at the ending delimiter) if you know the literal values. If the dates/times can change, tou can easily generate these on the fly by feeding user input through the
date -d
command and using that to construct the search pattern.– Bratchley
Apr 9 '14 at 20:28
@Ramesh, the referenced question is too broad.
– maxschlepzig
Apr 9 '14 at 20:33
@Ramesh, the referenced question is too broad.
– maxschlepzig
Apr 9 '14 at 20:33
@JoelDavis : I want to do it programmatically. So everytime i just need to enter desired time stamp to extract the logs between those time stamp in my /tmp location.
– Amit
Apr 9 '14 at 21:36
@JoelDavis : I want to do it programmatically. So everytime i just need to enter desired time stamp to extract the logs between those time stamp in my /tmp location.
– Amit
Apr 9 '14 at 21:36
|
show 1 more comment
4 Answers
4
active
oldest
votes
up vote
18
down vote
You can use awk
for this:
$ awk -F']|['
'$0 ~ /^[/ && $2 >= "2014-04-07 23:00" { p=1 }
$0 ~ /^[/ && $2 >= "2014-04-08 02:00" { p=0 }
p { print $0 }' log
Where:
-F
specifies the characters[
and]
as field separators using a regular expression
$0
references a complete line
$2
references the date field
p
is used as boolean variable that guards the actual printing
$0 ~ /regex/
is true if regex matches$0
>=
is used for lexicographically comparing string (equivalent to e.g.strcmp()
)
Variations
The above command line implements right-open time interval matching. To get closed interval semantics just increment your right date, e.g.:
$ awk -F']|['
'$0 ~ /^[/ && $2 >= "2014-04-07 23:00" { p=1 }
$0 ~ /^[/ && $2 >= "2014-04-08 02:00:01" { p=0 }
p { print $0 }' log
In case you want to match timestamps in another format you have to modify the $0 ~ /^[/
sub-expression. Note that it used to ignore lines without any timestamps from print on/off logic.
For example for a timestamp format like YYYY-MM-DD HH24:MI:SS
(without braces) you could modify the command like this:
$ awk
'$0 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/
{
if ($1" "$2 >= "2014-04-07 23:00") p=1;
if ($1" "$2 >= "2014-04-08 02:00:01") p=0;
}
p { print $0 }' log
(note that also the field separator is changed - to blank/non-blank transition, the default)
Thanx for sharing the script but its not checking the end timestamp.. Can you please check. Also let me know what if i have the logs like 2014-04-07 23:59:58 . I mean without braces
– Amit
Apr 9 '14 at 21:47
@Amit, updated the answer
– maxschlepzig
Apr 10 '14 at 7:15
Although I don't think this is a string problem (see my answer), you could make yours much more readable, and probably a bit faster, by not repeating all of the tests:$1 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2}/ && $2 ~/[0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ { Time = $1" "$2; if (Time >= "2014-04-07 23:00" ) { p=1 } if (Time >= "2014-04-08 02:00:01" ) { p=0 } } p
– user61786
Apr 10 '14 at 8:34
Hi Max, One more small doubt.. If i have something like Apr-07-2014 10:51:17 . Then what changes i need to do.. I triedcode
$0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 11:00" { p=1 } $0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 12:00:01" { p=0 }code
but its not working
– Amit
Apr 10 '14 at 13:30
@awk_FTW, changed the code such that the regex is explicitly shared.
– maxschlepzig
Apr 10 '14 at 20:04
|
show 1 more comment
up vote
10
down vote
Check out dategrep
at https://github.com/mdom/dategrep
Description:
dategrep searches the named input files for lines matching a date range and prints them to stdout.
If dategrep works on a seekable file, it can do a binary search to find the first and last line to print pretty efficiently. dategrep can also read from stdin if one the filename arguments is just a hyphen, but in this case it has to parse every single line which will be slower.
Usage examples:
dategrep --start "12:00" --end "12:15" --format "%b %d %H:%M:%S" syslog
dategrep --end "12:15" --format "%b %d %H:%M:%S" syslog
dategrep --last-minutes 5 --format "%b %d %H:%M:%S" syslog
dategrep --last-minutes 5 --format rsyslog syslog
cat syslog | dategrep --end "12:15" -
Although this limitation may make this unsuitable for your exact question:
At the moment dategrep will die as soon as it finds a line that is not parsable. In a future version this will be configurable.
I learned about this command only a couple days ago courtesy of onethingwell.org/post/81991115668/dategrep so, kudos to him!
– cpugeniusmv
Apr 9 '14 at 20:53
add a comment |
up vote
3
down vote
One alternative to awk
or a non-standard tool is to use GNU grep
for its contextual greps. GNU's grep
will let you specify the number of lines after a positive match to print with -A
and the preceding lines to print with -B
For example:
[davisja5@xxxxxxlp01 ~]$ cat test.txt
Ignore this line, please.
This one too while you're at it...
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
we don't
want these lines.
[davisja5@xxxxxxlp01 ~]$ egrep "^[2014-04-07 23:59:58]" test.txt -A 10000 | egrep "^[2014-04-08 00:00:03]" -B 10000
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
The above essentially tells grep
to print the 10,000 lines that follow the line that matches the pattern you're wanting to start at, effectively making your output start where you're wanting it to and go until the end (hopefully) whereas the second egrep
in the pipeline tells it to only print the line with the ending delimiter and the 10,000 lines before it. The end result of these two is starting where you're wanting and not going passed where you told it to stop.
10,000 is just a number I came up with, feel free to change it to a million if you think your output is going to be too long.
How is this going to work if there is no log entry for the start and end ranges? If OP wants everything between 14:00 and 15:00, but there's no log entry for 14:00, then?
– user61786
Apr 10 '14 at 5:31
It will word about as well as thesed
which is also searching for literal matches.dategrep
is probably the most correct answer of all the ones given (since you need to be able to get "fuzzy" on what timestamps you'll accept) but like the answer says, I was just mentioning it as an alternative. That said, if the log is active enough to generate enough output to warrant cutting it's probably also going to have some kind of entry for the given timeperiod.
– Bratchley
Apr 10 '14 at 13:32
add a comment |
up vote
0
down vote
Using sed :
#!/bin/bash
E_BADARGS=23
if [ $# -ne "3" ]
then
echo "Usage: `basename $0` "<start_date>" "<end_date>" file"
echo "NOTE:Make sure to put dates in between double quotes"
exit $E_BADARGS
fi
isDatePresent(){
#check if given date exists in file.
local date=$1
local file=$2
grep -q "$date" "$file"
return $?
}
convertToEpoch(){
#converts to epoch time
local _date=$1
local epoch_date=`date --date="$_date" +%s`
echo $epoch_date
}
convertFromEpoch(){
#converts to date/time format from epoch
local epoch_date=$1
local _date=`date --date="@$epoch_date" +"%F %T"`
echo $_date
}
getDates(){
# collects all dates at beginning of lines in a file, converts them to epoch and returns a sequence of numbers
local file="$1"
local state="$2"
local i=0
local date_array=( )
if [[ "$state" -eq "S" ]];then
datelist=`cat "$file" | sed -r -e "s/^[([^+)].*/1/" | egrep "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}"`
elif [[ "$state" -eq "E" ]];then
datelist=`tac "$file" | sed -r -e "s/^[([^+)].*/1/" | egrep "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}"`
else
echo "Something went wrong while getting dates..." 1>&2
exit 500
fi
while read _date
do
epoch_date=`convertToEpoch "$_date"`
date_array[$i]=$epoch_date
#echo "$_date" "$epoch_date" 1>&2
(( i++ ))
done<<<"$datelist"
echo ${date_array[@]}
}
findneighbours(){
# search next best date if date is not in the file using recursivity
IFS="$old_IFS"
local elt=$1
shift
local state="$1"
shift
local -a array=( "$@" )
index_pivot=`expr ${#array[@]} / 2`
echo "#array="${#array[@]} ";array="${array[@]} ";index_pivot="$index_pivot 1>&2
if [ "$index_pivot" -eq 1 -a ${#array[@]} -eq 2 ];then
if [ "$state" == "E" ];then
echo ${array[0]}
elif [ "$state" == "S" ];then
echo ${array[(( ${#array[@]} - 1 ))]}
else
echo "State" $state "undefined" 1>&2
exit 100
fi
else
echo "elt with index_pivot="$index_pivot":"${array[$index_pivot]} 1>&2
if [ $elt -lt ${array[$index_pivot]} ];then
echo "elt is smaller than pivot" 1>&2
array=( ${array[@]:0:(($index_pivot + 1)) } )
else
echo "elt is bigger than pivot" 1>&2
array=( ${array[@]:$index_pivot:(( ${#array[@]} - 1 ))} )
fi
findneighbours "$elt" "$state" "${array[@]}"
fi
}
findFirstDate(){
local file="$1"
echo "Looking for first date in file" 1>&2
while read line
do
echo "$line" | egrep -q "^[[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}]" &>/dev/null
if [ "$?" -eq "0" ]
then
#echo "line=" "$line" 1>&2
firstdate=`echo "$line" | sed -r -e "s/^[([^+)].*/1/"`
echo "$firstdate"
break
else
echo $? 1>&2
fi
done< <( cat "$file" )
}
findLastDate(){
local file="$1"
echo "Looking for last date in file" 1>&2
while read line
do
echo "$line" | egrep -q "^[[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}]" &>/dev/null
if [ "$?" -eq "0" ]
then
#echo "line=" "$line" 1>&2
lastdate=`echo "$line" | sed -r -e "s/^[([^+)].*/1/"`
echo "$lastdate"
break
else
echo $? 1>&2
fi
done< <( tac "$file" )
}
findBestDate(){
IFS="$old_IFS"
local initdate="$1"
local file="$2"
local state="$3"
local first_elts="$4"
local last_elts="$5"
local date_array=( )
local initdate_epoch=`convertToEpoch "$initdate"`
if [[ $initdate_epoch -lt $first_elt ]];then
echo `convertFromEpoch "$first_elt"`
elif [[ $initdate_epoch -gt $last_elt ]];then
echo `convertFromEpoch "$last_elt"`
else
date_array=( `getDates "$file" "$state"` )
echo "date_array="${date_array[@]} 1>&2
#first_elt=${date_array[0]}
#last_elt=${date_array[(( ${#date_array[@]} - 1 ))]}
echo `convertFromEpoch $(findneighbours "$initdate_epoch" "$state" "${date_array[@]}")`
fi
}
main(){
init_date_start="$1"
init_date_end="$2"
filename="$3"
echo "problem start.." 1>&2
date_array=( "$init_date_start","$init_date_end" )
flag_array=( 0 0 )
i=0
#echo "$IFS" | cat -vte
old_IFS="$IFS"
#changing separator to avoid whitespace issue in date/time format
IFS=,
for _date in ${date_array[@]}
do
#IFS="$old_IFS"
#echo "$IFS" | cat -vte
if isDatePresent "$_date" "$filename";then
if [ "$i" -eq 0 ];then
echo "Starting date exists" 1>&2
#echo "date_start=""$_date" 1>&2
date_start="$_date"
else
echo "Ending date exists" 1>&2
#echo "date_end=""$_date" 1>&2
date_end="$_date"
fi
else
if [ "$i" -eq 0 ];then
echo "start date $_date not found" 1>&2
else
echo "end date $_date not found" 1>&2
fi
flag_array[$i]=1
fi
#IFS=,
(( i++ ))
done
IFS="$old_IFS"
if [ ${flag_array[0]} -eq 1 -o ${flag_array[1]} -eq 1 ];then
first_elt=`convertToEpoch "$(findFirstDate "$filename")"`
last_elt=`convertToEpoch "$(findLastDate "$filename")"`
border_dates_array=( "$first_elt","$last_elt" )
#echo "first_elt=" $first_elt "last_elt=" $last_elt 1>&2
i=0
IFS=,
for _date in ${date_array[@]}
do
if [ $i -eq 0 -a ${flag_array[$i]} -eq 1 ];then
date_start=`findBestDate "$_date" "$filename" "S" "${border_dates_array[@]}"`
elif [ $i -eq 1 -a ${flag_array[$i]} -eq 1 ];then
date_end=`findBestDate "$_date" "$filename" "E" "${border_dates_array[@]}"`
fi
(( i++ ))
done
fi
sed -r -n "/^[${date_start}]/,/^[${date_end}]/p" "$filename"
}
main "$1" "$2" "$3"
Copy this in a file. If you don't want to see debugging info, debugging is sent to stderr so just add "2>/dev/null"
1
This wont display the log files which dont have time stamp.
– Amit
Apr 9 '14 at 21:57
@Amit, yes it will, have you tried?
– UnX
Apr 10 '14 at 0:36
@rMistero, it won't work because if there is no log entry at 22:30, the range won't be terminated. As OP mentioned, the start and stop times may not be in the logs. You can tweak your regex for it to work, but you'll loose resolution and never be guaranteed in advance that the range will terminate at the right time.
– user61786
Apr 10 '14 at 5:21
@awk_FTW this was an example, I didn't use the timestamps provided by Amit. Again regex can be used. I agree thought it won't work if timestamp doesn't exists when provided explicitely or no timestamp regex matches. I ll improve it soon..
– UnX
Apr 10 '14 at 14:59
"As OP mentioned, the start and stop times may not be in the logs." No, read the OP again. OP says those WILL be present but intervening lines won't necessarily start with a timestamp. It doesn't even make sense to say the stop times might not be present. How could you ever tell any tool where to stop if the termination marker isn't guaranteed to be there? There would be no criteria to give the tool to tell it where to stop processing.
– Bratchley
Apr 10 '14 at 17:47
|
show 5 more comments
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
18
down vote
You can use awk
for this:
$ awk -F']|['
'$0 ~ /^[/ && $2 >= "2014-04-07 23:00" { p=1 }
$0 ~ /^[/ && $2 >= "2014-04-08 02:00" { p=0 }
p { print $0 }' log
Where:
-F
specifies the characters[
and]
as field separators using a regular expression
$0
references a complete line
$2
references the date field
p
is used as boolean variable that guards the actual printing
$0 ~ /regex/
is true if regex matches$0
>=
is used for lexicographically comparing string (equivalent to e.g.strcmp()
)
Variations
The above command line implements right-open time interval matching. To get closed interval semantics just increment your right date, e.g.:
$ awk -F']|['
'$0 ~ /^[/ && $2 >= "2014-04-07 23:00" { p=1 }
$0 ~ /^[/ && $2 >= "2014-04-08 02:00:01" { p=0 }
p { print $0 }' log
In case you want to match timestamps in another format you have to modify the $0 ~ /^[/
sub-expression. Note that it used to ignore lines without any timestamps from print on/off logic.
For example for a timestamp format like YYYY-MM-DD HH24:MI:SS
(without braces) you could modify the command like this:
$ awk
'$0 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/
{
if ($1" "$2 >= "2014-04-07 23:00") p=1;
if ($1" "$2 >= "2014-04-08 02:00:01") p=0;
}
p { print $0 }' log
(note that also the field separator is changed - to blank/non-blank transition, the default)
Thanx for sharing the script but its not checking the end timestamp.. Can you please check. Also let me know what if i have the logs like 2014-04-07 23:59:58 . I mean without braces
– Amit
Apr 9 '14 at 21:47
@Amit, updated the answer
– maxschlepzig
Apr 10 '14 at 7:15
Although I don't think this is a string problem (see my answer), you could make yours much more readable, and probably a bit faster, by not repeating all of the tests:$1 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2}/ && $2 ~/[0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ { Time = $1" "$2; if (Time >= "2014-04-07 23:00" ) { p=1 } if (Time >= "2014-04-08 02:00:01" ) { p=0 } } p
– user61786
Apr 10 '14 at 8:34
Hi Max, One more small doubt.. If i have something like Apr-07-2014 10:51:17 . Then what changes i need to do.. I triedcode
$0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 11:00" { p=1 } $0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 12:00:01" { p=0 }code
but its not working
– Amit
Apr 10 '14 at 13:30
@awk_FTW, changed the code such that the regex is explicitly shared.
– maxschlepzig
Apr 10 '14 at 20:04
|
show 1 more comment
up vote
18
down vote
You can use awk
for this:
$ awk -F']|['
'$0 ~ /^[/ && $2 >= "2014-04-07 23:00" { p=1 }
$0 ~ /^[/ && $2 >= "2014-04-08 02:00" { p=0 }
p { print $0 }' log
Where:
-F
specifies the characters[
and]
as field separators using a regular expression
$0
references a complete line
$2
references the date field
p
is used as boolean variable that guards the actual printing
$0 ~ /regex/
is true if regex matches$0
>=
is used for lexicographically comparing string (equivalent to e.g.strcmp()
)
Variations
The above command line implements right-open time interval matching. To get closed interval semantics just increment your right date, e.g.:
$ awk -F']|['
'$0 ~ /^[/ && $2 >= "2014-04-07 23:00" { p=1 }
$0 ~ /^[/ && $2 >= "2014-04-08 02:00:01" { p=0 }
p { print $0 }' log
In case you want to match timestamps in another format you have to modify the $0 ~ /^[/
sub-expression. Note that it used to ignore lines without any timestamps from print on/off logic.
For example for a timestamp format like YYYY-MM-DD HH24:MI:SS
(without braces) you could modify the command like this:
$ awk
'$0 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/
{
if ($1" "$2 >= "2014-04-07 23:00") p=1;
if ($1" "$2 >= "2014-04-08 02:00:01") p=0;
}
p { print $0 }' log
(note that also the field separator is changed - to blank/non-blank transition, the default)
Thanx for sharing the script but its not checking the end timestamp.. Can you please check. Also let me know what if i have the logs like 2014-04-07 23:59:58 . I mean without braces
– Amit
Apr 9 '14 at 21:47
@Amit, updated the answer
– maxschlepzig
Apr 10 '14 at 7:15
Although I don't think this is a string problem (see my answer), you could make yours much more readable, and probably a bit faster, by not repeating all of the tests:$1 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2}/ && $2 ~/[0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ { Time = $1" "$2; if (Time >= "2014-04-07 23:00" ) { p=1 } if (Time >= "2014-04-08 02:00:01" ) { p=0 } } p
– user61786
Apr 10 '14 at 8:34
Hi Max, One more small doubt.. If i have something like Apr-07-2014 10:51:17 . Then what changes i need to do.. I triedcode
$0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 11:00" { p=1 } $0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 12:00:01" { p=0 }code
but its not working
– Amit
Apr 10 '14 at 13:30
@awk_FTW, changed the code such that the regex is explicitly shared.
– maxschlepzig
Apr 10 '14 at 20:04
|
show 1 more comment
up vote
18
down vote
up vote
18
down vote
You can use awk
for this:
$ awk -F']|['
'$0 ~ /^[/ && $2 >= "2014-04-07 23:00" { p=1 }
$0 ~ /^[/ && $2 >= "2014-04-08 02:00" { p=0 }
p { print $0 }' log
Where:
-F
specifies the characters[
and]
as field separators using a regular expression
$0
references a complete line
$2
references the date field
p
is used as boolean variable that guards the actual printing
$0 ~ /regex/
is true if regex matches$0
>=
is used for lexicographically comparing string (equivalent to e.g.strcmp()
)
Variations
The above command line implements right-open time interval matching. To get closed interval semantics just increment your right date, e.g.:
$ awk -F']|['
'$0 ~ /^[/ && $2 >= "2014-04-07 23:00" { p=1 }
$0 ~ /^[/ && $2 >= "2014-04-08 02:00:01" { p=0 }
p { print $0 }' log
In case you want to match timestamps in another format you have to modify the $0 ~ /^[/
sub-expression. Note that it used to ignore lines without any timestamps from print on/off logic.
For example for a timestamp format like YYYY-MM-DD HH24:MI:SS
(without braces) you could modify the command like this:
$ awk
'$0 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/
{
if ($1" "$2 >= "2014-04-07 23:00") p=1;
if ($1" "$2 >= "2014-04-08 02:00:01") p=0;
}
p { print $0 }' log
(note that also the field separator is changed - to blank/non-blank transition, the default)
You can use awk
for this:
$ awk -F']|['
'$0 ~ /^[/ && $2 >= "2014-04-07 23:00" { p=1 }
$0 ~ /^[/ && $2 >= "2014-04-08 02:00" { p=0 }
p { print $0 }' log
Where:
-F
specifies the characters[
and]
as field separators using a regular expression
$0
references a complete line
$2
references the date field
p
is used as boolean variable that guards the actual printing
$0 ~ /regex/
is true if regex matches$0
>=
is used for lexicographically comparing string (equivalent to e.g.strcmp()
)
Variations
The above command line implements right-open time interval matching. To get closed interval semantics just increment your right date, e.g.:
$ awk -F']|['
'$0 ~ /^[/ && $2 >= "2014-04-07 23:00" { p=1 }
$0 ~ /^[/ && $2 >= "2014-04-08 02:00:01" { p=0 }
p { print $0 }' log
In case you want to match timestamps in another format you have to modify the $0 ~ /^[/
sub-expression. Note that it used to ignore lines without any timestamps from print on/off logic.
For example for a timestamp format like YYYY-MM-DD HH24:MI:SS
(without braces) you could modify the command like this:
$ awk
'$0 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/
{
if ($1" "$2 >= "2014-04-07 23:00") p=1;
if ($1" "$2 >= "2014-04-08 02:00:01") p=0;
}
p { print $0 }' log
(note that also the field separator is changed - to blank/non-blank transition, the default)
edited Apr 10 '14 at 20:02
answered Apr 9 '14 at 20:27
maxschlepzig
33k32135208
33k32135208
Thanx for sharing the script but its not checking the end timestamp.. Can you please check. Also let me know what if i have the logs like 2014-04-07 23:59:58 . I mean without braces
– Amit
Apr 9 '14 at 21:47
@Amit, updated the answer
– maxschlepzig
Apr 10 '14 at 7:15
Although I don't think this is a string problem (see my answer), you could make yours much more readable, and probably a bit faster, by not repeating all of the tests:$1 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2}/ && $2 ~/[0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ { Time = $1" "$2; if (Time >= "2014-04-07 23:00" ) { p=1 } if (Time >= "2014-04-08 02:00:01" ) { p=0 } } p
– user61786
Apr 10 '14 at 8:34
Hi Max, One more small doubt.. If i have something like Apr-07-2014 10:51:17 . Then what changes i need to do.. I triedcode
$0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 11:00" { p=1 } $0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 12:00:01" { p=0 }code
but its not working
– Amit
Apr 10 '14 at 13:30
@awk_FTW, changed the code such that the regex is explicitly shared.
– maxschlepzig
Apr 10 '14 at 20:04
|
show 1 more comment
Thanx for sharing the script but its not checking the end timestamp.. Can you please check. Also let me know what if i have the logs like 2014-04-07 23:59:58 . I mean without braces
– Amit
Apr 9 '14 at 21:47
@Amit, updated the answer
– maxschlepzig
Apr 10 '14 at 7:15
Although I don't think this is a string problem (see my answer), you could make yours much more readable, and probably a bit faster, by not repeating all of the tests:$1 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2}/ && $2 ~/[0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ { Time = $1" "$2; if (Time >= "2014-04-07 23:00" ) { p=1 } if (Time >= "2014-04-08 02:00:01" ) { p=0 } } p
– user61786
Apr 10 '14 at 8:34
Hi Max, One more small doubt.. If i have something like Apr-07-2014 10:51:17 . Then what changes i need to do.. I triedcode
$0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 11:00" { p=1 } $0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 12:00:01" { p=0 }code
but its not working
– Amit
Apr 10 '14 at 13:30
@awk_FTW, changed the code such that the regex is explicitly shared.
– maxschlepzig
Apr 10 '14 at 20:04
Thanx for sharing the script but its not checking the end timestamp.. Can you please check. Also let me know what if i have the logs like 2014-04-07 23:59:58 . I mean without braces
– Amit
Apr 9 '14 at 21:47
Thanx for sharing the script but its not checking the end timestamp.. Can you please check. Also let me know what if i have the logs like 2014-04-07 23:59:58 . I mean without braces
– Amit
Apr 9 '14 at 21:47
@Amit, updated the answer
– maxschlepzig
Apr 10 '14 at 7:15
@Amit, updated the answer
– maxschlepzig
Apr 10 '14 at 7:15
Although I don't think this is a string problem (see my answer), you could make yours much more readable, and probably a bit faster, by not repeating all of the tests:
$1 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2}/ && $2 ~/[0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ { Time = $1" "$2; if (Time >= "2014-04-07 23:00" ) { p=1 } if (Time >= "2014-04-08 02:00:01" ) { p=0 } } p
– user61786
Apr 10 '14 at 8:34
Although I don't think this is a string problem (see my answer), you could make yours much more readable, and probably a bit faster, by not repeating all of the tests:
$1 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2}/ && $2 ~/[0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ { Time = $1" "$2; if (Time >= "2014-04-07 23:00" ) { p=1 } if (Time >= "2014-04-08 02:00:01" ) { p=0 } } p
– user61786
Apr 10 '14 at 8:34
Hi Max, One more small doubt.. If i have something like Apr-07-2014 10:51:17 . Then what changes i need to do.. I tried
code
$0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 11:00" { p=1 } $0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 12:00:01" { p=0 } code
but its not working– Amit
Apr 10 '14 at 13:30
Hi Max, One more small doubt.. If i have something like Apr-07-2014 10:51:17 . Then what changes i need to do.. I tried
code
$0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 11:00" { p=1 } $0 ~ /^[a-z|A-Z]{4}-[0-9]{2}-[0-9]{4} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ && $1" "$2 >= "Apr-07-2014 12:00:01" { p=0 } code
but its not working– Amit
Apr 10 '14 at 13:30
@awk_FTW, changed the code such that the regex is explicitly shared.
– maxschlepzig
Apr 10 '14 at 20:04
@awk_FTW, changed the code such that the regex is explicitly shared.
– maxschlepzig
Apr 10 '14 at 20:04
|
show 1 more comment
up vote
10
down vote
Check out dategrep
at https://github.com/mdom/dategrep
Description:
dategrep searches the named input files for lines matching a date range and prints them to stdout.
If dategrep works on a seekable file, it can do a binary search to find the first and last line to print pretty efficiently. dategrep can also read from stdin if one the filename arguments is just a hyphen, but in this case it has to parse every single line which will be slower.
Usage examples:
dategrep --start "12:00" --end "12:15" --format "%b %d %H:%M:%S" syslog
dategrep --end "12:15" --format "%b %d %H:%M:%S" syslog
dategrep --last-minutes 5 --format "%b %d %H:%M:%S" syslog
dategrep --last-minutes 5 --format rsyslog syslog
cat syslog | dategrep --end "12:15" -
Although this limitation may make this unsuitable for your exact question:
At the moment dategrep will die as soon as it finds a line that is not parsable. In a future version this will be configurable.
I learned about this command only a couple days ago courtesy of onethingwell.org/post/81991115668/dategrep so, kudos to him!
– cpugeniusmv
Apr 9 '14 at 20:53
add a comment |
up vote
10
down vote
Check out dategrep
at https://github.com/mdom/dategrep
Description:
dategrep searches the named input files for lines matching a date range and prints them to stdout.
If dategrep works on a seekable file, it can do a binary search to find the first and last line to print pretty efficiently. dategrep can also read from stdin if one the filename arguments is just a hyphen, but in this case it has to parse every single line which will be slower.
Usage examples:
dategrep --start "12:00" --end "12:15" --format "%b %d %H:%M:%S" syslog
dategrep --end "12:15" --format "%b %d %H:%M:%S" syslog
dategrep --last-minutes 5 --format "%b %d %H:%M:%S" syslog
dategrep --last-minutes 5 --format rsyslog syslog
cat syslog | dategrep --end "12:15" -
Although this limitation may make this unsuitable for your exact question:
At the moment dategrep will die as soon as it finds a line that is not parsable. In a future version this will be configurable.
I learned about this command only a couple days ago courtesy of onethingwell.org/post/81991115668/dategrep so, kudos to him!
– cpugeniusmv
Apr 9 '14 at 20:53
add a comment |
up vote
10
down vote
up vote
10
down vote
Check out dategrep
at https://github.com/mdom/dategrep
Description:
dategrep searches the named input files for lines matching a date range and prints them to stdout.
If dategrep works on a seekable file, it can do a binary search to find the first and last line to print pretty efficiently. dategrep can also read from stdin if one the filename arguments is just a hyphen, but in this case it has to parse every single line which will be slower.
Usage examples:
dategrep --start "12:00" --end "12:15" --format "%b %d %H:%M:%S" syslog
dategrep --end "12:15" --format "%b %d %H:%M:%S" syslog
dategrep --last-minutes 5 --format "%b %d %H:%M:%S" syslog
dategrep --last-minutes 5 --format rsyslog syslog
cat syslog | dategrep --end "12:15" -
Although this limitation may make this unsuitable for your exact question:
At the moment dategrep will die as soon as it finds a line that is not parsable. In a future version this will be configurable.
Check out dategrep
at https://github.com/mdom/dategrep
Description:
dategrep searches the named input files for lines matching a date range and prints them to stdout.
If dategrep works on a seekable file, it can do a binary search to find the first and last line to print pretty efficiently. dategrep can also read from stdin if one the filename arguments is just a hyphen, but in this case it has to parse every single line which will be slower.
Usage examples:
dategrep --start "12:00" --end "12:15" --format "%b %d %H:%M:%S" syslog
dategrep --end "12:15" --format "%b %d %H:%M:%S" syslog
dategrep --last-minutes 5 --format "%b %d %H:%M:%S" syslog
dategrep --last-minutes 5 --format rsyslog syslog
cat syslog | dategrep --end "12:15" -
Although this limitation may make this unsuitable for your exact question:
At the moment dategrep will die as soon as it finds a line that is not parsable. In a future version this will be configurable.
answered Apr 9 '14 at 20:48
cpugeniusmv
2,0621024
2,0621024
I learned about this command only a couple days ago courtesy of onethingwell.org/post/81991115668/dategrep so, kudos to him!
– cpugeniusmv
Apr 9 '14 at 20:53
add a comment |
I learned about this command only a couple days ago courtesy of onethingwell.org/post/81991115668/dategrep so, kudos to him!
– cpugeniusmv
Apr 9 '14 at 20:53
I learned about this command only a couple days ago courtesy of onethingwell.org/post/81991115668/dategrep so, kudos to him!
– cpugeniusmv
Apr 9 '14 at 20:53
I learned about this command only a couple days ago courtesy of onethingwell.org/post/81991115668/dategrep so, kudos to him!
– cpugeniusmv
Apr 9 '14 at 20:53
add a comment |
up vote
3
down vote
One alternative to awk
or a non-standard tool is to use GNU grep
for its contextual greps. GNU's grep
will let you specify the number of lines after a positive match to print with -A
and the preceding lines to print with -B
For example:
[davisja5@xxxxxxlp01 ~]$ cat test.txt
Ignore this line, please.
This one too while you're at it...
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
we don't
want these lines.
[davisja5@xxxxxxlp01 ~]$ egrep "^[2014-04-07 23:59:58]" test.txt -A 10000 | egrep "^[2014-04-08 00:00:03]" -B 10000
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
The above essentially tells grep
to print the 10,000 lines that follow the line that matches the pattern you're wanting to start at, effectively making your output start where you're wanting it to and go until the end (hopefully) whereas the second egrep
in the pipeline tells it to only print the line with the ending delimiter and the 10,000 lines before it. The end result of these two is starting where you're wanting and not going passed where you told it to stop.
10,000 is just a number I came up with, feel free to change it to a million if you think your output is going to be too long.
How is this going to work if there is no log entry for the start and end ranges? If OP wants everything between 14:00 and 15:00, but there's no log entry for 14:00, then?
– user61786
Apr 10 '14 at 5:31
It will word about as well as thesed
which is also searching for literal matches.dategrep
is probably the most correct answer of all the ones given (since you need to be able to get "fuzzy" on what timestamps you'll accept) but like the answer says, I was just mentioning it as an alternative. That said, if the log is active enough to generate enough output to warrant cutting it's probably also going to have some kind of entry for the given timeperiod.
– Bratchley
Apr 10 '14 at 13:32
add a comment |
up vote
3
down vote
One alternative to awk
or a non-standard tool is to use GNU grep
for its contextual greps. GNU's grep
will let you specify the number of lines after a positive match to print with -A
and the preceding lines to print with -B
For example:
[davisja5@xxxxxxlp01 ~]$ cat test.txt
Ignore this line, please.
This one too while you're at it...
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
we don't
want these lines.
[davisja5@xxxxxxlp01 ~]$ egrep "^[2014-04-07 23:59:58]" test.txt -A 10000 | egrep "^[2014-04-08 00:00:03]" -B 10000
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
The above essentially tells grep
to print the 10,000 lines that follow the line that matches the pattern you're wanting to start at, effectively making your output start where you're wanting it to and go until the end (hopefully) whereas the second egrep
in the pipeline tells it to only print the line with the ending delimiter and the 10,000 lines before it. The end result of these two is starting where you're wanting and not going passed where you told it to stop.
10,000 is just a number I came up with, feel free to change it to a million if you think your output is going to be too long.
How is this going to work if there is no log entry for the start and end ranges? If OP wants everything between 14:00 and 15:00, but there's no log entry for 14:00, then?
– user61786
Apr 10 '14 at 5:31
It will word about as well as thesed
which is also searching for literal matches.dategrep
is probably the most correct answer of all the ones given (since you need to be able to get "fuzzy" on what timestamps you'll accept) but like the answer says, I was just mentioning it as an alternative. That said, if the log is active enough to generate enough output to warrant cutting it's probably also going to have some kind of entry for the given timeperiod.
– Bratchley
Apr 10 '14 at 13:32
add a comment |
up vote
3
down vote
up vote
3
down vote
One alternative to awk
or a non-standard tool is to use GNU grep
for its contextual greps. GNU's grep
will let you specify the number of lines after a positive match to print with -A
and the preceding lines to print with -B
For example:
[davisja5@xxxxxxlp01 ~]$ cat test.txt
Ignore this line, please.
This one too while you're at it...
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
we don't
want these lines.
[davisja5@xxxxxxlp01 ~]$ egrep "^[2014-04-07 23:59:58]" test.txt -A 10000 | egrep "^[2014-04-08 00:00:03]" -B 10000
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
The above essentially tells grep
to print the 10,000 lines that follow the line that matches the pattern you're wanting to start at, effectively making your output start where you're wanting it to and go until the end (hopefully) whereas the second egrep
in the pipeline tells it to only print the line with the ending delimiter and the 10,000 lines before it. The end result of these two is starting where you're wanting and not going passed where you told it to stop.
10,000 is just a number I came up with, feel free to change it to a million if you think your output is going to be too long.
One alternative to awk
or a non-standard tool is to use GNU grep
for its contextual greps. GNU's grep
will let you specify the number of lines after a positive match to print with -A
and the preceding lines to print with -B
For example:
[davisja5@xxxxxxlp01 ~]$ cat test.txt
Ignore this line, please.
This one too while you're at it...
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
we don't
want these lines.
[davisja5@xxxxxxlp01 ~]$ egrep "^[2014-04-07 23:59:58]" test.txt -A 10000 | egrep "^[2014-04-08 00:00:03]" -B 10000
[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null
--Checking user--
Post
[2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall
The above essentially tells grep
to print the 10,000 lines that follow the line that matches the pattern you're wanting to start at, effectively making your output start where you're wanting it to and go until the end (hopefully) whereas the second egrep
in the pipeline tells it to only print the line with the ending delimiter and the 10,000 lines before it. The end result of these two is starting where you're wanting and not going passed where you told it to stop.
10,000 is just a number I came up with, feel free to change it to a million if you think your output is going to be too long.
edited Jul 21 '14 at 15:46
answered Apr 9 '14 at 20:58
Bratchley
11.8k64388
11.8k64388
How is this going to work if there is no log entry for the start and end ranges? If OP wants everything between 14:00 and 15:00, but there's no log entry for 14:00, then?
– user61786
Apr 10 '14 at 5:31
It will word about as well as thesed
which is also searching for literal matches.dategrep
is probably the most correct answer of all the ones given (since you need to be able to get "fuzzy" on what timestamps you'll accept) but like the answer says, I was just mentioning it as an alternative. That said, if the log is active enough to generate enough output to warrant cutting it's probably also going to have some kind of entry for the given timeperiod.
– Bratchley
Apr 10 '14 at 13:32
add a comment |
How is this going to work if there is no log entry for the start and end ranges? If OP wants everything between 14:00 and 15:00, but there's no log entry for 14:00, then?
– user61786
Apr 10 '14 at 5:31
It will word about as well as thesed
which is also searching for literal matches.dategrep
is probably the most correct answer of all the ones given (since you need to be able to get "fuzzy" on what timestamps you'll accept) but like the answer says, I was just mentioning it as an alternative. That said, if the log is active enough to generate enough output to warrant cutting it's probably also going to have some kind of entry for the given timeperiod.
– Bratchley
Apr 10 '14 at 13:32
How is this going to work if there is no log entry for the start and end ranges? If OP wants everything between 14:00 and 15:00, but there's no log entry for 14:00, then?
– user61786
Apr 10 '14 at 5:31
How is this going to work if there is no log entry for the start and end ranges? If OP wants everything between 14:00 and 15:00, but there's no log entry for 14:00, then?
– user61786
Apr 10 '14 at 5:31
It will word about as well as the
sed
which is also searching for literal matches. dategrep
is probably the most correct answer of all the ones given (since you need to be able to get "fuzzy" on what timestamps you'll accept) but like the answer says, I was just mentioning it as an alternative. That said, if the log is active enough to generate enough output to warrant cutting it's probably also going to have some kind of entry for the given timeperiod.– Bratchley
Apr 10 '14 at 13:32
It will word about as well as the
sed
which is also searching for literal matches. dategrep
is probably the most correct answer of all the ones given (since you need to be able to get "fuzzy" on what timestamps you'll accept) but like the answer says, I was just mentioning it as an alternative. That said, if the log is active enough to generate enough output to warrant cutting it's probably also going to have some kind of entry for the given timeperiod.– Bratchley
Apr 10 '14 at 13:32
add a comment |
up vote
0
down vote
Using sed :
#!/bin/bash
E_BADARGS=23
if [ $# -ne "3" ]
then
echo "Usage: `basename $0` "<start_date>" "<end_date>" file"
echo "NOTE:Make sure to put dates in between double quotes"
exit $E_BADARGS
fi
isDatePresent(){
#check if given date exists in file.
local date=$1
local file=$2
grep -q "$date" "$file"
return $?
}
convertToEpoch(){
#converts to epoch time
local _date=$1
local epoch_date=`date --date="$_date" +%s`
echo $epoch_date
}
convertFromEpoch(){
#converts to date/time format from epoch
local epoch_date=$1
local _date=`date --date="@$epoch_date" +"%F %T"`
echo $_date
}
getDates(){
# collects all dates at beginning of lines in a file, converts them to epoch and returns a sequence of numbers
local file="$1"
local state="$2"
local i=0
local date_array=( )
if [[ "$state" -eq "S" ]];then
datelist=`cat "$file" | sed -r -e "s/^[([^+)].*/1/" | egrep "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}"`
elif [[ "$state" -eq "E" ]];then
datelist=`tac "$file" | sed -r -e "s/^[([^+)].*/1/" | egrep "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}"`
else
echo "Something went wrong while getting dates..." 1>&2
exit 500
fi
while read _date
do
epoch_date=`convertToEpoch "$_date"`
date_array[$i]=$epoch_date
#echo "$_date" "$epoch_date" 1>&2
(( i++ ))
done<<<"$datelist"
echo ${date_array[@]}
}
findneighbours(){
# search next best date if date is not in the file using recursivity
IFS="$old_IFS"
local elt=$1
shift
local state="$1"
shift
local -a array=( "$@" )
index_pivot=`expr ${#array[@]} / 2`
echo "#array="${#array[@]} ";array="${array[@]} ";index_pivot="$index_pivot 1>&2
if [ "$index_pivot" -eq 1 -a ${#array[@]} -eq 2 ];then
if [ "$state" == "E" ];then
echo ${array[0]}
elif [ "$state" == "S" ];then
echo ${array[(( ${#array[@]} - 1 ))]}
else
echo "State" $state "undefined" 1>&2
exit 100
fi
else
echo "elt with index_pivot="$index_pivot":"${array[$index_pivot]} 1>&2
if [ $elt -lt ${array[$index_pivot]} ];then
echo "elt is smaller than pivot" 1>&2
array=( ${array[@]:0:(($index_pivot + 1)) } )
else
echo "elt is bigger than pivot" 1>&2
array=( ${array[@]:$index_pivot:(( ${#array[@]} - 1 ))} )
fi
findneighbours "$elt" "$state" "${array[@]}"
fi
}
findFirstDate(){
local file="$1"
echo "Looking for first date in file" 1>&2
while read line
do
echo "$line" | egrep -q "^[[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}]" &>/dev/null
if [ "$?" -eq "0" ]
then
#echo "line=" "$line" 1>&2
firstdate=`echo "$line" | sed -r -e "s/^[([^+)].*/1/"`
echo "$firstdate"
break
else
echo $? 1>&2
fi
done< <( cat "$file" )
}
findLastDate(){
local file="$1"
echo "Looking for last date in file" 1>&2
while read line
do
echo "$line" | egrep -q "^[[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}]" &>/dev/null
if [ "$?" -eq "0" ]
then
#echo "line=" "$line" 1>&2
lastdate=`echo "$line" | sed -r -e "s/^[([^+)].*/1/"`
echo "$lastdate"
break
else
echo $? 1>&2
fi
done< <( tac "$file" )
}
findBestDate(){
IFS="$old_IFS"
local initdate="$1"
local file="$2"
local state="$3"
local first_elts="$4"
local last_elts="$5"
local date_array=( )
local initdate_epoch=`convertToEpoch "$initdate"`
if [[ $initdate_epoch -lt $first_elt ]];then
echo `convertFromEpoch "$first_elt"`
elif [[ $initdate_epoch -gt $last_elt ]];then
echo `convertFromEpoch "$last_elt"`
else
date_array=( `getDates "$file" "$state"` )
echo "date_array="${date_array[@]} 1>&2
#first_elt=${date_array[0]}
#last_elt=${date_array[(( ${#date_array[@]} - 1 ))]}
echo `convertFromEpoch $(findneighbours "$initdate_epoch" "$state" "${date_array[@]}")`
fi
}
main(){
init_date_start="$1"
init_date_end="$2"
filename="$3"
echo "problem start.." 1>&2
date_array=( "$init_date_start","$init_date_end" )
flag_array=( 0 0 )
i=0
#echo "$IFS" | cat -vte
old_IFS="$IFS"
#changing separator to avoid whitespace issue in date/time format
IFS=,
for _date in ${date_array[@]}
do
#IFS="$old_IFS"
#echo "$IFS" | cat -vte
if isDatePresent "$_date" "$filename";then
if [ "$i" -eq 0 ];then
echo "Starting date exists" 1>&2
#echo "date_start=""$_date" 1>&2
date_start="$_date"
else
echo "Ending date exists" 1>&2
#echo "date_end=""$_date" 1>&2
date_end="$_date"
fi
else
if [ "$i" -eq 0 ];then
echo "start date $_date not found" 1>&2
else
echo "end date $_date not found" 1>&2
fi
flag_array[$i]=1
fi
#IFS=,
(( i++ ))
done
IFS="$old_IFS"
if [ ${flag_array[0]} -eq 1 -o ${flag_array[1]} -eq 1 ];then
first_elt=`convertToEpoch "$(findFirstDate "$filename")"`
last_elt=`convertToEpoch "$(findLastDate "$filename")"`
border_dates_array=( "$first_elt","$last_elt" )
#echo "first_elt=" $first_elt "last_elt=" $last_elt 1>&2
i=0
IFS=,
for _date in ${date_array[@]}
do
if [ $i -eq 0 -a ${flag_array[$i]} -eq 1 ];then
date_start=`findBestDate "$_date" "$filename" "S" "${border_dates_array[@]}"`
elif [ $i -eq 1 -a ${flag_array[$i]} -eq 1 ];then
date_end=`findBestDate "$_date" "$filename" "E" "${border_dates_array[@]}"`
fi
(( i++ ))
done
fi
sed -r -n "/^[${date_start}]/,/^[${date_end}]/p" "$filename"
}
main "$1" "$2" "$3"
Copy this in a file. If you don't want to see debugging info, debugging is sent to stderr so just add "2>/dev/null"
1
This wont display the log files which dont have time stamp.
– Amit
Apr 9 '14 at 21:57
@Amit, yes it will, have you tried?
– UnX
Apr 10 '14 at 0:36
@rMistero, it won't work because if there is no log entry at 22:30, the range won't be terminated. As OP mentioned, the start and stop times may not be in the logs. You can tweak your regex for it to work, but you'll loose resolution and never be guaranteed in advance that the range will terminate at the right time.
– user61786
Apr 10 '14 at 5:21
@awk_FTW this was an example, I didn't use the timestamps provided by Amit. Again regex can be used. I agree thought it won't work if timestamp doesn't exists when provided explicitely or no timestamp regex matches. I ll improve it soon..
– UnX
Apr 10 '14 at 14:59
"As OP mentioned, the start and stop times may not be in the logs." No, read the OP again. OP says those WILL be present but intervening lines won't necessarily start with a timestamp. It doesn't even make sense to say the stop times might not be present. How could you ever tell any tool where to stop if the termination marker isn't guaranteed to be there? There would be no criteria to give the tool to tell it where to stop processing.
– Bratchley
Apr 10 '14 at 17:47
|
show 5 more comments
up vote
0
down vote
Using sed :
#!/bin/bash
E_BADARGS=23
if [ $# -ne "3" ]
then
echo "Usage: `basename $0` "<start_date>" "<end_date>" file"
echo "NOTE:Make sure to put dates in between double quotes"
exit $E_BADARGS
fi
isDatePresent(){
#check if given date exists in file.
local date=$1
local file=$2
grep -q "$date" "$file"
return $?
}
convertToEpoch(){
#converts to epoch time
local _date=$1
local epoch_date=`date --date="$_date" +%s`
echo $epoch_date
}
convertFromEpoch(){
#converts to date/time format from epoch
local epoch_date=$1
local _date=`date --date="@$epoch_date" +"%F %T"`
echo $_date
}
getDates(){
# collects all dates at beginning of lines in a file, converts them to epoch and returns a sequence of numbers
local file="$1"
local state="$2"
local i=0
local date_array=( )
if [[ "$state" -eq "S" ]];then
datelist=`cat "$file" | sed -r -e "s/^[([^+)].*/1/" | egrep "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}"`
elif [[ "$state" -eq "E" ]];then
datelist=`tac "$file" | sed -r -e "s/^[([^+)].*/1/" | egrep "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}"`
else
echo "Something went wrong while getting dates..." 1>&2
exit 500
fi
while read _date
do
epoch_date=`convertToEpoch "$_date"`
date_array[$i]=$epoch_date
#echo "$_date" "$epoch_date" 1>&2
(( i++ ))
done<<<"$datelist"
echo ${date_array[@]}
}
findneighbours(){
# search next best date if date is not in the file using recursivity
IFS="$old_IFS"
local elt=$1
shift
local state="$1"
shift
local -a array=( "$@" )
index_pivot=`expr ${#array[@]} / 2`
echo "#array="${#array[@]} ";array="${array[@]} ";index_pivot="$index_pivot 1>&2
if [ "$index_pivot" -eq 1 -a ${#array[@]} -eq 2 ];then
if [ "$state" == "E" ];then
echo ${array[0]}
elif [ "$state" == "S" ];then
echo ${array[(( ${#array[@]} - 1 ))]}
else
echo "State" $state "undefined" 1>&2
exit 100
fi
else
echo "elt with index_pivot="$index_pivot":"${array[$index_pivot]} 1>&2
if [ $elt -lt ${array[$index_pivot]} ];then
echo "elt is smaller than pivot" 1>&2
array=( ${array[@]:0:(($index_pivot + 1)) } )
else
echo "elt is bigger than pivot" 1>&2
array=( ${array[@]:$index_pivot:(( ${#array[@]} - 1 ))} )
fi
findneighbours "$elt" "$state" "${array[@]}"
fi
}
findFirstDate(){
local file="$1"
echo "Looking for first date in file" 1>&2
while read line
do
echo "$line" | egrep -q "^[[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}]" &>/dev/null
if [ "$?" -eq "0" ]
then
#echo "line=" "$line" 1>&2
firstdate=`echo "$line" | sed -r -e "s/^[([^+)].*/1/"`
echo "$firstdate"
break
else
echo $? 1>&2
fi
done< <( cat "$file" )
}
findLastDate(){
local file="$1"
echo "Looking for last date in file" 1>&2
while read line
do
echo "$line" | egrep -q "^[[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}]" &>/dev/null
if [ "$?" -eq "0" ]
then
#echo "line=" "$line" 1>&2
lastdate=`echo "$line" | sed -r -e "s/^[([^+)].*/1/"`
echo "$lastdate"
break
else
echo $? 1>&2
fi
done< <( tac "$file" )
}
findBestDate(){
IFS="$old_IFS"
local initdate="$1"
local file="$2"
local state="$3"
local first_elts="$4"
local last_elts="$5"
local date_array=( )
local initdate_epoch=`convertToEpoch "$initdate"`
if [[ $initdate_epoch -lt $first_elt ]];then
echo `convertFromEpoch "$first_elt"`
elif [[ $initdate_epoch -gt $last_elt ]];then
echo `convertFromEpoch "$last_elt"`
else
date_array=( `getDates "$file" "$state"` )
echo "date_array="${date_array[@]} 1>&2
#first_elt=${date_array[0]}
#last_elt=${date_array[(( ${#date_array[@]} - 1 ))]}
echo `convertFromEpoch $(findneighbours "$initdate_epoch" "$state" "${date_array[@]}")`
fi
}
main(){
init_date_start="$1"
init_date_end="$2"
filename="$3"
echo "problem start.." 1>&2
date_array=( "$init_date_start","$init_date_end" )
flag_array=( 0 0 )
i=0
#echo "$IFS" | cat -vte
old_IFS="$IFS"
#changing separator to avoid whitespace issue in date/time format
IFS=,
for _date in ${date_array[@]}
do
#IFS="$old_IFS"
#echo "$IFS" | cat -vte
if isDatePresent "$_date" "$filename";then
if [ "$i" -eq 0 ];then
echo "Starting date exists" 1>&2
#echo "date_start=""$_date" 1>&2
date_start="$_date"
else
echo "Ending date exists" 1>&2
#echo "date_end=""$_date" 1>&2
date_end="$_date"
fi
else
if [ "$i" -eq 0 ];then
echo "start date $_date not found" 1>&2
else
echo "end date $_date not found" 1>&2
fi
flag_array[$i]=1
fi
#IFS=,
(( i++ ))
done
IFS="$old_IFS"
if [ ${flag_array[0]} -eq 1 -o ${flag_array[1]} -eq 1 ];then
first_elt=`convertToEpoch "$(findFirstDate "$filename")"`
last_elt=`convertToEpoch "$(findLastDate "$filename")"`
border_dates_array=( "$first_elt","$last_elt" )
#echo "first_elt=" $first_elt "last_elt=" $last_elt 1>&2
i=0
IFS=,
for _date in ${date_array[@]}
do
if [ $i -eq 0 -a ${flag_array[$i]} -eq 1 ];then
date_start=`findBestDate "$_date" "$filename" "S" "${border_dates_array[@]}"`
elif [ $i -eq 1 -a ${flag_array[$i]} -eq 1 ];then
date_end=`findBestDate "$_date" "$filename" "E" "${border_dates_array[@]}"`
fi
(( i++ ))
done
fi
sed -r -n "/^[${date_start}]/,/^[${date_end}]/p" "$filename"
}
main "$1" "$2" "$3"
Copy this in a file. If you don't want to see debugging info, debugging is sent to stderr so just add "2>/dev/null"
1
This wont display the log files which dont have time stamp.
– Amit
Apr 9 '14 at 21:57
@Amit, yes it will, have you tried?
– UnX
Apr 10 '14 at 0:36
@rMistero, it won't work because if there is no log entry at 22:30, the range won't be terminated. As OP mentioned, the start and stop times may not be in the logs. You can tweak your regex for it to work, but you'll loose resolution and never be guaranteed in advance that the range will terminate at the right time.
– user61786
Apr 10 '14 at 5:21
@awk_FTW this was an example, I didn't use the timestamps provided by Amit. Again regex can be used. I agree thought it won't work if timestamp doesn't exists when provided explicitely or no timestamp regex matches. I ll improve it soon..
– UnX
Apr 10 '14 at 14:59
"As OP mentioned, the start and stop times may not be in the logs." No, read the OP again. OP says those WILL be present but intervening lines won't necessarily start with a timestamp. It doesn't even make sense to say the stop times might not be present. How could you ever tell any tool where to stop if the termination marker isn't guaranteed to be there? There would be no criteria to give the tool to tell it where to stop processing.
– Bratchley
Apr 10 '14 at 17:47
|
show 5 more comments
up vote
0
down vote
up vote
0
down vote
Using sed :
#!/bin/bash
E_BADARGS=23
if [ $# -ne "3" ]
then
echo "Usage: `basename $0` "<start_date>" "<end_date>" file"
echo "NOTE:Make sure to put dates in between double quotes"
exit $E_BADARGS
fi
isDatePresent(){
#check if given date exists in file.
local date=$1
local file=$2
grep -q "$date" "$file"
return $?
}
convertToEpoch(){
#converts to epoch time
local _date=$1
local epoch_date=`date --date="$_date" +%s`
echo $epoch_date
}
convertFromEpoch(){
#converts to date/time format from epoch
local epoch_date=$1
local _date=`date --date="@$epoch_date" +"%F %T"`
echo $_date
}
getDates(){
# collects all dates at beginning of lines in a file, converts them to epoch and returns a sequence of numbers
local file="$1"
local state="$2"
local i=0
local date_array=( )
if [[ "$state" -eq "S" ]];then
datelist=`cat "$file" | sed -r -e "s/^[([^+)].*/1/" | egrep "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}"`
elif [[ "$state" -eq "E" ]];then
datelist=`tac "$file" | sed -r -e "s/^[([^+)].*/1/" | egrep "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}"`
else
echo "Something went wrong while getting dates..." 1>&2
exit 500
fi
while read _date
do
epoch_date=`convertToEpoch "$_date"`
date_array[$i]=$epoch_date
#echo "$_date" "$epoch_date" 1>&2
(( i++ ))
done<<<"$datelist"
echo ${date_array[@]}
}
findneighbours(){
# search next best date if date is not in the file using recursivity
IFS="$old_IFS"
local elt=$1
shift
local state="$1"
shift
local -a array=( "$@" )
index_pivot=`expr ${#array[@]} / 2`
echo "#array="${#array[@]} ";array="${array[@]} ";index_pivot="$index_pivot 1>&2
if [ "$index_pivot" -eq 1 -a ${#array[@]} -eq 2 ];then
if [ "$state" == "E" ];then
echo ${array[0]}
elif [ "$state" == "S" ];then
echo ${array[(( ${#array[@]} - 1 ))]}
else
echo "State" $state "undefined" 1>&2
exit 100
fi
else
echo "elt with index_pivot="$index_pivot":"${array[$index_pivot]} 1>&2
if [ $elt -lt ${array[$index_pivot]} ];then
echo "elt is smaller than pivot" 1>&2
array=( ${array[@]:0:(($index_pivot + 1)) } )
else
echo "elt is bigger than pivot" 1>&2
array=( ${array[@]:$index_pivot:(( ${#array[@]} - 1 ))} )
fi
findneighbours "$elt" "$state" "${array[@]}"
fi
}
findFirstDate(){
local file="$1"
echo "Looking for first date in file" 1>&2
while read line
do
echo "$line" | egrep -q "^[[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}]" &>/dev/null
if [ "$?" -eq "0" ]
then
#echo "line=" "$line" 1>&2
firstdate=`echo "$line" | sed -r -e "s/^[([^+)].*/1/"`
echo "$firstdate"
break
else
echo $? 1>&2
fi
done< <( cat "$file" )
}
findLastDate(){
local file="$1"
echo "Looking for last date in file" 1>&2
while read line
do
echo "$line" | egrep -q "^[[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}]" &>/dev/null
if [ "$?" -eq "0" ]
then
#echo "line=" "$line" 1>&2
lastdate=`echo "$line" | sed -r -e "s/^[([^+)].*/1/"`
echo "$lastdate"
break
else
echo $? 1>&2
fi
done< <( tac "$file" )
}
findBestDate(){
IFS="$old_IFS"
local initdate="$1"
local file="$2"
local state="$3"
local first_elts="$4"
local last_elts="$5"
local date_array=( )
local initdate_epoch=`convertToEpoch "$initdate"`
if [[ $initdate_epoch -lt $first_elt ]];then
echo `convertFromEpoch "$first_elt"`
elif [[ $initdate_epoch -gt $last_elt ]];then
echo `convertFromEpoch "$last_elt"`
else
date_array=( `getDates "$file" "$state"` )
echo "date_array="${date_array[@]} 1>&2
#first_elt=${date_array[0]}
#last_elt=${date_array[(( ${#date_array[@]} - 1 ))]}
echo `convertFromEpoch $(findneighbours "$initdate_epoch" "$state" "${date_array[@]}")`
fi
}
main(){
init_date_start="$1"
init_date_end="$2"
filename="$3"
echo "problem start.." 1>&2
date_array=( "$init_date_start","$init_date_end" )
flag_array=( 0 0 )
i=0
#echo "$IFS" | cat -vte
old_IFS="$IFS"
#changing separator to avoid whitespace issue in date/time format
IFS=,
for _date in ${date_array[@]}
do
#IFS="$old_IFS"
#echo "$IFS" | cat -vte
if isDatePresent "$_date" "$filename";then
if [ "$i" -eq 0 ];then
echo "Starting date exists" 1>&2
#echo "date_start=""$_date" 1>&2
date_start="$_date"
else
echo "Ending date exists" 1>&2
#echo "date_end=""$_date" 1>&2
date_end="$_date"
fi
else
if [ "$i" -eq 0 ];then
echo "start date $_date not found" 1>&2
else
echo "end date $_date not found" 1>&2
fi
flag_array[$i]=1
fi
#IFS=,
(( i++ ))
done
IFS="$old_IFS"
if [ ${flag_array[0]} -eq 1 -o ${flag_array[1]} -eq 1 ];then
first_elt=`convertToEpoch "$(findFirstDate "$filename")"`
last_elt=`convertToEpoch "$(findLastDate "$filename")"`
border_dates_array=( "$first_elt","$last_elt" )
#echo "first_elt=" $first_elt "last_elt=" $last_elt 1>&2
i=0
IFS=,
for _date in ${date_array[@]}
do
if [ $i -eq 0 -a ${flag_array[$i]} -eq 1 ];then
date_start=`findBestDate "$_date" "$filename" "S" "${border_dates_array[@]}"`
elif [ $i -eq 1 -a ${flag_array[$i]} -eq 1 ];then
date_end=`findBestDate "$_date" "$filename" "E" "${border_dates_array[@]}"`
fi
(( i++ ))
done
fi
sed -r -n "/^[${date_start}]/,/^[${date_end}]/p" "$filename"
}
main "$1" "$2" "$3"
Copy this in a file. If you don't want to see debugging info, debugging is sent to stderr so just add "2>/dev/null"
Using sed :
#!/bin/bash
E_BADARGS=23
if [ $# -ne "3" ]
then
echo "Usage: `basename $0` "<start_date>" "<end_date>" file"
echo "NOTE:Make sure to put dates in between double quotes"
exit $E_BADARGS
fi
isDatePresent(){
#check if given date exists in file.
local date=$1
local file=$2
grep -q "$date" "$file"
return $?
}
convertToEpoch(){
#converts to epoch time
local _date=$1
local epoch_date=`date --date="$_date" +%s`
echo $epoch_date
}
convertFromEpoch(){
#converts to date/time format from epoch
local epoch_date=$1
local _date=`date --date="@$epoch_date" +"%F %T"`
echo $_date
}
getDates(){
# collects all dates at beginning of lines in a file, converts them to epoch and returns a sequence of numbers
local file="$1"
local state="$2"
local i=0
local date_array=( )
if [[ "$state" -eq "S" ]];then
datelist=`cat "$file" | sed -r -e "s/^[([^+)].*/1/" | egrep "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}"`
elif [[ "$state" -eq "E" ]];then
datelist=`tac "$file" | sed -r -e "s/^[([^+)].*/1/" | egrep "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}"`
else
echo "Something went wrong while getting dates..." 1>&2
exit 500
fi
while read _date
do
epoch_date=`convertToEpoch "$_date"`
date_array[$i]=$epoch_date
#echo "$_date" "$epoch_date" 1>&2
(( i++ ))
done<<<"$datelist"
echo ${date_array[@]}
}
findneighbours(){
# search next best date if date is not in the file using recursivity
IFS="$old_IFS"
local elt=$1
shift
local state="$1"
shift
local -a array=( "$@" )
index_pivot=`expr ${#array[@]} / 2`
echo "#array="${#array[@]} ";array="${array[@]} ";index_pivot="$index_pivot 1>&2
if [ "$index_pivot" -eq 1 -a ${#array[@]} -eq 2 ];then
if [ "$state" == "E" ];then
echo ${array[0]}
elif [ "$state" == "S" ];then
echo ${array[(( ${#array[@]} - 1 ))]}
else
echo "State" $state "undefined" 1>&2
exit 100
fi
else
echo "elt with index_pivot="$index_pivot":"${array[$index_pivot]} 1>&2
if [ $elt -lt ${array[$index_pivot]} ];then
echo "elt is smaller than pivot" 1>&2
array=( ${array[@]:0:(($index_pivot + 1)) } )
else
echo "elt is bigger than pivot" 1>&2
array=( ${array[@]:$index_pivot:(( ${#array[@]} - 1 ))} )
fi
findneighbours "$elt" "$state" "${array[@]}"
fi
}
findFirstDate(){
local file="$1"
echo "Looking for first date in file" 1>&2
while read line
do
echo "$line" | egrep -q "^[[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}]" &>/dev/null
if [ "$?" -eq "0" ]
then
#echo "line=" "$line" 1>&2
firstdate=`echo "$line" | sed -r -e "s/^[([^+)].*/1/"`
echo "$firstdate"
break
else
echo $? 1>&2
fi
done< <( cat "$file" )
}
findLastDate(){
local file="$1"
echo "Looking for last date in file" 1>&2
while read line
do
echo "$line" | egrep -q "^[[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}]" &>/dev/null
if [ "$?" -eq "0" ]
then
#echo "line=" "$line" 1>&2
lastdate=`echo "$line" | sed -r -e "s/^[([^+)].*/1/"`
echo "$lastdate"
break
else
echo $? 1>&2
fi
done< <( tac "$file" )
}
findBestDate(){
IFS="$old_IFS"
local initdate="$1"
local file="$2"
local state="$3"
local first_elts="$4"
local last_elts="$5"
local date_array=( )
local initdate_epoch=`convertToEpoch "$initdate"`
if [[ $initdate_epoch -lt $first_elt ]];then
echo `convertFromEpoch "$first_elt"`
elif [[ $initdate_epoch -gt $last_elt ]];then
echo `convertFromEpoch "$last_elt"`
else
date_array=( `getDates "$file" "$state"` )
echo "date_array="${date_array[@]} 1>&2
#first_elt=${date_array[0]}
#last_elt=${date_array[(( ${#date_array[@]} - 1 ))]}
echo `convertFromEpoch $(findneighbours "$initdate_epoch" "$state" "${date_array[@]}")`
fi
}
main(){
init_date_start="$1"
init_date_end="$2"
filename="$3"
echo "problem start.." 1>&2
date_array=( "$init_date_start","$init_date_end" )
flag_array=( 0 0 )
i=0
#echo "$IFS" | cat -vte
old_IFS="$IFS"
#changing separator to avoid whitespace issue in date/time format
IFS=,
for _date in ${date_array[@]}
do
#IFS="$old_IFS"
#echo "$IFS" | cat -vte
if isDatePresent "$_date" "$filename";then
if [ "$i" -eq 0 ];then
echo "Starting date exists" 1>&2
#echo "date_start=""$_date" 1>&2
date_start="$_date"
else
echo "Ending date exists" 1>&2
#echo "date_end=""$_date" 1>&2
date_end="$_date"
fi
else
if [ "$i" -eq 0 ];then
echo "start date $_date not found" 1>&2
else
echo "end date $_date not found" 1>&2
fi
flag_array[$i]=1
fi
#IFS=,
(( i++ ))
done
IFS="$old_IFS"
if [ ${flag_array[0]} -eq 1 -o ${flag_array[1]} -eq 1 ];then
first_elt=`convertToEpoch "$(findFirstDate "$filename")"`
last_elt=`convertToEpoch "$(findLastDate "$filename")"`
border_dates_array=( "$first_elt","$last_elt" )
#echo "first_elt=" $first_elt "last_elt=" $last_elt 1>&2
i=0
IFS=,
for _date in ${date_array[@]}
do
if [ $i -eq 0 -a ${flag_array[$i]} -eq 1 ];then
date_start=`findBestDate "$_date" "$filename" "S" "${border_dates_array[@]}"`
elif [ $i -eq 1 -a ${flag_array[$i]} -eq 1 ];then
date_end=`findBestDate "$_date" "$filename" "E" "${border_dates_array[@]}"`
fi
(( i++ ))
done
fi
sed -r -n "/^[${date_start}]/,/^[${date_end}]/p" "$filename"
}
main "$1" "$2" "$3"
Copy this in a file. If you don't want to see debugging info, debugging is sent to stderr so just add "2>/dev/null"
edited Apr 12 '14 at 19:50
answered Apr 9 '14 at 21:38
UnX
72648
72648
1
This wont display the log files which dont have time stamp.
– Amit
Apr 9 '14 at 21:57
@Amit, yes it will, have you tried?
– UnX
Apr 10 '14 at 0:36
@rMistero, it won't work because if there is no log entry at 22:30, the range won't be terminated. As OP mentioned, the start and stop times may not be in the logs. You can tweak your regex for it to work, but you'll loose resolution and never be guaranteed in advance that the range will terminate at the right time.
– user61786
Apr 10 '14 at 5:21
@awk_FTW this was an example, I didn't use the timestamps provided by Amit. Again regex can be used. I agree thought it won't work if timestamp doesn't exists when provided explicitely or no timestamp regex matches. I ll improve it soon..
– UnX
Apr 10 '14 at 14:59
"As OP mentioned, the start and stop times may not be in the logs." No, read the OP again. OP says those WILL be present but intervening lines won't necessarily start with a timestamp. It doesn't even make sense to say the stop times might not be present. How could you ever tell any tool where to stop if the termination marker isn't guaranteed to be there? There would be no criteria to give the tool to tell it where to stop processing.
– Bratchley
Apr 10 '14 at 17:47
|
show 5 more comments
1
This wont display the log files which dont have time stamp.
– Amit
Apr 9 '14 at 21:57
@Amit, yes it will, have you tried?
– UnX
Apr 10 '14 at 0:36
@rMistero, it won't work because if there is no log entry at 22:30, the range won't be terminated. As OP mentioned, the start and stop times may not be in the logs. You can tweak your regex for it to work, but you'll loose resolution and never be guaranteed in advance that the range will terminate at the right time.
– user61786
Apr 10 '14 at 5:21
@awk_FTW this was an example, I didn't use the timestamps provided by Amit. Again regex can be used. I agree thought it won't work if timestamp doesn't exists when provided explicitely or no timestamp regex matches. I ll improve it soon..
– UnX
Apr 10 '14 at 14:59
"As OP mentioned, the start and stop times may not be in the logs." No, read the OP again. OP says those WILL be present but intervening lines won't necessarily start with a timestamp. It doesn't even make sense to say the stop times might not be present. How could you ever tell any tool where to stop if the termination marker isn't guaranteed to be there? There would be no criteria to give the tool to tell it where to stop processing.
– Bratchley
Apr 10 '14 at 17:47
1
1
This wont display the log files which dont have time stamp.
– Amit
Apr 9 '14 at 21:57
This wont display the log files which dont have time stamp.
– Amit
Apr 9 '14 at 21:57
@Amit, yes it will, have you tried?
– UnX
Apr 10 '14 at 0:36
@Amit, yes it will, have you tried?
– UnX
Apr 10 '14 at 0:36
@rMistero, it won't work because if there is no log entry at 22:30, the range won't be terminated. As OP mentioned, the start and stop times may not be in the logs. You can tweak your regex for it to work, but you'll loose resolution and never be guaranteed in advance that the range will terminate at the right time.
– user61786
Apr 10 '14 at 5:21
@rMistero, it won't work because if there is no log entry at 22:30, the range won't be terminated. As OP mentioned, the start and stop times may not be in the logs. You can tweak your regex for it to work, but you'll loose resolution and never be guaranteed in advance that the range will terminate at the right time.
– user61786
Apr 10 '14 at 5:21
@awk_FTW this was an example, I didn't use the timestamps provided by Amit. Again regex can be used. I agree thought it won't work if timestamp doesn't exists when provided explicitely or no timestamp regex matches. I ll improve it soon..
– UnX
Apr 10 '14 at 14:59
@awk_FTW this was an example, I didn't use the timestamps provided by Amit. Again regex can be used. I agree thought it won't work if timestamp doesn't exists when provided explicitely or no timestamp regex matches. I ll improve it soon..
– UnX
Apr 10 '14 at 14:59
"As OP mentioned, the start and stop times may not be in the logs." No, read the OP again. OP says those WILL be present but intervening lines won't necessarily start with a timestamp. It doesn't even make sense to say the stop times might not be present. How could you ever tell any tool where to stop if the termination marker isn't guaranteed to be there? There would be no criteria to give the tool to tell it where to stop processing.
– Bratchley
Apr 10 '14 at 17:47
"As OP mentioned, the start and stop times may not be in the logs." No, read the OP again. OP says those WILL be present but intervening lines won't necessarily start with a timestamp. It doesn't even make sense to say the stop times might not be present. How could you ever tell any tool where to stop if the termination marker isn't guaranteed to be there? There would be no criteria to give the tool to tell it where to stop processing.
– Bratchley
Apr 10 '14 at 17:47
|
show 5 more comments
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f123979%2fhow-to-extract-logs-between-two-time-stamps%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Possible duplicate of stackoverflow.com/questions/7575267/…
– Ramesh
Apr 9 '14 at 20:16
Do you just need to do this just once or programmatically at various times?
– Bratchley
Apr 9 '14 at 20:23
Reason I ask is because you can do two contextual grep's (one to grab everything after the starting delimiter and another to stop printing at the ending delimiter) if you know the literal values. If the dates/times can change, tou can easily generate these on the fly by feeding user input through the
date -d
command and using that to construct the search pattern.– Bratchley
Apr 9 '14 at 20:28
@Ramesh, the referenced question is too broad.
– maxschlepzig
Apr 9 '14 at 20:33
@JoelDavis : I want to do it programmatically. So everytime i just need to enter desired time stamp to extract the logs between those time stamp in my /tmp location.
– Amit
Apr 9 '14 at 21:36