最近做数据批量加载的时候,是通过pl/sql嵌在shell脚本里执行的。
脚本运行后生成的日志类似如下的格式
Get Dump file for APP_TMP.TESTRESS_NAME_LINK...
Elapsed: 00:00:00.64
.
DB details is accessible from source schema ...
.
DB details is accessible from target schema ...
.
Directory ext_datapump_dir has read,write permission ,proceed...
.
SYNONYM TESTEEMENT exists in CONNECT account,proceed...
.


Get Dump file for APP_TMP.TESTEEMENT...
Elapsed: 00:00:00.49
.
DB details is accessible from source schema ...
.
DB details is accessible from target schema ...
.
Directory ext_datapump_dir has read,write permission ,proceed...
.
SYNONYM TESTEEMENT_RESOURCE exists in CONNECT account,proceed...
.


因为表比较多,所以日志看起来不是很清晰,想生成类似报表的格式来查看每个表耗费了多长时间,就一目了然了。

期望的结果类似下面的样子。

#########################################################################
Table_name Elapsed time
#########################################################################
APP_TMP.TESTRESS_DATA... Elapsed: 00:00:01.13
APP_TMP.TESTRESS_NAME_LINK... Elapsed: 00:00:00.64
APP_TMP.TESTEEMENT... Elapsed: 00:00:00.49
APP_TMP.TESTEEMENT_RESOURCE... Elapsed: 00:00:00.74
APP_TMP.TEST_RES_HISTORY... Elapsed: 00:00:00.82
APP_TMP.TEST_ACCOUNT... Elapsed: 00:00:01.03
APP_TMP.TEST_ADDRESS_NAME... Elapsed: 00:00:00.78
APP_TMP.TEST_AGED_TRIAL_BALANCE... Elapsed: 00:00:01.16
APP_TMP.TEST_BILLING_ARRANGEMENT... Elapsed: 00:00:00.61
APP_TMP.TEST_CHARGE_GROUP... Elapsed: 00:00:01.66
APP_TMP.TEST_CHARGES... Elapsed: 00:00:06.73
APP_TMP.TEST_CREDIT_DEBIT_LINK... Elapsed: 00:00:01.67
APP_TMP.TEST_CUSTOMER_CREDIT... Elapsed: 00:00:00.40
APP_TMP.TEST_DEPOSIT_REQUEST... Elapsed: 00:00:00.10
APP_TMP.TEST_DIRECT_DEBIT_REQUEST... Elapsed: 00:00:00.67
APP_TMP.TEST_INVOICE... Elapsed: 00:00:01.98
APP_TMP.TEST_PAY_CHANNEL... Elapsed: 00:00:00.53
APP_TMP.TEST_PAYMENT... Elapsed: 00:00:01.28
APP_TMP.TEST_PAYMENT_ACTIVITY... Elapsed: 00:00:00.19


首先是根据关键字找到对应的行,下一行就是耗费的时间。想通过命令简单的实现。最后grep帮了大忙。
grep -A1 --color=auto "Get Dump file for " extract.log
输出类似下面的样子。
--
Get Dump file for APP_TMP.TESTTOMER...
Elapsed: 00:00:00.91
--
Get Dump file for APP_TMP.TESTNT_DISTRIBUTE...
Elapsed: 00:00:00.84
--
Get Dump file for APP_TMP.TEST_MEMO...
Elapsed: 00:00:22.27
--
Get Dump file for APP_TMP.TESTE_DATA...
Elapsed: 00:00:01.55
--

达到了初步效果。就需要把冗余的信息去除“Get Dump file for ",然后能让对应的执行时间横向显示。
可以使用sed来做。
sed 's/Get Dump file for //' |sed 'N;N;s/\n/ /g'

APP_TMP.TESTRESS_DATA... Elapsed: 00:00:01.13 --
APP_TMP.TESTRESS_NAME_LINK... Elapsed: 00:00:00.64 --
APP_TMP.TESTEEMENT... Elapsed: 00:00:00.49 --
APP_TMP.TESTEEMENT_RESOURCE... Elapsed: 00:00:00.74 --

效果基本达到了,但是显示比较粗糙,不规整,这个时候awk能帮上大忙了。
使用如下的脚本来格式化输出
awk '
BEGIN{
print "#########################################################################"
printf "%-50s %8s %11s \n", "Table_name","Elapsed","time"
print "#########################################################################"
}
{printf "%-50s %8s %11s \n", $1,$2,$3,$4
}'


这样环环相扣,就可以输出基本完整的报告了。完整命令如下:
grep -A1 --color=auto "Get Dump file for " extract.log |sed 's/Get Dump file for //' |sed 'N;N;s/\n/ /g' |awk '
BEGIN{
print "#########################################################################"
printf "%-50s %8s %11s \n", "Table_name","Elapsed","time"
print "#########################################################################"
}
{printf "%-50s %8s %11s \n", $1,$2,$3,$4
}'

输出:
#########################################################################
Table_name Elapsed time
#########################################################################
APP_TMP.TESTRESS_DATA... Elapsed: 00:00:01.13
APP_TMP.TESTRESS_NAME_LINK... Elapsed: 00:00:00.64
APP_TMP.TESTEEMENT... Elapsed: 00:00:00.49