diff -Nru pgbadger-2.0/ChangeLog pgbadger-5.0/ChangeLog --- pgbadger-2.0/ChangeLog 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/ChangeLog 2014-02-06 15:29:04.000000000 +0000 @@ -1,56 +1,809 @@ +2014-02-05 version 5.0 + +This new major release adds some new features like incremental mode and SQL +queries times histogram. There is also a hourly graphic representation of the +count and average duration of top normalized queries. Same for errors or events, +you will be able to see graphically at which hours they are occuring the most +often. + +The incremental mode is an old request issued at PgCon Ottawa 2012 that concern +the ability to construct incremental reports with successive runs of pgBadger. +It is now possible to run pgbadger each days or even more, each hours, and have +cumulatives reports per day and per week. A top index page allow you to go +directly to the weekly and daily reports. + +This mode have been build with simplicity in mind so running pgbadger by cron +as follow: + + 0 23 * * * pgbadger -q -I -O /var/www/pgbadger/ /var/log/postgresql.log + +is enough to have daily and weelky reports viewable using your browser. + +You can take a look at a sample report at http://dalibo.github.io/pgbadger/demov5/index.html + +There's also a useful improvement to allow pgBadger to seek directly to the +last position in the same log file after a successive execution. This feature +is only available using the incremental mode or the -l option and parsing a +single log file. Let's say you have a weekly rotated log file and want to run +pgBadger each days. With 2GB of log per day, pgbadger was spending 5 minutes +per block of 2 GB to reach the last position in the log, so at the end of the +week this feature will save you 35 minutes. Now pgBadger will start parsing +new log entries immediatly. This feature is compatible with the multiprocess +mode using -j option (n processes for one log file). + +Histogram of query times is a new report in top queries slide that shows the +query times distribution during the analyzed period. For example: + + Range Count Percentage + -------------------------------------------- + 0-1ms 10,367,313 53.52% + 1-5ms 799,883 4.13% + 5-10ms 451,646 2.33% + 10-25ms 2,965,883 15.31% + 25-50ms 4,510,258 23.28% + 50-100ms 180,975 0.93% + 100-500ms 87,613 0.45% + 500-1000ms 5,856 0.03% + 1000-10000ms 2,697 0.01% + > 10000ms 74 0.00% + + +There is also some graphic and report improvements, like the mouse tracker +formatting that have been reviewed. It now shows a vertical crosshair and +all dataset values at a time when mouse pointer moves over series. Automatic +queries formatting has also been changed, it is now done on double click +event as simple click was painful when you want to copy some part of the +queries. + +The report "Simultaneous Connections" has been relabeled into "Established +Connections", it is less confusing as many people think that this is the number +of simultaneous sessions, which is not the case. It only count the number of +connections established at same time. + +Autovacuum reports now associate database name to the autovacuum and autoanalyze +entries. Statistics now refer to "dbname.schema.table", previous versions was only +showing the pair "schema.table". + +This release also adds Session peak information and a report about Simultaneous +sessions. Parameters log_connections and log_disconnections must be enabled in +postgresql.conf. + +Complete ChangeLog: + + - Fix size of SQL queries columns to prevent exceeding screen width. + - Add new histogram reports on top normalized queries and top errors + or event. It shows at what hours and in which quantity the queries + or errors appears. + - Add seeking to last parser position in log file in incremental mode. + This prevent parsing all the file to find the last line parse from + previous run. This only works when parsing a single flat file, -j + option is permitted. Thanks to ioguix for the kick. + - Rewrite reloading of last log time from binary files. + - Fix missing statistics of last parsed queries in incremental mode. + - Fix bug in incremental mode that prevent reindexing a previous day. + Thanks to Martin Prochazka for the great help. + - Fix missing label "Avg duration" on column header in details of Most + frequent queries (N). + - Add vertical crosshair on graph. + - Fix case where queries and events was not updated when using -b and + -e command line. Thanks to Nicolas Thauvin for the report. + - Fix week sorting on incremental report main index page. Thanks to + Martin Prochazka for the report. + - Add "Histogram of query times" report to show statistics like + 0-100ms : 80%, 100-500ms :14%, 500-1000ms : 3%, > 1000ms : 1%. + Thanks to tmihail for the feature request. + - Format mouse tracker on graphs to show all dataset value at a time. + - Add control of -o vs -O option with incremental mode to prevent + wrong use. + - Change log level of missing LAST_PARSED.tmp file to WARNING and + add a HINT. + - Update copyright date to 2014 + - Fix empty reports of connections. Thanks to Reeshna Ramakrishnan + for the report. + - Fix display of connections peak when no connection was reported. + - Fix warning on META_MERGE for ExtUtils::MakeMaker < 6.46. Thanks + to Julien Rouhaud for the patch. + - Add documentation about automatic incremental mode. + - Add incremental mode to pgBadger. This mode will build a report + per day and a cumulative report per week. It also create an index + interface to easiest access to the different report. Must be run, + for example, as: + pgbadger /var/log/postgresql.log.1 -I -O /var/www/pgbadger/ + after a daily PostgreSQL log file rotation. + - Add -O | --outdir path to specify the directory where out file + must be saved. + - Automatic queries formatting is now done on double click event, + simple click was painful when you want to copy some part of the + queries. Thanks to Guillaume Smet for the feature request. + - Remove calls of binmode to force html file output to be utf8 as + there is some bad side effect. Thanks to akorotkov for the report. + - Remove use of Time::HiRes Perl module as some distributions does + not include this module by default in core Perl install. + - Fix "Wide character in print" Perl message by setting binmode + to :utf8. Thanks to Casey Allen Shobe for the report. + - Fix application name search regex to handle application name with + space like "pgAdmin III - Query Tool". + - Fix wrong timestamps saved with top queries. Thanks to Herve Werner + for the report. + - Fix missing logs types statitics when using binary mode. Thanks to + Herve Werner for the report. + - Fix Queries by application table column header: Database replaced + by Application. Thanks to Herve Werner for the report. + - Add "Max number of times the same event was reported" report in + Global stats Events tab. + - Replace "Number of errors" by "Number of ERROR entries" and add + "Number of FATAL entries". + - Replace "Number of errors" by "Number of events" and "Total errors + found" by "Total events found" in Events reports. Thanks to Herve + Werner for the report. + - Fix title error in Sessions per database. + - Fix clicking on the info link to not go back to the top of the page. + Thanks to Guillaume Smet for the report and solution. + - Fix incremental report from binary output where binary data was not + loaded if no queries were present in log file. Thanks to Herve Werner + for the report. + - Fix parsing issue when log_error_verbosity = verbose. Thanks to vorko + for the report. + - Add Session peak information and a report about Simultaneous sessions. + log_connections+log_disconnections must be enabled in postgresql.conf. + - Fix wrong requests number in Queries by user and by host. Thanks to + Jehan-Guillaume de Rorthais for the report. + - Fix issue with rsyslog format failing to parse logs. Thanks to Tim + Sampson for the report. + - Associate autovacuum and autoanalyze log entry to the corresponding + database name. Thanks to Herve Werner for the feature request. + - Change "Simultaneous Connections" label into "Established Connections", + it is less confusing as many people think that this is the number of + simultaneous sessions, which is not the case. It only count the number + of connections established at same time. Thanks to Ronan Dunklau for + the report. + +2013-11-08 version 4.1 + +This release fixes two major bugs and some others minor issues. There's also a +new command line option --exclude-appname that allow exclusion from the report +of queries generated by a specific program, like pg_dump. Documentation have +been updated with a new chapter about building incremental reports. + + - Add log_autovacuum_min_duration into documentation in chapter about + postgresql configuration directives. Thanks to Herve Werner for the + report. + - Add chapter about "Incremental reports" into documentation. + - Fix reports with per minutes average where last time fraction was + not reported. Thanks to Ludovic Levesque and Vincent Laborie for the + report. + - Fix unterminated comment in information popup. Thanks to Ronan + Dunklau for the patch. + - Add --exclude-appname command line option to eliminate unwanted + traffic generated by a specific application. Thanks to Steve Crawford + for the feature request. + - Allow external links use into URL to go to a specific report. Thanks + to Hubert depesz Lubaczewski for the feature request. + - Fix empty reports when parsing compressed files with the -j option + which is not allowed with compressed file. Thanks to Vincent Laborie + for the report. + - Prevent progress bar length to increase after 100% when real size is + greater than estimated size (issue found with huge compressed file). + - Correct some spelling and grammar in ChangeLog and pgbadger. Thanks + to Thom Brown for the patch. + - Fix major bug on SQL traffic reports with wrong min value and bad + average value on select reports, add min/max for select queries. + Thanks to Vincent Laborie for the report. + +2013-10-31 - Version 4.0 + +This major release is the "Say goodbye to the fouine" release. With a full +rewrite of the reports design, pgBadger has now turned the HTML reports into +a more intuitive user experience and professional look. + +The report is now driven by a dynamic menu with the help of the embedded +Bootstrap library. Every main menu corresponds to a hidden slide that is +revealed when the menu or one of its submenus is activated. There's +also the embedded font Font Awesome webfont to beautify the report. + +Every statistics report now includes a key value section that immediately +shows you some of the relevant information. Pie charts have also been +separated from their data tables using two tabs, one for the chart and the +other one for the data. + +Tables reporting hourly statistics have been moved to a multiple tabs report +following the data. This is used with General (queries, connections, sessions), +Checkpoints (buffer, files, warnings), Temporary files and Vacuums activities. + +There's some new useful information shown in the key value sections. Peak +information shows the number and datetime of the highest activity. Here is the +list of those reports: + + - Queries peak + - Read queries peak + - Write queries peak + - Connections peak + - Checkpoints peak + - WAL files usage Peak + - Checkpoints warnings peak + - Temporary file size peak + - Temporary file number peak + +Reports about Checkpoints and Restartpoints have been merged into a single report. +These are almost one in the same event, except that restartpoints occur on a slave +cluster, so there was no need to distinguish between the two. + +Recent PostgreSQL versions add additional information about checkpoints, the +number of synced files, the longest sync and the average of sync time per file. +pgBadger collects and shows this information in the Checkpoint Activity report. + +There's also some new reports: + + - Prepared queries ratio (execute vs prepare) + - Prepared over normal queries + - Queries (select, insert, update, delete) per user/host/application + - Pie charts for tables with the most tuples and pages removed during vacuum. + +The vacuum report will now highlight the costly tables during a vacuum or +analyze of a database. + +The errors are now highlighted by a different color following the level. +A LOG level will be green, HINT will be yellow, WARNING orange, ERROR red +and FATAL dark red. + +Some changes in the binary format are not backward compatible and the option +--client has been removed as it has been superseded by --dbclient for a long time now. + +If you are running a pg_dump or some batch process with very slow queries, your +report analysis will be hindered by those queries having unwanted prominence in the +report. Before this release it was a pain to exclude those queries from the +report. Now you can use the --exclude-time command line option to exclude all +traces matching the given time regexp from the report. For example, let's say +you have a pg_dump at 13:00 each day during half an hour, you can use pgbadger +as follows: + + pgbadger --exclude-time "2013-09-.* 13:.*" postgresql.log + +If you are also running a pg_dump at night, let's say 22:00, you can write it +as follows: + + pgbadger --exclude-time '2013-09-\d+ 13:[0-3]' --exclude-time '2013-09-\d+ 22:[0-3]' postgresql.log + +or more shortly: + + pgbadger --exclude-time '2013-09-\d+ (13|22):[0-3]' postgresql.log + +Exclude time always requires the iso notation yyyy-mm-dd hh:mm:ss, even if log +format is syslog. This is the same for all time-related options. Use this option +with care as it has a high cost on the parser performance. + + +2013-09-17 - version 3.6 + +Still an other version in 3.x branch to fix two major bugs in vacuum and checkpoint +graphs. Some other minors bugs has also been fixed. + + - Fix grammar in --quiet usage. Thanks to stephen-a-ingram for the report. + - Fix reporting period to starts after the last --last-parsed value instead + of the first log line. Thanks to Keith Fiske for the report. + - Add --csv-separator command line usage to documentation. + - Fix CSV log parser and add --csv-separator command line option to allow + change of the default csv field separator, coma, in any other character. + - Avoid "negative look behind not implemented" errors on perl 5.16/5.18. + Thanks to Marco Baringer for the patch. + - Support timestamps for begin/end with fractional seconds (so it'll handle + postgresql's normal string representation of timestamps). + - When using negative look behind set sub-regexp to -i (not case insensitive) + to avoid issues where some upper case letter sequence, like SS or ST. + - Change shebang from /usr/bin/perl to /usr/bin/env perl so that user-local + (perlbrew) perls will get used. + - Fix empty graph of autovacuum and autoanalyze. + - Fix checkpoint graphs that was not displayed any more. + + +2013-07-11 - Version 3.5 + +Last release of the 3.x branch, this is a bug fix release that also adds some +pretty print of Y axis number on graphs and a new graph that groups queries +duration series that was shown as second Y axis on graphs, as well as a new +graph with number of temporary file that was also used as second Y axis. + + - Split temporary files report into two graphs (files size and number + of file) to no more used a second Y axis with flotr2 - mouse tracker + is not working as expected. + - Duration series representing the second Y axis in queries graph have + been removed and are now drawn in a new "Average queries duration" + independant graph. + - Add pretty print of numbers in Y axis and mouse tracker output with + PB, TB, GB, KB, B units, and seconds, microseconds. Number without + unit are shown with P, T, M, K suffix for easiest very long number + reading. + - Remove Query type reports when log only contains duration. + - Fix display of checkpoint hourly report with no entry. + - Fix count in Query type report. + - Fix minimal statistics output when nothing was load from log file. + Thanks to Herve Werner for the report. + - Fix several bug in log line parser. Thanks to Den Untevskiy for the + report. + - Fix bug in last parsed storage when log files was not provided in the + right order. Thanks to Herve Werner for the report. + - Fix orphan lines wrongly associated to previous queries instead of + temporary file and lock logged statement. Thanks to Den Untevskiy for + the report. + - Fix number of different samples shown in events report. + - Escape HTML tags on error messages examples. Thanks to Mael Rimbault + for the report. + - Remove some temporary debug informations used with some LOG messages + reported as events. + - Fix several issues with restartpoint and temporary files reports. + Thanks to Guillaume Lelarge for the report. + - Fix issue when an absolute path was given to the incremental file. + Thanks to Herve Werner for the report. + - Remove creation of incremental temp file $tmp_last_parsed when not + running in multiprocess mode. Thanks to Herve Werner for the report. + + +2013-06-18 - Version 3.4 + +This release adds lot of graphic improvements and a better rendering with logs +over few hours. There's also some bug fixes especially on report of queries that +generate the most temporary files. + + - Update flotr2.min.js to latest github code. + - Add mouse tracking over y2axis. + - Add label/legend information to ticks displayed on mouseover graphs. + - Fix documentation about log_statement and log_min_duration_statement. + Thanks to Herve Werner for the report. + - Fix missing top queries for locks and temporary files in multiprocess + mode. + - Cleanup code to remove storage of unused information about connection. + - Divide the huge dump_as_html() method with one method per each report. + - Checkpoints, restart points and temporary files are now drawn using a + period of 5 minutes per default instead of one hour. Thanks to Josh + Berkus for the feature request. + - Change fixed increment of one hour to five minutes on queries graphs + "SELECT queries" and "Write queries". Remove graph "All queries" as, + with a five minutes increment, it duplicates the "Queries per second". + Thanks to Josh Berkus for the feature request. + - Fix typos. Thanks to Arsen Stasic for the patch. + - Add default HTML charset to utf-8 and a command line option --charset + to be able to change the default. Thanks to thomas hankeuhh for the + feature request. + - Fix missing temporary files query reports in some conditions. Thanks + to Guillaume Lelarge and Thomas Reiss for the report. + - Fix some parsing issue with log generated by pg 7.4. + - Update documentation about missing new reports introduced in previous + version 3.3. + +Note that it should be the last release of the 3.x branch unless there's major +bug fixes, but next one will be a major release with a completely new design. + + +2013-05-01 - Version 3.3 + +This release adds four more useful reports about queries that generate locks and +temporary files. An other new report about restart point on slaves and several +bugs fix or cosmetic change. Support to parallel processing under Windows OS has +been removed. + + - Remove parallel processing under Windows platform, the use of waitpid + is freezing pgbadger. Thanks to Saurabh Agrawal for the report. I'm + not comfortable with that OS this is why support have been removed, + if someone know how to fix that, please submit a patch. + - Fix Error in tempfile() under Windows. Thanks to Saurabh Agrawal for + the report. + - Fix wrong queries storage with lock and temporary file reports. Thanks + to Thomas Reiss for the report. + - Add samples queries to "Most frequent waiting queries" and "Queries + generating the most temporary files" report. + - Add two more reports about locks: 'Most frequent waiting queries (N)", + and "Queries that waited the most". Thanks to Thomas Reiss for the + patch. + - Add two reports about temporary files: "Queries generating the most + temporary files (N)" and "Queries generating the largest temporary + files". Thanks to Thomas Reiss for the patch. + - Cosmetic change to the Min/Max/Avg duration columns. + - Fix report of samples error with csvlog format. Thanks to tpoindessous + for the report. + - Add --disable-autovacuum to the documentation. Thanks to tpoindessous + for the report. + - Fix unmatched ) in regex when using %s in prefix. + - Fix bad average size of temporary file in Overall statistics report. + Thanks to Jehan Guillaume de Rorthais for the report. + - Add restartpoint reporting. Thanks to Guillaume Lelarge for the patch. + - Made some minor change in CSS. + - Replace %% in log line prefix internally by a single % so that it + could be exactly the same than in log_line_prefix. Thanks to Cal + Heldenbrand for the report. + - Fix perl documentation header, thanks to Cyril Bouthors for the patch. + +2013-04-07 - Version 3.2 + +This is mostly a bug fix release, it also adds escaping of HTML code inside +queries and the adds Min/Max reports with Average duration in all queries +reports. + + - In multiprocess mode, fix case where pgbadger does not update + the last-parsed file and do not take care of the previous run. + Thanks to Kong Man for the report. + - Fix case where pgbadger does not update the last-parsed file. + Thanks to Kong Man for the report. + - Add CDATA to make validator happy. Thanks to Euler Taveira de + Oliveira for the patch. + - Some code review by Euler Taveira de Oliveira, thanks for the + patch. + - Fix case where stat were multiplied by N when -J was set to N. + Thanks to thegnorf for the report. + - Add a line in documentation about log_statement that disable + log_min_duration_statement when it is set to all. + - Add quick note on how to contribute, thanks to Damien Clochard + for the patch. + - Fix issue with logs read from stdin. Thanks to hubert depesz + lubaczewski for the report. + - Force pgbadger to not try to beautify queries bigger than 10kb, + this will take too much time. This value can be reduce in the + future if hang with long queries still happen. Thanks to John + Rouillard for the report. + - Fix an other issue in replacing bind param when the bind value + is alone on a single line. Thanks to Kjeld Peters for the report. + - Fix parsing of compressed files together with uncompressed files + using the the -j option. Uncompressed files are now processed using + split method and compressed ones are parsed per one dedicated process. + - Replace zcat by gunzip -c to fix an issue on MacOsx. Thanks to + Kjeld Peters for the report. + - Escape HTML code inside queries. Thanks to denstark for the report. + - Add Min/Max in addition to Average duration values in queries reports. + Thanks to John Rouillard fot the feature request. + - Fix top slowest array size with binary format. + - Fix an other case with bind parameters with value in next line and + the top N slowest queries that was repeated until N even if the real + number of queries was lower. Thanks to Kjeld Peters for the reports. + - Fix non replacement of bind parameters where there is line breaks in + the parameters, aka multiline bind parameters. Thanks to Kjeld Peters + for the report. + - Fix error with seekable export tag with Perl v5.8. Thanks to Jeff Bohmer + for the report. + - Fix parsing of non standard syslog lines begining with a timestamp like + "2013-02-28T10:35:11-05:00". Thanks to Ryan P. Kelly for the report. + - Fix issue #65 where using -c | --dbclient with csvlog was broken. Thanks + to Jaime Casanova for the report. + - Fix empty report in watchlog mode (-w option). + +2013-02-21 - Version 3.1 + +This is a quick release to fix missing reports of most frequent errors and slowest +normalized queries in previous version published yesterday. + + - Fix empty report in watchlog mode (-w option). + - Force immediat die on command line options error. + - Fix missing report of most frequent events/errors report. Thanks to + Vincent Laborie for the report. + - Fix missing report of slowest normalized queries. Thanks to Vincent + Laborie for the report. + - Fix display of last print of progress bar when quiet mode is enabled. + +2013-02-20 - Version 3.0 + +This new major release adds parallel log processing by using as many cores as +wanted to parse log files, the performances gain is directly related to the +number of cores specified. There's also new reports about autovacuum/autoanalyze +informations and many bugs have been fixed. + + - Update documentation about log_duration, log_min_duration_statement + and log_statement. + - Rewrite dirty code around log timestamp comparison to find timestamp + of the specified begin or ending date. + - Remove distinction between logs with duration enabled from variables + log_min_duration_statement and log_duration. Commands line options + --enable-log_duration and --enable-log_min_duration have been removed. + - Update documentation about parallel processing. + - Remove usage of Storable::file_magic to autodetect binary format file, + it is not include in core perl 5.8. Thanks to Marc Cousin for the + report. + - Force multiprocess per file when files are compressed. Thanks to + Julien Rouhaud for the report. + - Add progress bar logger for multiprocess by forking a dedicated + process and using pipe. Also fix some bugs in using binary format + that duplicate query/error samples per process. + - chmod 755 pgbadger + - Fix checkpoint reports when there is no checkpoint warnings. + - Fix non report of hourly connections/checkpoint/autovacuum when not + query is found in log file. Thanks to Guillaume Lelarge for the + report. + - Add better handling of signals in multiprocess mode. + - Add -J|--job_per_file command line option to force pgbadger to use + one process per file instead of using all to parse one file. Useful + to have better performances with lot of small log file. + - Fix parsing of orphan lines with stderr logs and log_line_prefix + without session information into the prefix (%l). + - Update documentation about -j | --jobs option. + - Allow pgbadger to use several cores, aka multiprocessing. Add options + -j | --jobs option to specify the number of core to use. + - Add autovacuum and autoanalyze infos to binary format. + - Fix case in SQL code highlighting where QQCODE temp keyword was not + replaced. Thanks to Julien Ruhaud for the report. + - Fix CSS to draw autovacuum graph and change legend opacity. + - Add pie graph to show repartition of number of autovacuum per table + and number of tuples removed by autovacuum per table. + - Add debug information about selected type of log duration format. + - Add report of tuples/pages removed in report of Vacuums by table. + - Fix major bug on syslog parser where years part of the date was + wrongly extracted from current date with logs generated in 2012. + - Fix issue with Perl 5.16 that do not allow "ss" inside look-behind + assertions. Thanks to Cedric for the report. + - New vacuum and analyze hourly reports and graphs. Thanks to Guillaume + Lelarge for the patch. + +UPGRADE: if you are running pgbadger by cron take care if you were using one of +the following option: --enable-log_min_duration and --enable-log_duration, they +have been removed and pgbadger will refuse to start. + +2013-01-17 - Version 2.3 + +This release fixes several major issues especially with csvlog and a memory leak +with log parsing using a start date. There's also several improvement like new +reports of number of queries by database and application. Mouse over reported +queries will show database, user, remote client and application name where they +are executed. + +A new binary input/output format have been introduced to allow saving or reading +precomputed statistics. This will allow incremental reports based on periodical +runs of pgbader. This is a work in progress fully available with next coming +major release. + +Several SQL code beautifier improvement from pgFormatter have also been merged. + + - Clarify misleading statement about log_duration: log_duration may be + turned on depending on desired information. Only log_statement must + not be on. Thanks to Matt Romaine for the patch. + - Fix --dbname and --dbuser not working with csvlog format. Thanks to + Luke Cyca for the report. + - Fix issue in SQL formatting that prevent left back indentation when + major keywords were found. Thanks to Kevin Brannen for the report. + - Display 3 decimals in time report so that ms can be seen. Thanks to + Adam Schroder for the request. + - Force the parser to not insert a new line after the SET keyword when + the query begin with it. This is to preserve the single line with + queries like SET client_encoding TO "utf8"; + - Add better SQL formatting of update queries by adding a new line + after the SET keyword. Thanks to pilat66 for the report. + - Update copyright and documentation. + - Queries without application name are now stored under others + application name. + - Add report of number of queries by application if %a is specified in + the log_line_prefix. + - Add link menu to the request per database and limit the display of + this information when there is more than one database. + - Add report of requests per database. + - Add report of user,remote client and application name to all request + info. + - Fix memory leak with option -b (--begin) and in incremental log + parsing mode. + - Remove duration part from log format auto-detection. Thanks to + Guillaume Lelarge for the report. + - Fix a performance issue on prettifying SQL queries that makes pgBagder + several time slower that usual to generate the HTML output. Thanks to + Vincent Laborie for the report. + - Add missing SQL::Beautify paternity. + - Add 'binary' format as input/output format. The binary output format + allows to save log statistics in a non human readable file instead of + an HTML or text file. These binary files might then be used as regular + input files, combined or not, to produce a html or txt report. Thanks + to Jehan Guillaume de Rorthais for the patch. + - Remove port from the session regex pattern to match all lines. + - Fix the progress bar. It was trying to use gunzip to get real file + size for all formats (by default). Unbreak the bz2 format (that does + not report real size) and add support for zip format. Thanks to Euler + Taveira de Oliveira fort the patch. + - Fix some typos and grammatical issues. Thanks to Euler Taveira de + Oliveira fort the patch. + - Improve SQL code highlighting and keywords detection merging change + from pgFormatter project. + - Add support to hostname or ip address in the client detection. Thanks + to stuntmunkee for the report. + - pgbadger will now only reports execute statement of the extended + protocol (parse/bind/execute). Thanks to pierrestroh for the report. + - Fix numerous typos as well as formatting and grammatical issues. + Thanks to Thom Brown for the patch. + - Add backward compatibility to obsolete --client command line option. + If you were using the short option -c nothing is changed. + - Fix issue with --dbclient and %h in log_line_prefix. Thanks to Julien + Rouhaud for the patch. + - Fix multiline progress bar output. + - Allow usage of a dash into database, user and application names when + prefix is used. Thanks to Vipul for the report. + - Mouse over queries will now show in which database they are executed + in the overviews (Slowest queries, Most frequent queries, etc. ). + Thank to Dirk-Jan Bulsink for the feature request. + - Fix missing keys on %cur_info hash. Thanks to Marc Cousin for the + report. + - Move opening file handle to log file into a dedicated function. + Thanks to Marc Cousin for the patch. + - Replace Ctrl+M by printable \r. Thanks to Marc Cousin for the report. + + +2012-11-13 - Version 2.2 + +This release add some major features like tsung output, speed improvement with +csvlog, report of shut down events, new command line options to generate report +excluding some user(s), to build report based on select queries only, to specify +regex of the queries that must only be included in the report and to remove +comments from queries. Lot of bug fixes, please upgrade. + + - Update PostgreSQL keywords list for 9.2 + - Fix number of queries in progress bar with tsung output. + - Remove obsolete syslog-ng and temporary syslog-ll log format added to + fix some syslog autodetection issues. There is now just one syslog + format: syslog, differences between syslog formats are detected and + the log parser is adaptive. + - Add comment about the check_incremental_position() method + - Fix reports with empty graphs when log files were not in chronological + order. + - Add report of current total of queries and events parsed in progress + bar. Thanks to Jehan-Guillaume de Rorthais for the patch. + - Force pgBadger to use an require the XS version of Text::CSV instead + of the Pure Perl implementation. It is a good bit faster thanks to + David Fetter for the patch. Note that using csvlog is still a bit + slower than syslog or stderr log format. + - Fix several issue with tsung output. + - Add report of shut down events + - Add debug information on command line used to pipe compressed log + file when -v is provide. + - Add -U | --exclude-user command line option to generate report + excluded user. Thanks to Birta Levente for the feature request. + - Allow some options to be specified multiple time or be written as a + coma separated list of value, here are these options: --dbname, + --dbuser, --dbclient, --dbappname, --exclude_user. + - Add -S | --select-only option to build report only on select queries. + - Add first support to tsung output, see usage. Thanks to Guillaume + Lelarge for the feature request. + - Add --include-query and --include-file to specify regex of the queries + that must only be included in the report. Thanks to Marc Cousin for + the feature request. + - Fix auto detection of log_duration and log_min_duration_statement + format. + - Fix parser issue with Windows logs without timezone information. + Thanks to Nicolas Thauvin for the report. + - Fix bug in %r = remote host and port log line prefix detection. + Thanks to Hubert Depesz Lubaczewski for the report. + - Add -C | --nocomment option to remove comment like /* ... */ from + queries. Thanks to Hubert Depesz Lubaczewski for the feature request. + - Fix escaping of log_line_prefix. Thanks to Hubert Depesz Lubaczewski + for the patch. + - Fix wrong detection of update queries when a query has a object names + containing update and set. Thanks to Vincent Laborie for the report. + +2012-10-10 - Version 2.1 + +This release add a major feature by allowing any custom log_line_prefix to be +used by pgBadger. With stderr output you at least need to log the timestamp (%t) +the pid (%p) and the session/line number (%l). Support to log_duration instead +of log_min_duration_statement to allow reports simply based on duration and +count report without query detail and report. Lot of bug fixes, please upgrade +asap. + + - Add new --enable-log_min_duration option to force pgbadger to use lines + generated by the log_min_duration_statement even if the log_duration + format is autodetected. Useful if you use both but do not log all queries. + Thanks to Vincent Laborie for the feature request. + - Add syslog-ng format to better handle syslog traces with notation like: + [ID * local2.info]. It is autodetected but can be forced in the -f option + with value set to: syslog-ng. + - Add --enable-log_duration command line option to force pgbadger to only + use the log_duration trace even if log_min_duration_statement traces are + autodetected. + - Fix display of empty hourly graph when no data were found. + - Remove query type report when log_duration is enabled. + - Fix a major bug in query with bind parameter. Thanks to Marc Cousin for + the report. + - Fix detection of compressed log files and allow automatic detection + and uncompress of .gz, .bz2 and .zip files. + - Add gunzip -l command to find the real size of a gzip compressed file. + - Fix log_duration only reports to not take care about query detail but + just count and duration. + - Fix issue with compressed csvlog. Thanks to Philip Freeman for the + report. + - Allow usage of log_duration instead of log_min_duration_statement to + just collect statistics about the number of queries and their time. + Thanks to Vincent Laborie for the feature request. + - Fix issue on syslog format and autodetect with additional info like: + [ID * local2.info]. Thanks to kapsalar for the report. + - Removed unrecognized log line generated by deadlock_timeout. + - Add missing information about unsupported csv log input from stdin. + It must be read from a file. Thank to Philip Freeman for the report. + - Fix issue #28: Illegal division by zero with log file without query + and txt output. Thanks to rlowe for the report. + - Update documentation about the -N | --appname option. + - Rename --name option into --appname. Thanks to Guillaume Lellarge for + the patch. + - Fix min/max value in xasis that was always represented 2 days by + default. Thanks to Casey Allen Shobe for the report. + - Fix major bug when running pgbadger with the -e option. Thanks to + Casey Allen Shobe for the report and the great help + - Change project url to http://dalibo.github.com/pgbadger/. Thanks to + Damien Clochard for this new hosting. + - Fix lot of issues in CSV parser and force locale to be C. Thanks to + Casey Allen Shobe for the reports. + - Improve speed with custom log_line_prefix. + - Merge pull request #26 from elementalvoid/helpdoc-fix + - Fixed help text for --exclude-file. Old help text indicated that the + option name was --exclude_file which was incorrect. + - Remove the obsolete --regex-user and --regex-db options that was used + to specify a search pattern in the log_line_prefix to find the user + and db name. This is replaced by the --prefix option. + - Replace Time column report header by Hour. + - Fix another issue in log_line_prefix parser with stderr format + - Add a more complex example using log_line_prefix + - Fix log_line_prefix issue when using timepstamp with millisecond. + - Add support to use any custom log_line_prefix with new option -p or + --prefix. See README for an example. + - Fix false autodetection of CSV format when log_statement is enable or + in possible other cases. This was resulting in error: "FATAL: cannot + use CSV". Thanks to Thomas Reiss for the report. + - Fix display of empty graph of connections per seconds + - Allow character : in log line prefix, it will no more break the log + parsing. Thanks to John Rouillard for the report. + - Add report of configuration parameter changes into the errors report + and change errors report by events report to handle important messages + that are not errors. + - Allow pgbadger to recognize " autovacuum launcher" messages. + 2012-08-21 - version 2.0 -This major version adds some changes not backward compatible with previous versions. Options --p and -g are not more used as progress bar and graphs generation are enabled by default now. -The obsolete -l option use to specify the log file to parse has been reused to specify an -incremental file. Outside these changes and some bug fix there's also new features: - - * Using an incremental file with -l option allow to parse multiple time a single log file - and to "seek" at the last line parsed during the previous run. Useful if you have a log - rotation not sync with your pgbadger run. For exemple you can run somthing like this: +This major version adds some changes not backward compatible with previous +versions. Options -p and -g are not more used as progress bar and graphs +generation are enabled by default now. + +The obsolete -l option use to specify the log file to parse has been reused to +specify an incremental file. Outside these changes and some bug fix there's +also new features: + + * Using an incremental file with -l option allow to parse multiple time a + single log file and to "seek" at the last line parsed during the previous + run. Useful if you have a log rotation not sync with your pgbadger run. + For exemple you can run somthing like this: pgbadger `find /var/log/postgresql/ -name "postgresql*" -mtime -7 -type f` \ -o report_`date +%F`.html -l /var/run/pgbadger/last_run.log - * All queries diplayed in the HTML report are now clickable to display or hide a nice - SQL query format. This is called SQL format beautifier. + * All queries diplayed in the HTML report are now clickable to display or + hide a nice SQL query format. This is called SQL format beautifier. * CSV log parser have been entirely rewritten to handle csv with multiline. Every one should upgrade. - - Change license from BSD like to PostgreSQL license. Request from Robert Treat. - - Fix wrong pointer on Connections per host menu. Reported by Jean-Paul Argudo. - - Small fix for sql formatting adding scrollbars. Patch by Julien Rouhaud. - - Add SQL format beautifier on SQL queries. When you will click on a query it - will be beautified. Patch by Gilles Darold - - The progress bar is now enabled by default, the -p option has been removed. - Use -q | --quiet to disable it. Patch by Gilles Darold - - Graphs are now generated by default for HTML output, option -g as been remove - and option -G added to allow disabling graph generation. Request from Julien - Rouhaud, patch by Gilles Darold. + - Change license from BSD like to PostgreSQL license. Request from + Robert Treat. + - Fix wrong pointer on Connections per host menu. Reported by Jean-Paul + Argudo. + - Small fix for sql formatting adding scrollbars. Patch by Julien + Rouhaud. + - Add SQL format beautifier on SQL queries. When you will click on a + query it will be beautified. Patch by Gilles Darold + - The progress bar is now enabled by default, the -p option has been + removed. Use -q | --quiet to disable it. Patch by Gilles Darold. + - Graphs are now generated by default for HTML output, option -g as + been remove and option -G added to allow disabling graph generation. + Request from Julien Rouhaud, patch by Gilles Darold. - Remove option -g and -p to the documentation. Patch by Gilles Darold. - Fix case sensitivity in command line options. Patch by Julien Rouhaud. - Add -T|--title option to change report title. Patch by Yury Bushmelev. - - Add new option --exclude-file to exclude specific commands with regex stated - in a file. This is a rewrite by Gilles Darold of the neoeahit (Vipul) patch. - - CSV log parser have been entirely rewritten to handle csv with multiline, it - also adds approximative duration for csvlog. Reported by Ludhimila Kendrick, - patch by Gilles Darold. - - Alphabetical reordering of options list in method usage() and documentation. + - Add new option --exclude-file to exclude specific commands with regex + stated in a file. This is a rewrite by Gilles Darold of the neoeahit + (Vipul) patch. + - CSV log parser have been entirely rewritten to handle csv with multi + line, it also adds approximative duration for csvlog. Reported by + Ludhimila Kendrick, patch by Gilles Darold. + - Alphabetical reordering of options list in method usage() and + documentation. Patch by Gilles Darold. + - Remove obsolete -l | --logfile command line option, the -l option + will be reused to specify an incremental file. Patch by Gilles Darold. + - Add -l | --last-parsed options to allow incremental run of pgbadger. Patch by Gilles Darold. - - Remove obsolete -l | --logfile command line option, the -l option will be reused - to specify an incremental file. Patch by Gilles Darold. - - Add -l | --last-parsed options to allow incremental run of pgbadger. Patch - by Gilles Darold. - - Replace call to timelocal_nocheck by timegm_nocheck, to convert date/time into - second from the epoch. This should fix timezone issue. Patch by Gilles Darold. - - Change regex on log parser to allow missing ending space in log_line_prefix. - - This seems a common mistake. Patch by Gilles Darold. - - print a warning when an empty log file is found. Patch by Gilles Darold. + - Replace call to timelocal_nocheck by timegm_nocheck, to convert date + time into second from the epoch. This should fix timezone issue. + Patch by Gilles Darold. + - Change regex on log parser to allow missing ending space in + log_line_prefix. This seems a common mistake. Patch by Gilles Darold. + - print warning when an empty log file is found. Patch by Gilles Darold. - Add perltidy rc file to format pgbadger Perl code. Patch from depesz. - 2012-07-15 - version 1.2 This version adds some reports and fixes a major issue in log parser. Every one diff -Nru pgbadger-2.0/CONTRIBUTING.md pgbadger-5.0/CONTRIBUTING.md --- pgbadger-2.0/CONTRIBUTING.md 1970-01-01 00:00:00.000000000 +0000 +++ pgbadger-5.0/CONTRIBUTING.md 2014-02-06 15:29:04.000000000 +0000 @@ -0,0 +1,9 @@ +# How to contribute # + +##Before Submitting an issue## + +1. Upgrade to the latest version of pgBadger and see if the problem remains + +2. Look at the [closed issues](https://github.com/dalibo/pgbadger/issues?state=closed), we may have alreayd answered to a similar problem + +3. [Read the doc](http://dalibo.github.com/pgbadger/documentation.html). It is short and useful. diff -Nru pgbadger-2.0/debian/bzr-builder.manifest pgbadger-5.0/debian/bzr-builder.manifest --- pgbadger-2.0/debian/bzr-builder.manifest 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/debian/bzr-builder.manifest 2014-03-19 15:11:13.000000000 +0000 @@ -1,3 +1,2 @@ -# bzr-builder format 0.3 deb-version {debupstream}-0~133 -lp:pgbadger revid:git-v1:abf49e1a6b9f8547643551d07c28777adf346f88 -nest packaging lp:~guilhem-fr/pgbadger/debian debian revid:guilhem+ubuntu@lettron.fr-20120905150220-gu7iqgldnokqhi7w +# bzr-builder format 0.3 deb-version {debupstream}-0~5 +lp:~stub/ubuntu/precise/pgbadger/devel revid:stuart@stuartbishop.net-20140217090428-83ciqqie8u8l5b3d diff -Nru pgbadger-2.0/debian/changelog pgbadger-5.0/debian/changelog --- pgbadger-2.0/debian/changelog 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/debian/changelog 2014-03-19 15:11:13.000000000 +0000 @@ -1,11 +1,37 @@ -pgbadger (2.0-0~133~quantal1) quantal; urgency=low +pgbadger (5.0-0~5~ubuntu14.04.1) trusty; urgency=low * Auto build. - -- Guilhem Lettron Tue, 18 Sep 2012 09:56:31 +0000 + -- Stuart Bishop Mon, 17 Feb 2014 09:13:47 +0000 -pgbadger (2.0-1) unstable; urgency=low +pgbadger (5.0-0) precise; urgency=low - * Initial Release. (Closes: #679362) + * New upstream release. - -- Guilhem Lettron Wed, 5 Sep 2012 16:56:22 +0200 + -- Stuart Bishop (Work) Mon, 17 Feb 2014 16:02:24 +0700 + +pgbadger (3.3-2) unstable; urgency=low + + * Fixed debian/copyright file + + -- Cyril Bouthors Wed, 01 May 2013 22:23:30 +0200 + +pgbadger (3.3-1) unstable; urgency=low + + * New upstream release + + -- Cyril Bouthors Wed, 01 May 2013 22:11:23 +0200 + +pgbadger (3.2-1) unstable; urgency=low + + * New upstream release + * Fixed dependencies thanks to Marc Fournier + (closes: #704996). + + -- Cyril Bouthors Mon, 08 Apr 2013 18:56:55 +0200 + +pgbadger (1.0-1) unstable; urgency=low + + * Initial release (closes: #679362). + + -- Cyril Bouthors Sun, 27 Jan 2013 02:06:14 +0100 diff -Nru pgbadger-2.0/debian/compat pgbadger-5.0/debian/compat --- pgbadger-2.0/debian/compat 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/debian/compat 2014-03-19 15:11:13.000000000 +0000 @@ -1 +1 @@ -8 +5 diff -Nru pgbadger-2.0/debian/control pgbadger-5.0/debian/control --- pgbadger-2.0/debian/control 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/debian/control 2014-03-19 15:11:13.000000000 +0000 @@ -1,31 +1,24 @@ Source: pgbadger -Section: perl +Section: admin Priority: optional -Maintainer: Guilhem Lettron -Build-Depends: debhelper (>= 8) -Build-Depends-Indep: perl -Standards-Version: 3.9.2 -Homepage: http://search.cpan.org/dist/pgBadger/ +Maintainer: Cyril Bouthors +Uploaders: Cyril Bouthors , + Cyril Bouthors +Build-Depends: debhelper (>= 5) +Standards-Version: 3.9.4 Package: pgbadger Architecture: all Depends: ${misc:Depends}, ${perl:Depends} -Description: unknown - pgBadger is a PostgreSQL log analyzer built for speed with fully detailed - reports from your PostgreSQL log file. It's a single and small Perl script - that aims to replace and outperform the old php script pgFouine. +Suggests: libtext-csv-xs-perl +Description: Fast PostgreSQL log analysis report + pgBadger is a PostgreSQL log analyzer build for speed with fully detailed + reports from your PostgreSQL log file. It's a single and small script written + in pure Perl language. . - By the way, we would like to thank Guillaume Smet for all the work he has - done on this really nice tool. We've been using it a long time, it is a - really great tool! + It uses a javascript library to draw graphs so that you don't need additional + Perl modules or any other package to install. Furthermore, this library gives + us more features such as zooming. . - pgBadger is written in pure Perl language. It uses a javascript library to - draw graphs so that you don't need additional Perl modules or any other - package to install. Furthermore, this library gives us additional features, - such as zooming. - . - pgBadger is able to autodetect your log file format (syslog, stderr or - csvlog). It is designed to parse huge log files, as well as gzip compressed - file. See a complete list of features below. - . - This description was automagically extracted from the module by dh-make-perl. + pgBadger is able to autodetect your log file format (syslog, stderr or csvlog). + It is designed to parse huge log files as well as gzip compressed file. diff -Nru pgbadger-2.0/debian/copyright pgbadger-5.0/debian/copyright --- pgbadger-2.0/debian/copyright 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/debian/copyright 2014-03-19 15:11:13.000000000 +0000 @@ -1,42 +1,32 @@ -Format-Specification: http://anonscm.debian.org/viewvc/dep/web/deps/dep5.mdwn?view=markup&pathrev=135 -Maintainer: unknown -Source: http://search.cpan.org/dist/pgBadger/ -Name: pgBadger -DISCLAIMER: This copyright info was automatically extracted - from the perl module. It may not be accurate, so you better - check the module sources in order to ensure the module for its - inclusion in Debian or for general legal information. Please, - if licensing information is incorrectly generated, file a bug - on dh-make-perl. - NOTE: Don't forget to remove this disclaimer once you are happy - with this file. - -Files: * -Copyright: unknown -License: unparsable - -Files: debian/* -Copyright: 2012, Guilhem Lettron -License: unparsable or Artistic or GPL-1+ - -License: unparsable - No known license could be automatically determined for this module. - If this module conforms to a commonly used license, please report this - as a bug in dh-make-perl. In any case, please find the proper license - and fix this file! - -License: Artistic - This program is free software; you can redistribute it and/or modify - it under the terms of the Artistic License, which comes with Perl. - . - On Debian systems, the complete text of the Artistic License can be - found in `/usr/share/common-licenses/Artistic'. - -License: GPL-1+ - This program is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 1, or (at your option) - any later version. - . - On Debian systems, the complete text of version 1 of the GNU General - Public License can be found in `/usr/share/common-licenses/GPL-1'. +This package was debianized by Cyril Bouthors on +Sun Jan 27 02:09:25 CET 2013 + +It was downloaded from + +Upstream Author: + + Gilles Darold + +Copyright: + + + +License: + +Copyright (c) 2012-2013, Dalibo + +Permission to use, copy, modify, and distribute this software and its +documentation for any purpose, without fee, and without a written agreement +is hereby granted, provided that the above copyright notice and this +paragraph and the following two paragraphs appear in all copies. + +IN NO EVENT SHALL Dalibo BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, +SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, +ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF +Dalibo HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Dalibo SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED +TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND Dalibo +HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, +OR MODIFICATIONS. diff -Nru pgbadger-2.0/debian/pgbadger.docs pgbadger-5.0/debian/pgbadger.docs --- pgbadger-2.0/debian/pgbadger.docs 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/debian/pgbadger.docs 1970-01-01 00:00:00.000000000 +0000 @@ -1 +0,0 @@ -README diff -Nru pgbadger-2.0/debian/rules pgbadger-5.0/debian/rules --- pgbadger-2.0/debian/rules 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/debian/rules 2014-03-19 15:11:13.000000000 +0000 @@ -1,4 +1,4 @@ #!/usr/bin/make -f - +# -*- makefile -*- %: dh $@ diff -Nru pgbadger-2.0/debian/source/format pgbadger-5.0/debian/source/format --- pgbadger-2.0/debian/source/format 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/debian/source/format 1970-01-01 00:00:00.000000000 +0000 @@ -1 +0,0 @@ -3.0 (native) diff -Nru pgbadger-2.0/debian/watch pgbadger-5.0/debian/watch --- pgbadger-2.0/debian/watch 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/debian/watch 1970-01-01 00:00:00.000000000 +0000 @@ -1,2 +0,0 @@ -version=3 -http://search.cpan.org/dist/pgBadger/ .*/pgBadger-v?(\d[\d.-]+)\.(?:tar(?:\.gz|\.bz2)?|tgz|zip)$ diff -Nru pgbadger-2.0/doc/pgBadger.pod pgbadger-5.0/doc/pgBadger.pod --- pgbadger-2.0/doc/pgBadger.pod 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/doc/pgBadger.pod 2014-02-06 15:29:03.000000000 +0000 @@ -1,4 +1,4 @@ -=head1 ABSTRACT +=head1 NAME pgBadger - a fast PostgreSQL log analysis report @@ -11,65 +11,89 @@ Arguments: logfile can be a single log file, a list of files, or a shell command - returning a list of file. If you want to pass log content from stdin - use - as filename. - + returning a list of files. If you want to pass log content from stdin + use - as filename. Note that input from stdin will not work with csvlog. Options: -a | --average minutes : number of minutes to build the average graphs of queries and connections. -b | --begin datetime : start date/time for the data to be parsed in log. - -d | --dbname database : only report what concern the given database + -c | --dbclient host : only report on entries for the given client host. + -C | --nocomment : remove comments like /* ... */ from queries. + -d | --dbname database : only report on entries for the given database. -e | --end datetime : end date/time for the data to be parsed in log. -f | --format logtype : possible values: syslog,stderr,csv. Default: stderr -G | --nograph : disable graphs on HTML output. Enable by default. -h | --help : show this message and exit. -i | --ident name : programname used as syslog ident. Default: postgres + -I | --incremental : use incremental mode, reports will be generated by + days in a separate directory, --outdir must be set. + -j | --jobs number : number of jobs to run on parallel on each log file. + Default is 1, run as single process. + -J | --Jobs number : number of log file to parse in parallel. Default + is 1, run as single process. -l | --last-parsed file: allow incremental log parsing by registering the last datetime and line parsed. Useful if you want to watch errors since last run or if you want one report per day with a log rotated each week. - -m | --maxlength size : maximum length of a query, it will be cutted above + -m | --maxlength size : maximum length of a query, it will be restricted to the given size. Default: no truncate -n | --nohighlight : disable SQL code highlighting. - -N | --appname name : only report what concern the given application name - -o | --outfile filename: define the filename for the output. Default depends - of the output format: out.html or out.txt. To dump - output to stdout use - as filename. + -N | --appname name : only report on entries for given application name + -o | --outfile filename: define the filename for output. Default depends on + the output format: out.html, out.txt or out.tsung. + To dump output to stdout use - as filename. + -O | --outdir path : directory where out file must be saved. -p | --prefix string : give here the value of your custom log_line_prefix - defined in your postgresql.conf. You may used it only - if don't have the default allowed formats ot to use - other custom variables like client ip or application - name. See examples below. + defined in your postgresql.conf. Only use it if you + aren't using one of the standard prefixes specified + in the pgBadger documentation, such as if your prefix + includes additional variables like client ip or + application name. See examples below. -P | --no-prettify : disable SQL queries prettify formatter. - -q | --quiet : don't print anything to stdout, even not a progress bar. - -s | --sample number : number of query sample to store/display. Default: 3 - -t | --top number : number of query to store/display. Default: 20 + -q | --quiet : don't print anything to stdout, not even a progress bar. + -s | --sample number : number of query samples to store/display. Default: 3 + -S | --select-only : use it if you want to report select queries only. + -t | --top number : number of queries to store/display. Default: 20 -T | --title string : change title of the HTML page report. - -u | --dbuser username : only report what concern the given user + -u | --dbuser username : only report on entries for the given user. + -U | --exclude-user username : exclude entries for the specified user from report. -v | --verbose : enable verbose or debug mode. Disabled by default. -V | --version : show pgBadger version and exit. -w | --watch-mode : only report errors just like logwatch could do. - -x | --extension : output format. Values: text or html. Default: html + -x | --extension : output format. Values: text, html or tsung. Default: html -z | --zcat exec_path : set the full path to the zcat program. Use it if - zcat is not on your path or you want to use gzcat. + zcat or bzcat or unzip is not on your path. --pie-limit num : pie data lower than num% will show a sum instead. --exclude-query regex : any query matching the given regex will be excluded from the report. For example: "^(VACUUM|COMMIT)" - you can use this option multiple time. + You can use this option multiple times. --exclude-file filename: path of the file which contains all the regex to use to exclude queries from the report. One regex per line. + --include-query regex : any query that does not match the given regex will be + excluded from the report. For example: "(table_1|table_2)" + You can use this option multiple times. + --include-file filename: path of the file which contains all the regex of the + queries to include from the report. One regex per line. --disable-error : do not generate error report. - --disable-hourly : do not generate hourly reports. + --disable-hourly : do not generate hourly report. --disable-type : do not generate query type report. - --disable-query : do not generate queries reports (slowest, most + --disable-query : do not generate query reports (slowest, most frequent, ...). --disable-session : do not generate session report. --disable-connection : do not generate connection report. --disable-lock : do not generate lock report. --disable-temporary : do not generate temporary report. --disable-checkpoint : do not generate checkpoint report. + --disable-autovacuum : do not generate autovacuum report. + --charset : used to set the HTML charset to be used. Default: utf-8. + --csv-separator : used to set the CSV field separator, default: , + --exclude-time regex : any timestamp matching the given regex will be + excluded from the report. Example: "2013-04-12 .*" + You can use this option multiple times. + --exclude-appname name : exclude entries for the specified application name + from report. Example: "pg_dump". Examples: @@ -78,7 +102,7 @@ pgbadger /var/log/postgresql/postgresql-2012-05-* pgbadger --exclude-query="^(COPY|COMMIT)" /var/log/postgresql.log pgbadger -b "2012-06-25 10:56:11" -e "2012-06-25 10:59:11" /var/log/postgresql.log - cat /var/log/postgres.log | pgbadger - + cat /var/log/postgres.log | pgbadger - # log prefix with stderr log output perl pgbadger --prefix '%t [%p]: [%l-1] user=%u,db=%d,client=%h' \ /pglog/postgresql-2012-08-21* @@ -87,6 +111,14 @@ perl pgbadger --prefix 'user=%u,db=%d,client=%h,appname=%a' \ /pglog/postgresql-2012-08-21* +Use my 8 CPUs to parse my 10GB file faster, really faster + + perl pgbadger -j 8 /pglog/postgresql-9.1-main.log + +Generate Tsung sessions XML file with select queries only: + + perl pgbadger -S -o sessions.tsung --prefix '%t [%p]: [%l-1] user=%u,db=%d ' /pglog/postgresql-9.1.log + Reporting errors every week by cron job: 30 23 * * 1 /usr/bin/pgbadger -q -w /var/log/postgresql.log -o /var/reports/pg_errors.html @@ -96,68 +128,141 @@ 0 4 * * 1 /usr/bin/pgbadger -q `find /var/log/ -mtime -7 -name "postgresql.log*"` \ -o /var/reports/pg_errors-`date +%F`.html -l /var/reports/pgbadger_incremental_file.dat -This suppose that your log file and HTML report are also rotated every week. +This supposes that your log file and HTML report are also rotated every week. + +Or better, use the auto-generated incremental reports: + + 0 4 * * * /usr/bin/pgbadger -I -q /var/log/postgresql/postgresql.log.1 \ + -O /var/www/pg_reports/ + +will generate a report per day and per week in the given output directory. + +If you have a pg_dump at 23:00 and 13:00 each day during half an hour, you can +use pgbadger as follow to exclude these periods from the report: + + pgbadger --exclude-time "2013-09-.* (23|13):.*" postgresql.log + +This will help to not have all COPY order on top of slowest queries. You can +also use --exclude-appname "pg_dump" to solve this problem in a more simple way. =head1 DESCRIPTION -pgBadger is a PostgreSQL log analyzer built for speed with fully detailed reports from your PostgreSQL log file. It's a single and small Perl script that aims to replace and outperform the old php script pgFouine. +pgBadger is a PostgreSQL log analyzer build for speed with fully detailed reports from your PostgreSQL log file. It's a single and small Perl script that outperform any other PostgreSQL log analyzer. + +It is written in pure Perl language and uses a javascript library (flotr2) to draw graphs so that you don't need to install any additional Perl modules or other packages. Furthermore, this library gives us more features such as zooming. pgBadger also uses the Bootstrap javascript library and the FontAwesome webfont for better design. Everything is embedded. + +pgBadger is able to autodetect your log file format (syslog, stderr or csvlog). It is designed to parse huge log files as well as gzip compressed file. See a complete list of features below. -By the way, we would like to thank Guillaume Smet for all the work he has done on this really nice tool. We've been using it a long time, it is a really great tool! +All charts are zoomable and can be saved as PNG images. -pgBadger is written in pure Perl language. It uses a javascript library to draw graphs so that you don't need additional Perl modules or any other package to install. Furthermore, this library gives us additional features, such as zooming. +You can also limit pgBadger to only report errors or remove any part of the report using command line options. -pgBadger is able to autodetect your log file format (syslog, stderr or csvlog). It is designed to parse huge log files, as well as gzip compressed file. See a complete list of features below. +pgBadger supports any custom format set into log_line_prefix of your postgresql.conf file provide that you use the %t, %p and %l patterns. + +pgBadger allow parallel processing on a single log file and multiple files through the use of the -j option and the number of CPUs as value. + +If you want to save system performance you can also use log_duration instead of log_min_duration_statement to have reports on duration and number of queries only. =head1 FEATURE pgBadger reports everything about your SQL queries: - Overall statistics. + Overall statistics + The most frequent waiting queries. + Queries that waited the most. + Queries generating the most temporary files. + Queries generating the largest temporary files. The slowest queries. Queries that took up the most time. The most frequent queries. The most frequent errors. + Histogram of query times. + +The following reports are also available with hourly charts divide by periods of +five minutes: + + SQL queries statistics. + Temporary file statistics. + Checkpoints statistics. + Autovacuum and autoanalyze statistics. -The following reports are also available with hourly charts: +There's also some pie reports of distribution about: - Hourly queries statistics. - Hourly temporary file statistics. - Hourly checkpoints statistics. Locks statistics. - Queries by type (select/insert/update/delete). + ueries by type (select/insert/update/delete). + Distribution of queries type per database/application Sessions per database/user/client. Connections per database/user/client. + Autovacuum and autoanalyze per table. All charts are zoomable and can be saved as PNG images. SQL queries reported are highlighted and beautified automatically. +You can also have incremental reports with one report per day and a cumulative report per week. + =head1 REQUIREMENT -PgBadger comes as a single Perl script- you do not need anything else than a modern Perl distribution. Charts are rendered using a Javascript library so you don't need anything. Your browser will do all the work. +pgBadger comes as a single Perl script - you do not need anything other than a modern Perl distribution. Charts are rendered using a Javascript library so you don't need anything. Your browser will do all the work. If you planned to parse PostgreSQL CSV log files you might need some Perl Modules: - Text::CSV - to parse PostgreSQL CSV log files. + Text::CSV_XS - to parse PostgreSQL CSV log files. This module is optional, if you don't have PostgreSQL log in the CSV format you don't need to install it. -Under Windows OS you may not be able to use gzipped log files unless you have a -zcat like utility that could uncompress the log file and send content to stdout. -If you have such an utility or in other OSes you want to use other compression -utility like bzip2 or Zip, use the --zcat comand line option as follow: +Compressed log file format is autodetected from the file exension. If pgBadger find a gz extension +it will use the zcat utility, with a bz2 extension it will use bzcat and if the file extension is zip +then the unzip utility will be used. + +If those utilities are not found in the PATH environment variable then use the --zcat command line option +to change this path. For example: + + --zcat="/usr/local/bin/gunzip -c" or --zcat="/usr/local/bin/bzip2 -dc" + --zcat="C:\tools\unzip -p" + +By default pgBadger will use the zcat, bzcat and unzip utilities following the +file extension. If you use the default autodetection compress format you can +mixed gz, bz2 or zip files. Specifying a custom value to --zcat option will +remove this feature of mixed compressed format. + +Note that multiprocessing can not be used with compressed files or CSV files as +well as under Windows platform. + +=head1 INSTALLATION + +Download the tarball from github and unpack the archive as follow: + + tar xzf pgbadger-4.x.tar.gz + cd pgbadger-4.x/ + perl Makefile.PL + make && sudo make install - --zcat="unzip -p" or --zcat="gunzip -c" or --zcat="bzip2 -dc" +This will copy the Perl script pgbadger to /usr/local/bin/pgbadger by default and the +man page into /usr/local/share/man/man1/pgbadger.1. Those are the default installation +directories for 'site' install. + +If you want to install all under /usr/ location, use INSTALLDIRS='perl' as an argument +of Makefile.PL. The script will be installed into /usr/bin/pgbadger and the manpage +into /usr/share/man/man1/pgbadger.1. -the last example can also be used like this: --zcat="bzcat" +For example, to install everything just like Debian does, proceed as follows: + + perl Makefile.PL INSTALLDIRS=vendor + +By default INSTALLDIRS is set to site. =head1 POSTGRESQL CONFIGURATION -You must enable some configuration directives in your postgresql.conf before starting. +You must enable and set some configuration directives in your postgresql.conf +before starting. You must first enable SQL query logging to have something to parse: log_min_duration_statement = 0 -Note that pgBadger is not compatible with statements logs provided by log_statement and log_duration. +Here every statement will be logged, on busy server you may want to increase +this value to only log queries with a higher duration time. Note that if you have +log_statement set to 'all' nothing will be logged through log_min_duration_statement. +See next chapter for more information. With 'stderr' log format, log_line_prefix must be at least: @@ -179,47 +284,166 @@ log_line_prefix = 'db=%d,user=%u ' -You need to enable other parameters in postgresql.conf to get more informations from your log files: +You need to enable other parameters in postgresql.conf to get more information from your log files: log_checkpoints = on log_connections = on log_disconnections = on log_lock_waits = on log_temp_files = 0 + log_autovacuum_min_duration = 0 -Do not enable log_statement and log_duration, their log format will not be parsed by pgBadger. +Do not enable log_statement as their log format will not be parsed by pgBadger. -Of course your log messages should be in english without locale support: +Of course your log messages should be in English without locale support: lc_messages='C' -but this is not only recommanded by pgbadger. +but this is not only recommended by pgBadger. +=head1 log_min_duration_statement, log_duration and log_statement -=head1 INSTALLATION +If you want full statistics reports you must set log_min_duration_statement +to 0 or more milliseconds. -Download the tarball from github and unpack the archive as follow: +If you just want to report duration and number of queries and don't want all +details about queries, set log_min_duration_statement to -1 to disable it and +enable log_duration in your postgresql.conf file. If you want to add the most +common request report you can either choose to set log_min_duration_statement +to a higher value or choose to enable log_statement. - tar xzf pgbadger-1.x.tar.gz - cd pgbadger-1.x/ - perl Makefile.PL - make && sudo make install +Enabling log_min_duration_statement will add reports about slowest queries and +queries that took up the most time. Take care that if you have log_statement +set to 'all' nothing will be logged with log_line_prefix. -This will copy the Perl script pgbadger in /usr/local/bin/pgbadger directory by default and the man page into /usr/local/share/man/man1/pgbadger.1. Those are the default installation directory for 'site' install. -If you want to install all under /usr/ location, use INSTALLDIRS='perl' as argument of Makefile.PL. The script will be installed into /usr/bin/pgbadger and the manpage into /usr/share/man/man1/pgbadger.1. +=head1 PARALLEL PROCESSING -For example, to install everything just like Debian does, proceed as follow: +To enable parallel processing you just have to use the -j N option where N is +the number of cores you want to use. - perl Makefile.PL INSTALLDIRS=vendor +pgbadger will then proceed as follow: -By default INSTALLDIRS is set to site. + for each log file + chunk size = int(file size / N) + look at start/end offsets of these chunks + fork N processes and seek to the start offset of each chunk + each process will terminate when the parser reach the end offset + of its chunk + each process write stats into a binary temporary file + wait for all children has terminated + All binary temporary files generated will then be read and loaded into + memory to build the html output. + +With that method, at start/end of chunks pgbadger may truncate or omit a +maximum of N queries perl log file which is an insignificant gap if you have +millions of queries in your log file. The chance that the query that you were +looking for is loose is near 0, this is why I think this gap is livable. Most +of the time the query is counted twice but truncated. + +When you have lot of small log files and lot of CPUs it is speedier to dedicate +one core to one log file at a time. To enable this behavior you have to use +option -J N instead. With 200 log files of 10MB each the use of the -J option +start being really interesting with 8 Cores. Using this method you will be sure +to not loose any queries in the reports. + +He are a benchmarck done on a server with 8 CPUs and a single file of 9.5GB. + + Option | 1 CPU | 2 CPU | 4 CPU | 8 CPU + --------+---------+-------+-------+------ + -j | 1h41m18 | 50m25 | 25m39 | 15m58 + -J | 1h41m18 | 54m28 | 41m16 | 34m45 + +With 200 log files of 10MB each and a total og 2GB the results are slightly +different: + + Option | 1 CPU | 2 CPU | 4 CPU | 8 CPU + --------+-------+-------+-------+------ + -j | 20m15 | 9m56 | 5m20 | 4m20 + -J | 20m15 | 9m49 | 5m00 | 2m40 + +So it is recommanded to use -j unless you have hundred of small log file +and can use at least 8 CPUs. + +IMPORTANT: when you are using parallel parsing pgbadger will generate a lot +of temporary files in the /tmp directory and will remove them at end, so do +not remove those files unless pgbadger is not running. They are all named +with the following template tmp_pgbadgerXXXX.bin so they can be easily identified. + +=head1 INCREMENTAL REPORTS + +pgBadger include an automatic incremental report mode using option -I or +--incremental. When running in this mode, pgBadger will generate one report +per day and a cumulative report per week. Output is first done in binary +format into the mandatory output directory (see option -O or --outdir), +then in HTML format for daily and weekly reports with a main index file. + +The main index file will show a dropdown menu per week with a link to the week +report and links to daily reports of this week. + +For example, if you run pgBadger as follow based on a daily rotated file: + + 0 4 * * * /usr/bin/pgbadger -I -q /var/log/postgresql/postgresql.log.1 \ + -O /var/www/pg_reports/ + +you will have all daily and weekly reports for the full running period. + +In this mode pgBagder will create an automatic incremental file into the +output directory, so you don't have to use the -l option unless you want +to change the path of that file. This mean that you can run pgBadger in +this mode each days on a log file rotated each week, it will not count +the log entries twice. + +=head1 BINARY FORMAT + +Using the binary format it is possible to create custom incremental and +cumulative reports. For example, if you want to refresh a pgbadger report +each hour from a daily PostgreSQl log file, you can proceed by running each +hour the following commands: + + pgbadder --last-parsed .pgbadger_last_state_file -o sunday/hourX.bin /var/log/pgsql/postgresql-Sun.log + +to generate the incremental data files in binary format. And to generate the fresh HTML +report from that binary file: + + pgbadder sunday/*.bin + +Or an other example, if you have one log file per hour and you want a reports to be +rebuild each time the log file is switched. Proceed as follow: + + pgbadger -o day1/hour01.bin /var/log/pgsql/pglog/postgresql-2012-03-23_10.log + pgbadger -o day1/hour02.bin /var/log/pgsql/pglog/postgresql-2012-03-23_11.log + pgbadger -o day1/hour03.bin /var/log/pgsql/pglog/postgresql-2012-03-23_12.log + ... + +When you want to refresh the HTML report, for example each time after a new binary file +is generated, just do the following: + + pgbadger -o day1_report.html day1/*.bin + +Adjust the commands following your needs. =head1 AUTHORS -pgBadger is an original work from Gilles Darold. It is maintained by the good folks at Dalibo and every one who wants to contribute. +pgBadger is an original work from Gilles Darold. + +The pgBadger logo is an original creation of Damien Clochard. + +The pgBadger v4.x design comes from the "Art is code" company. + +This web site is a work of Gilles Darold. + +pgBadger is maintained by Gilles Darold, the good folks at Dalibo, and every one who wants to contribute. + +Many people have contributed to pgBadger, they are all quoted in the Changelog file. =head1 LICENSE pgBadger is free software distributed under the PostgreSQL Licence. +Copyright (c) 2012-2014, Dalibo + +A modified version of the SQL::Beautify Perl Module is embedded in pgBadger +with copyright (C) 2009 by Jonas Kramer and is published under the terms of +the Artistic License 2.0. + diff -Nru pgbadger-2.0/.gitignore pgbadger-5.0/.gitignore --- pgbadger-2.0/.gitignore 1970-01-01 00:00:00.000000000 +0000 +++ pgbadger-5.0/.gitignore 2014-02-06 15:29:03.000000000 +0000 @@ -0,0 +1,2 @@ +# Swap files +*.swp diff -Nru pgbadger-2.0/LICENSE pgbadger-5.0/LICENSE --- pgbadger-2.0/LICENSE 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/LICENSE 2014-02-06 15:29:03.000000000 +0000 @@ -1,4 +1,4 @@ -Copyright (c) 2012, Dalibo +Copyright (c) 2012-2014, Dalibo Permission to use, copy, modify, and distribute this software and its documentation for any purpose, without fee, and without a written agreement diff -Nru pgbadger-2.0/Makefile.PL pgbadger-5.0/Makefile.PL --- pgbadger-2.0/Makefile.PL 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/Makefile.PL 2014-03-19 15:11:13.000000000 +0000 @@ -30,11 +30,11 @@ 'AUTHOR' => 'Gilles Darold (gilles@darold.net)', 'ABSTRACT' => 'pgBadger - PostgreSQL log analysis report', 'EXE_FILES' => [ qw(pgbadger) ], - 'MAN1PODS' => { 'doc/pgBadger.pod' => 'blib/man1/pgbadger.1' }, + 'MAN1PODS' => { 'doc/pgBadger.pod' => 'blib/man1/pgbadger.1p' }, 'DESTDIR' => $DESTDIR, 'INSTALLDIRS' => $INSTALLDIRS, 'clean' => {}, - 'META_MERGE' => { + ($ExtUtils::MakeMaker::VERSION < 6.46 ? () : 'META_MERGE' => { resources => { homepage => 'http://projects.dalibo.org/pgbadger', repository => { @@ -44,5 +44,6 @@ }, }, } + ) ); diff -Nru pgbadger-2.0/META.yml pgbadger-5.0/META.yml --- pgbadger-2.0/META.yml 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/META.yml 2014-02-06 15:29:03.000000000 +0000 @@ -5,7 +5,7 @@ version_from: pgbadger installdirs: site recommends: - Text::CSV: 0 + Text::CSV_XS: 0 distribution_type: script generated_by: ExtUtils::MakeMaker version 6.17 diff -Nru pgbadger-2.0/pgbadger pgbadger-5.0/pgbadger --- pgbadger-2.0/pgbadger 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/pgbadger 2014-02-06 15:29:03.000000000 +0000 @@ -1,27 +1,29 @@ -#!/usr/bin/perl +#!/usr/bin/env perl #------------------------------------------------------------------------------ # -# PgBadger - An other PostgreSQL log analyzer that aims to replace and -# outperforms pgFouine +# pgBadger - Advanced PostgreSQL log analyzer # # This program is open source, licensed under the PostgreSQL Licence. # For license terms, see the LICENSE file. #------------------------------------------------------------------------------ # -# You must enable SQL query logging : log_min_duration_statement = 0 +# Settings in postgresql.conf +# +# You should enable SQL query logging with log_min_duration_statement >= 0 # With stderr output -# Log line prefix should be : log_line_prefix = '%t [%p]: [%l-1] ' -# Log line prefix should be : log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d ' -# Log line prefix should be : log_line_prefix = '%t [%p]: [%l-1] db=%d,user=%u ' +# Log line prefix should be: log_line_prefix = '%t [%p]: [%l-1] ' +# Log line prefix should be: log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d ' +# Log line prefix should be: log_line_prefix = '%t [%p]: [%l-1] db=%d,user=%u ' # With syslog output -# Log line prefix should be : log_line_prefix = 'db=%d,user=%u ' +# Log line prefix should be: log_line_prefix = 'db=%d,user=%u ' # -# Additional informations that could be collected and reported -# log_checkpoints = on -# log_connections = on -# log_disconnections = on -# log_lock_waits = on -# log_temp_files = 0 +# Additional information that could be collected and reported +# log_checkpoints = on +# log_connections = on +# log_disconnections = on +# log_lock_waits = on +# log_temp_files = 0 +# log_autovacuum_min_duration = 0 #------------------------------------------------------------------------------ use vars qw($VERSION); @@ -31,65 +33,129 @@ use IO::File; use Benchmark; use File::Basename; +use Storable qw(store_fd fd_retrieve); use Time::Local 'timegm_nocheck'; -use POSIX qw(locale_h); +use POSIX qw(locale_h sys_wait_h _exit strftime); setlocale(LC_NUMERIC, ''); -setlocale(LC_ALL, 'C'); +setlocale(LC_ALL, 'C'); +use File::Spec qw/ tmpdir /; +use File::Temp qw/ tempfile /; +use IO::Handle; +use IO::Pipe; + +$VERSION = '5.0'; + +$SIG{'CHLD'} = 'DEFAULT'; + +my $TMP_DIR = File::Spec->tmpdir() || '/tmp'; +my %RUNNING_PIDS = (); +my @tempfiles = (); +my $parent_pid = $$; +my $interrupt = 0; +my $tmp_last_parsed = ''; +my @SQL_ACTION = ('SELECT', 'INSERT', 'UPDATE', 'DELETE'); +my $graphid = 1; +my $NODATA = '
NO DATASET
'; + +my $pgbadger_logo = + ''; +my $pgbadger_ico = + ''; + +#### +# method used to fork as many child as wanted +## +sub spawn +{ + my $coderef = shift; -$VERSION = '2.0'; + unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') { + print "usage: spawn CODEREF"; + exit 0; + } -$| = 1; + my $pid; + if (!defined($pid = fork)) { + print STDERR "Error: cannot fork: $!\n"; + return; + } elsif ($pid) { + $RUNNING_PIDS{$pid} = $pid; + return; # the parent + } + # the child -- go spawn + $< = $>; + $( = $); # suid progs only -# Global variables overriden during install -my $JQGRAPH = 1; + exit &$coderef(); +} # Command line options -my $zcat = 'zcat'; -my $format = ''; -my $outfile = ''; -my $outdir = ''; -my $help = ''; -my $ver = ''; -my $dbname = ''; -my $dbuser = ''; -my $dbclient = ''; -my $dbappname = ''; -my $ident = ''; -my $top = 0; -my $sample = 0; -my $extension = ''; -my $maxlength = 0; -my $graph = 1; -my $nograph = 0; -my $debug = 0; -my $nohighlight = 0; -my $noprettify = 0; -my $from = ''; -my $to = ''; -my $quiet = 0; -my $progress = 1; -my $error_only = 0; -my @exclude_query = (); -my $exclude_file = ''; -my $disable_error = 0; -my $disable_hourly = 0; -my $disable_type = 0; -my $disable_query = 0; -my $disable_session = 0; -my $disable_connection = 0; -my $disable_lock = 0; -my $disable_temporary = 0; -my $disable_checkpoint = 0; -my $avg_minutes = 5; -my $last_parsed = ''; -my $report_title = 'PgBadger: PostgreSQL log analyzer'; -my $log_line_prefix = ''; -my $compiled_prefix = ''; -my $project_url = 'http://dalibo.github.com/pgbadger/'; -my $t_min = 0; -my $t_max = 0; -my $t_min_hour = 0; -my $t_max_hour = 0; +my $zcat_cmd = 'gunzip -c'; +my $zcat = $zcat_cmd; +my $bzcat = 'bunzip2 -c'; +my $ucat = 'unzip -p'; +my $gzip_uncompress_size = "gunzip -l %f | grep -E '^\\s*[0-9]+' | awk '{print \$2}'"; +my $zip_uncompress_size = "unzip -l %f | awk '{if (NR==4) print \$1}'"; +my $format = ''; +my $outfile = ''; +my $outdir = ''; +my $incremental = ''; +my $help = ''; +my $ver = ''; +my @dbname = (); +my @dbuser = (); +my @dbclient = (); +my @dbappname = (); +my @exclude_user = (); +my @exclude_appname = (); +my $ident = ''; +my $top = 0; +my $sample = 0; +my $extension = ''; +my $maxlength = 0; +my $graph = 1; +my $nograph = 0; +my $debug = 0; +my $nohighlight = 0; +my $noprettify = 0; +my $from = ''; +my $to = ''; +my $quiet = 0; +my $progress = 1; +my $error_only = 0; +my @exclude_query = (); +my @exclude_time = (); +my $exclude_file = ''; +my @include_query = (); +my $include_file = ''; +my $disable_error = 0; +my $disable_hourly = 0; +my $disable_type = 0; +my $disable_query = 0; +my $disable_session = 0; +my $disable_connection = 0; +my $disable_lock = 0; +my $disable_temporary = 0; +my $disable_checkpoint = 0; +my $disable_autovacuum = 0; +my $avg_minutes = 5; +my $last_parsed = ''; +my $report_title = 'PostgreSQL log analyzer'; +my $log_line_prefix = ''; +my $compiled_prefix = ''; +my $project_url = 'http://dalibo.github.com/pgbadger/'; +my $t_min = 0; +my $t_max = 0; +my $remove_comment = 0; +my $select_only = 0; +my $tsung_queries = 0; +my $queue_size = 0; +my $job_per_file = 0; +my $charset = 'utf-8'; +my $csv_sep_char = ','; +my %current_sessions = (); +my $incr_date = ''; +my $last_incr_date = ''; my $NUMPROGRESS = 10000; my @DIMENSIONS = (800, 300); @@ -98,8 +164,13 @@ my @log_files = (); my %prefix_vars = (); +# Load the DATA part of the script +my @jscode = ; + +my $sql_prettified; + # Do not display data in pie where percentage is lower than this value -# to avoid label overlaping. +# to avoid label overlapping. my $pie_percentage_limit = 2; # Get the decimal separator @@ -107,48 +178,96 @@ my $num_sep = ','; $num_sep = ' ' if ($n =~ /,/); +# Inform the parent that it should stop iterate on parsing other files +sub stop_parsing +{ + $interrupt = 1; +} + +# With multiprocess we need to wait all childs +sub wait_child +{ + my $sig = shift; + print STDERR "Received terminating signal ($sig).\n"; + if ($^O !~ /MSWin32|dos/i) { + 1 while wait != -1; + $SIG{INT} = \&wait_child; + $SIG{TERM} = \&wait_child; + foreach my $f (@tempfiles) { + unlink("$f->[1]") if (-e "$f->[1]"); + } + } + if ($last_parsed && -e $tmp_last_parsed) { + unlink("$tmp_last_parsed"); + } + if ($last_parsed && -e "$last_parsed.tmp") { + unlink("$last_parsed.tmp"); + } + _exit(0); +} +$SIG{INT} = \&wait_child; +$SIG{TERM} = \&wait_child; +$SIG{USR2} = \&stop_parsing; + +$| = 1; + # get the command line parameters my $result = GetOptions( - "a|average=i" => \$avg_minutes, - "b|begin=s" => \$from, - "c|client=s" => \$dbclient, - "d|dbname=s" => \$dbname, - "e|end=s" => \$to, - "f|format=s" => \$format, - "G|nograph!" => \$nograph, - "h|help!" => \$help, - "i|ident=s" => \$ident, - "l|last-parsed=s" => \$last_parsed, - "m|maxlength=i" => \$maxlength, - "N|appname=s" => \$dbappname, - "n|nohighlight!" => \$nohighlight, - "o|outfile=s" => \$outfile, - "p|prefix=s" => \$log_line_prefix, - "P|no-prettify!" => \$noprettify, - "q|quiet!" => \$quiet, - "s|sample=i" => \$sample, - "t|top=i" => \$top, - "T|title=s" => \$report_title, - "u|dbuser=s" => \$dbuser, - "v|verbose!" => \$debug, - "V|version!" => \$ver, - "w|watch-mode!" => \$error_only, - "x|extension=s" => \$extension, - "z|zcat=s" => \$zcat, - "pie-limit=i" => \$pie_percentage_limit, - "image-format=s" => \$img_format, - "exclude-query=s" => \@exclude_query, - "exclude-file=s" => \$exclude_file, - "disable-error!" => \$disable_error, - "disable-hourly!" => \$disable_hourly, - "disable-type!" => \$disable_type, - "disable-query!" => \$disable_query, - "disable-session!" => \$disable_session, - "disable-connection!" => \$disable_connection, - "disable-lock!" => \$disable_lock, - "disable-temporary!" => \$disable_temporary, - "disable-checkpoint!" => \$disable_checkpoint, + "a|average=i" => \$avg_minutes, + "b|begin=s" => \$from, + "c|dbclient=s" => \@dbclient, + "C|nocomment!" => \$remove_comment, + "d|dbname=s" => \@dbname, + "e|end=s" => \$to, + "f|format=s" => \$format, + "G|nograph!" => \$nograph, + "h|help!" => \$help, + "i|ident=s" => \$ident, + "I|incremental!" => \$incremental, + "j|jobs=i" => \$queue_size, + "J|job_per_file=i" => \$job_per_file, + "l|last-parsed=s" => \$last_parsed, + "m|maxlength=i" => \$maxlength, + "N|appname=s" => \@dbappname, + "n|nohighlight!" => \$nohighlight, + "o|outfile=s" => \$outfile, + "O|outdir=s" => \$outdir, + "p|prefix=s" => \$log_line_prefix, + "P|no-prettify!" => \$noprettify, + "q|quiet!" => \$quiet, + "s|sample=i" => \$sample, + "S|select-only!" => \$select_only, + "t|top=i" => \$top, + "T|title=s" => \$report_title, + "u|dbuser=s" => \@dbuser, + "U|exclude-user=s" => \@exclude_user, + "v|verbose!" => \$debug, + "V|version!" => \$ver, + "w|watch-mode!" => \$error_only, + "x|extension=s" => \$extension, + "z|zcat=s" => \$zcat, + "pie-limit=i" => \$pie_percentage_limit, + "image-format=s" => \$img_format, + "exclude-query=s" => \@exclude_query, + "exclude-file=s" => \$exclude_file, + "exclude-appname=s" => \@exclude_appname, + "include-query=s" => \@include_query, + "include-file=s" => \$include_file, + "disable-error!" => \$disable_error, + "disable-hourly!" => \$disable_hourly, + "disable-type!" => \$disable_type, + "disable-query!" => \$disable_query, + "disable-session!" => \$disable_session, + "disable-connection!" => \$disable_connection, + "disable-lock!" => \$disable_lock, + "disable-temporary!" => \$disable_temporary, + "disable-checkpoint!" => \$disable_checkpoint, + "disable-autovacuum!" => \$disable_autovacuum, + "charset=s" => \$charset, + "csv-separator=s" => \$csv_sep_char, + "exclude-time=s" => \@exclude_time, ); +die "FATAL: use pgbadger --help\n" if (not $result); if ($ver) { print "pgBadger version $VERSION\n"; @@ -156,11 +275,14 @@ } &usage() if ($help); +# Rewrite some command line argument as lists +&compute_arg_list(); + # Log file to be parsed are passed as command line argument if ($#ARGV >= 0) { foreach my $file (@ARGV) { if ($file ne '-') { - die "FATAL: logfile $file must exist!\n" if (!-f $file); + die "FATAL: logfile $file must exist!\n" if not -f $file; if (-z $file) { print "WARNING: file $file is empty\n"; next; @@ -183,11 +305,31 @@ $avg_minutes ||= 5; $avg_minutes = 60 if ($avg_minutes > 60); $avg_minutes = 1 if ($avg_minutes < 1); +my @avgs = (); +for (my $i = 0 ; $i < 60 ; $i += $avg_minutes) { + push(@avgs, sprintf("%02d", $i)); +} +# Set error like log level regex +my $parse_regex = qr/^(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|HINT|STATEMENT|CONTEXT)/; +my $full_error_regex = qr/^(WARNING|ERROR|FATAL|PANIC|DETAIL|HINT|STATEMENT|CONTEXT)/; +my $main_error_regex = qr/^(WARNING|ERROR|FATAL|PANIC)/; + +# Set syslog prefix regex +my $other_syslog_line = + qr/^(...)\s+(\d+)\s(\d+):(\d+):(\d+)(?:\s[^\s]+)?\s([^\s]+)\s([^\s\[]+)\[(\d+)\]:(?:\s\[[^\]]+\])?\s\[(\d+)\-\d+\]\s*(.*)/; +my $orphan_syslog_line = qr/^(...)\s+(\d+)\s(\d+):(\d+):(\d+)(?:\s[^\s]+)?\s([^\s]+)\s([^\s\[]+)\[(\d+)\]:/; +my $orphan_stderr_line = ''; # Set default format $format ||= &autodetect_format($log_files[0]); +if ($format eq 'syslog2') { + $other_syslog_line = + qr/^(\d+-\d+)-(\d+)T(\d+):(\d+):(\d+)(?:.[^\s]+)?\s([^\s]+)\s(?:[^\s]+\s)?([^\s\[]+)\[(\d+)\]:(?:\s\[[^\]]+\])?\s\[(\d+)\-\d+\]\s*(.*)/; + $orphan_syslog_line = qr/^(\d+-\d+)-(\d+)T(\d+):(\d+):(\d+)(?:.[^\s]+)?\s([^\s]+)\s(?:[^\s]+\s)?([^\s\[]+)\[(\d+)\]:/; +} + # Set default top query $top ||= 20; @@ -196,7 +338,11 @@ # Set the default extension and output format if (!$extension) { - if ($outfile =~ /\.htm[l]*/i) { + if ($outfile =~ /\.bin/i) { + $extension = 'binary'; + } elsif ($outfile =~ /\.tsung/i) { + $extension = 'tsung'; + } elsif ($outfile =~ /\.htm[l]*/i) { $extension = 'html'; } elsif ($outfile) { $extension = 'txt'; @@ -223,23 +369,61 @@ $img_format = 'png' if ($img_format ne 'jpeg'); # Extract the output directory from outfile so that graphs will -# be created in the same directoty -my @infs = fileparse($outfile); -$outdir = $infs[1] . '/'; +# be created in the same directory +if ($outfile ne '-') { + if (!$outdir) { + my @infs = fileparse($outfile); + if ($infs[0] ne '') { + $outdir = $infs[1]; + } else { + # maybe a confusion between -O and -o + die "FATAL: output file $outfile is a directory, should be a file\nor maybe you want to use -O | --outdir option instead.\n"; + } + } elsif (!-d "$outdir") { + # An output directory have been passed as command line parameter + die "FATAL: $outdir is not a directory or doesn't exist.\n"; + } + $outfile = basename($outfile); + $outfile = $outdir . '/' . $outfile; +} # Remove graph support if output is not html -$graph = 0 if ($extension ne 'html'); +$graph = 0 unless ($extension eq 'html' or $extension eq 'binary' ); $graph = 0 if ($nograph); +# Set some default values my $end_top = $top - 1; +$queue_size ||= 1; +$job_per_file ||= 1; + +if ($^O =~ /MSWin32|dos/i) { + if ( ($queue_size > 1) || ($job_per_file > 1) ) { + print STDERR "WARNING: parallel processing is not supported on this platform.\n"; + $queue_size = 1; + $job_per_file = 1; + } +} + +if ($extension eq 'tsung') { + + # Open filehandle + my $fh = new IO::File ">$outfile"; + if (not defined $fh) { + die "FATAL: can't write to $outfile, $!\n"; + } + print $fh "\n"; + $fh->close(); + +} else { -# Test file creation before going to parse log -my $tmpfh = new IO::File ">$outfile"; -if (not defined $tmpfh) { - die "FATAL: can't write to $outfile, $!\n"; + # Test file creation before going to parse log + my $tmpfh = new IO::File ">$outfile"; + if (not defined $tmpfh) { + die "FATAL: can't write to $outfile, $!\n"; + } + $tmpfh->close(); + unlink($outfile) if (-e $outfile); } -$tmpfh->close(); -unlink($outfile) if (-e $outfile); # -w and --disable-error can't go together if ($error_only && $disable_error) { @@ -256,46 +440,99 @@ my @exclq = ; close(IN); chomp(@exclq); - map {s/ //;} @exclq; + map {s/\r//;} @exclq; foreach my $r (@exclq) { &check_regex($r, '--exclude-file'); } push(@exclude_query, @exclq); } -# Testing regex syntaxe +# Testing regex syntax if ($#exclude_query >= 0) { foreach my $r (@exclude_query) { &check_regex($r, '--exclude-query'); } } -my $other_syslog_line = qr/^(...)\s+(\d+)\s(\d+):(\d+):(\d+)\s([^\s]+)\s([^\[]+)\[(\d+)\]:\s\[(\d+)\-\d+\]\s*(.*)/; -my $orphan_syslog_line = qr/^...\s+\d+\s\d+:\d+:\d+\s[^\s]+\s[^\[]+\[\d+\]:/; -my $orphan_stderr_line = qr/[^']*\d+-\d+-\d+\s\d+:\d+:\d+[\.\d]*\s[^\s]+[^']*/; +# Testing regex syntax +if ($#exclude_time >= 0) { + foreach my $r (@exclude_time) { + &check_regex($r, '--exclude-time'); + } +} + +# Loading included query from file if any +if ($include_file) { + open(IN, "$include_file") or die "FATAL: can't read file $include_file: $!\n"; + my @exclq = ; + close(IN); + chomp(@exclq); + map {s/\r//;} @exclq; + foreach my $r (@exclq) { + &check_regex($r, '--include-file'); + } + push(@include_query, @exclq); +} + +# Testing regex syntax +if ($#include_query >= 0) { + foreach my $r (@include_query) { + &check_regex($r, '--include-query'); + } +} -# Compile custom log line prefie prefix +my @action_regex = ( + qr/^\s*(delete) from/is, + qr/^\s*(insert) into/is, + qr/^\s*(update) .*\bset\b/is, + qr/^\s*(select) /is +); + +# Compile custom log line prefix prefix my @prefix_params = (); if ($log_line_prefix) { - $log_line_prefix =~ s/([\[\]\|\/])/\\$1/g; + # Build parameters name that will be extracted from the prefix regexp @prefix_params = &build_log_line_prefix_regex(); &check_regex($log_line_prefix, '--prefix'); if ($format eq 'syslog') { - $log_line_prefix = '^(...)\s+(\d+)\s(\d+):(\d+):(\d+)\s([^\s]+)\s([^\[]+)\[(\d+)\]:\s\[(\d+)\-\d+\]\s*' . $log_line_prefix . '\s*(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):\s+(.*)'; + $log_line_prefix = + '^(...)\s+(\d+)\s(\d+):(\d+):(\d+)(?:\s[^\s]+)?\s([^\s]+)\s([^\s\[]+)\[(\d+)\]:(?:\s\[[^\]]+\])?\s\[(\d+)\-\d+\]\s*' + . $log_line_prefix + . '\s*(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):\s+(?:[0-9A-Z]{5}:\s+)?(.*)'; $compiled_prefix = qr/$log_line_prefix/; unshift(@prefix_params, 't_month', 't_day', 't_hour', 't_min', 't_sec', 't_host', 't_ident', 't_pid', 't_session_line'); push(@prefix_params, 't_loglevel', 't_query'); + } elsif ($format eq 'syslog2') { + $format = 'syslog'; + $log_line_prefix = + '^(\d+)-(\d+)-(\d+)T(\d+):(\d+):(\d+)(?:.[^\s]+)?\s([^\s]+)\s(?:[^\s]+\s)?([^\s\[]+)\[(\d+)\]:(?:\s\[[^\]]+\])?\s\[(\d+)\-\d+\]\s*' + . $log_line_prefix + . '\s*(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):\s+(?:[0-9A-Z]{5}:\s+)?(.*)'; + $compiled_prefix = qr/$log_line_prefix/; + unshift(@prefix_params, 't_year', 't_month', 't_day', 't_hour', 't_min', 't_sec', 't_host', 't_ident', 't_pid', 't_session_line'); + push(@prefix_params, 't_loglevel', 't_query'); } elsif ($format eq 'stderr') { - $log_line_prefix = '^' . $log_line_prefix . '\s*(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):\s+(.*)'; + $orphan_stderr_line = qr/$log_line_prefix/; + $log_line_prefix = '^' . $log_line_prefix . '\s*(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):\s+(?:[0-9A-Z]{5}:\s+)?(.*)'; $compiled_prefix = qr/$log_line_prefix/; push(@prefix_params, 't_loglevel', 't_query'); } } elsif ($format eq 'syslog') { - $compiled_prefix = qr/^(...)\s+(\d+)\s(\d+):(\d+):(\d+)\s([^\s]+)\s([^\[]+)\[(\d+)\]:\s\[(\d+)\-\d+\]\s*(.*?)\s*(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):\s+(.*)/; - push(@prefix_params, 't_month', 't_day', 't_hour', 't_min', 't_sec', 't_host', 't_ident', 't_pid', 't_session_line', 't_logprefix', 't_loglevel', 't_query'); + $compiled_prefix = +qr/^(...)\s+(\d+)\s(\d+):(\d+):(\d+)(?:\s[^\s]+)?\s([^\s]+)\s([^\s\[]+)\[(\d+)\]:(?:\s\[[^\]]+\])?\s\[(\d+)\-\d+\]\s*(.*?)\s*(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):\s+(?:[0-9A-Z]{5}:\s+)?(.*)/; + push(@prefix_params, 't_month', 't_day', 't_hour', 't_min', 't_sec', 't_host', 't_ident', 't_pid', 't_session_line', + 't_logprefix', 't_loglevel', 't_query'); +} elsif ($format eq 'syslog2') { + $format = 'syslog'; + $compiled_prefix = +qr/^(\d+)-(\d+)-(\d+)T(\d+):(\d+):(\d+)(?:.[^\s]+)?\s([^\s]+)\s(?:[^\s]+\s)?([^\s\[]+)\[(\d+)\]:(?:\s\[[^\]]+\])?\s\[(\d+)\-\d+\]\s*(.*?)\s*(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):\s+(?:[0-9A-Z]{5}:\s+)?(.*)/; + push(@prefix_params, 't_year', 't_month', 't_day', 't_hour', 't_min', 't_sec', 't_host', 't_ident', 't_pid', 't_session_line', + 't_logprefix', 't_loglevel', 't_query'); } elsif ($format eq 'stderr') { - $compiled_prefix = qr/^(\d+-\d+-\d+\s\d+:\d+:\d+)[\.\d]*\s[^\s]+\s\[(\d+)\]:\s\[(\d+)\-\d+\]\s*(.*?)\s*(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):\s+(.*)/; + $compiled_prefix = +qr/^(\d+-\d+-\d+\s\d+:\d+:\d+)[\.\d]*(?: [A-Z\d]{3,6})?\s\[(\d+)\]:\s\[(\d+)\-\d+\]\s*(.*?)\s*(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):\s+(?:[0-9A-Z]{5}:\s+)?(.*)/; push(@prefix_params, 't_timestamp', 't_pid', 't_session_line', 't_logprefix', 't_loglevel', 't_query'); + $orphan_stderr_line = qr/^(\d+-\d+-\d+\s\d+:\d+:\d+)[\.\d]*(?: [A-Z\d]{3,6})?\s\[(\d+)\]:\s\[(\d+)\-\d+\]\s*(.*?)\s*/; } sub check_regex @@ -310,22 +547,21 @@ # Check start/end date time if ($from) { - if ($from =~ /^(\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2})$/) { - $from = "$1$2$3$4$5$6"; - } elsif ($from =~ /^(\d{4})-(\d{2})-(\d{2})$/) { - $from = "$1$2$3" . "000000"; + if ($from !~ /^(\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2})([.]\d+([+-]\d+)?)?$/) { + die "FATAL: bad format for begin datetime, should be yyyy-mm-dd hh:mm:ss.l+tz\n"; } else { - die "FATAL: bad format for begin datetime, shoud be yyyy-mm-dd hh:mm:ss\n"; - } + my $fractional_seconds = $7 || "0"; + $from = "$1-$2-$3 $4:$5:$6.$7" + } + } if ($to) { - if ($to =~ /^(\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2})$/) { - $to = "$1$2$3$4$5$6"; - } elsif ($to =~ /^(\d{4})-(\d{2})-(\d{2})$/) { - $to = "$1$2$3" . "000000"; + if ($to !~ /^(\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2})([.]\d+([+-]\d+)?)?$/) { + die "FATAL: bad format for ending datetime, should be yyyy-mm-dd hh:mm:ss.l+tz\n"; } else { - die "FATAL: bad format for ending datetime, shoud be yyyy-mm-dd hh:mm:ss\n"; - } + my $fractional_seconds = $7 || "0"; + $to = "$1-$2-$3 $4:$5:$6.$7" + } } # Stores the last parsed line from log file to allow incremental parsing @@ -345,33 +581,57 @@ '07' => 'Jul', '08' => 'Aug', '09' => 'Sep', '10' => 'Oct', '11' => 'Nov', '12' => 'Dec' ); +# Keywords variable +my @pg_keywords = qw( + ALL ANALYSE ANALYZE AND ANY ARRAY AS ASC ASYMMETRIC AUTHORIZATION BINARY BOTH CASE + CAST CHECK COLLATE COLLATION COLUMN CONCURRENTLY CONSTRAINT CREATE CROSS + CURRENT_DATE CURRENT_ROLE CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER + DEFAULT DEFERRABLE DESC DISTINCT DO ELSE END EXCEPT FALSE FETCH FOR FOREIGN FREEZE FROM + FULL GRANT GROUP HAVING ILIKE IN INITIALLY INNER INTERSECT INTO IS ISNULL JOIN LEADING + LEFT LIKE LIMIT LOCALTIME LOCALTIMESTAMP NATURAL NOT NOTNULL NULL ON ONLY OPEN OR + ORDER OUTER OVER OVERLAPS PLACING PRIMARY REFERENCES RETURNING RIGHT SELECT SESSION_USER + SIMILAR SOME SYMMETRIC TABLE THEN TO TRAILING TRUE UNION UNIQUE USER USING VARIADIC + VERBOSE WHEN WHERE WINDOW WITH +); + + # Highlight variables -my @KEYWORDS1 = ( - 'ALL', 'ASC', 'AS', 'ALTER', 'AND', 'ADD', 'AUTO_INCREMENT', 'ANY', 'ANALYZE', - 'BETWEEN', 'BINARY', 'BOTH', 'BY', 'BOOLEAN', 'BEGIN', - 'CHANGE', 'CHECK', 'COLUMNS', 'COLUMN', 'CROSS', 'CREATE', 'CASE', 'COMMIT', 'COALESCE', 'CLUSTER', 'COPY', - 'DATABASES', 'DATABASE', 'DATA', 'DELAYED', 'DESCRIBE', 'DESC', 'DISTINCT', 'DELETE', 'DROP', 'DEFAULT', - 'ENCLOSED', 'ESCAPED', 'EXISTS', 'EXPLAIN', 'ELSE', 'END', 'EXCEPT', - 'FIELDS', 'FIELD', 'FLUSH', 'FOR', 'FOREIGN', 'FUNCTION', 'FROM', 'FULL', - 'GROUP', 'GRANT', 'GREATEST', - 'HAVING', - 'IGNORE', 'INDEX', 'INFILE', 'INSERT', 'INNER', 'INTO', 'IDENTIFIED', 'IN', 'IS', 'IF', 'INTERSECT', 'INHERIT', - 'JOIN', - 'KEYS', 'KILL', 'KEY', - 'LEADING', 'LIKE', 'LIMIT', 'LINES', 'LOAD', 'LOCAL', 'LOCK', 'LOW_PRIORITY', 'LEFT', 'LANGUAGE', 'LEAST', 'LOGIN', - 'MODIFY', - 'NATURAL', 'NOT', 'NULL', 'NEXTVAL', 'NULLIF', 'NOSUPERUSER', 'NOCREATEDB', 'NOCREATEROLE', - 'OPTIMIZE', 'OPTION', 'OPTIONALLY', 'ORDER', 'OUTFILE', 'OR', 'OUTER', 'ON', 'OVERLAPS', 'OWNER', - 'PROCEDURE', 'PROCEDURAL', 'PRIMARY', - 'READ', 'REFERENCES', 'REGEXP', 'RENAME', 'REPLACE', 'RETURN', 'REVOKE', 'RLIKE', 'RIGHT', 'ROLE', 'ROLLBACK', - 'SHOW', 'SONAME', 'STATUS', 'STRAIGHT_JOIN', 'SELECT', 'SETVAL', 'SET', 'SOME', 'SEQUENCE', - 'TABLES', 'TEMINATED', 'TO', 'TRAILING', 'TRUNCATE', 'TABLE', 'TEMPORARY', 'TRIGGER', 'TRUSTED', 'THEN', - 'UNIQUE', 'UNLOCK', 'USE', 'USING', 'UPDATE', 'UNSIGNED', 'UNION', - 'VALUES', 'VARIABLES', 'VIEW', 'VACUUM', 'VERBOSE', - 'WITH', 'WRITE', 'WHERE', 'WHEN', - 'ZEROFILL', - 'XOR', +my @KEYWORDS1 = qw( + ALTER ADD AUTO_INCREMENT BETWEEN BY BOOLEAN BEGIN CHANGE COLUMNS COMMIT COALESCE CLUSTER + COPY DATABASES DATABASE DATA DELAYED DESCRIBE DELETE DROP ENCLOSED ESCAPED EXISTS EXPLAIN + FIELDS FIELD FLUSH FUNCTION GREATEST IGNORE INDEX INFILE INSERT IDENTIFIED IF INHERIT + KEYS KILL KEY LINES LOAD LOCAL LOCK LOW_PRIORITY LANGUAGE LEAST LOGIN MODIFY + NULLIF NOSUPERUSER NOCREATEDB NOCREATEROLE OPTIMIZE OPTION OPTIONALLY OUTFILE OWNER PROCEDURE + PROCEDURAL READ REGEXP RENAME RETURN REVOKE RLIKE ROLE ROLLBACK SHOW SONAME STATUS + STRAIGHT_JOIN SET SEQUENCE TABLES TEMINATED TRUNCATE TEMPORARY TRIGGER TRUSTED UN$filenumLOCK + USE UPDATE UNSIGNED VALUES VARIABLES VIEW VACUUM WRITE ZEROFILL XOR + ABORT ABSOLUTE ACCESS ACTION ADMIN AFTER AGGREGATE ALSO ALWAYS ASSERTION ASSIGNMENT AT ATTRIBUTE + BACKWARD BEFORE BIGINT CACHE CALLED CASCADE CASCADED CATALOG CHAIN CHARACTER CHARACTERISTICS + CHECKPOINT CLOSE COMMENT COMMENTS COMMITTED CONFIGURATION CONNECTION CONSTRAINTS CONTENT + CONTINUE CONVERSION COST CSV CURRENT CURSOR CYCLE DAY DEALLOCATE DEC DECIMAL DECLARE DEFAULTS + DEFERRED DEFINER DELIMITER DELIMITERS DICTIONARY DISABLE DISCARD DOCUMENT DOMAIN DOUBLE EACH + ENABLE ENCODING ENCRYPTED ENUM ESCAPE EXCLUDE EXCLUDING EXCLUSIVE EXECUTE EXTENSION EXTERNAL + FIRST FLOAT FOLLOWING FORCE FORWARD FUNCTIONS GLOBAL GRANTED HANDLER HEADER HOLD + HOUR IDENTITY IMMEDIATE IMMUTABLE IMPLICIT INCLUDING INCREMENT INDEXES INHERITS INLINE INOUT INPUT + INSENSITIVE INSTEAD INT INTEGER INVOKER ISOLATION LABEL LARGE LAST LC_COLLATE LC_CTYPE + LEAKPROOF LEVEL LISTEN LOCATION LOOP MAPPING MATCH MAXVALUE MINUTE MINVALUE MODE MONTH MOVE NAMES + NATIONAL NCHAR NEXT NO NONE NOTHING NOTIFY NOWAIT NULLS OBJECT OF OFF OIDS OPERATOR OPTIONS + OUT OWNED PARSER PARTIAL PARTITION PASSING PASSWORD PLANS PRECEDING PRECISION PREPARE + PREPARED PRESERVE PRIOR PRIVILEGES QUOTE RANGE REAL REASSIGN RECHECK RECURSIVE REF REINDEX RELATIVE + RELEASE REPEATABLE REPLICA RESET RESTART RESTRICT RETURNS ROW ROWS RULE SAVEPOINT SCHEMA SCROLL SEARCH + SECOND SECURITY SEQUENCES SERIALIZABLE SERVER SESSION SETOF SHARE SIMPLE SMALLINT SNAPSHOT STABLE + STANDALONE START STATEMENT STATISTICS STORAGE STRICT SYSID SYSTEM TABLESPACE TEMP + TEMPLATE TRANSACTION TREAT TYPE TYPES UNBOUNDED UNCOMMITTED UNENCRYPTED + UNKNOWN UNLISTEN UNLOGGED UNTIL VALID VALIDATE VALIDATOR VALUE VARYING VOLATILE + WHITESPACE WITHOUT WORK WRAPPER XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLEXISTS XMLFOREST XMLPARSE + XMLPI XMLROOT XMLSERIALIZE YEAR YES ZONE ); + +foreach my $k (@pg_keywords) { + push(@KEYWORDS1, $k) if (!grep(/^$k$/i, @KEYWORDS1)); +} + + my @KEYWORDS2 = ( 'ascii', 'age', 'bit_length', 'btrim', @@ -400,318 +660,256 @@ my @BRACKETS = ('(', ')'); map {$_ = quotemeta($_)} @BRACKETS; -# Where statistic are stored -my %STATS = (); -my $first_log_date = ''; -my $last_log_date = ''; -my %overall_stat = (); -my @top_slowest = (); -my %normalyzed_info = (); -my %error_info = (); -my %logs_type = (); -my %per_hour_info = (); -my %per_minute_info = (); -my %lock_info = (); -my %tempfile_info = (); -my %connection_info = (); -my %session_info = (); -my %conn_received = (); -my %checkpoint_info = (); -my @graph_values = (); -my %cur_info = (); -my $nlines = 0; -my %last_line = (); -my %saved_last_line = (); +# Inbounds of query times histogram +my @histogram_query_time = (0, 1, 5, 10, 25, 50, 100, 500, 1000, 10000); + +# Get inbounds of query times histogram +sub get_hist_inbound +{ + my $duration = shift; + + for (my $i = 0; $i <= $#histogram_query_time; $i++) { + return $histogram_query_time[$i-1] if ($histogram_query_time[$i] > $duration); + } + + return -1; +} + +# Where statistics are stored +my %overall_stat = (); +my %overall_checkpoint = (); +my @top_slowest = (); +my %normalyzed_info = (); +my %error_info = (); +my %logs_type = (); +my %per_minute_info = (); +my %lock_info = (); +my %tempfile_info = (); +my %connection_info = (); +my %database_info = (); +my %application_info = (); +my %user_info = (); +my %host_info = (); +my %session_info = (); +my %conn_received = (); +my %checkpoint_info = (); +my %autovacuum_info = (); +my %autoanalyze_info = (); +my @graph_values = (); +my %cur_info = (); +my %cur_temp_info = (); +my %cur_lock_info = (); +my $nlines = 0; +my %last_line = (); +our %saved_last_line = (); +my %tsung_session = (); +my @top_locked_info = (); +my @top_tempfile_info = (); +my %drawn_graphs = (); my $t0 = Benchmark->new; +# Automatically set parameters with incremental mode +if ($incremental) { + # In incremental mode an output directory must be set + if (!$outdir) { + die "FATAL: you must specify an output directory with incremental mode, see -O or --outdir.\n" + } + # Ensure this is not a relative path + if (dirname($outdir) eq '.') { + die "FATAL: output directory ($outdir) is not an absolute path.\n"; + } + # Ensure that the directory already exists + if (!-d $outdir) { + die "FATAL: output directory $outdir does not exists\n"; + } + # Set default last parsed file in incremental mode + if (!$last_parsed) { + $last_parsed = $outdir . '/LAST_PARSED'; + } + $outfile = 'index.html'; + # Set default output format + $extension = 'binary'; +} + # Reading last line parsed if ($last_parsed && -e $last_parsed) { if (open(IN, "$last_parsed")) { my $line = ; close(IN); - ($saved_last_line{datetime}, $saved_last_line{orig}) = split(/\t/, $line, 2); + ($saved_last_line{datetime}, $saved_last_line{current_pos}, $saved_last_line{orig}) = split(/\t/, $line, 3); + # Preserve backward compatibility with version < 5 + if ($saved_last_line{current_pos} =~ /\D/) { + $saved_last_line{orig} = $saved_last_line{current_pos} . "\t" . $saved_last_line{orig}; + $saved_last_line{current_pos} = 0; + } + if ( ($format eq 'binary') || ($format eq 'csv') ) { + $saved_last_line{current_pos} = 0; + } + } else { die "FATAL: can't read last parsed line from $last_parsed, $!\n"; } } +$tmp_last_parsed = 'tmp_' . basename($last_parsed) if ($last_parsed); -# Main loop reading log files -foreach my $logfile (@log_files) { - &logmsg('DEBUG', "Starting to parse log file: $logfile"); - - my $curdate = localtime(time); - - # Syslog do not have year information, so take care of year overlapping - my ($gsec, $gmin, $ghour, $gmday, $gmon, $gyear, $gwday, $gyday, $gisdst) = localtime(time); - $gyear += 1900; - my $CURRENT_DATE = $gyear . sprintf("%02d", $gmon + 1) . sprintf("%02d", $gmday); - - # Get size of the file - my $totalsize = (stat("$logfile"))[7] || 0; - my $cursize = 0; - - if ($format eq 'csv') { - require Text::CSV; - my $csv = Text::CSV->new({binary => 1, eol => $/}); - open(my $io, "<", $logfile) or die "FATAL: cannot read csvlog file $logfile. $!\n"; - - # Parse csvlog lines - my $getout = 0; - while (my $row = $csv->getline($io)) { - - # Set progress statistics - $cursize += length(join(',', @$row)); - $nlines++; - if ($progress && (($nlines % $NUMPROGRESS) == 0)) { - if ($totalsize) { - print progress_bar($cursize, $totalsize, 25, '='); - } else { - print "."; - } - } - # Process only relevant lines - next if ($row->[11] !~ /^(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT)$/); - # Extract the date - $row->[0] =~ m/^(\d+)-(\d+)-(\d+)\s+(\d+):(\d+):(\d+)\.(\d+)/; - my $milli = $7 || 0; - ($prefix_vars{'t_year'}, $prefix_vars{'t_month'}, $prefix_vars{'t_day'}, $prefix_vars{'t_hour'}, $prefix_vars{'t_min'}, $prefix_vars{'t_sec'}) = ($1, $2, $3, $4, $5, $6); - $prefix_vars{'t_date'} = $prefix_vars{'t_year'} . $prefix_vars{'t_month'} . $prefix_vars{'t_day'} . $prefix_vars{'t_hour'} . $prefix_vars{'t_min'} . $prefix_vars{'t_sec'}; - $prefix_vars{'t_timestamp'} = "$prefix_vars{'t_year'}-$prefix_vars{'t_month'}-$prefix_vars{'t_day'} $prefix_vars{'t_hour'}:$prefix_vars{'t_min'}:$prefix_vars{'t_sec'}"; - # Skip unwanted lines - next if ($from && ($from > $prefix_vars{'t_date'})); - $getout = 1, last if ($to && ($to < $prefix_vars{'t_date'})); - - # Jump to the last line parsed if required - next if (!&check_incremental_position($prefix_vars{'t_date'}, join(',', @$row))); - - # Set approximative session duration - if (($row->[11] eq 'LOG') && $row->[13] && ($row->[13] !~ m/^duration: \d+\.\d+ ms/)) { - my $end_time = timegm_nocheck($6, $5, $4, $3, $2, $1); - $row->[8] =~ m/^(\d+)-(\d+)-(\d+)\s+(\d+):(\d+):(\d+)/; - my $start_time = timegm_nocheck($6, $5, $4, $3, $2, $1); - my $duration = (($end_time - $start_time) * 1000) - $milli; - $duration = 0 if ($duration < 0); - $row->[13] = "duration: $duration ms $row->[13]"; - } - - $prefix_vars{'t_dbuser'} = $row->[1] || ''; - $prefix_vars{'t_dbname'} = $row->[2] || ''; - $prefix_vars{'t_appname'} = $row->[2] || ''; - $prefix_vars{'t_client'} = $row->[2] || ''; - $prefix_vars{'t_host'} = 'csv'; - $prefix_vars{'t_pid'} = $row->[3]; - $prefix_vars{'t_session_line'} = $row->[5]; - $prefix_vars{'t_session_line'} =~ s/\..*//; - $prefix_vars{'t_loglevel'} = $row->[11]; - $prefix_vars{'t_query'} = $row->[13]; - &parse_query(); - if ($row->[14]) { - if ($row->[11] eq 'LOG') { - if ($row->[13] =~ /^(duration: \d+\.\d+ ms)/) { - $row->[14] = "$1 $row->[14]"; - } - } - $prefix_vars{'t_loglevel'} = 'DETAIL'; - $prefix_vars{'t_query'} = $row->[14]; - &parse_query(); - } - if ($row->[15]) { - $prefix_vars{'t_query'} = $row->[15]; - $prefix_vars{'t_loglevel'} = 'HINT'; - &parse_query(); - } - } - if (!$getout) { - $csv->eof or warn "FATAL: cannot use CSV, " . $csv->error_diag() . "\n"; - } - close $io; - } else { - - # Open log file for reading - my $lfile = new IO::File; - if ($logfile !~ /\.gz/) { - $lfile->open($logfile) || die "FATAL: cannot read log file $logfile. $!\n"; - } else { - - # Open a pipe to zcat program for compressed log - $lfile->open("$zcat $logfile |") || die "FATAL: cannot read from pipe to $zcat $logfile. $!\n"; - - # Real size of the file is unknow - $totalsize = 0; - } - - my $time_pattern = qr/(\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2})/; - - my $cur_pid = ''; - while (my $line = <$lfile>) { - $cursize += length($line); - chomp($line); - $line =~ s/ //; - $nlines++; - next if (!$line); - - if ($progress && (($nlines % $NUMPROGRESS) == 0)) { - if ($totalsize) { - print progress_bar($cursize, $totalsize, 25, '='); - } else { - print "."; - } - } - - # Parse syslog lines - if ($format eq 'syslog') { - my @matches = ($line =~ $compiled_prefix); - if ($#matches >= 0) { +# Main loop reading log files +my $global_totalsize = 0; +my @given_log_files = ( @log_files ); - for (my $i = 0; $i <= $#prefix_params; $i++) { - $prefix_vars{$prefix_params[$i]} = $matches[$i]; - } - # skip non postgresql lines - next if ($prefix_vars{'t_ident'} ne $ident); +# Verify that the file have not changed for incremental move +if ( ($saved_last_line{current_pos} > 0) && ($#given_log_files == 0)) { + $saved_last_line{current_pos} = 0 if (&check_file_changed($given_log_files[0], $saved_last_line{datetime})); + $saved_last_line{current_pos}++ if ($saved_last_line{current_pos} > 0); +} - # Syslog do not have year information, so take care of year overlapping - $prefix_vars{'t_year'} = $gyear; - $prefix_vars{'t_day'} = sprintf("%02d", $prefix_vars{'t_day'}); - $prefix_vars{'t_month'} = $month_abbr{$prefix_vars{'t_month'}}; - if ("$prefix_vars{'t_year'}$prefix_vars{'t_month'}$prefix_vars{'t_day'}" > $CURRENT_DATE) { - $prefix_vars{'t_year'} = substr($CURRENT_DATE, 1, 4) - 1; - } +# log files must be erase when loading stats from binary format +if ($format eq 'binary') { + $queue_size = 1; + $job_per_file = 1; + @log_files = (); +} - $prefix_vars{'t_date'} = $prefix_vars{'t_year'} . $prefix_vars{'t_month'} . $prefix_vars{'t_day'} . $prefix_vars{'t_hour'} . $prefix_vars{'t_min'} . $prefix_vars{'t_sec'}; - $prefix_vars{'t_timestamp'} = "$prefix_vars{'t_year'}-$prefix_vars{'t_month'}-$prefix_vars{'t_day'} $prefix_vars{'t_hour'}:$prefix_vars{'t_min'}:$prefix_vars{'t_sec'}"; +my $pipe; - # Skip unwanted lines - next if ($from && ($from > $prefix_vars{'t_date'})); - last if ($to && ($to < $prefix_vars{'t_date'})); +# Seeking to an old log position is not possible when multiple file are provided +$saved_last_line{current_pos} = 0 if (!$last_parsed && ($#given_log_files > 0)); - # Jump to the last line parsed if required - next if (!&check_incremental_position($prefix_vars{'t_date'}, $line)); - $cur_pid = $prefix_vars{'t_pid'}; +# Start parsing all given files using multiprocess +if ( ($queue_size > 1) || ($job_per_file > 1) ) { - # Extract information from log line prefix - if (!$log_line_prefix) { - &parse_log_prefix($prefix_vars{'t_logprefix'}); - } - # Check if the log line shoud be exclude from the report - if (&validate_log_line($prefix_vars{'t_pid'})) { - # Process the log line - &parse_query(); - } + # Number of running process + my $child_count = 0; + # Set max number of parallel process + my $parallel_process = $queue_size; + if ($job_per_file > 1) { + $parallel_process = $job_per_file; + } + # Store total size of the log files + foreach my $logfile ( @given_log_files ) { + $global_totalsize += &get_log_file($logfile); + } - } elsif ($line =~ $other_syslog_line) { + # Open a pipe for interprocess communication + my $reader = new IO::Handle; + my $writer = new IO::Handle; + $pipe = IO::Pipe->new($reader, $writer); + $writer->autoflush(1); - $cur_pid = $8; - my $t_query = $10; - $t_query =~ s/#011/\t/g; - if ($cur_info{$cur_pid}{statement}) { - $cur_info{$cur_pid}{statement} .= "\n" . $t_query; - } elsif ($cur_info{$cur_pid}{context}) { - $cur_info{$cur_pid}{context} .= "\n" . $t_query; - } elsif ($cur_info{$cur_pid}{detail}) { - $cur_info{$cur_pid}{detail} .= "\n" . $t_query; - } else { - $cur_info{$cur_pid}{query} .= "\n" . $t_query; - } + # Fork the logger process + if ($progress) { + spawn sub { + &multiprocess_progressbar($global_totalsize); + }; + } - # Collect orphans lines of multiline queries - } elsif ($line !~ $orphan_syslog_line) { + # Parse each log file following the multiprocess mode chosen (-j or -J) + foreach my $logfile ( @given_log_files ) { - if ($cur_info{$cur_pid}{statement}) { - $cur_info{$cur_pid}{statement} .= "\n" . $line; - } elsif ($cur_info{$cur_pid}{context}) { - $cur_info{$cur_pid}{context} .= "\n" . $line; - } elsif ($cur_info{$cur_pid}{detail}) { - $cur_info{$cur_pid}{detail} .= "\n" . $line; - } else { - $cur_info{$cur_pid}{query} .= "\n" . $line; + while ($child_count >= $parallel_process) { + my $kid = waitpid(-1, WNOHANG); + if ($kid > 0) { + $child_count--; + delete $RUNNING_PIDS{$kid}; + } + sleep(1); + } + # Do not use split method with compressed files + if ( ($queue_size > 1) && ($logfile !~ /\.(gz|bz2|zip)/i) ) { + # Create multiple process to parse one log file by chunks of data + my @chunks = &split_logfile($logfile); + for (my $i = 0; $i < $#chunks; $i++) { + while ($child_count >= $parallel_process) { + my $kid = waitpid(-1, WNOHANG); + if ($kid > 0) { + $child_count--; + delete $RUNNING_PIDS{$kid}; } - - } else { - &logmsg('DEBUG', "Unknown syslog line format: $line"); + sleep(1); } + push(@tempfiles, [ tempfile('tmp_pgbadgerXXXX', SUFFIX => '.bin', DIR => $TMP_DIR, UNLINK => 1 ) ]); + spawn sub { + &process_file($logfile, $tempfiles[-1]->[0], $chunks[$i], $chunks[$i+1]); + }; + $child_count++; + } - } elsif ($format eq 'stderr') { - - my @matches = ($line =~ $compiled_prefix); - if ($#matches >= 0) { - for (my $i = 0; $i <= $#prefix_params; $i++) { - $prefix_vars{$prefix_params[$i]} = $matches[$i]; - } - if (!$prefix_vars{'t_timestamp'} && $prefix_vars{'t_mtimestamp'}) { - $prefix_vars{'t_timestamp'} = $prefix_vars{'t_mtimestamp'}; - } elsif (!$prefix_vars{'t_timestamp'} && $prefix_vars{'t_session_timestamp'}) { - $prefix_vars{'t_timestamp'} = $prefix_vars{'t_session_timestamp'}; - } - ($prefix_vars{'t_year'}, $prefix_vars{'t_month'}, $prefix_vars{'t_day'}, $prefix_vars{'t_hour'}, $prefix_vars{'t_min'}, $prefix_vars{'t_sec'}) = ($prefix_vars{'t_timestamp'} =~ $time_pattern); - $prefix_vars{'t_date'} = $prefix_vars{'t_year'} . $prefix_vars{'t_month'} . $prefix_vars{'t_day'} . $prefix_vars{'t_hour'} . $prefix_vars{'t_min'} . $prefix_vars{'t_sec'}; - - # Skip unwanted lines - next if ($from && ($from > $prefix_vars{'t_date'})); - last if ($to && ($to < $prefix_vars{'t_date'})); - - # Jump to the last line parsed if required - next if (!&check_incremental_position($prefix_vars{'t_date'}, $line)); - $cur_pid = $prefix_vars{'t_pid'}; + } else { - # Extract information from log line prefix - if (!$log_line_prefix) { - &parse_log_prefix($prefix_vars{'t_logprefix'}); - } + # Start parsing one file per parallel process + push(@tempfiles, [ tempfile('tmp_pgbadgerXXXX', SUFFIX => '.bin', DIR => $TMP_DIR, UNLINK => 1 ) ]); + spawn sub { + &process_file($logfile, $tempfiles[-1]->[0]); + }; + $child_count++; + + } + last if ($interrupt); + } + + my $minproc = 1; + $minproc = 0 if (!$progress); + # Wait for all child dies less the logger + while (scalar keys %RUNNING_PIDS > $minproc) { + my $kid = waitpid(-1, WNOHANG); + if ($kid > 0) { + delete $RUNNING_PIDS{$kid}; + } + sleep(1); + } + # Terminate the process logger + foreach my $k (keys %RUNNING_PIDS) { + kill(10, $k); + %RUNNING_PIDS = (); + } + + # Clear previous statistics + &init_stats_vars(); + + # Load all data gathered by all the differents processes + foreach my $f (@tempfiles) { + next if (!-e "$f->[1]" || -z "$f->[1]"); + my $fht = new IO::File; + $fht->open("< $f->[1]") or die "FATAL: can't open file $f->[1], $!\n"; + &load_stats($fht); + $fht->close(); + } - # Check if the log line shoud be exclude from the report - if (&validate_log_line($prefix_vars{'t_pid'})) { - $prefix_vars{'t_host'} = 'stderr'; - # Process the log line - &parse_query(); - } - - # Collect orphans lines of multiline queries - } elsif ($line !~ $orphan_stderr_line) { +} else { - if ($cur_info{$cur_pid}{statement}) { - $cur_info{$cur_pid}{statement} .= "\n" . $line; - } elsif ($cur_info{$cur_pid}{context}) { - $cur_info{$cur_pid}{context} .= "\n" . $line; - } elsif ($cur_info{$cur_pid}{detail}) { - $cur_info{$cur_pid}{detail} .= "\n" . $line; - } else { - $cur_info{$cur_pid}{query} .= "\n" . $line; - } + # Multiprocessing disabled, parse log files one by one + foreach my $logfile ( @given_log_files ) { + last if (&process_file($logfile, '', $saved_last_line{current_pos})); + } +} +# Get last line parsed from all process +if ($last_parsed) { + if (open(IN, "$tmp_last_parsed") ) { + while (my $line = ) { + chomp($line); + my ($d, $p, $l) = split(/\t/, $line, 3); + if (!$last_line{datetime} || ($d gt $last_line{datetime})) { + $last_line{datetime} = $d; + if ($p =~ /^\d+$/) { + $last_line{orig} = $l; + $last_line{current_pos} = $p; } else { - $cur_info{$cur_pid}{query} .= "\n" . $line if ($cur_info{$cur_pid}{query}); + $last_line{orig} = $p . "\t" . $l; } - - } else { - - # unknown format - &logmsg('DEBUG', "Unknown line format: $line"); } } - $lfile->close(); - } - - # Get stats from all pending temporary storage - foreach my $pid (sort {$cur_info{$a}{date} <=> $cur_info{$b}{date}} %cur_info) { - &store_queries($pid); - } - %cur_info = (); - - if ($progress) { - if ($totalsize) { - print progress_bar($cursize, $totalsize, 25, '='); - } - print STDERR "\n"; + close(IN); } - + unlink("$tmp_last_parsed"); } # Save last line parsed if ($last_parsed && scalar keys %last_line) { if (open(OUT, ">$last_parsed")) { - print OUT "$last_line{datetime}\t$last_line{orig}\n"; + $last_line{current_pos} ||= 0; + print OUT "$last_line{datetime}\t$last_line{current_pos}\t$last_line{orig}\n"; close(OUT); } else { &logmsg('ERROR', "can't save last parsed line into $last_parsed, $!"); @@ -720,41 +918,288 @@ my $t1 = Benchmark->new; my $td = timediff($t1, $t0); -&logmsg('DEBUG', "the log statistics gathering tooks:" . timestr($td)); +&logmsg('DEBUG', "the log statistics gathering took:" . timestr($td)); + +# Global output filehandle +my $fh = undef; -&logmsg('DEBUG', "Ok, generating $extension report..."); +if (!$incremental) { -# Open filehandle -my $fh = new IO::File ">$outfile"; -if (not defined $fh) { - die "FATAL: can't write to $outfile, $!\n"; -} -if (($extension eq 'text') || ($extension eq 'txt')) { - if ($error_only) { - &dump_error_as_text(); + &logmsg('LOG', "Ok, generating $extension report..."); + + if ($extension ne 'tsung') { + $fh = new IO::File ">$outfile"; + if (not defined $fh) { + die "FATAL: can't write to $outfile, $!\n"; + } + if (($extension eq 'text') || ($extension eq 'txt')) { + if ($error_only) { + &dump_error_as_text(); + } else { + &dump_as_text(); + } + } elsif ($extension eq 'binary') { + &dump_as_binary($fh); + } else { + # Create instance to prettify SQL query + if (!$noprettify) { + $sql_prettified = SQL::Beautify->new(keywords => \@pg_keywords); + } + &dump_as_html(); + } + $fh->close; } else { - &dump_as_text(); + + # Open filehandle + $fh = new IO::File ">>$outfile"; + if (not defined $fh) { + die "FATAL: can't write to $outfile, $!\n"; + } + print $fh "\n"; + $fh->close(); } + } else { - if ($error_only) { - &dump_error_as_html(); + + # Build a report per day + my %weeks_directories = (); + my @build_directories = (); + if (open(IN, "$last_parsed.tmp")) { + while (my $l = ) { + chomp($l); + push(@build_directories, $l) if (!grep(/^$l$/, @build_directories)); + } + close(IN); + unlink("$last_parsed.tmp"); } else { + &logmsg('WARNING', "can't read file $last_parsed.tmp, $!"); + &logmsg('HINT', "maybe there's no new entries in your log since last run."); + } + foreach $incr_date (@build_directories) { + + $last_incr_date = $incr_date; + + # Set the path to binary files + my $bpath = $incr_date; + $bpath =~ s/\-/\//g; + $incr_date =~ /^(\d+)-(\d+)\-(\d+)$/; + + # Get the week number following the date + my $wn = &get_week_number($1, $2, $3); + $weeks_directories{$wn} = "$1-$2" if (!exists $weeks_directories{$wn}); + + # First clear previous stored statistics + &init_stats_vars(); + + # Load all data gathered by all the differents processes + unless(opendir(DIR, "$outdir/$bpath")) { + die "Error: can't opendir $outdir/$bpath: $!"; + } + my @mfiles = grep { !/^\./ && ($_ =~ /\.bin$/) } readdir(DIR); + closedir DIR; + foreach my $f (@mfiles) { + my $fht = new IO::File; + $fht->open("< $outdir/$bpath/$f") or die "FATAL: can't open file $outdir/$bpath/$f, $!\n"; + &load_stats($fht); + $fht->close(); + } + + &logmsg('LOG', "Ok, generating HTML daily report into $outdir/$bpath/..."); + + $fh = new IO::File ">$outdir/$bpath/$outfile"; + if (not defined $fh) { + die "FATAL: can't write to $outdir/$bpath/$outfile, $!\n"; + } + # Create instance to prettify SQL query + if (!$noprettify) { + $sql_prettified = SQL::Beautify->new(keywords => \@pg_keywords); + } &dump_as_html(); + $fh->close; } -} -$fh->close; -my $t2 = Benchmark->new; -$td = timediff($t2, $t1); -&logmsg('DEBUG', "the generating of reports tooks:" . timestr($td)); -$td = timediff($t2, $t0); -&logmsg('DEBUG', "the total execution time tooks:" . timestr($td)); + # Build a report per week + foreach my $wn (sort { $a <=> $b } keys %weeks_directories) { + &init_stats_vars(); -exit 0; + # Get all days of the current week + my @wdays = &get_wdays_per_month($wn - 1, $weeks_directories{$wn}); + my $wdir = ''; + + # Load data per day + foreach $incr_date (@wdays) { + my $bpath = $incr_date; + $bpath =~ s/\-/\//g; + $incr_date =~ /^(\d+)-(\d+)\-(\d+)$/; + $wdir = "$1/week-$wn"; + + # Load all data gathered by all the differents processes + if (-e "$outdir/$bpath") { + unless(opendir(DIR, "$outdir/$bpath")) { + die "Error: can't opendir $outdir/$bpath: $!"; + } + my @mfiles = grep { !/^\./ && ($_ =~ /\.bin$/) } readdir(DIR); + closedir DIR; + foreach my $f (@mfiles) { + my $fht = new IO::File; + $fht->open("< $outdir/$bpath/$f") or die "FATAL: can't open file $outdir/$bpath/$f, $!\n"; + &load_stats($fht); + $fht->close(); + } + } + } + + &logmsg('LOG', "Ok, generating HTML weekly report into $outdir/$wdir/..."); + if (!-d "$outdir/$wdir") { + mkdir("$outdir/$wdir"); + } + $fh = new IO::File ">$outdir/$wdir/$outfile"; + if (not defined $fh) { + die "FATAL: can't write to $outdir/$wdir/$outfile, $!\n"; + } + # Create instance to prettify SQL query + if (!$noprettify) { + $sql_prettified = SQL::Beautify->new(keywords => \@pg_keywords); + } + &dump_as_html(); + $fh->close; + + } + + &logmsg('LOG', "Ok, generating global index to access incremental reports..."); + + $fh = new IO::File ">$outdir/index.html"; + if (not defined $fh) { + die "FATAL: can't write to $outdir/index.html, $!\n"; + } + my $date = localtime(time); + print $fh qq{ + + +pgBadger :: Global Index on incremental reports + + + + + + +@jscode + + + + + + +


+
+ + + + +}; + # get years directories + unless(opendir(DIR, "$outdir")) { + die "Error: can't opendir $outdir: $!"; + } + my @dyears = grep { !/^\./ && /^\d{4}$/ } readdir(DIR); + closedir DIR; + my @day_names = ('Mon','Tue','Wed','Thu','Fri','Sat','Sun'); + my @month_names = ('Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sept','Oct','Nov','Dec'); + foreach my $y (sort { $b <=> $a } @dyears) { + print $fh qq{ +

Year $y

+
+
    +}; + # foreach year directory look for week directories + unless(opendir(DIR, "$outdir/$y")) { + die "Error: can't opendir $outdir/$y: $!"; + } + + my @yweeks = grep { !/^\./ && /^week-\d+$/ } readdir(DIR); + closedir DIR; + my %ywdays = &get_wdays_per_year($y); + foreach my $w (sort { &sort_by_week($a, $b); } @yweeks) { + $w =~ /week\-(\d+)/; + my $week = "Week $1"; + # foreach week add link to daily reports + my $wn = sprintf("%02d", $1 - 1); + my @wdays = @{$ywdays{$wn}}; + my $data_content = ''; + for (my $i = 0; $i <= $#wdays; $i++) { + my $bpath = $wdays[$i]; + $bpath =~ s/(\d+)\-(\d+)\-(\d+)/$1\/$2\/$3/g; + my $mmonth = $month_names[$2 - 1]; + my $dday = $3; + if (-e "$outdir/$bpath/index.html") { + $data_content .= "$day_names[$i] $mmonth $dday $y
    "; + } else { + $data_content .= "$day_names[$i] $mmonth $dday $y
    "; + } + } + print $fh qq{ +

  • $week
  • +}; + } + print $fh qq{ +
+
+}; + } + print $fh qq{ +
+ + + +
+ +
+ + + +}; + + + + $fh->close; + +} + +my $t2 = Benchmark->new; +$td = timediff($t2, $t1); +&logmsg('DEBUG', "building reports took:" . timestr($td)); +$td = timediff($t2, $t0); +&logmsg('DEBUG', "the total execution time took:" . timestr($td)); + +exit 0; #------------------------------------------------------------------------------- -# Show PgBadger command line usage +# Show pgBadger command line usage sub usage { print qq{ @@ -764,66 +1209,88 @@ Arguments: - logfile can be a single log file, a list of files or a shell command - returning a list of file. If you want to pass log content from stdin - use - as filename. + logfile can be a single log file, a list of files, or a shell command + returning a list of files. If you want to pass log content from stdin + use - as filename. Note that input from stdin will not work with csvlog. Options: -a | --average minutes : number of minutes to build the average graphs of - queries and connections. + queries and connections. -b | --begin datetime : start date/time for the data to be parsed in log. - -c | --dbclient host : only report what concern the given client host. - -d | --dbname database : only report what concern the given database. + -c | --dbclient host : only report on entries for the given client host. + -C | --nocomment : remove comments like /* ... */ from queries. + -d | --dbname database : only report on entries for the given database. -e | --end datetime : end date/time for the data to be parsed in log. - -f | --format logtype : possible values: syslog,stderr,csv. Default: stderr + -f | --format logtype : possible values: syslog,stderr,csv. Default: stderr. -G | --nograph : disable graphs on HTML output. Enable by default. -h | --help : show this message and exit. -i | --ident name : programname used as syslog ident. Default: postgres + -I | --incremental : use incremental mode, reports will be generated by + days in a separate directory, --outdir must be set. + -j | --jobs number : number of jobs to run at same time. Default is 1, + run as single process. -l | --last-parsed file: allow incremental log parsing by registering the - last datetime and line parsed. Useful if you want - to watch errors since last run or if you want one - report per day with a log rotated each week. - -m | --maxlength size : maximum length of a query, it will be cutted above - the given size. Default: no truncate + last datetime and line parsed. Useful if you want + to watch errors since last run or if you want one + report per day with a log rotated each week. + -m | --maxlength size : maximum length of a query, it will be restricted to + the given size. Default: no truncate -n | --nohighlight : disable SQL code highlighting. - -N | --appname name : only report what concern the given application name + -N | --appname name : only report on entries for given application name -o | --outfile filename: define the filename for the output. Default depends - of the output format: out.html or out.txt. To dump - output to stdout use - as filename. + on the output format: out.html, out.txt or out.tsung. + To dump output to stdout use - as filename. + -O | --outdir path : directory where out file must be saved. -p | --prefix string : give here the value of your custom log_line_prefix - defined in your postgresql.conf. You may used it only - if don't have the default allowed formats ot to use - other custom variables like client ip or application - name. See examples below. + defined in your postgresql.conf. Only use it if you + aren't using one of the standard prefixes specified + in the pgBadger documentation, such as if your prefix + includes additional variables like client ip or + application name. See examples below. -P | --no-prettify : disable SQL queries prettify formatter. - -q | --quiet : don't print anything to stdout, even not a progress bar. - -s | --sample number : number of query sample to store/display. Default: 3 - -t | --top number : number of query to store/display. Default: 20 + -q | --quiet : don't print anything to stdout, not even a progress bar. + -s | --sample number : number of query samples to store/display. Default: 3 + -S | --select-only : use it if you want to report select queries only. + -t | --top number : number of queries to store/display. Default: 20 -T | --title string : change title of the HTML page report. - -u | --dbuser username : only report what concern the given user + -u | --dbuser username : only report on entries for the given user. + -U | --exclude-user username : exclude entries for the specified user from report. -v | --verbose : enable verbose or debug mode. Disabled by default. -V | --version : show pgBadger version and exit. - -w | --watch-mode : only report events/errors just like logwatch could do. - -x | --extension : output format. Values: text or html. Default: html + -w | --watch-mode : only report errors just like logwatch could do. + -x | --extension : output format. Values: text, html or tsung. Default: html -z | --zcat exec_path : set the full path to the zcat program. Use it if - zcat is not on your path or you want to use gzcat. + zcat or bzcat or unzip is not on your path. --pie-limit num : pie data lower than num% will show a sum instead. --exclude-query regex : any query matching the given regex will be excluded - from the report. For example: "^(VACUUM|COMMIT)" - you can use this option multiple time. + from the report. For example: "^(VACUUM|COMMIT)" + You can use this option multiple times. --exclude-file filename: path of the file which contains all the regex to use - to exclude queries from the report. One regex per line. + to exclude queries from the report. One regex per line. + --include-query regex : any query that does not match the given regex will be + excluded from the report. For example: "(table_1|table_2)" + You can use this option multiple times. + --include-file filename: path of the file which contains all the regex of the + queries to include from the report. One regex per line. --disable-error : do not generate error report. - --disable-hourly : do not generate hourly reports. - --disable-type : do not generate query type report. - --disable-query : do not generate queries reports (slowest, most - frequent, ...). + --disable-hourly : do not generate hourly report. + --disable-type : do not generate report of queries by type, database... + --disable-query : do not generate query reports (slowest, most + frequent, queries by users, by database, ...). --disable-session : do not generate session report. --disable-connection : do not generate connection report. --disable-lock : do not generate lock report. --disable-temporary : do not generate temporary report. - --disable-checkpoint : do not generate checkpoint report. + --disable-checkpoint : do not generate checkpoint/restartpoint report. + --disable-autovacuum : do not generate autovacuum report. + --charset : used to set the HTML charset to be used. Default: utf-8. + --csv-separator : used to set the CSV field separator, default: , + --exclude-time regex : any timestamp matching the given regex will be + excluded from the report. Example: "2013-04-12 .*" + You can use this option multiple times. + --exclude-appname name : exclude entries for the specified application name + from report. Example: "pg_dump". Examples: @@ -832,6 +1299,8 @@ /var/log/postgres.log pgbadger /var/log/postgresql/postgresql-2012-05-* pgbadger --exclude-query="^(COPY|COMMIT)" /var/log/postgresql.log + pgbadger -b "2012-06-25 10:56:11" -e "2012-06-25 10:59:11" \ + /var/log/postgresql.log cat /var/log/postgres.log | pgbadger - # log prefix with stderr log output perl pgbadger --prefix '%t [%p]: [%l-1] user=%u,db=%d,client=%h' \ @@ -840,6 +1309,13 @@ # Log line prefix with syslog log output perl pgbadger --prefix 'user=%u,db=%d,client=%h,appname=%a' \ /pglog/postgresql-2012-08-21* + # Use my 8 CPUs to parse my 10GB file faster, really faster + perl pgbadger -j 8 /pglog/postgresql-9.1-main.log + + +Generate Tsung sessions XML file with select queries only: + + perl pgbadger -S -o sessions.tsung --prefix '%t [%p]: [%l-1] user=%u,db=%d ' /pglog/postgresql-9.1.log Reporting errors every week by cron job: @@ -850,2208 +1326,6404 @@ 0 4 * * 1 /usr/bin/pgbadger -q `find /var/log/ -mtime -7 -name "postgresql.log*"` \ -o /var/reports/pg_errors-`date +%F`.html -l /var/reports/pgbadger_incremental_file.dat -This supposes that your log file and HTML report are also rotated every weeks. +This supposes that your log file and HTML report are also rotated every week. + +Or better, use the auto-generated incremental reports: + + 0 4 * * * /usr/bin/pgbadger -I -q /var/log/postgresql/postgresql.log.1 \ + -O /var/www/pg_reports/ + +will generate a report per day and per week. + +If you have a pg_dump at 23:00 and 13:00 each day during half an hour, you can +use pgbadger as follow to exclude these period from the report: + + pgbadger --exclude-time "2013-09-.* (23|13):.*" postgresql.log + +This will help to not have all COPY order on top of slowest queries. You can +also use --exclude-appname "pg_dump" to solve this problem in a more simple way. }; exit 0; } -sub check_incremental_position +sub sort_by_week { - my ($cur_date, $line) = @_; + my $curr = shift; + my $next = shift; - if ($last_parsed) { - if ($saved_last_line{datetime}) { - if ($cur_date < $saved_last_line{datetime}) { - return 0; - } elsif (!$last_line{datetime} && ($cur_date == $saved_last_line{datetime})) { - return 0 if ($line ne $saved_last_line{orig}); - } - } - $last_line{datetime} = $cur_date; - $last_line{orig} = $line; - } + $a =~ /week\-(\d+)/; + $curr = $1; + $b =~ /week\-(\d+)/; + $next = $1; - return 1; + return $next <=> $curr; } -# Display message following the log level -sub logmsg +sub init_stats_vars { - my ($level, $str) = @_; - return if ($quiet && ($level ne 'FATAL')); - return if (!$debug && ($level eq 'DEBUG')); + # Empty where statistics are stored + %overall_stat = (); + %overall_checkpoint = (); + @top_slowest = (); + @top_tempfile_info = (); + @top_locked_info = (); + %normalyzed_info = (); + %error_info = (); + %logs_type = (); + %per_minute_info = (); + %lock_info = (); + %tempfile_info = (); + %connection_info = (); + %database_info = (); + %application_info = (); + %session_info = (); + %conn_received = (); + %checkpoint_info = (); + %autovacuum_info = (); + %autoanalyze_info = (); + @graph_values = (); + %cur_info = (); + $nlines = 0; + %tsung_session = (); +} + +#### +# Main function called per each parser process +#### +sub multiprocess_progressbar +{ + my $totalsize = shift; - if ($level =~ /(\d+)/) { - print STDERR "\t" x $1; + &logmsg('DEBUG', "Starting progressbar writer process"); + + $0 = 'pgbadger logger'; + + # Terminate the process when we doesn't read the complete file but must exit + local $SIG{USR1} = sub { + print STDERR "\n"; + exit 0; + }; + my $timeout = 3; + my $cursize = 0; + my $nqueries = 0; + my $nerrors = 0; + $pipe->reader(); + while (my $r = <$pipe>) { + chomp($r); + my @infos = split(/\s+/, $r); + $cursize += $infos[0]; + $nqueries += $infos[1]; + $nerrors += $infos[2]; + $cursize = $totalsize if ($cursize > $totalsize); + print STDERR &progress_bar($cursize, $totalsize, 25, '=', $nqueries, $nerrors); + last if ($cursize >= $totalsize); } + print STDERR "\n"; - print STDERR "$level: $str\n"; + exit 0; } -# Normalyze SQL queries by removing parameters -sub normalize_query +#### +# Main function called per each parser process +#### +sub process_file { - my $orig_query = shift; + my ($logfile, $tmpoutfile, $start_offset, $stop_offset) = @_; - return if (!$orig_query); + my $old_queries_count = 0; + my $old_errors_count = 0; + my $current_offset = $start_offset || 0; + my $getout = 0; - $orig_query = lc($orig_query); + $0 = 'pgbadger parser'; - # Remove extra space, new line and tab caracters by a single space - $orig_query =~ s/[\t\s\r\n]+/ /gs; + &init_stats_vars() if ($tmpoutfile); - # Remove string content - $orig_query =~ s/\\'//g; - $orig_query =~ s/'[^']*'/''/g; - $orig_query =~ s/''('')+/''/g; + &logmsg('DEBUG', "Starting to parse log file: $logfile"); - # Remove NULL parameters - $orig_query =~ s/=\s*NULL/=''/g; + my $terminate = 0; + local $SIG{INT} = sub { $terminate = 1 }; + local $SIG{TERM} = sub { $terminate = 1 }; - # Remove numbers - $orig_query =~ s/([^a-z_\$-])-?([0-9]+)/${1}0/g; + my $curdate = localtime(time); - # Remove hexadecimal numbers - $orig_query =~ s/([^a-z_\$-])0x[0-9a-f]{1,10}/${1}0x/g; + $pipe->writer() if (defined $pipe); - # Remove IN values - $orig_query =~ s/in\s*\([\'0x,\s]*\)/in (...)/g; + # Syslog does not have year information, so take care of year overlapping + my ($gsec, $gmin, $ghour, $gmday, $gmon, $gyear, $gwday, $gyday, $gisdst) = localtime(time); + $gyear += 1900; + my $CURRENT_DATE = $gyear . sprintf("%02d", $gmon + 1) . sprintf("%02d", $gmday); - return $orig_query; -} + my $cursize = 0; -# Format numbers with comma for better reading -sub comma_numbers -{ - return 0 if ($#_ < 0); + # Get file handle and size of the file + my ($lfile, $totalsize) = &get_log_file($logfile); + if ($stop_offset > 0) { + $totalsize = $stop_offset - $start_offset; + } - my $text = reverse $_[0]; + &logmsg('DEBUG', "Starting reading file $logfile..."); - $text =~ s/(\d\d\d)(?=\d)(?!\d*\.)/$1$num_sep/g; + if ($format eq 'csv') { - return scalar reverse $text; -} + require Text::CSV_XS; + my $csv = Text::CSV_XS->new({binary => 1, eol => $/, sep_char => $csv_sep_char}); -# Format duration -sub convert_time -{ - my $time = shift; + # Parse csvlog lines + while (my $row = $csv->getline($lfile)) { - return '0s' if (!$time); + # We received a signal + last if ($terminate); - my $days = int($time / 86400000); - $time -= ($days * 86400000); - my $hours = int($time / 3600000); - $time -= ($hours * 3600000); - my $minutes = int($time / 60000); - $time -= ($minutes * 60000); - my $seconds = sprintf("%0.2f", $time / 1000); + # Set progress statistics + $cursize += length(join(',', @$row)); + $nlines++; + if (!$tmpoutfile) { + if ($progress && (($nlines % $NUMPROGRESS) == 0)) { + if ($totalsize) { + print STDERR &progress_bar($cursize, $totalsize, 25, '='); + } else { + print STDERR "."; + } + } + } else { + if ($progress && (($nlines % $NUMPROGRESS) == 0)) { + $pipe->print("$cursize " . ($overall_stat{'queries_number'} - $old_queries_count) . " " . ($overall_stat{'errors_number'} - $old_errors_count) . "\n"); + $old_queries_count = $overall_stat{'queries_number'}; + $old_errors_count = $overall_stat{'errors_number'}; + $cursize = 0; + } + } + next if ($row->[11] !~ $parse_regex); - $days = $days < 1 ? '' : $days . 'd'; - $hours = $hours < 1 ? '' : $hours . 'h'; - $minutes = $minutes < 1 ? '' : $minutes . 'm'; - $time = $days . $hours . $minutes . $seconds . 's'; + # Extract the date + $row->[0] =~ m/^(\d+)-(\d+)-(\d+)\s+(\d+):(\d+):(\d+)\.(\d+)/; + my $milli = $7 || 0; + ($prefix_vars{'t_year'}, $prefix_vars{'t_month'}, $prefix_vars{'t_day'}, $prefix_vars{'t_hour'}, $prefix_vars{'t_min'}, $prefix_vars{'t_sec'}) = ($1, $2, $3, $4, $5, $6); + $prefix_vars{'t_timestamp'} = "$1-$2-$3 $4:$5:$6"; - return $time; -} + # Skip unwanted lines + next if ($from && ($from gt $prefix_vars{'t_timestamp'})); + if ($to && ($to lt $prefix_vars{'t_timestamp'})) { + if ($tmpoutfile) { + $pipe->print("$cursize " . ($overall_stat{'queries_number'} - $old_queries_count) . " " . ($overall_stat{'errors_number'} - $old_errors_count) . "\n"); + $old_queries_count = $overall_stat{'queries_number'}; + $old_errors_count = $overall_stat{'errors_number'}; + $cursize = 0; + } + $getout = 2; + last; + } -# Stores the top N slowest queries -sub set_top_slowest -{ - my ($q, $dt, $date) = @_; + # Jump to the last line parsed if required + next if (!&check_incremental_position($prefix_vars{'t_timestamp'}, join(',', @$row))); - push(@top_slowest, [($dt, $date, $q)]); + # Store the current timestamp of the log line + &store_current_timestamp($prefix_vars{'t_timestamp'}); - @top_slowest = (sort {$b->[0] <=> $a->[0]} @top_slowest)[0 .. $end_top]; + # Set query parameters as global variables + $prefix_vars{'t_dbuser'} = $row->[1] || ''; + $prefix_vars{'t_dbname'} = $row->[2] || ''; + $prefix_vars{'t_appname'} = $row->[22] || ''; + $prefix_vars{'t_client'} = $row->[4] || ''; + $prefix_vars{'t_client'} =~ s/:.*//; + $prefix_vars{'t_host'} = 'csv'; + $prefix_vars{'t_pid'} = $row->[3]; + $prefix_vars{'t_session_line'} = $row->[5]; + $prefix_vars{'t_session_line'} =~ s/\..*//; + $prefix_vars{'t_loglevel'} = $row->[11]; + $prefix_vars{'t_query'} = $row->[13]; + # Set ERROR additional information + $prefix_vars{'t_detail'} = $row->[14]; + $prefix_vars{'t_hint'} = $row->[15]; + $prefix_vars{'t_context'} = $row->[18]; + $prefix_vars{'t_statement'} = $row->[19]; -} + # Check if the log line should be excluded from the report + if (&validate_log_line($prefix_vars{'t_pid'})) { -# Stores top N slowest sample queries -sub set_top_sample -{ - my ($norm, $q, $dt, $date) = @_; + # Parse the query now + &parse_query(); + &store_queries($prefix_vars{'t_pid'}); + delete $cur_info{$prefix_vars{'t_pid'}}; + } + } + if (!$getout) { + $csv->eof or warn "FATAL: cannot use CSV, " . $csv->error_diag() . "\n"; + } - $normalyzed_info{$norm}{samples}{$dt}{query} = $q; - $normalyzed_info{$norm}{samples}{$dt}{date} = $date; + } + elsif ($format eq 'binary') { + &load_stats($lfile); + } + else { # Format is not CSV. - my $i = 1; - foreach my $k (sort {$b <=> $a} keys %{$normalyzed_info{$norm}{samples}}) { - if ($i > $sample) { - delete $normalyzed_info{$norm}{samples}{$k}; + my $time_pattern = qr/(\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2})/; + my $cur_pid = ''; + my @matches = (); + my $goon = 0; + &logmsg('DEBUG', "Start parsing at offset $start_offset of file $logfile"); + if ($start_offset) { + $lfile->seek($start_offset, 0); } - $i++; - } -} + while (my $line = <$lfile>) { -# Stores top N error sample queries -sub set_top_error_sample -{ - my ($q, $date, $real_error, $detail, $context, $statement, $hint) = @_; + # We received a signal + last if ($terminate); - # Stop when we have our number of samples - if (!exists $error_info{$q}{date} || ($#{$error_info{$q}{date}} < $sample)) { - if (($q =~ /deadlock detected/) || !grep(/\Q$real_error\E/, @{$error_info{$q}{error}})) { - push(@{$error_info{$q}{date}}, $date); - push(@{$error_info{$q}{detail}}, $detail); - push(@{$error_info{$q}{context}}, $context); - push(@{$error_info{$q}{statement}}, $statement); - push(@{$error_info{$q}{hint}}, $hint); - push(@{$error_info{$q}{error}}, $real_error); - } - } -} + $cursize += length($line); + $current_offset += length($line); + chomp($line); + $line =~ s/\r//; + $nlines++; + next if (!$line); -sub dump_as_text -{ + if (!$tmpoutfile) { + if ($progress && (($nlines % $NUMPROGRESS) == 0)) { + if ($totalsize) { + if ($stop_offset > 0) { + print STDERR &progress_bar($cursize - $start_offset, $stop_offset, 25, '='); + } else { + print STDERR &progress_bar($cursize, $totalsize, 25, '='); + } + } else { + print STDERR "."; + } + } + } else { + if ($progress && (($nlines % $NUMPROGRESS) == 0)) { + $pipe->print("$cursize " . ($overall_stat{'queries_number'} - $old_queries_count) . " " . ($overall_stat{'errors_number'} - $old_errors_count) . "\n"); + $old_queries_count = $overall_stat{'queries_number'}; + $old_errors_count = $overall_stat{'errors_number'}; + $cursize = 0; + } + } - # Global informations - my $curdate = localtime(time); - my $fmt_nlines = &comma_numbers($nlines); - my $total_time = timestr($td); - $total_time =~ s/^([\.0-9]+) wallclock.*/$1/; - $total_time = &convert_time($total_time * 1000); - my $logfile_str = $log_files[0]; - if ($#log_files > 0) { - $logfile_str .= ', ..., ' . $log_files[-1]; - } - print $fh qq{ -$report_title + %prefix_vars = (); -- Global informations -------------------------------------------------- + # Parse syslog lines + if ($format =~ /syslog/) { -Generated on $curdate -Log file: $logfile_str -Parsed $fmt_nlines log entries in $total_time -Log start from $first_log_date to $last_log_date -}; + @matches = ($line =~ $compiled_prefix); - # Overall statistics - my $fmt_unique = &comma_numbers(scalar keys %normalyzed_info) || 0; - my $fmt_queries = &comma_numbers($overall_stat{'queries_number'}) || 0; - my $fmt_duration = &convert_time($overall_stat{'queries_duration'}) || 0; - print $fh qq{ + if ($#matches >= 0) { -- Overall statistics --------------------------------------------------- + for (my $i = 0 ; $i <= $#prefix_params ; $i++) { + $prefix_vars{$prefix_params[$i]} = $matches[$i]; + } -Number of unique normalized queries: $fmt_unique -Number of queries: $fmt_queries -Total query duration: $fmt_duration -First query: $overall_stat{'first_query'} -Last query: $overall_stat{'last_query'} -}; - foreach (sort {$overall_stat{'query_peak'}{$b} <=> $overall_stat{'query_peak'}{$a}} keys %{$overall_stat{'query_peak'}}) { - print $fh "Query peak: ", &comma_numbers($overall_stat{'query_peak'}{$_}), " queries/s at $_"; - last; - } - if (!$disable_error) { - my $fmt_errors = &comma_numbers($overall_stat{'errors_number'}) || 0; - my $fmt_unique_error = &comma_numbers(scalar keys %{$overall_stat{'unique_normalized_errors'}}) || 0; + # skip non postgresql lines + next if ($prefix_vars{'t_ident'} ne $ident); + + # Stores temporary files and locks information + &store_temporary_and_lock_infos($cur_pid); + + # Standard syslog format does not have year information, months are + # three letters and day are not always with 2 digit. + if ($prefix_vars{'t_month'} !~ /\d/) { + $prefix_vars{'t_year'} = $gyear; + $prefix_vars{'t_day'} = sprintf("%02d", $prefix_vars{'t_day'}); + $prefix_vars{'t_month'} = $month_abbr{$prefix_vars{'t_month'}}; + # Take care of year overlapping + if ("$prefix_vars{'t_year'}$prefix_vars{'t_month'}$prefix_vars{'t_day'}" > $CURRENT_DATE) { + $prefix_vars{'t_year'} = substr($CURRENT_DATE, 0, 4) - 1; + } + } + $prefix_vars{'t_timestamp'} = +"$prefix_vars{'t_year'}-$prefix_vars{'t_month'}-$prefix_vars{'t_day'} $prefix_vars{'t_hour'}:$prefix_vars{'t_min'}:$prefix_vars{'t_sec'}"; + + # Skip unwanted lines + if ($#exclude_time >= 0) { + foreach (@exclude_time) { + if ($prefix_vars{'t_timestamp'} =~ /$_/) { + return; + } + } + } + + next if ($from && ($from gt $prefix_vars{'t_timestamp'})); + if ($to && ($to lt $prefix_vars{'t_timestamp'})) { + if ($tmpoutfile) { + $pipe->print("$cursize " . ($overall_stat{'queries_number'} - $old_queries_count) . " " . ($overall_stat{'errors_number'} - $old_errors_count) . "\n"); + $old_queries_count = $overall_stat{'queries_number'}; + $old_errors_count = $overall_stat{'errors_number'}; + $cursize = 0; + } + $getout = 2; + last; + } + + # Jump to the last line parsed if required + next if (!&check_incremental_position($prefix_vars{'t_timestamp'}, $line)); + $cur_pid = $prefix_vars{'t_pid'}; + $goon = 1; + + # Store the current timestamp of the log line + &store_current_timestamp($prefix_vars{'t_timestamp'}); + + # Extract information from log line prefix + if (!$log_line_prefix) { + &parse_log_prefix($prefix_vars{'t_logprefix'}); + } + + # Check if the log line should be excluded from the report + if (&validate_log_line($prefix_vars{'t_pid'})) { + + # Process the log line + &parse_query(); + } + + } elsif ($goon && ($line =~ $other_syslog_line)) { + + $cur_pid = $8; + my $t_query = $10; + $t_query =~ s/#011/\t/g; + next if ($t_query eq "\t"); + + if ($cur_info{$cur_pid}{vacuum} && ($t_query =~ /^\t(pages|tuples|buffer usage|avg read rate|system usage):/)) { + if ($t_query =~ /^\t(pages|tuples): (\d+) removed, (\d+) remain/) { + $autovacuum_info{tables}{$cur_info{$cur_pid}{vacuum}}{$1}{removed} += $2; + } + if ($t_query =~ m#^\tsystem usage: CPU .* sec elapsed (.*) sec#) { + if ($1 > $autovacuum_info{peak}{system_usage}{elapsed}) { + $autovacuum_info{peak}{system_usage}{elapsed} = $1; + $autovacuum_info{peak}{system_usage}{table} = $cur_info{$cur_pid}{vacuum}; + $autovacuum_info{peak}{system_usage}{date} = + "$cur_info{$cur_pid}{year}-$cur_info{$cur_pid}{month}-$cur_info{$cur_pid}{day} " . + "$cur_info{$cur_pid}{hour}:$cur_info{$cur_pid}{min}:$cur_info{$cur_pid}{sec}"; + } + } + next; + } elsif ( $cur_info{$cur_pid}{parameters} && (($t_query =~ /[,\s]*\$(\d+)\s=\s/) || ($t_query =~ /^('[^']*')$/)) ) { + # stores bind parameters if any + $cur_info{$cur_pid}{parameters} .= " $t_query"; + next; + } + + if (exists $cur_temp_info{$cur_pid}{query}) { + $cur_temp_info{$cur_pid}{query} .= "\n" . $t_query; + } elsif (exists $cur_lock_info{$cur_pid}{query}) { + $cur_lock_info{$cur_pid}{query} .= "\n" . $t_query; + } elsif (exists $cur_info{$cur_pid}{statement}) { + $cur_info{$cur_pid}{statement} .= "\n" . $t_query; + } elsif (exists $cur_info{$cur_pid}{context}) { + $cur_info{$cur_pid}{context} .= "\n" . $t_query; + } elsif (exists $cur_info{$cur_pid}{detail}) { + $cur_info{$cur_pid}{detail} .= "\n" . $t_query; + } elsif (exists $cur_info{$cur_pid}{query}) { + $cur_info{$cur_pid}{query} .= "\n" . $t_query; + } + + # Collect orphans lines of multiline queries + } elsif ($cur_pid && ($line !~ $orphan_syslog_line)) { + + if (exists $cur_temp_info{$cur_pid}{query}) { + $cur_temp_info{$cur_pid}{query} .= "\n" . $line; + } elsif (exists $cur_lock_info{$cur_pid}{query}) { + $cur_lock_info{$cur_pid}{query} .= "\n" . $line; + } elsif (exists $cur_info{$cur_pid}{statement}) { + $cur_info{$cur_pid}{statement} .= "\n" . $line; + } elsif (exists $cur_info{$cur_pid}{context}) { + $cur_info{$cur_pid}{context} .= "\n" . $line; + } elsif (exists $cur_info{$cur_pid}{detail}) { + $cur_info{$cur_pid}{detail} .= "\n" . $line; + } elsif (exists $cur_info{$cur_pid}{query}) { + $cur_info{$cur_pid}{query} .= "\n" . $line; + } + + } else { + &logmsg('DEBUG', "Unknown syslog line format: $line"); + } + + } elsif ($format eq 'stderr') { + + @matches = ($line =~ $compiled_prefix); + if ($#matches >= 0) { + for (my $i = 0 ; $i <= $#prefix_params ; $i++) { + $prefix_vars{$prefix_params[$i]} = $matches[$i]; + } + + # Stores temporary files and locks information + &store_temporary_and_lock_infos($cur_pid); + + if (!$prefix_vars{'t_timestamp'} && $prefix_vars{'t_mtimestamp'}) { + $prefix_vars{'t_timestamp'} = $prefix_vars{'t_mtimestamp'}; + } elsif (!$prefix_vars{'t_timestamp'} && $prefix_vars{'t_session_timestamp'}) { + $prefix_vars{'t_timestamp'} = $prefix_vars{'t_session_timestamp'}; + } + ($prefix_vars{'t_year'}, $prefix_vars{'t_month'}, $prefix_vars{'t_day'}, $prefix_vars{'t_hour'}, + $prefix_vars{'t_min'}, $prefix_vars{'t_sec'}) = ($prefix_vars{'t_timestamp'} =~ $time_pattern); + + # Skip unwanted lines + if ($#exclude_time >= 0) { + foreach (@exclude_time) { + if ($prefix_vars{'t_timestamp'} =~ /$_/) { + return; + } + } + } + next if ($from && ($from gt $prefix_vars{'t_timestamp'})); + if ($to && ($to lt $prefix_vars{'t_timestamp'})) { + if ($tmpoutfile) { + $pipe->print("$cursize " . ($overall_stat{'queries_number'} - $old_queries_count) . " " . ($overall_stat{'errors_number'} - $old_errors_count) . "\n"); + $old_queries_count = $overall_stat{'queries_number'}; + $old_errors_count = $overall_stat{'errors_number'}; + $cursize = 0; + } + $getout = 2; + last; + } + + # Jump to the last line parsed if required + next if (!&check_incremental_position($prefix_vars{'t_timestamp'}, $line)); + $cur_pid = $prefix_vars{'t_pid'}; + + # Store the current timestamp of the log line + &store_current_timestamp($prefix_vars{'t_timestamp'}); + + # Extract information from log line prefix + if (!$log_line_prefix) { + &parse_log_prefix($prefix_vars{'t_logprefix'}); + } + + # Check if the log line should be excluded from the report + if (&validate_log_line($prefix_vars{'t_pid'})) { + $prefix_vars{'t_host'} = 'stderr'; + + # Process the log line + &parse_query(); + } + + # Collect additional query information + } elsif ($cur_pid && ($line !~ $orphan_stderr_line)) { + + if ($line =~ s/^(STATEMENT|DETAIL|HINT):\s+//) { + $line =~ s/ERROR:\s+//; + $cur_info{$cur_pid}{"\L$1\E"} = $line; + next; + } elsif ($cur_info{$cur_pid}{vacuum} && ($line =~ /^\t(pages|tuples|buffer usage|avg read rate|system usage):/)) { + if ($line =~ /^\t(pages|tuples): (\d+) removed, (\d+) remain/) { + $autovacuum_info{tables}{$cur_info{$cur_pid}{vacuum}}{$1}{removed} += $2; + } + if ($line =~ m#^\tsystem usage: CPU .* sec elapsed (.*) sec#) { + if ($1 > $autovacuum_info{peak}{system_usage}{elapsed}) { + $autovacuum_info{peak}{system_usage}{elapsed} = $1; + $autovacuum_info{peak}{system_usage}{table} = $cur_info{$cur_pid}{vacuum}; + $autovacuum_info{peak}{system_usage}{date} = + "$cur_info{$cur_pid}{year}-$cur_info{$cur_pid}{month}-$cur_info{$cur_pid}{day} " . + "$cur_info{$cur_pid}{hour}:$cur_info{$cur_pid}{min}:$cur_info{$cur_pid}{sec}"; + } + } + next; + } elsif ( $cur_info{$cur_pid}{parameters} && (($line =~ /[,\s]*\$(\d+)\s=\s/) || ($line =~ /^'[^']*'$/)) ) { + # stores bind parameters if any + $cur_info{$cur_pid}{parameters} .= " $line"; + next; + } + if (exists $cur_temp_info{$cur_pid}{query}) { + $cur_temp_info{$cur_pid}{query} .= "\n" . $line; + } elsif (exists $cur_lock_info{$cur_pid}{query}) { + $cur_lock_info{$cur_pid}{query} .= "\n" . $line; + } elsif (exists $cur_info{$cur_pid}{statement}) { + $cur_info{$cur_pid}{statement} .= "\n" . $line; + } elsif (exists $cur_info{$cur_pid}{context}) { + $cur_info{$cur_pid}{context} .= "\n" . $line; + } elsif (exists $cur_info{$cur_pid}{detail}) { + $cur_info{$cur_pid}{detail} .= "\n" . $line; + } elsif (exists $cur_info{$cur_pid}{query}) { + $cur_info{$cur_pid}{query} .= "\n" . $line; + } + + # Collect orphans lines of multiline queries + } elsif ($cur_pid && ($cur_info{$cur_pid}{query})) { + + $cur_info{$cur_pid}{detail} .= "\n" . $line; + + } + + } else { + + # unknown format + &logmsg('DEBUG', "Unknown line format: $line"); + } + last if (($stop_offset > 0) && ($current_offset > $stop_offset)); + } + $last_line{current_pos} = $current_offset if ($last_parsed && ($#given_log_files == 0)); + + } + close $lfile; + + # Get stats from all pending temporary storage + foreach my $pid (sort {$cur_info{$a}{date} <=> $cur_info{$b}{date}} keys %cur_info) { + # Stores last queries information + &store_queries($pid); + + } + # Stores last temporary files and locks information + foreach my $pid (keys %cur_temp_info) { + &store_temporary_and_lock_infos($pid); + } + # Stores last temporary files and locks information + foreach my $pid (keys %cur_lock_info) { + &store_temporary_and_lock_infos($pid); + } + + if ($extension eq 'tsung') { + foreach my $pid (sort {$a <=> $b} keys %tsung_session) { + &store_tsung_session($pid); + } + } + + if ($progress && ($getout != 1)) { + if (!$tmpoutfile) { + if ($totalsize) { + if (($stop_offset > 0) && ($format ne 'csv')) { + print STDERR &progress_bar($cursize - $start_offset, $stop_offset, 25, '=',$overall_stat{'queries_number'},$overall_stat{'errors_number'}, $logfile); + } elsif ($extension eq 'tsung') { + print STDERR &progress_bar($cursize, $totalsize, 25, '=', $logfile); + } else { + print STDERR &progress_bar($cursize, $totalsize, 25, '=', $overall_stat{'queries_number'},$overall_stat{'errors_number'}, $logfile); + } + print STDERR "\n"; + } + } else { + $pipe->print("$cursize " . ($overall_stat{'queries_number'} - $old_queries_count) . " " . ($overall_stat{'errors_number'} - $old_errors_count) . "\n"); + } + } + + %cur_info = (); + + # In incremental mode data are saved to disk per day + if ($incremental && $last_line{datetime}) { + $incr_date = $last_line{datetime}; + $incr_date =~ s/\s.*$//; + # set path and create subdirectories + my $bpath = $incr_date; + while ($bpath =~ s/([^\-]+)\-/$1\//) { + mkdir("$outdir/$1") if (!-d "$outdir/$1"); + } + mkdir("$outdir/$bpath") if (!-d "$outdir/$bpath"); + + # Mark the directory as needing index update + if (open(OUT, ">>$last_parsed.tmp")) { + flock(OUT, 2) || return $getout; + print OUT "$incr_date\n"; + close(OUT); + } else { + &logmsg('ERROR', "can't save last parsed line into $last_parsed.tmp, $!"); + } + + # Save binary data + my $filenum = $$; + $filenum++ while (-e "$outdir/$bpath/$incr_date-$filenum.bin"); + my $fhb = new IO::File ">$outdir/$bpath/$incr_date-$filenum.bin"; + if (not defined $fhb) { + die "FATAL: can't write to $outdir/$bpath/$incr_date-$filenum.bin, $!\n"; + } + &dump_as_binary($fhb); + $fhb->close; + &init_stats_vars(); + + } elsif ($tmpoutfile) { + + &dump_as_binary($tmpoutfile); + $tmpoutfile->close(); + + } + + # Inform the parent that it should stop parsing other files + if ($getout) { + kill(12, $parent_pid); + } + + # Save last line into temporary file + if ($last_parsed && scalar keys %last_line) { + if (open(OUT, ">>$tmp_last_parsed")) { + flock(OUT, 2) || return $getout; + $last_line{current_pos} ||= 0; + print OUT "$last_line{datetime}\t$last_line{current_pos}\t$last_line{orig}\n"; + close(OUT); + } else { + &logmsg('ERROR', "can't save last parsed line into $tmp_last_parsed, $!"); + } + } + + return $getout; +} + +# Store the current timestamp of the log line +sub store_current_timestamp +{ + my $t_timestamp = shift; + + $prefix_vars{'t_date'} = $t_timestamp; + $prefix_vars{'t_date'} =~ s/\D+//g; + + if (!$overall_stat{'first_log_ts'} || ($overall_stat{'first_log_ts'} gt $t_timestamp)) { + $overall_stat{'first_log_ts'} = $t_timestamp; + } + if (!$overall_stat{'last_log_ts'} || ($overall_stat{'last_log_ts'} lt $t_timestamp)) { + $overall_stat{'last_log_ts'} = $t_timestamp; + } +} + +# Method used to check if the file stores logs after the last incremental position or not +# This position should have been saved in the incremental file and read in the $last_parsed at +# start up. Here we just verify that the first date in file is before the last incremental date. +sub check_file_changed +{ + my ($file, $saved_date) = @_; + + my ($lfile, $totalsize, $iscompressed) = &get_log_file($file); + + # Compressed files do not allow seeking + if ($iscompressed) { + close($lfile); + return 1; + # do not seek if filesize is smaller than the seek position + } elsif ($saved_last_line{current_pos} > $totalsize) { + close($lfile); + return 1; + } + + my ($gsec, $gmin, $ghour, $gmday, $gmon, $gyear, $gwday, $gyday, $gisdst) = localtime(time); + $gyear += 1900; + my $CURRENT_DATE = $gyear . sprintf("%02d", $gmon + 1) . sprintf("%02d", $gmday); + + %prefix_vars = (); + while (my $line = <$lfile>) { + + if ($format =~ /syslog/) { + + my @matches = ($line =~ $compiled_prefix); + if ($#matches >= 0) { + + for (my $i = 0 ; $i <= $#prefix_params ; $i++) { + $prefix_vars{$prefix_params[$i]} = $matches[$i]; + } + # Standard syslog format does not have year information, months are + # three letters and day are not always with 2 digit. + if ($prefix_vars{'t_month'} !~ /\d/) { + $prefix_vars{'t_year'} = $gyear; + $prefix_vars{'t_day'} = sprintf("%02d", $prefix_vars{'t_day'}); + $prefix_vars{'t_month'} = $month_abbr{$prefix_vars{'t_month'}}; + # Take care of year overlapping + if ("$prefix_vars{'t_year'}$prefix_vars{'t_month'}$prefix_vars{'t_day'}" > $CURRENT_DATE) { + $prefix_vars{'t_year'} = substr($CURRENT_DATE, 0, 4) - 1; + } + } + $prefix_vars{'t_timestamp'} = +"$prefix_vars{'t_year'}-$prefix_vars{'t_month'}-$prefix_vars{'t_day'} $prefix_vars{'t_hour'}:$prefix_vars{'t_min'}:$prefix_vars{'t_sec'}"; + if ($saved_date gt $prefix_vars{'t_timestamp'}) { + close($lfile); + return 0; + } else { + last; + } + } + + } elsif ($format eq 'stderr') { + + my @matches = ($line =~ $compiled_prefix); + if ($#matches >= 0) { + for (my $i = 0 ; $i <= $#prefix_params ; $i++) { + $prefix_vars{$prefix_params[$i]} = $matches[$i]; + } + if (!$prefix_vars{'t_timestamp'} && $prefix_vars{'t_mtimestamp'}) { + $prefix_vars{'t_timestamp'} = $prefix_vars{'t_mtimestamp'}; + } elsif (!$prefix_vars{'t_timestamp'} && $prefix_vars{'t_session_timestamp'}) { + $prefix_vars{'t_timestamp'} = $prefix_vars{'t_session_timestamp'}; + } + } + if ($saved_date gt $prefix_vars{'t_timestamp'}) { + close($lfile); + return 0; + } else { + last; + } + } + } + close($lfile); + + return 1; +} + + +# Method used to check if we have already reach the last parsing position in incremental mode +# This position should have been saved in the incremental file and read in the $last_parsed at +# start up. +sub check_incremental_position +{ + my ($cur_date, $line) = @_; + + if ($last_parsed) { + if ($saved_last_line{datetime}) { + if ($cur_date lt $saved_last_line{datetime}) { + return 0; + } elsif (!$last_line{datetime} && ($cur_date eq $saved_last_line{datetime})) { + return 0 if ($line ne $saved_last_line{orig}); + } + } + $last_line{datetime} = $cur_date; + $last_line{orig} = $line; + } + + # In incremental mode data are saved to disk per day + if ($incremental) { + $cur_date =~ s/\s.*$//; + # Check if the current day has changed, if so save data + $incr_date = $cur_date if (!$incr_date); + if ($cur_date gt $incr_date) { + + # Get stats from all pending temporary storage + foreach my $pid (sort {$cur_info{$a}{date} <=> $cur_info{$b}{date}} keys %cur_info) { + # Stores last queries information + &store_queries($pid); + } + # Stores last temporary files and locks information + foreach my $pid (keys %cur_temp_info) { + &store_temporary_and_lock_infos($pid); + } + # Stores last temporary files and locks information + foreach my $pid (keys %cur_lock_info) { + &store_temporary_and_lock_infos($pid); + } + + if ($extension eq 'tsung') { + foreach my $pid (sort {$a <=> $b} keys %tsung_session) { + &store_tsung_session($pid); + } + } + + # set path and create subdirectories + my $bpath = $incr_date; + while ($bpath =~ s/([^\-]+)\-/$1\//) { + mkdir("$outdir/$1") if (!-d "$outdir/$1"); + } + mkdir("$outdir/$bpath") if (!-d "$outdir/$bpath"); + + # Mark this directory as needing a reindex + if (open(OUT, ">>$last_parsed.tmp")) { + flock(OUT, 2) || return 1; + print OUT "$incr_date\n"; + close(OUT); + } else { + &logmsg('ERROR', "can't save last parsed line into $last_parsed.tmp, $!"); + } + + # Save binary data + my $filenum = $$; + $filenum++ while (-e "$outdir/$bpath/$incr_date-$filenum.bin"); + my $fhb = new IO::File ">$outdir/$bpath/$incr_date-$filenum.bin"; + if (not defined $fhb) { + die "FATAL: can't write to $outdir/$bpath/$incr_date-$filenum.bin, $!\n"; + } + &dump_as_binary($fhb); + $fhb->close; + $incr_date = $cur_date; + &init_stats_vars(); + } + } + + return 1; +} + +# Display message following the log level +sub logmsg +{ + my ($level, $str) = @_; + + return if ($quiet && ($level ne 'FATAL')); + return if (!$debug && ($level eq 'DEBUG')); + + if ($level =~ /(\d+)/) { + print STDERR "\t" x $1; + } + + print STDERR "$level: $str\n"; +} + +# Normalize SQL queries by removing parameters +sub normalize_query +{ + my $orig_query = shift; + + return if (!$orig_query); + + # Remove comments + $orig_query =~ s/\/\*(.*?)\*\///gs; + + $orig_query = lc($orig_query); + + # Remove extra space, new line and tab characters by a single space + $orig_query =~ s/[\t\s\r\n]+/ /gs; + + # Remove string content + $orig_query =~ s/\\'//g; + $orig_query =~ s/'[^']*'/''/g; + $orig_query =~ s/''('')+/''/g; + + # Remove NULL parameters + $orig_query =~ s/=\s*NULL/=''/g; + + # Remove numbers + $orig_query =~ s/([^a-z_\$-])-?([0-9]+)/${1}0/g; + + # Remove hexadecimal numbers + $orig_query =~ s/([^a-z_\$-])0x[0-9a-f]{1,10}/${1}0x/g; + + # Remove IN values + $orig_query =~ s/in\s*\([\'0x,\s]*\)/in (...)/g; + + return $orig_query; +} + +# Format numbers with comma for better reading +sub comma_numbers +{ + return 0 if ($#_ < 0); + + return 0 if (!$_[0]); + + my $text = reverse $_[0]; + + $text =~ s/(\d\d\d)(?=\d)(?!\d*\.)/$1$num_sep/g; + + return scalar reverse $text; +} + +# Format numbers with comma for better reading +sub pretty_print_size +{ + my $val = shift; + return 0 if (!$val); + + if ($val >= 1125899906842624) { + $val = ($val / 1125899906842624); + $val = sprintf("%0.2f", $val) . " PiB"; + } elsif ($val >= 1099511627776) { + $val = ($val / 1099511627776); + $val = sprintf("%0.2f", $val) . " TiB"; + } elsif ($val >= 1073741824) { + $val = ($val / 1073741824); + $val = sprintf("%0.2f", $val) . " GiB"; + } elsif ($val >= 1048576) { + $val = ($val / 1048576); + $val = sprintf("%0.2f", $val) . " MiB"; + } elsif ($val >= 1024) { + $val = ($val / 1024); + $val = sprintf("%0.2f", $val) . " KiB"; + } else { + $val = $val . " B"; + } + + return $val; +} + + +# Format duration +sub convert_time +{ + my $time = shift; + + return '0s' if (!$time); + + my $days = int($time / 86400000); + $time -= ($days * 86400000); + my $hours = int($time / 3600000); + $time -= ($hours * 3600000); + my $minutes = int($time / 60000); + $time -= ($minutes * 60000); + my $seconds = sprintf("%0.3f", $time / 1000); + + $days = $days < 1 ? '' : $days . 'd'; + $hours = $hours < 1 ? '' : $hours . 'h'; + $minutes = $minutes < 1 ? '' : $minutes . 'm'; + $seconds =~ s/\.\d+$// if ($minutes); + $time = $days . $hours . $minutes . $seconds . 's'; + + return $time; +} + +# Stores the top N queries generating the biggest temporary file +sub set_top_tempfile_info +{ + my ($q, $sz, $date, $db, $user, $remote, $app) = @_; + + push(@top_tempfile_info, [($sz, $date, $q, $db, $user, $remote, $app)]); + + my @tmp_top_tempfile_info = sort {$b->[0] <=> $a->[0]} @top_tempfile_info; + @top_tempfile_info = (); + for (my $i = 0; $i <= $#tmp_top_tempfile_info; $i++) { + push(@top_tempfile_info, $tmp_top_tempfile_info[$i]); + last if ($i == $end_top); + } +} + +# Stores the top N queries waiting the most +sub set_top_locked_info +{ + my ($q, $dt, $date, $db, $user, $remote, $app) = @_; + + push(@top_locked_info, [($dt, $date, $q, $db, $user, $remote, $app)]); + + my @tmp_top_locked_info = sort {$b->[0] <=> $a->[0]} @top_locked_info; + @top_locked_info = (); + for (my $i = 0; $i <= $#tmp_top_locked_info; $i++) { + push(@top_locked_info, $tmp_top_locked_info[$i]); + last if ($i == $end_top); + } +} + +# Stores the top N slowest queries +sub set_top_slowest +{ + my ($q, $dt, $date, $db, $user, $remote, $app) = @_; + + push(@top_slowest, [($dt, $date, $q, $db, $user, $remote, $app)]); + + my @tmp_top_slowest = sort {$b->[0] <=> $a->[0]} @top_slowest; + @top_slowest = (); + for (my $i = 0; $i <= $#tmp_top_slowest; $i++) { + push(@top_slowest, $tmp_top_slowest[$i]); + last if ($i == $end_top); + } + +} + +# Stores top N slowest sample queries +sub set_top_sample +{ + my ($norm, $q, $dt, $date, $db, $user, $remote, $app) = @_; + + $normalyzed_info{$norm}{samples}{$dt}{query} = $q; + $normalyzed_info{$norm}{samples}{$dt}{date} = $date; + $normalyzed_info{$norm}{samples}{$dt}{db} = $db; + $normalyzed_info{$norm}{samples}{$dt}{user} = $user; + $normalyzed_info{$norm}{samples}{$dt}{remote} = $remote; + $normalyzed_info{$norm}{samples}{$dt}{app} = $app; + + my $i = 1; + foreach my $k (sort {$b <=> $a} keys %{$normalyzed_info{$norm}{samples}}) { + if ($i > $sample) { + delete $normalyzed_info{$norm}{samples}{$k}; + } + $i++; + } +} + +# Stores top N error sample queries +sub set_top_error_sample +{ + my ($q, $date, $real_error, $detail, $context, $statement, $hint, $db) = @_; + + # Stop when we have our number of samples + if (!exists $error_info{$q}{date} || ($#{$error_info{$q}{date}} < $sample)) { + if ( ($q =~ /deadlock detected/) || ($real_error && !grep(/\Q$real_error\E/, @{$error_info{$q}{error}})) ) { + push(@{$error_info{$q}{date}}, $date); + push(@{$error_info{$q}{detail}}, $detail); + push(@{$error_info{$q}{context}}, $context); + push(@{$error_info{$q}{statement}}, $statement); + push(@{$error_info{$q}{hint}}, $hint); + push(@{$error_info{$q}{error}}, $real_error); + push(@{$error_info{$q}{db}}, $db); + } + } +} + +sub dump_as_text +{ + + # Global information + my $curdate = localtime(time); + my $fmt_nlines = &comma_numbers($nlines); + my $total_time = timestr($td); + $total_time =~ s/^([\.0-9]+) wallclock.*/$1/; + $total_time = &convert_time($total_time * 1000); + my $logfile_str = $log_files[0]; + if ($#log_files > 0) { + $logfile_str .= ', ..., ' . $log_files[-1]; + } + print $fh qq{ +pgBadger :: $report_title + +- Global information --------------------------------------------------- + +Generated on $curdate +Log file: $logfile_str +Parsed $fmt_nlines log entries in $total_time +Log start from $overall_stat{'first_log_ts'} to $overall_stat{'last_log_ts'} +}; + + # Overall statistics + my $fmt_unique = &comma_numbers(scalar keys %normalyzed_info); + my $fmt_queries = &comma_numbers($overall_stat{'queries_number'}); + my $fmt_duration = &convert_time($overall_stat{'queries_duration'}); + $overall_stat{'first_query_ts'} ||= '-'; + $overall_stat{'last_query_ts'} ||= '-'; + print $fh qq{ + +- Overall statistics --------------------------------------------------- + +Number of unique normalized queries: $fmt_unique +Number of queries: $fmt_queries +Total query duration: $fmt_duration +First query: $overall_stat{'first_query_ts'} +Last query: $overall_stat{'last_query_ts'} +}; + foreach (sort {$overall_stat{'peak'}{$b}{query} <=> $overall_stat{'peak'}{$a}{query}} keys %{$overall_stat{'peak'}}) { + print $fh "Query peak: ", &comma_numbers($overall_stat{'peak'}{$_}{query}), " queries/s at $_"; + last; + } + if (!$disable_error) { + my $fmt_errors = &comma_numbers($overall_stat{'errors_number'}); + my $fmt_unique_error = &comma_numbers(scalar keys %error_info); + print $fh qq{ +Number of events: $fmt_errors +Number of unique normalized events: $fmt_unique_error +}; + } + if ($tempfile_info{count}) { + my $fmt_temp_maxsise = &comma_numbers($tempfile_info{maxsize}); + my $fmt_temp_avsize = &comma_numbers(sprintf("%.2f", ($tempfile_info{size} / $tempfile_info{count}))); + print $fh qq{Number temporary files: $tempfile_info{count} +Max size of temporary files: $fmt_temp_maxsise +Average size of temporary files: $fmt_temp_avsize +}; + } + if (!$disable_session && $session_info{count}) { + my $avg_session_duration = &convert_time($session_info{duration} / $session_info{count}); + my $tot_session_duration = &convert_time($session_info{duration}); + print $fh qq{Total number of sessions: $session_info{count} +Total duration of sessions: $tot_session_duration +Average duration of sessions: $avg_session_duration +}; + foreach (sort {$overall_stat{'peak'}{$b}{session} <=> $overall_stat{'peak'}{$a}{session}} keys %{$overall_stat{'peak'}}) { + print $fh "Session peak: ", &comma_numbers($overall_stat{'peak'}{$_}{session}), " sessions at $_"; + last; + } + } + if (!$disable_connection && $connection_info{count}) { + print $fh "Total number of connections: $connection_info{count}\n"; + foreach (sort {$overall_stat{'peak'}{$b}{connection} <=> $overall_stat{'peak'}{$a}{connection}} keys %{$overall_stat{'peak'}}) { + if ($overall_stat{'peak'}{$_}{connection} > 0) { + print $fh "Connection peak: ", &comma_numbers($overall_stat{'peak'}{$_}{connection}), " conn/s at $_"; + } + last; + } + } + if (scalar keys %database_info > 1) { + print $fh "Total number of databases: ", scalar keys %database_info, "\n"; + } + if (!$disable_hourly && $overall_stat{'queries_number'}) { + print $fh qq{ + +- Hourly statistics ---------------------------------------------------- + +Report not supported by text format + +}; + } + + # INSERT/DELETE/UPDATE/SELECT repartition + my $totala = 0; + foreach my $a (@SQL_ACTION) { + $totala += $overall_stat{$a}; + } + if (!$disable_type && $totala) { + + my $total = $overall_stat{'queries_number'} || 1; + print $fh "\n- Queries by type ------------------------------------------------------\n\n"; + print $fh "Type Count Percentage\n"; + foreach my $a (@SQL_ACTION) { + print $fh "$a: ", &comma_numbers($overall_stat{$a}), " ", sprintf("%0.2f", ($overall_stat{$a} * 100) / $total), "%\n"; + } + print $fh "OTHERS: ", &comma_numbers($total - $totala), " ", sprintf("%0.2f", (($total - $totala) * 100) / $total), "%\n" + if (($total - $totala) > 0); + print $fh "\n"; + + # Show request per database statistics + if (scalar keys %database_info > 1) { + print $fh "\n- Request per database ------------------------------------------------------\n\n"; + print $fh "Database Request type Count\n"; + foreach my $d (sort keys %database_info) { + print $fh "$d - ", &comma_numbers($database_info{$d}{count}), "\n"; + foreach my $r (sort keys %{$database_info{$d}}) { + next if ($r eq 'count'); + print $fh "\t$r ", &comma_numbers($database_info{$d}{$r}), "\n"; + } + } + } + + # Show request per application statistics + if (scalar keys %application_info > 1) { + print $fh "\n- Request per application ------------------------------------------------------\n\n"; + print $fh "Application Request type Count\n"; + foreach my $d (sort keys %application_info) { + print $fh "$d - ", &comma_numbers($application_info{$d}{count}), "\n"; + foreach my $r (sort keys %{$application_info{$d}}) { + next if ($r eq 'count'); + print $fh "\t$r ", &comma_numbers($application_info{$d}{$r}), "\n"; + } + } + } + + # Show request per user statistics + if (scalar keys %user_info > 1) { + print $fh "\n- Request per user ------------------------------------------------------\n\n"; + print $fh "User Request type Count\n"; + foreach my $d (sort keys %user_info) { + print $fh "$d - ", &comma_numbers($user_info{$d}{count}), "\n"; + foreach my $r (sort keys %{$user_info{$d}}) { + next if ($r eq 'count'); + print $fh "\t$r ", &comma_numbers($user_info{$d}{$r}), "\n"; + } + } + } + + # Show request per user statistics + if (scalar keys %user_info > 1) { + print $fh "\n- Request per user ------------------------------------------------------\n\n"; + print $fh "Host Request type Count\n"; + foreach my $d (sort keys %user_info) { + print $fh "$d - ", &comma_numbers($user_info{$d}{count}), "\n"; + foreach my $r (sort keys %{$user_info{$d}}) { + next if ($r eq 'count'); + print $fh "\t$r ", &comma_numbers($user_info{$d}{$r}), "\n"; + } + } + } + + + + } + + if (!$disable_lock && scalar keys %lock_info > 0) { + print $fh "\n- Locks by type ------------------------------------------------------\n\n"; + print $fh "Type Object Count Total Duration Avg duration (s)\n"; + my $total_count = 0; + my $total_duration = 0; + foreach my $t (sort keys %lock_info) { + print $fh "$t\t\t", &comma_numbers($lock_info{$t}{count}), " ", &convert_time($lock_info{$t}{duration}), " ", + &convert_time($lock_info{$t}{duration} / $lock_info{$t}{count}), "\n"; + foreach my $o (sort keys %{$lock_info{$t}}) { + next if (($o eq 'count') || ($o eq 'duration') || ($o eq 'chronos')); + print $fh "\t$o\t", &comma_numbers($lock_info{$t}{$o}{count}), " ", &convert_time($lock_info{$t}{$o}{duration}), " ", + &convert_time($lock_info{$t}{$o}{duration} / $lock_info{$t}{$o}{count}), "\n"; + } + $total_count += $lock_info{$t}{count}; + $total_duration += $lock_info{$t}{duration}; + } + print $fh "Total:\t\t\t", &comma_numbers($total_count), " ", &convert_time($total_duration), " ", + &convert_time($total_duration / ($total_count || 1)), "\n"; + + } + + # Show session per database statistics + if (!$disable_session && exists $session_info{database}) { + print $fh "\n- Sessions per database ------------------------------------------------------\n\n"; + print $fh "Database Count Total Duration Avg duration (s)\n"; + foreach my $d (sort keys %{$session_info{database}}) { + print $fh "$d - ", &comma_numbers($session_info{database}{$d}{count}), " ", + &convert_time($session_info{database}{$d}{duration}), " ", + &convert_time($session_info{database}{$d}{duration} / $session_info{database}{$d}{count}), "\n"; + } + } + + # Show session per user statistics + if (!$disable_session && exists $session_info{user}) { + print $fh "\n- Sessions per user ------------------------------------------------------\n\n"; + print $fh "User Count Total Duration Avg duration (s)\n"; + foreach my $d (sort keys %{$session_info{user}}) { + print $fh "$d - ", &comma_numbers($session_info{user}{$d}{count}), " ", &convert_time($session_info{user}{$d}{duration}), + " ", &convert_time($session_info{user}{$d}{duration} / $session_info{user}{$d}{count}), "\n"; + } + } + + # Show session per host statistics + if (!$disable_session && exists $session_info{host}) { + print $fh "\n- Sessions per host ------------------------------------------------------\n\n"; + print $fh "User Count Total Duration Avg duration (s)\n"; + foreach my $d (sort keys %{$session_info{host}}) { + print $fh "$d - ", &comma_numbers($session_info{host}{$d}{count}), " ", &convert_time($session_info{host}{$d}{duration}), + " ", &convert_time($session_info{host}{$d}{duration} / $session_info{host}{$d}{count}), "\n"; + } + } + + # Show connection per database statistics + if (!$disable_connection && exists $connection_info{database}) { + print $fh "\n- Connections per database ------------------------------------------------------\n\n"; + print $fh "Database User Count\n"; + foreach my $d (sort keys %{$connection_info{database}}) { + print $fh "$d - ", &comma_numbers($connection_info{database}{$d}), "\n"; + foreach my $u (sort keys %{$connection_info{user}}) { + next if (!exists $connection_info{database_user}{$d}{$u}); + print $fh "\t$u ", &comma_numbers($connection_info{database_user}{$d}{$u}), "\n"; + } + } + } + + # Show connection per user statistics + if (!$disable_connection && exists $connection_info{user}) { + print $fh "\n- Connections per user ------------------------------------------------------\n\n"; + print $fh "User Count\n"; + foreach my $d (sort keys %{$connection_info{user}}) { + print $fh "$d - ", &comma_numbers($connection_info{user}{$d}), "\n"; + } + } + + # Show connection per host statistics + if (!$disable_connection && exists $connection_info{host}) { + print $fh "\n- Connections per host ------------------------------------------------------\n\n"; + print $fh "User Count\n"; + foreach my $d (sort keys %{$connection_info{host}}) { + print $fh "$d - ", &comma_numbers($connection_info{host}{$d}), "\n"; + } + } + + # Show lock wait detailed information + if (!$disable_lock && scalar keys %lock_info > 0) { + + my @top_locked_queries; + foreach my $h (keys %normalyzed_info) { + if (exists($normalyzed_info{$h}{locks})) { + push (@top_locked_queries, [$h, $normalyzed_info{$h}{locks}{count}, $normalyzed_info{$h}{locks}{wait}, + $normalyzed_info{$h}{locks}{minwait}, $normalyzed_info{$h}{locks}{maxwait}]); + } + } + + # Most frequent waiting queries (N) + @top_locked_queries = sort {$b->[2] <=> $a->[2]} @top_locked_queries; + print $fh "\n- Most frequent waiting queries (N) -----------------------------------------\n\n"; + print $fh "Rank Count Total wait time (s) Min/Max/Avg duration (s) Query\n"; + for (my $i = 0 ; $i <= $#top_locked_queries ; $i++) { + last if ($i > $end_top); + print $fh ($i + 1), ") ", $top_locked_queries[$i]->[1], " - ", &convert_time($top_locked_queries[$i]->[2]), + " - ", &convert_time($top_locked_queries[$i]->[3]), "/", &convert_time($top_locked_queries[$i]->[4]), "/", + &convert_time(($top_locked_queries[$i]->[4] / $top_locked_queries[$i]->[1])), + " - ", $top_locked_queries[$i]->[0], "\n"; + print $fh "--\n"; + my $k = $top_locked_queries[$i]->[0]; + my $j = 1; + foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { + my $ttl = $top_locked_info[$i]->[1] || ''; + my $db = " - $normalyzed_info{$k}{samples}{$d}{date} - database: $normalyzed_info{$k}{samples}{$d}{db}" if ($normalyzed_info{$k}{samples}{$d}{db}); + $db .= ", user: $normalyzed_info{$k}{samples}{$d}{user}" if ($normalyzed_info{$k}{samples}{$d}{user}); + $db .= ", remote: $normalyzed_info{$k}{samples}{$d}{remote}" if ($normalyzed_info{$k}{samples}{$d}{remote}); + $db .= ", app: $normalyzed_info{$k}{samples}{$d}{app}" if ($normalyzed_info{$k}{samples}{$d}{app}); + $db =~ s/^, / - /; + print $fh "\t- Example $j: ", &convert_time($d), "$db - ", $normalyzed_info{$k}{samples}{$d}{query}, "\n"; + $j++; + } + } + print $fh "\n"; + @top_locked_queries = (); + + # Queries that waited the most + @top_locked_info = sort {$b->[1] <=> $a->[1]} @top_locked_info; + print $fh "\n- Queries that waited the mosts ---------------------------------------------\n\n"; + print $fh "Rank Wait time (s) Query\n"; + for (my $i = 0 ; $i <= $#top_locked_info ; $i++) { + my $ttl = $top_locked_info[$i]->[1] || ''; + my $db = " - database: $top_locked_info[$i]->[3]" if ($top_locked_info[$i]->[3]); + $db .= ", user: $top_locked_info[$i]->[4]" if ($top_locked_info[$i]->[4]); + $db .= ", remote: $top_locked_info[$i]->[5]" if ($top_locked_info[$i]->[5]); + $db .= ", app: $top_locked_info[$i]->[6]" if ($top_locked_info[$i]->[6]); + $db =~ s/^, / - /; + print $fh ($i + 1), ") ", &convert_time($top_locked_info[$i]->[0]), + " $ttl$db - ", $top_locked_info[$i]->[2], "\n"; + print $fh "--\n"; + } + print $fh "\n"; + } + + # Show temporary files detailed information + if (!$disable_temporary && scalar keys %tempfile_info > 0) { + + my @top_temporary; + foreach my $h (keys %normalyzed_info) { + if (exists($normalyzed_info{$h}{tempfiles})) { + push (@top_temporary, [$h, $normalyzed_info{$h}{tempfiles}{count}, $normalyzed_info{$h}{tempfiles}{size}, + $normalyzed_info{$h}{tempfiles}{minsize}, $normalyzed_info{$h}{tempfiles}{maxsize}]); + } + } + + # Queries generating the most temporary files (N) + @top_temporary = sort {$b->[1] <=> $a->[1]} @top_temporary; + print $fh "\n- Queries generating the most temporary files (N) ---------------------------\n\n"; + print $fh "Rank Count Total size Min/Max/Avg size Query\n"; + my $idx = 1; + for (my $i = 0 ; $i <= $#top_temporary ; $i++) { + last if ($i > $end_top); + print $fh $idx, ") ", + $top_temporary[$i]->[1], " - ", &comma_numbers($top_temporary[$i]->[2]), + " - ", &comma_numbers($top_temporary[$i]->[3]), + "/", &comma_numbers($top_temporary[$i]->[4]), "/", + &comma_numbers(sprintf("%.2f", $top_temporary[$i]->[2] / $top_temporary[$i]->[1])), + " - ", $top_temporary[$i]->[0], "\n"; + print $fh "--\n"; + my $k = $top_temporary[$i]->[0]; + if ($normalyzed_info{$k}{count} > 1) { + my $j = 1; + foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { + my $db = "$normalyzed_info{$k}{samples}{$d}{date} - database: $normalyzed_info{$k}{samples}{$d}{db}" if ($normalyzed_info{$k}{samples}{$d}{db}); + $db .= ", user: $normalyzed_info{$k}{samples}{$d}{user}" if ($normalyzed_info{$k}{samples}{$d}{user}); + $db .= ", remote: $normalyzed_info{$k}{samples}{$d}{remote}" if ($normalyzed_info{$k}{samples}{$d}{remote}); + $db .= ", app: $normalyzed_info{$k}{samples}{$d}{app}" if ($normalyzed_info{$k}{samples}{$d}{app}); + $db =~ s/^, / - /; + print $fh "\t- Example $j: ", &convert_time($d), " - $db - ", $normalyzed_info{$k}{samples}{$d}{query}, "\n"; + $j++; + } + } + $idx++; + } + @top_temporary = (); + + # Top queries generating the largest temporary files + @top_tempfile_info = sort {$b->[1] <=> $a->[1]} @top_tempfile_info; + + print $fh "\n- Queries generating the largest temporary files ----------------------------\n\n"; + print $fh "Rank Size Query\n"; + for (my $i = 0 ; $i <= $#top_tempfile_info ; $i++) { + my $ttl = $top_tempfile_info[$i]->[1] || ''; + my $db = " - database: $top_tempfile_info[$i]->[3]" if ($top_tempfile_info[$i]->[3]); + $db .= ", user: $top_tempfile_info[$i]->[4]" if ($top_tempfile_info[$i]->[4]); + $db .= ", remote: $top_tempfile_info[$i]->[5]" if ($top_tempfile_info[$i]->[5]); + $db .= ", app: $top_tempfile_info[$i]->[6]" if ($top_tempfile_info[$i]->[6]); + $db =~ s/^, / - /; + print $fh ($i + 1), ") ", &comma_numbers($top_tempfile_info[$i]->[0]), + " - $ttl$db - ", $top_tempfile_info[$i]->[2], "\n"; + } + print $fh "\n"; + } + + # Show top information + if (!$disable_query && ($#top_slowest >= 0)) { + print $fh "\n- Slowest queries ------------------------------------------------------\n\n"; + print $fh "Rank Duration (s) Query\n"; + for (my $i = 0 ; $i <= $#top_slowest ; $i++) { + my $db = " database: $top_slowest[$i]->[3]" if ($top_slowest[$i]->[3]); + $db .= ", user: $top_slowest[$i]->[4]" if ($top_slowest[$i]->[4]); + $db .= ", remote: $top_slowest[$i]->[5]" if ($top_slowest[$i]->[5]); + $db .= ", app: $top_slowest[$i]->[6]" if ($top_slowest[$i]->[6]); + $db =~ s/^, //; + print $fh $i + 1, ") " . &convert_time($top_slowest[$i]->[0]) . "$db - $top_slowest[$i]->[2]\n"; + print $fh "--\n"; + } + + print $fh "\n- Queries that took up the most time (N) -------------------------------\n\n"; + print $fh "Rank Total duration Times executed Min/Max/Avg duration (s) Query\n"; + my $idx = 1; + foreach my $k (sort {$normalyzed_info{$b}{duration} <=> $normalyzed_info{$a}{duration}} keys %normalyzed_info) { + next if (!$normalyzed_info{$k}{count}); + last if ($idx > $top); + my $q = $k; + if ($normalyzed_info{$k}{count} == 1) { + foreach (keys %{$normalyzed_info{$k}{samples}}) { + $q = $normalyzed_info{$k}{samples}{$_}{query}; + last; + } + } + $normalyzed_info{$k}{average} = $normalyzed_info{$k}{duration} / $normalyzed_info{$k}{count}; + print $fh "$idx) " + . &convert_time($normalyzed_info{$k}{duration}) . " - " + . &comma_numbers($normalyzed_info{$k}{count}) . " - " + . &convert_time($normalyzed_info{$k}{min}) . "/" + . &convert_time($normalyzed_info{$k}{max}) . "/" + . &convert_time($normalyzed_info{$k}{average}) + . " - $q\n"; + print $fh "--\n"; + my $i = 1; + foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { + my $db = " - database: $normalyzed_info{$k}{samples}{$d}{db}" if ($normalyzed_info{$k}{samples}{$d}{db}); + $db .= ", user: $normalyzed_info{$k}{samples}{$d}{user}" if ($normalyzed_info{$k}{samples}{$d}{user}); + $db .= ", remote: $normalyzed_info{$k}{samples}{$d}{remote}" if ($normalyzed_info{$k}{samples}{$d}{remote}); + $db .= ", app: $normalyzed_info{$k}{samples}{$d}{app}" if ($normalyzed_info{$k}{samples}{$d}{app}); + $db =~ s/^, / - /; + print $fh "\t- Example $i: ", &convert_time($d), "$db - ", $normalyzed_info{$k}{samples}{$d}{query}, "\n"; + $i++; + } + $idx++; + } + } + if (!$disable_query && (scalar keys %normalyzed_info > 0)) { + print $fh "\n- Most frequent queries (N) --------------------------------------------\n\n"; + print $fh "Rank Times executed Total duration Min/Max/Avg duration (s) Query\n"; + my $idx = 1; + foreach my $k (sort {$normalyzed_info{$b}{count} <=> $normalyzed_info{$a}{count}} keys %normalyzed_info) { + next if (!$normalyzed_info{$k}{count}); + last if ($idx > $top); + my $q = $k; + if ($normalyzed_info{$k}{count} == 1) { + foreach (keys %{$normalyzed_info{$k}{samples}}) { + $q = $normalyzed_info{$k}{samples}{$_}{query}; + last; + } + } + print $fh "$idx) " + . &comma_numbers($normalyzed_info{$k}{count}) . " - " + . &convert_time($normalyzed_info{$k}{duration}) . " - " + . &convert_time($normalyzed_info{$k}{min}) . "/" + . &convert_time($normalyzed_info{$k}{max}) . "/" + . &convert_time($normalyzed_info{$k}{duration} / $normalyzed_info{$k}{count}) + . " - $q\n"; + print $fh "--\n"; + my $i = 1; + foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { + my $db = " - database: $normalyzed_info{$k}{samples}{$d}{db}" if ($normalyzed_info{$k}{samples}{$d}{db}); + $db .= ", user: $normalyzed_info{$k}{samples}{$d}{user}" if ($normalyzed_info{$k}{samples}{$d}{user}); + $db .= ", remote: $normalyzed_info{$k}{samples}{$d}{remote}" if ($normalyzed_info{$k}{samples}{$d}{remote}); + $db .= ", app: $normalyzed_info{$k}{samples}{$d}{app}" if ($normalyzed_info{$k}{samples}{$d}{app}); + $db =~ s/^, / - /; + print $fh "\tExample $i: ", &convert_time($d), "$db - ", $normalyzed_info{$k}{samples}{$d}{query}, "\n"; + $i++; + } + $idx++; + } + } + + if (!$disable_query && ($#top_slowest >= 0)) { + print $fh "\n- Slowest queries (N) --------------------------------------------------\n\n"; + print $fh "Rank Min/Max/Avg duration (s) Times executed Total duration Query\n"; + my $idx = 1; + foreach my $k (sort {$normalyzed_info{$b}{average} <=> $normalyzed_info{$a}{average}} keys %normalyzed_info) { + next if (!$normalyzed_info{$k}{count}); + last if ($idx > $top); + my $q = $k; + if ($normalyzed_info{$k}{count} == 1) { + foreach (keys %{$normalyzed_info{$k}{samples}}) { + $q = $normalyzed_info{$k}{samples}{$_}{query}; + last; + } + } + print $fh "$idx) " + . &convert_time($normalyzed_info{$k}{min}) . "/" + . &convert_time($normalyzed_info{$k}{max}) . "/" + . &convert_time($normalyzed_info{$k}{average}) . " - " + . &comma_numbers($normalyzed_info{$k}{count}) . " - " + . &convert_time($normalyzed_info{$k}{duration}) + . " - $q\n"; + print $fh "--\n"; + my $i = 1; + foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { + my $db = " - database: $normalyzed_info{$k}{samples}{$d}{db}" if ($normalyzed_info{$k}{samples}{$d}{db}); + $db .= ", user: $normalyzed_info{$k}{samples}{$d}{user}" if ($normalyzed_info{$k}{samples}{$d}{user}); + $db .= ", remote: $normalyzed_info{$k}{samples}{$d}{remote}" if ($normalyzed_info{$k}{samples}{$d}{remote}); + $db .= ", app: $normalyzed_info{$k}{samples}{$d}{app}" if ($normalyzed_info{$k}{samples}{$d}{app}); + $db =~ s/^, / - /; + print $fh "\tExample $i: ", &convert_time($d), "$db - ", $normalyzed_info{$k}{samples}{$d}{query}, "\n"; + $i++; + } + $idx++; + } + } + @top_slowest = (); + + if (!$disable_error) { + &show_error_as_text(); + } + + print $fh "\n\n"; + print $fh "Report generated by pgBadger $VERSION ($project_url).\n"; + +} + +sub dump_error_as_text +{ + + # Global information + my $curdate = localtime(time); + my $fmt_nlines = &comma_numbers($nlines); + my $total_time = timestr($td); + $total_time =~ s/^([\.0-9]+) wallclock.*/$1/; + $total_time = &convert_time($total_time * 1000); + my $logfile_str = $log_files[0]; + if ($#log_files > 0) { + $logfile_str .= ', ..., ' . $log_files[-1]; + } + print $fh qq{ +pgBadger :: $report_title + +- Global information --------------------------------------------------- + +Generated on $curdate +Log file: $logfile_str +Parsed $fmt_nlines log entries in $total_time +Log start from $overall_stat{'first_log_ts'} to $overall_stat{'last_log_ts'} +}; + + &show_error_as_text(); + + print $fh "\n\n"; + print $fh "Report generated by pgBadger $VERSION ($project_url).\n"; +} + +sub show_error_as_text +{ + return if (scalar keys %error_info == 0); + + print $fh "\n- Most frequent events (N) ---------------------------------------------\n\n"; + my $idx = 1; + foreach my $k (sort {$error_info{$b}{count} <=> $error_info{$a}{count}} keys %error_info) { + next if (!$error_info{$k}{count}); + last if ($idx > $top); + if ($error_info{$k}{count} > 1) { + my $msg = $k; + $msg =~ s/ERROR: (parameter "[^"]+" changed to)/LOG: $1/; + $msg =~ s/ERROR: (database system was shut down)/LOG: $1/; + $msg =~ s/ERROR: (recovery has paused)/LOG: $1/; + $msg =~ s/ERROR: (database system was interrupted while in recovery)/LOG: $1/; + print $fh "$idx) " . &comma_numbers($error_info{$k}{count}) . " - $msg\n"; + print $fh "--\n"; + my $j = 1; + for (my $i = 0 ; $i <= $#{$error_info{$k}{date}} ; $i++) { + if ( ($error_info{$k}{error}[$i] =~ s/ERROR: (parameter "[^"]+" changed to)/LOG: $1/) + || ($error_info{$k}{error}[$i] =~ s/ERROR: (database system was shut down)/LOG: $1/) + || ($error_info{$k}{error}[$i] =~ s/ERROR: (database system was interrupted while in recovery)/LOG: $1/) + || ($error_info{$k}{error}[$i] =~ s/ERROR: (recovery has paused)/LOG: $1/)) + { + $logs_type{ERROR}--; + $logs_type{LOG}++; + } + print $fh "\t- Example $j: $error_info{$k}{date}[$i] - $error_info{$k}{error}[$i]\n"; + print $fh "\t\tDetail: $error_info{$k}{detail}[$i]\n" if ($error_info{$k}{detail}[$i]); + print $fh "\t\tContext: $error_info{$k}{context}[$i]\n" if ($error_info{$k}{context}[$i]); + print $fh "\t\tHint: $error_info{$k}{hint}[$i]\n" if ($error_info{$k}{hint}[$i]); + print $fh "\t\tStatement: $error_info{$k}{statement}[$i]\n" if ($error_info{$k}{statement}[$i]); + print $fh "\t\tDatabase: $error_info{$k}{db}[$i]\n" if ($error_info{$k}{db}[$i]); + $j++; + } + } else { + if ( ($error_info{$k}{error}[0] =~ s/ERROR: (parameter "[^"]+" changed to)/LOG: $1/) + || ($error_info{$k}{error}[0] =~ s/ERROR: (database system was shut down)/LOG: $1/) + || ($error_info{$k}{error}[0] =~ s/ERROR: (database system was interrupted while in recovery)/LOG: $1/) + || ($error_info{$k}{error}[0] =~ s/ERROR: (recovery has paused)/LOG: $1/)) + { + $logs_type{ERROR}--; + $logs_type{LOG}++; + } + print $fh "$idx) " . &comma_numbers($error_info{$k}{count}) . " - $error_info{$k}{error}[0]\n"; + print $fh "--\n"; + print $fh "\t- Date: $error_info{$k}{date}[0]\n"; + print $fh "\t\tDetail: $error_info{$k}{detail}[0]\n" if ($error_info{$k}{detail}[0]); + print $fh "\t\tContext: $error_info{$k}{context}[0]\n" if ($error_info{$k}{context}[0]); + print $fh "\t\tHint: $error_info{$k}{hint}[0]\n" if ($error_info{$k}{hint}[0]); + print $fh "\t\tStatement: $error_info{$k}{statement}[0]\n" if ($error_info{$k}{statement}[0]); + print $fh "\t\tDatabase: $error_info{$k}{db}[0]\n" if ($error_info{$k}{db}[0]); + } + $idx++; + } + + if (scalar keys %logs_type > 0) { + print $fh "\n- Logs per type ---------------------------------------------\n\n"; + + my $total_logs = 0; + foreach my $d (keys %logs_type) { + $total_logs += $logs_type{$d}; + } + print $fh "Logs type Count Percentage\n"; + foreach my $d (sort keys %logs_type) { + next if (!$logs_type{$d}); + print $fh "$d\t\t", &comma_numbers($logs_type{$d}), "\t", sprintf("%0.2f", ($logs_type{$d} * 100) / $total_logs), "%\n"; + } + } +} + +sub html_header +{ + my $date = localtime(time); + my $global_info = &print_global_information(); + + print $fh qq{ + + +pgBadger :: $report_title + + + + + + +@jscode + + + + +


+
+ +
    +}; + +} + +sub html_footer +{ + print $fh qq{ + +
+
+ + + +
+ +
+ + + +}; + +} + + +# Create global information section +sub print_global_information +{ + + my $curdate = localtime(time); + my $fmt_nlines = &comma_numbers($nlines); + my $total_time = timestr($td); + $total_time =~ s/^([\.0-9]+) wallclock.*/$1/; + $total_time = &convert_time($total_time * 1000); + my $logfile_str = $log_files[0]; + if ($#log_files > 0) { + $logfile_str .= ', ..., ' . $log_files[-1]; + } + return qq{ +
    +
  • Generated on $curdate
  • +
  • Log file: $logfile_str
  • +
  • Parsed $fmt_nlines log entries in $total_time
  • +
  • Log start from $overall_stat{'first_log_ts'} to $overall_stat{'last_log_ts'}
  • +
+}; + +} + +sub print_overall_statistics +{ + + my $fmt_unique = &comma_numbers(scalar keys %normalyzed_info); + my $fmt_queries = &comma_numbers($overall_stat{'queries_number'}); + my $fmt_duration = &convert_time($overall_stat{'queries_duration'}); + $overall_stat{'first_query_ts'} ||= '-'; + $overall_stat{'last_query_ts'} ||= '-'; + my $query_peak = 0; + my $query_peak_date = ''; + foreach (sort {$overall_stat{'peak'}{$b}{query} <=> $overall_stat{'peak'}{$a}{query}} keys %{$overall_stat{'peak'}}) { + $query_peak = &comma_numbers($overall_stat{'peak'}{$_}{query}); + $query_peak_date = $_; + last; + } + my $fmt_errors = &comma_numbers($overall_stat{'errors_number'}); + my $fmt_unique_error = &comma_numbers(scalar keys %error_info); + my $autovacuum_count = &comma_numbers($autovacuum_info{count}); + my $autoanalyze_count = &comma_numbers($autoanalyze_info{count}); + my $tempfile_count = &comma_numbers($tempfile_info{count}); + my $fmt_temp_maxsise = &comma_numbers($tempfile_info{maxsize}); + my $fmt_temp_avsize = &comma_numbers(sprintf("%.2f", $tempfile_info{size} / ($tempfile_info{count} || 1))); + my $session_count = &comma_numbers($session_info{count}); + my $avg_session_duration = &convert_time($session_info{duration} / ($session_info{count} || 1)); + my $tot_session_duration = &convert_time($session_info{duration}); + my $connection_count = &comma_numbers($connection_info{count}); + my $connection_peak = 0; + my $connection_peak_date = ''; + my $session_peak = 0; + my $session_peak_date = ''; + foreach (sort {$overall_stat{'peak'}{$b}{connection} <=> $overall_stat{'peak'}{$a}{connection}} keys %{$overall_stat{'peak'}}) { + $connection_peak = &comma_numbers($overall_stat{'peak'}{$_}{connection}); + $connection_peak_date = $_; + last; + } + foreach (sort {$overall_stat{'peak'}{$b}{session} <=> $overall_stat{'peak'}{$a}{session}} keys %{$overall_stat{'peak'}}) { + $session_peak = &comma_numbers($overall_stat{'peak'}{$_}{session}); + $session_peak_date = $_; + last; + } + my $main_error = 0; + my $total = 0; + foreach my $k (sort {$error_info{$b}{count} <=> $error_info{$a}{count}} keys %error_info) { + next if (!$error_info{$k}{count}); + $main_error = &comma_numbers($error_info{$k}{count}) if (!$main_error); + $total += $error_info{$k}{count}; + } + $total = &comma_numbers($total); + + my $db_count = scalar keys %database_info; + print $fh qq{ +

Overview

+ +
+

Global Stats

+
+ +
+
+
    +
  • $fmt_unique Number of unique normalized queries
  • +
  • $fmt_queries Number of queries
  • +
  • $fmt_duration Total query duration
  • +
  • $overall_stat{'first_query_ts'} First query
  • +
  • $overall_stat{'last_query_ts'} Last query
  • +
  • $query_peak queries/s at $query_peak_date Query peak
  • +
+
+
+
    +
  • $fmt_errors Number of events
  • +
  • $fmt_unique_error Number of unique normalized events
  • +
  • $main_error Max number of times the same event was reported
  • +
+
+
+
    +
  • $autovacuum_count Total number of automatic vacuums
  • +
  • $autoanalyze_count Total number of automatic analyzes
  • +
+
+
+
    +
  • $tempfile_count Number temporary file
  • +
  • $fmt_temp_maxsise Max size of temporary file
  • +
  • $fmt_temp_avsize Average size of temporary file
  • +
+
+
+
    +
  • $session_count Total number of sessions
  • +
  • $session_peak sessions at $session_peak_date Session peak
  • +
  • $tot_session_duration Total duration of sessions
  • +
  • $avg_session_duration Average duration of sessions
  • +
+
+
+
    +
  • $connection_count Total number of connections
  • +}; + if ($connection_count) { + print $fh qq{ +
  • $connection_peak connections/s at $connection_peak_date Connection peak
  • +}; + } + print $fh qq{ +
  • $db_count Total number of databases
  • +
+
+
+
+
+}; + +} + +sub print_general_activity +{ + my $queries = ''; + my $select_queries = ''; + my $write_queries = ''; + my $prepared_queries = ''; + my $connections = ''; + my $sessions = ''; + foreach my $d (sort {$a <=> $b} keys %per_minute_info) { + my $c = 1; + $d =~ /^\d{4}(\d{2})(\d{2})$/; + my $zday = "$abbr_month{$1} $2"; + foreach my $h (sort {$a <=> $b} keys %{$per_minute_info{$d}}) { + my %cur_period_info = (); + my $write_average_duration = 0; + my $write_average_count = 0; + foreach my $m (keys %{$per_minute_info{$d}{$h}}) { + $cur_period_info{count} += ($per_minute_info{$d}{$h}{$m}{query}{count} || 0); + $cur_period_info{duration} += ($per_minute_info{$d}{$h}{$m}{query}{duration} || 0); + $cur_period_info{min} = $per_minute_info{$d}{$h}{$m}{query}{duration} if (!exists $cur_period_info{min} || ($per_minute_info{$d}{$h}{$m}{query}{duration} < $cur_period_info{min})); + $cur_period_info{max} = $per_minute_info{$d}{$h}{$m}{query}{duration} if (!exists $cur_period_info{max} || ($per_minute_info{$d}{$h}{$m}{query}{duration} > $cur_period_info{max})); + foreach my $a (@SQL_ACTION) { + $cur_period_info{$a}{count} += ($per_minute_info{$d}{$h}{$m}{$a}{count} || 0); + $cur_period_info{$a}{duration} += ($per_minute_info{$d}{$h}{$m}{$a}{duration} || 0); + $cur_period_info{usual} += ($per_minute_info{$d}{$h}{$m}{$a}{count} || 0); + } + $cur_period_info{prepare} += ($per_minute_info{$d}{$h}{$m}{prepare} || 0); + $cur_period_info{execute} += ($per_minute_info{$d}{$h}{$m}{execute} || 0); + } + + $cur_period_info{average} = $cur_period_info{duration} / ($cur_period_info{count} || 1); + $cur_period_info{'SELECT'}{average} = $cur_period_info{'SELECT'}{duration} / ($cur_period_info{'SELECT'}{count} || 1); + $write_average_duration = ($cur_period_info{'INSERT'}{duration} + + $cur_period_info{'UPDATE'}{duration} + + $cur_period_info{'DELETE'}{duration}); + $write_average_count = ($cur_period_info{'INSERT'}{count} + + $cur_period_info{'UPDATE'}{count} + + $cur_period_info{'DELETE'}{count}); + $zday = " " if ($c > 1); + $c++; + + my $count = &comma_numbers($cur_period_info{count}); + my $min = &convert_time($cur_period_info{min}); + my $max = &convert_time($cur_period_info{max}); + my $average = &convert_time($cur_period_info{average}); + $queries .= qq{ + + $zday + $h + $count + $min + $max + $average + }; + $count = &comma_numbers($cur_period_info{'SELECT'}{count}); + $average = &convert_time($cur_period_info{'SELECT'}{average}); + $select_queries .= qq{ + + $zday + $h + $count + $average + }; + my $insert_count = &comma_numbers($cur_period_info{'INSERT'}{count}); + my $update_count = &comma_numbers($cur_period_info{'UPDATE'}{count}); + my $delete_count = &comma_numbers($cur_period_info{'DELETE'}{count}); + my $write_average = &convert_time($write_average_duration / ($write_average_count || 1)); + $write_queries .= qq{ + + $zday + $h + $insert_count + $update_count + $delete_count + $write_average + }; + my $prepare_count = &comma_numbers($cur_period_info{prepare}); + my $execute_count = &comma_numbers($cur_period_info{execute}); + my $bind_prepare = &comma_numbers(sprintf("%.2f", $cur_period_info{execute}/($cur_period_info{prepare}||1))); + my $prepare_usual = &comma_numbers(sprintf("%.2f", ($cur_period_info{prepare}/($cur_period_info{usual}||1)) * 100)) . "%"; + $prepared_queries .= qq{ + + $zday + $h + $prepare_count + $execute_count + $bind_prepare + $prepare_usual + }; + $count = &comma_numbers($connection_info{chronos}{"$d"}{"$h"}{count}); + $average = &comma_numbers(sprintf("%0.2f", $connection_info{chronos}{"$d"}{"$h"}{count} / 3600)); + $connections .= qq{ + + $zday + $h + $count + $average/s + }; + $count = &comma_numbers($session_info{chronos}{"$d"}{"$h"}{count}); + $cur_period_info{'session'}{average} = + $session_info{chronos}{"$d"}{"$h"}{duration} / ($session_info{chronos}{"$d"}{"$h"}{count} || 1); + $average = &convert_time($cur_period_info{'session'}{average}); + $sessions .= qq{ + + $zday + $h + $count + $average + }; + } + } + + # Set default values + $queries = qq{$NODATA} if (!$queries); + $select_queries = qq{$NODATA} if (!$select_queries); + $write_queries = qq{$NODATA} if (!$write_queries); + $prepared_queries = qq{$NODATA} if (!$prepared_queries); + $connections = qq{$NODATA} if (!$connections); + $sessions = qq{$NODATA} if (!$sessions); + + print $fh qq{ +
+

General Activity

+
+ +
+
+ + + + + + + + + + + + $queries + +
DayHourCountMin durationMax durationAvg duration
+
+
+ + + + + + + + + + $select_queries + +
DayHourCountAverage Duration
+
+
+ + + + + + + + + + + + $write_queries + +
DayHourINSERTUPDATEDELETEAverage Duration
+
+
+ + + + + + + + + + + + $prepared_queries + +
DayHourPrepareBindBind/PreparePercentage of prepare
+
+
+ + + + + + + + + + $connections + +
DayHourCountAverage / Second
+
+
+ + + + + + + + + + $sessions + +
DayHourCountAverage Duration
+
+
+ Back to the top of the General Activity table +
+ +
+}; + +} + +sub print_sql_traffic +{ + + my $bind_vs_prepared = sprintf("%.2f", $overall_stat{'execute'} / ($overall_stat{'prepare'} || 1)); + my $total_usual_queries = 0; + map { $total_usual_queries += $overall_stat{$_}; } @SQL_ACTION; + my $prepared_vs_normal = sprintf("%.2f", ($overall_stat{'execute'} / ($total_usual_queries || 1))*100); + + my $query_peak = 0; + my $query_peak_date = ''; + foreach (sort {$overall_stat{'peak'}{$b}{query} <=> $overall_stat{'peak'}{$a}{query}} keys %{$overall_stat{'peak'}}) { + $query_peak = &comma_numbers($overall_stat{'peak'}{$_}{query}); + $query_peak_date = $_; + last; + } + + my $select_peak = 0; + my $select_peak_date = ''; + foreach (sort {$overall_stat{'peak'}{$b}{select} <=> $overall_stat{'peak'}{$a}{select}} keys %{$overall_stat{'peak'}}) { + $select_peak = &comma_numbers($overall_stat{'peak'}{$_}{select}); + $select_peak_date = $_; + last; + } + + my $write_peak = 0; + my $write_peak_date = ''; + foreach (sort {$overall_stat{'peak'}{$b}{write} <=> $overall_stat{'peak'}{$a}{write}} keys %{$overall_stat{'peak'}}) { + $write_peak = &comma_numbers($overall_stat{'peak'}{$_}{write}); + $write_peak_date = $_; + last; + } + my $fmt_duration = &convert_time($overall_stat{'queries_duration'}); + + print $fh qq{ +
+

SQL Traffic

+
+

Key values

+
+
    +
  • $query_peak queries/s Query Peak
  • +
  • $query_peak_date Date
  • +
+
+
+
+

Queries per second ($avg_minutes minutes average)

+$drawn_graphs{queriespersecond_graph} +
+
+}; + delete $drawn_graphs{queriespersecond_graph}; + + print $fh qq{ +
+

SELECT Traffic

+
+

Key values

+
+
    +
  • $select_peak queries/s Query Peak
  • +
  • $select_peak_date Date
  • +
+
+
+
+

SELECT queries per second ($avg_minutes minutes average)

+$drawn_graphs{selectqueries_graph} +
+
+}; + delete $drawn_graphs{selectqueries_graph}; + + print $fh qq{ +
+

INSERT/UPDATE/DELETE Traffic

+
+

Key values

+
+
    +
  • $write_peak queries/s Query Peak
  • +
  • $write_peak_date Date
  • +
+
+
+
+

Write queries per second ($avg_minutes minutes average)

+$drawn_graphs{writequeries_graph} +
+
+}; + delete $drawn_graphs{writequeries_graph}; + + print $fh qq{ +
+

Queries duration

+
+

Key values

+
+
    +
  • $fmt_duration Total query duration
  • +
+
+
+
+

Average queries duration ($avg_minutes minutes average)

+$drawn_graphs{durationqueries_graph} +
+
+}; + delete $drawn_graphs{durationqueries_graph}; + + print $fh qq{ +
+

Prepared queries ratio

+
+

Key values

+
+
    +
  • $bind_vs_prepared Ratio of bind vs prepare
  • +
  • $prepared_vs_normal % Ratio between prepared and "usual" statements
  • +
+
+
+
+

Ratio of bind vs prepare statements ($avg_minutes minutes average)

+$drawn_graphs{bindpreparequeries_graph} +
+
+}; + delete $drawn_graphs{bindpreparequeries_graph}; + +} + +sub compute_query_graphs +{ + my %graph_data = (); + if ($graph) { + foreach my $tm (sort {$a <=> $b} keys %per_minute_info) { + $tm =~ /(\d{4})(\d{2})(\d{2})/; + my $y = $1 - 1900; + my $mo = $2 - 1; + my $d = $3; + foreach my $h ("00" .. "23") { + next if (!exists $per_minute_info{$tm}{$h}); + my %q_dataavg = (); + my %a_dataavg = (); + my %c_dataavg = (); + my %s_dataavg = (); + my %p_dataavg = (); + foreach my $m ("00" .. "59") { + next if (!exists $per_minute_info{$tm}{$h}{$m}); + + my $rd = &average_per_minutes($m, $avg_minutes); + + $p_dataavg{prepare}{"$rd"} += $per_minute_info{$tm}{$h}{$m}{prepare} + if (exists $per_minute_info{$tm}{$h}{$m}{prepare}); + $p_dataavg{prepare}{"$rd"} += $per_minute_info{$tm}{$h}{$m}{prepare} + if (exists $per_minute_info{$tm}{$h}{$m}{parse}); + $p_dataavg{execute}{"$rd"} += $per_minute_info{$tm}{$h}{$m}{execute} + if (exists $per_minute_info{$tm}{$h}{$m}{execute}); + + if (exists $per_minute_info{$tm}{$h}{$m}{query}) { + + # Average per minute + $q_dataavg{count}{"$rd"} += $per_minute_info{$tm}{$h}{$m}{query}{count}; + if (exists $per_minute_info{$tm}{$h}{$m}{query}{duration}) { + $q_dataavg{duration}{"$rd"} += $per_minute_info{$tm}{$h}{$m}{query}{duration}; + } + + # Search minimum and maximum during this minute + $q_dataavg{max}{"$rd"} = 0 if (!$q_dataavg{max}{"$rd"}); + $q_dataavg{min}{"$rd"} = 0 if (!$q_dataavg{min}{"$rd"}); + foreach my $s (keys %{$per_minute_info{$tm}{$h}{$m}{query}{second}}) { + $q_dataavg{max}{"$rd"} = $per_minute_info{$tm}{$h}{$m}{query}{second}{$s} + if ($per_minute_info{$tm}{$h}{$m}{query}{second}{$s} > $q_dataavg{max}{"$rd"}); + $q_dataavg{min}{"$rd"} = $per_minute_info{$tm}{$h}{$m}{query}{second}{$s} + if ($per_minute_info{$tm}{$h}{$m}{query}{second}{$s} < $q_dataavg{min}{"$rd"}); + } + + if (!$disable_query) { + foreach my $action (@SQL_ACTION) { + next if (!$per_minute_info{$tm}{$h}{$m}{$action}{count}); + $a_dataavg{$action}{count}{"$rd"} += ($per_minute_info{$tm}{$h}{$m}{$action}{count} || 0); + $a_dataavg{$action}{duration}{"$rd"} += ($per_minute_info{$tm}{$h}{$m}{$action}{duration} || 0); + if ( ($action ne 'SELECT') && exists $per_minute_info{$tm}{$h}{$m}{$action}{count}) { + $a_dataavg{write}{count}{"$rd"} += ($per_minute_info{$tm}{$h}{$m}{$action}{count} || 0); + $a_dataavg{write}{duration}{"$rd"} += ($per_minute_info{$tm}{$h}{$m}{$action}{duration} || 0); + } + # Search minimum and maximum during this minute + $a_dataavg{$action}{max}{"$rd"} = 0 if (! exists $a_dataavg{$action}{max}{"$rd"}); + $a_dataavg{$action}{min}{"$rd"} = 0 if (! exists $a_dataavg{$action}{min}{"$rd"}); + foreach my $s (keys %{$per_minute_info{$tm}{$h}{$m}{$action}{second}}) { + $a_dataavg{$action}{max}{"$rd"} = $per_minute_info{$tm}{$h}{$m}{$action}{second}{$s} + if ($per_minute_info{$tm}{$h}{$m}{$action}{second}{$s} > $a_dataavg{$action}{max}{"$rd"}); + $a_dataavg{$action}{min}{"$rd"} = $per_minute_info{$tm}{$h}{$m}{$action}{second}{$s} + if ($per_minute_info{$tm}{$h}{$m}{$action}{second}{$s} < $a_dataavg{$action}{min}{"$rd"}); + } + } + } + } + + if (exists $per_minute_info{$tm}{$h}{$m}{connection}) { + + # Average per minute + $c_dataavg{average}{"$rd"} += $per_minute_info{$tm}{$h}{$m}{connection}{count}; + + # Search minimum and maximum during this minute + $c_dataavg{max}{"$rd"} = 0 if (!$c_dataavg{max}{"$rd"}); + $c_dataavg{min}{"$rd"} = 0 if (!$c_dataavg{min}{"$rd"}); + foreach my $s (keys %{$per_minute_info{$tm}{$h}{$m}{connection}{second}}) { + $c_dataavg{max}{"$rd"} = $per_minute_info{$tm}{$h}{$m}{connection}{second}{$s} + if ($per_minute_info{$tm}{$h}{$m}{connection}{second}{$s} > $c_dataavg{max}{"$rd"}); + $c_dataavg{min}{"$rd"} = $per_minute_info{$tm}{$h}{$m}{connection}{second}{$s} + if ($per_minute_info{$tm}{$h}{$m}{connection}{second}{$s} < $c_dataavg{min}{"$rd"}); + } + delete $per_minute_info{$tm}{$h}{$m}{connection}; + } + + if (exists $per_minute_info{$tm}{$h}{$m}{session}) { + + # Average per minute + $s_dataavg{average}{"$rd"} += $per_minute_info{$tm}{$h}{$m}{session}{count}; + + # Search minimum and maximum during this minute + $s_dataavg{max}{"$rd"} = 0 if (!$s_dataavg{max}{"$rd"}); + $s_dataavg{min}{"$rd"} = 0 if (!$s_dataavg{min}{"$rd"}); + foreach my $s (keys %{$per_minute_info{$tm}{$h}{$m}{session}{second}}) { + $s_dataavg{max}{"$rd"} = $per_minute_info{$tm}{$h}{$m}{session}{second}{$s} + if ($per_minute_info{$tm}{$h}{$m}{session}{second}{$s} > $s_dataavg{max}{"$rd"}); + $s_dataavg{min}{"$rd"} = $per_minute_info{$tm}{$h}{$m}{session}{second}{$s} + if ($per_minute_info{$tm}{$h}{$m}{session}{second}{$s} < $s_dataavg{min}{"$rd"}); + } + delete $per_minute_info{$tm}{$h}{$m}{session}; + } + } + + foreach my $rd (@avgs) { + my $t = timegm_nocheck(0, $rd, $h, $d, $mo, $y) * 1000; + + next if ($t < $t_min); + last if ($t > $t_max); + + if (exists $q_dataavg{count}) { + # Average queries per minute + $graph_data{query} .= "[$t, " . int(($q_dataavg{count}{"$rd"} || 0) / (60 * $avg_minutes)) . "],"; + # Maxi queries per minute + $graph_data{'query-max'} .= "[$t, " . ($q_dataavg{max}{"$rd"} || 0) . "],"; + # Mini queries per minute + $graph_data{'query-min'} .= "[$t, " . ($q_dataavg{min}{"$rd"} || 0) . "],"; + # Average duration per minute + $graph_data{query4} .= "[$t, " . sprintf("%.3f", ($q_dataavg{duration}{"$rd"} || 0) / ($q_dataavg{count}{"$rd"} || 1)) . "],"; + } + if (scalar keys %c_dataavg) { + # Average connections per minute + $graph_data{conn_avg} .= "[$t, " . int(($c_dataavg{average}{"$rd"} || 0) / (60 * $avg_minutes)) . "],"; + # Maxi connections per minute + $graph_data{conn_max} .= "[$t, " . ($c_dataavg{max}{"$rd"} || 0) . "],"; + + # Mini connections per minute + $graph_data{conn_min} .= "[$t, " . ($c_dataavg{min}{"$rd"} || 0) . "],"; + } + if (scalar keys %s_dataavg) { + # Average connections per minute + $graph_data{sess_avg} .= "[$t, " . int(($s_dataavg{average}{"$rd"} || 0) / (60 * $avg_minutes)) . "],"; + # Maxi connections per minute + $graph_data{sess_max} .= "[$t, " . ($s_dataavg{max}{"$rd"} || 0) . "],"; + + # Mini connections per minute + $graph_data{sess_min} .= "[$t, " . ($s_dataavg{min}{"$rd"} || 0) . "],"; + } + if (!$disable_query && (scalar keys %a_dataavg > 0)) { + foreach my $action (@SQL_ACTION) { + next if ($select_only && ($action ne 'SELECT')); + + # Average queries per minute + $graph_data{"$action"} .= "[$t, " . int(($a_dataavg{$action}{count}{"$rd"} || 0) / (60 * $avg_minutes)) . "],"; + if ($action eq 'SELECT') { + # Maxi queries per minute + $graph_data{"$action-max"} .= "[$t, " . ($a_dataavg{$action}{max}{"$rd"} || 0) . "],"; + # Mini queries per minute + $graph_data{"$action-min"} .= "[$t, " . ($a_dataavg{$action}{min}{"$rd"} || 0) . "],"; + # Average query duration + $graph_data{"$action-2"} .= "[$t, " . sprintf("%.3f", ($a_dataavg{$action}{duration}{"$rd"} || 0) / ($a_dataavg{$action}{count}{"$rd"} || 1)) . "]," if ($action eq 'SELECT'); + } else { + # Average query duration + $graph_data{"write"} .= "[$t, " . sprintf("%.3f", ($a_dataavg{write}{duration}{"$rd"} || 0) / ($a_dataavg{write}{count}{"$rd"} || 1)) . "],"; + } + } + } + if (!$disable_query && (scalar keys %p_dataavg> 0)) { + $graph_data{prepare} .= "[$t, " . ($p_dataavg{prepare}{"$rd"} || 0) . "],"; + $graph_data{execute} .= "[$t, " . ($p_dataavg{execute}{"$rd"} || 0) . "],"; + $graph_data{ratio_bind_prepare} .= "[$t, " . sprintf("%.2f", ($p_dataavg{execute}{"$rd"} || 0) / ($p_dataavg{prepare}{"$rd"} || 1)) . "],"; + } + } + } + } + foreach (keys %graph_data) { + $graph_data{$_} =~ s/,$//; + } + } + $drawn_graphs{'queriespersecond_graph'} = &flotr2_graph( $graphid++, 'queriespersecond_graph', $graph_data{'query-max'}, + $graph_data{query}, $graph_data{'query-min'}, 'Queries per second (' . $avg_minutes . ' minutes average)', + 'Queries per second', 'Maximum', 'Average', 'Minimum' + ); + + $drawn_graphs{'connectionspersecond_graph'} = &flotr2_graph( $graphid++, 'connectionspersecond_graph', $graph_data{conn_max}, + $graph_data{conn_avg}, $graph_data{conn_min}, 'Connections per second (' . $avg_minutes . ' minutes average)', + 'Connections per second', 'Maximum', 'Average', 'Minimum' + ); + + $drawn_graphs{'sessionspersecond_graph'} = &flotr2_graph( $graphid++, 'sessionspersecond_graph', $graph_data{sess_max}, + $graph_data{sess_avg}, $graph_data{sess_min}, 'Number of sessions (' . $avg_minutes . ' minutes average)', + 'Sessions', 'Maximum', 'Average', 'Minimum' + ); + + $drawn_graphs{'selectqueries_graph'} = &flotr2_graph( $graphid++, 'selectqueries_graph', $graph_data{"SELECT-max"}, + $graph_data{"SELECT"}, $graph_data{"SELECT-min"}, + 'SELECT queries (' . $avg_minutes . ' minutes period)', + 'Queries per second', 'Maximum', 'Average', 'Minimum' + ); + + $drawn_graphs{'writequeries_graph'} = &flotr2_graph( + $graphid++, 'writequeries_graph', $graph_data{"DELETE"}, $graph_data{"INSERT"}, $graph_data{"UPDATE"}, 'Write queries (' . $avg_minutes . ' minutes period)', + 'Queries', 'DELETE queries', 'INSERT queries', 'UPDATE queries' + ); + + if (!$select_only) { + $drawn_graphs{'durationqueries_graph'} = &flotr2_graph( + $graphid++, 'durationqueries_graph', $graph_data{query4}, $graph_data{"SELECT-2"}, $graph_data{write}, 'Average queries duration (' . $avg_minutes . ' minutes average)', + 'Duration', 'All queries', 'Select queries', 'Write queries' + ); + } else { + $drawn_graphs{'durationqueries_graph'} = &flotr2_graph( + $graphid++, 'durationqueries_graph', $graph_data{query4}, '', '', 'Average queries duration (' . $avg_minutes . ' minutes average)', + 'Duration', 'Select queries' + ); + } + + $drawn_graphs{'bindpreparequeries_graph'} = &flotr2_graph( + $graphid++, 'bindpreparequeries_graph', $graph_data{prepare}, $graph_data{"execute"}, $graph_data{ratio_bind_prepare}, 'Bind versus prepare statements (' . $avg_minutes . ' minutes average)', + 'Number of statements', 'Prepare/Parse', 'Execute/Bind', 'Bind vs prepare' + ); + +} + +sub print_established_connection +{ + + my $connection_peak = 0; + my $connection_peak_date = ''; + foreach (sort {$overall_stat{'peak'}{$b}{connection} <=> $overall_stat{'peak'}{$a}{connection}} keys %{$overall_stat{'peak'}}) { + $connection_peak = &comma_numbers($overall_stat{'peak'}{$_}{connection}); + $connection_peak_date = $_; + last; + } + + print $fh qq{ +
+

Established Connections

+
+

Key values

+
+
    +
  • $connection_peak connections Connection Peak
  • +
  • $connection_peak_date Date
  • +
+
+
+
+

Connections per second ($avg_minutes minutes average)

+$drawn_graphs{connectionspersecond_graph} +
+
+}; + delete $drawn_graphs{connectionspersecond_graph}; + +} + +sub print_user_connection +{ + + my %infos = (); + my $total_count = 0; + my $c = 0; + my $conn_user_info = ''; + my @main_user = ('unknown',0); + foreach my $u (sort keys %{$connection_info{user}}) { + $conn_user_info .= "$u" . + &comma_numbers($connection_info{user}{$u}) . ""; + $total_count += $connection_info{user}{$u}; + if ($main_user[1] < $connection_info{user}{$u}) { + $main_user[0] = $u; + $main_user[1] = $connection_info{user}{$u}; + } + } + if ($graph) { + my @small = (); + foreach my $d (sort keys %{$connection_info{user}}) { + if ((($connection_info{user}{$d} * 100) / ($total_count||1)) > $pie_percentage_limit) { + $infos{$d} = $connection_info{user}{$d} || 0; + } else { + $infos{"Sum connections < $pie_percentage_limit%"} += $connection_info{user}{$d} || 0; + push(@small, $d); + } + } + if ($#small == 0) { + $infos{$small[0]} = $infos{"Sum connections < $pie_percentage_limit%"}; + delete $infos{"Sum connections < $pie_percentage_limit%"}; + } + } + $drawn_graphs{userconnections_graph} = &flotr2_piegraph($graphid++, 'userconnections_graph', 'Connections per user', %infos); + $total_count = &comma_numbers($total_count); + print $fh qq{ +
+

Connections per user

+
+

Key values

+
+
    +
  • $main_user[0] Main User
  • +
  • $total_count connections Total
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{userconnections_graph} +
+
+ + + + + + + + + $conn_user_info + +
UserCount
+
+
+
+
+
+}; + delete $drawn_graphs{userconnections_graph}; +} + +sub print_host_connection +{ + my %infos = (); + my $total_count = 0; + my $c = 0; + my $conn_host_info = ''; + my @main_host = ('unknown',0); + foreach my $h (sort keys %{$connection_info{host}}) { + $conn_host_info .= "$h" . + &comma_numbers($connection_info{host}{$h}) . ""; + $total_count += $connection_info{host}{$h}; + if ($main_host[1] < $connection_info{host}{$h}) { + $main_host[0] = $h; + $main_host[1] = $connection_info{host}{$h}; + } + } + if ($graph) { + my @small = (); + foreach my $d (sort keys %{$connection_info{host}}) { + if ((($connection_info{host}{$d} * 100) / ($total_count||1)) > $pie_percentage_limit) { + $infos{$d} = $connection_info{host}{$d} || 0; + } else { + $infos{"Sum connections < $pie_percentage_limit%"} += $connection_info{host}{$d} || 0; + push(@small, $d); + } + } + if ($#small == 0) { + $infos{$small[0]} = $infos{"Sum connections < $pie_percentage_limit%"}; + delete $infos{"Sum connections < $pie_percentage_limit%"}; + } + } + $drawn_graphs{hostconnections_graph} = &flotr2_piegraph($graphid++, 'hostconnections_graph', 'Connections per host', %infos); + $total_count = &comma_numbers($total_count); + print $fh qq{ +
+

Connections per host

+
+

Key values

+
+
    +
  • $main_host[0] Main Host
  • +
  • $total_count connections Total
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{hostconnections_graph} +
+
+ + + + + + + + + $conn_host_info + +
HostCount
+
+
+
+
+
+}; + + delete $drawn_graphs{hostconnections_graph}; +} + +sub print_database_connection +{ + my %infos = (); + my $total_count = 0; + my $conn_database_info = ''; + my @main_database = ('unknown',0); + foreach my $d (sort keys %{$connection_info{database}}) { + $conn_database_info .= "$d " . + &comma_numbers($connection_info{database}{$d}) . ""; + $total_count += $connection_info{database}{$d}; + if ($main_database[1] < $connection_info{database}{$d}) { + $main_database[0] = $d; + $main_database[1] = $connection_info{database}{$d}; + } + foreach my $u (sort keys %{$connection_info{user}}) { + next if (!exists $connection_info{database_user}{$d}{$u}); + $conn_database_info .= " $u" . + &comma_numbers($connection_info{database_user}{$d}{$u}) . ""; + } + } + if ($graph) { + my @small = (); + foreach my $d (sort keys %{$connection_info{database}}) { + if ((($connection_info{database}{$d} * 100) / ($total_count||1)) > $pie_percentage_limit) { + $infos{$d} = $connection_info{database}{$d} || 0; + } else { + $infos{"Sum connections < $pie_percentage_limit%"} += $connection_info{database}{$d} || 0; + push(@small, $d); + } + } + if ($#small == 0) { + $infos{$small[0]} = $infos{"Sum connections < $pie_percentage_limit%"}; + delete $infos{"Sum connections < $pie_percentage_limit%"}; + } + } + $drawn_graphs{databaseconnections_graph} = &flotr2_piegraph($graphid++, 'databaseconnections_graph', 'Connections per database', %infos); + $total_count = &comma_numbers($total_count); + print $fh qq{ +
+

Connections per database

+
+

Key values

+
+
    +
  • $main_database[0] Main Database
  • +
  • $total_count connections Total
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{databaseconnections_graph} +
+
+ + + + + + + + + + $conn_database_info + +
DatabaseUserCount
+
+
+
+
+
+}; + delete $drawn_graphs{databaseconnections_graph}; +} + +sub print_simultaneous_session +{ + + my $session_peak = 0; + my $session_peak_date = ''; + foreach (sort {$overall_stat{'peak'}{$b}{session} <=> $overall_stat{'peak'}{$a}{session}} keys %{$overall_stat{'peak'}}) { + $session_peak = &comma_numbers($overall_stat{'peak'}{$_}{session}); + $session_peak_date = $_; + last; + } + + print $fh qq{ +
+

Simultaneous sessions

+
+

Key values

+
+
    +
  • $session_peak sessions Session Peak
  • +
  • $session_peak_date Date
  • +
+
+
+
+

Number of sessions ($avg_minutes minutes average)

+$drawn_graphs{sessionspersecond_graph} +
+
+}; + delete $drawn_graphs{sessionspersecond_graph}; + +} + + +sub print_user_session +{ + my %infos = (); + my $total_count = 0; + my $c = 0; + my $sess_user_info = ''; + my @main_user = ('unknown',0); + foreach my $u (sort keys %{$session_info{user}}) { + $sess_user_info .= "$u" . &comma_numbers($session_info{user}{$u}{count}) . + "" . &convert_time($session_info{user}{$u}{duration}), "" . + &convert_time($session_info{user}{$u}{duration} / $session_info{user}{$u}{count}) . + ""; + $total_count += $session_info{user}{$u}{count}; + if ($main_user[1] < $session_info{user}{$u}{count}) { + $main_user[0] = $u; + $main_user[1] = $session_info{user}{$u}{count}; + } + } + if ($graph) { + my @small = (); + foreach my $d (sort keys %{$session_info{user}}) { + if ((($session_info{user}{$d}{count} * 100) / ($total_count||1)) > $pie_percentage_limit) { + $infos{$d} = $session_info{user}{$d}{count} || 0; + } else { + $infos{"Sum sessions < $pie_percentage_limit%"} += $session_info{user}{$d}{count} || 0; + push(@small, $d); + } + } + if ($#small == 0) { + $infos{$small[0]} = $infos{"Sum sessions < $pie_percentage_limit%"}; + delete $infos{"Sum sessions < $pie_percentage_limit%"}; + } + } + $drawn_graphs{usersessions_graph} = &flotr2_piegraph($graphid++, 'usersessions_graph', 'Connections per user', %infos); + $sess_user_info = qq{$NODATA} if (!$total_count); + $total_count = &comma_numbers($total_count); + print $fh qq{ +
+

Sessions per user

+
+

Key values

+
+
    +
  • $main_user[0] Main User
  • +
  • $total_count sessions Total
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{usersessions_graph} +
+
+ + + + + + + + + + + $sess_user_info + +
UserCountTotal DurationAverage Duration
+
+
+
+
+
+}; + delete $drawn_graphs{usersessions_graph}; +} + +sub print_host_session +{ + my %infos = (); + my $total_count = 0; + my $c = 0; + my $sess_host_info = ''; + my @main_host = ('unknown',0); + foreach my $h (sort keys %{$session_info{host}}) { + $sess_host_info .= "$h" . &comma_numbers($session_info{host}{$h}{count}) . + "" . &convert_time($session_info{host}{$h}{duration}) . "" . + &convert_time($session_info{host}{$h}{duration} / $session_info{host}{$h}{count}) . + ""; + $total_count += $session_info{host}{$h}{count}; + if ($main_host[1] < $session_info{host}{$h}{count}) { + $main_host[0] = $h; + $main_host[1] = $session_info{host}{$h}{count}; + } + } + if ($graph) { + my @small = (); + foreach my $d (sort keys %{$session_info{host}}) { + if ((($session_info{host}{$d}{count} * 100) / ($total_count||1)) > $pie_percentage_limit) { + $infos{$d} = $session_info{host}{$d}{count} || 0; + } else { + $infos{"Sum sessions < $pie_percentage_limit%"} += $session_info{host}{$d}{count} || 0; + push(@small, $d); + } + } + if ($#small == 0) { + $infos{$small[0]} = $infos{"Sum sessions < $pie_percentage_limit%"}; + delete $infos{"Sum sessions < $pie_percentage_limit%"}; + } + } + $drawn_graphs{hostsessions_graph} = &flotr2_piegraph($graphid++, 'hostsessions_graph', 'Connections per host', %infos); + $sess_host_info = qq{$NODATA} if (!$total_count); + $total_count = &comma_numbers($total_count); + print $fh qq{ +
+

Sessions per host

+
+

Key values

+
+
    +
  • $main_host[0] Main Host
  • +
  • $total_count sessions Total
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{hostsessions_graph} +
+
+ + + + + + + + + + + $sess_host_info + +
HostCountTotal DurationAverage Duration
+
+
+
+
+
}; + + delete $drawn_graphs{hostsessions_graph}; +} + +sub print_database_session +{ + my %infos = (); + my $total_count = 0; + my $sess_database_info = ''; + my @main_database = ('unknown',0); + foreach my $d (sort keys %{$session_info{database}}) { + $sess_database_info .= "$d" . &comma_numbers($session_info{database}{$d}{count}) . + "" . &convert_time($session_info{database}{$d}{duration}) . "" . + &convert_time($session_info{database}{$d}{duration} / $session_info{database}{$d}{count}) . + ""; + $total_count += $session_info{database}{$d}{count}; + if ($main_database[1] < $session_info{database}{$d}{count}) { + $main_database[0] = $d; + $main_database[1] = $session_info{database}{$d}{count}; + } } - if ($tempfile_info{count}) { - my $fmt_temp_maxsise = &comma_numbers($tempfile_info{maxsize}) || 0; - my $fmt_temp_avsize = &comma_numbers(sprintf("%.2f", ($tempfile_info{maxsize} / $tempfile_info{count}))); - print $fh qq{Number temporary file: $tempfile_info{count} -Max size of temporary file: $fmt_temp_maxsise -Average size of temporary file: $fmt_temp_avsize + if ($graph) { + my @small = (); + foreach my $d (sort keys %{$session_info{database}}) { + if ((($session_info{database}{$d}{count} * 100) / ($total_count||1)) > $pie_percentage_limit) { + $infos{$d} = $session_info{database}{$d}{count} || 0; + } else { + $infos{"Sum sessions < $pie_percentage_limit%"} += $session_info{database}{$d}{count} || 0; + push(@small, $d); + } + } + if ($#small == 0) { + $infos{$small[0]} = $infos{"Sum sessions < $pie_percentage_limit%"}; + delete $infos{"Sum sessions < $pie_percentage_limit%"}; + } + } + $drawn_graphs{databasesessions_graph} = &flotr2_piegraph($graphid++, 'databasesessions_graph', 'Connections per database', %infos); + $sess_database_info = qq{$NODATA} if (!$total_count); + + $total_count = &comma_numbers($total_count); + print $fh qq{ +
+

Sessions per database

+
+

Key values

+
+
    +
  • $main_database[0] Main Database
  • +
  • $total_count sessions Total
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{databasesessions_graph} +
+
+ + + + + + + + + + + + $sess_database_info + +
DatabaseUserCountTotal DurationAverage Duration
+
+
+
+
+
}; + delete $drawn_graphs{databasesessions_graph}; +} + +sub print_checkpoint +{ + + # checkpoint + my %graph_data = (); + if ($graph) { + foreach my $tm (sort {$a <=> $b} keys %per_minute_info) { + $tm =~ /(\d{4})(\d{2})(\d{2})/; + my $y = $1 - 1900; + my $mo = $2 - 1; + my $d = $3; + foreach my $h ("00" .. "23") { + next if (!exists $per_minute_info{$tm}{$h}); + my %chk_dataavg = (); + my %t_dataavg = (); + my %v_dataavg = (); + foreach my $m ("00" .. "59") { + next if (!exists $per_minute_info{$tm}{$h}{$m}); + + my $rd = &average_per_minutes($m, $avg_minutes); + + if ($checkpoint_info{wbuffer}) { + if (exists $per_minute_info{$tm}{$h}{$m}{checkpoint}) { + $chk_dataavg{wbuffer}{"$rd"} += ($per_minute_info{$tm}{$h}{$m}{checkpoint}{wbuffer} || 0); + $chk_dataavg{file_added}{"$rd"} += ($per_minute_info{$tm}{$h}{$m}{checkpoint}{file_added} || 0); + $chk_dataavg{file_removed}{"$rd"} += ($per_minute_info{$tm}{$h}{$m}{checkpoint}{file_removed} || 0); + $chk_dataavg{file_recycled}{"$rd"} += ($per_minute_info{$tm}{$h}{$m}{checkpoint}{file_recycled} || 0); + } + } + } + + foreach my $rd (@avgs) { + my $t = timegm_nocheck(0, $rd, $h, $d, $mo, $y) * 1000; + + next if ($t < $t_min); + last if ($t > $t_max); + + # Average of written checkpoints buffers and wal files + if (exists $chk_dataavg{wbuffer}) { + $graph_data{wbuffer} .= "[$t, " . ($chk_dataavg{wbuffer}{"$rd"} || 0) . "],"; + $graph_data{file_added} .= "[$t, " . ($chk_dataavg{file_added}{"$rd"} || 0) . "],"; + $graph_data{file_removed} .= "[$t, " . ($chk_dataavg{file_removed}{"$rd"} || 0) . "],"; + $graph_data{file_recycled} .= "[$t, " . ($chk_dataavg{file_recycled}{"$rd"} || 0) . "],"; + } + } + } + } + foreach (keys %graph_data) { + $graph_data{$_} =~ s/,$//; + } } - if (!$disable_session && $session_info{count}) { - my $avg_session_duration = &convert_time($session_info{duration} / $session_info{count}); - my $tot_session_duration = &convert_time($session_info{duration}); - print $fh qq{Total number of sessions: $session_info{count} -Total duration of sessions: $tot_session_duration -Average duration of sessions: $avg_session_duration + # Checkpoints buffers and files + $drawn_graphs{checkpointwritebuffers_graph} = + &flotr2_graph($graphid++, 'checkpointwritebuffers_graph', $graph_data{wbuffer}, '', '', + 'Checkpoint write buffers (' . $avg_minutes . ' minutes period)', + 'Buffers', 'Write buffers', '', '' + ); + $drawn_graphs{checkpointfiles_graph} = + &flotr2_graph($graphid++, 'checkpointfiles_graph', $graph_data{file_added}, + $graph_data{file_removed}, $graph_data{file_recycled}, 'Checkpoint Wal files usage', + 'Number of files', 'Added', 'Removed', 'Recycled' + ); + + my $checkpoint_wbuffer_peak = 0; + my $checkpoint_wbuffer_peak_date = ''; + foreach (sort {$overall_checkpoint{'peak'}{$b}{checkpoint_wbuffer} <=> $overall_checkpoint{'peak'}{$a}{checkpoint_wbuffer}} keys %{$overall_checkpoint{'peak'}}) { + $checkpoint_wbuffer_peak = &comma_numbers($overall_checkpoint{'peak'}{$_}{checkpoint_wbuffer}); + $checkpoint_wbuffer_peak_date = $_; + last; + } + my $walfile_usage_peak = 0; + my $walfile_usage_peak_date = ''; + foreach (sort {$overall_checkpoint{'peak'}{$b}{walfile_usage} <=> $overall_checkpoint{'peak'}{$a}{walfile_usage}} keys %{$overall_checkpoint{'peak'}}) { + $walfile_usage_peak = &comma_numbers($overall_checkpoint{'peak'}{$_}{walfile_usage}); + $walfile_usage_peak_date = $_; + last; + } + + print $fh qq{ +

Checkpoints / Restartpoints

+ +
+

Checkpoints Buffers

+
+

Key values

+
+
    +
  • $checkpoint_wbuffer_peak buffers Checkpoint Peak
  • +
  • $checkpoint_wbuffer_peak_date Date
  • +
  • $overall_checkpoint{checkpoint_write} seconds Highest write time
  • +
  • $overall_checkpoint{checkpoint_sync} seconds Sync time
  • +
+
+
+
+

Checkpoint write buffers ($avg_minutes minutes average)

+$drawn_graphs{checkpointwritebuffers_graph} +
+
+}; + delete $drawn_graphs{checkpointwritebuffers_graph}; + + print $fh qq{ +
+

Checkpoints Wal files

+
+

Key values

+
+
    +
  • $walfile_usage_peak files Wal files usage Peak
  • +
  • $walfile_usage_peak_date Date
  • +
+
+
+
+

Checkpoint Wal files usage

+$drawn_graphs{checkpointfiles_graph} +
+
}; + delete $drawn_graphs{checkpointfiles_graph}; + + my $buffers = ''; + my $files = ''; + my $warnings = ''; + foreach my $d (sort {$a <=> $b} keys %per_minute_info) { + $d =~ /^\d{4}(\d{2})(\d{2})$/; + my $zday = "$abbr_month{$1} $2"; + foreach my $h (sort {$a <=> $b} keys %{$per_minute_info{$d}}) { + $buffers .= "$zday$h"; + $files .= "$zday$h"; + $warnings .= "$zday$h"; + $zday = ''; + my %cinf = (); + my %rinf = (); + my %cainf = (); + my %rainf = (); + foreach my $m (keys %{$per_minute_info{$d}{$h}}) { + + if (exists $per_minute_info{$d}{$h}{$m}{checkpoint}) { + $cinf{wbuffer} += $per_minute_info{$d}{$h}{$m}{checkpoint}{wbuffer}; + $cinf{file_added} += $per_minute_info{$d}{$h}{$m}{checkpoint}{file_added}; + $cinf{file_removed} += $per_minute_info{$d}{$h}{$m}{checkpoint}{file_removed}; + $cinf{file_recycled} += $per_minute_info{$d}{$h}{$m}{checkpoint}{file_recycled}; + $cinf{write} += $per_minute_info{$d}{$h}{$m}{checkpoint}{write}; + $cinf{sync} += $per_minute_info{$d}{$h}{$m}{checkpoint}{sync}; + $cinf{total} += $per_minute_info{$d}{$h}{$m}{checkpoint}{total}; + $cainf{sync_files} += $per_minute_info{$d}{$h}{$m}{checkpoint}{sync_files}; + $cainf{sync_avg} += $per_minute_info{$d}{$h}{$m}{checkpoint}{sync_avg}; + $cainf{sync_longest} = $per_minute_info{$d}{$h}{$m}{checkpoint}{sync_longest} + if ($per_minute_info{$d}{$h}{$m}{checkpoint}{sync_longest} > $cainf{sync_longest}); + } + if (exists $per_minute_info{$d}{$h}{$m}{checkpoint}{warning}) { + $cinf{warning} += $per_minute_info{$d}{$h}{$m}{checkpoint}{warning}; + $cinf{warning_seconds} += $per_minute_info{$d}{$h}{$m}{checkpoint}{warning_seconds}; + } + } + if (scalar keys %cinf) { + $buffers .= "" . &comma_numbers($cinf{wbuffer}) . + "" . &comma_numbers($cinf{write}) . + "" . &comma_numbers($cinf{sync}) . + "" . &comma_numbers($cinf{total}) . + ""; + $files .= "" . &comma_numbers($cinf{file_added}) . + "" . &comma_numbers($cinf{file_removed}) . + "" . &comma_numbers($cinf{file_recycled}) . + "" . &comma_numbers($cainf{sync_files}) . + "" . &comma_numbers($cainf{sync_longest}) . + "" . &comma_numbers($cainf{sync_avg}) . + ""; + } else { + $buffers .= "0000"; + $files .= "000000"; + } + if (exists $cinf{warning}) { + $warnings .= "" . &comma_numbers($cinf{warning}) . "" . + &comma_numbers(sprintf( "%.2f", ($cinf{warning_seconds} || 0) / ($cinf{warning} || 1))) . + ""; + } else { + $warnings .= "00"; + } + } } - if (!$disable_connection && $connection_info{count}) { - print $fh "Total number of connections: $connection_info{count}\n"; + + $buffers = qq{$NODATA} if (!$buffers); + $files = qq{$NODATA} if (!$files); + $warnings = qq{$NODATA} if (!$warnings); + + print $fh qq{ +
+

Checkpoints Activity

+
+ +
+
+ + + + + + + + + + + + $buffers + +
DayHourWritten buffersWrite timeSync timeTotal time
+
+
+ + + + + + + + + + + + + + $files + +
DayHourAddedRemovedRecycledSynced filesLongest syncAverage sync
+
+
+ + + + + + + + + + $warnings + +
DayHourCountAvg time (sec)
+
+
+ Back to the top of the Checkpoint Activity table +
+ +
+}; + +} + +sub print_temporary_file +{ + + # checkpoint + my %graph_data = (); + if ($graph) { + foreach my $tm (sort {$a <=> $b} keys %per_minute_info) { + $tm =~ /(\d{4})(\d{2})(\d{2})/; + my $y = $1 - 1900; + my $mo = $2 - 1; + my $d = $3; + foreach my $h ("00" .. "23") { + next if (!exists $per_minute_info{$tm}{$h}); + my %chk_dataavg = (); + my %t_dataavg = (); + my %v_dataavg = (); + foreach my $m ("00" .. "59") { + next if (!exists $per_minute_info{$tm}{$h}{$m}); + my $rd = &average_per_minutes($m, $avg_minutes); + if ($tempfile_info{count}) { + if (exists $per_minute_info{$tm}{$h}{$m}{tempfile}) { + $t_dataavg{size}{"$rd"} += ($per_minute_info{$tm}{$h}{$m}{tempfile}{size} || 0); + $t_dataavg{count}{"$rd"} += ($per_minute_info{$tm}{$h}{$m}{tempfile}{count} || 0); + } + } + } + + foreach my $rd (@avgs) { + my $t = timegm_nocheck(0, $rd, $h, $d, $mo, $y) * 1000; + + next if ($t < $t_min); + last if ($t > $t_max); + + if (exists $t_dataavg{size}) { + $graph_data{size} .= "[$t, " . ($t_dataavg{size}{"$rd"} || 0) . "],"; + $graph_data{count} .= "[$t, " . ($t_dataavg{count}{"$rd"} || 0) . "],"; + } + } + } + } + foreach (keys %graph_data) { + $graph_data{$_} =~ s/,$//; + } } - if (!$disable_hourly) { - print $fh qq{ + # Temporary file size + $drawn_graphs{temporarydata_graph} = + &flotr2_graph($graphid++, 'temporarydata_graph', $graph_data{size}, '', '', + 'Size of temporary files (' . $avg_minutes . ' minutes period)', + 'Size of files', 'Size of files' + ); + # Temporary file number + $drawn_graphs{temporaryfile_graph} = + &flotr2_graph($graphid++, 'temporaryfile_graph', $graph_data{count}, '', '', + 'Number of temporary files (' . $avg_minutes . ' minutes period)', + 'Number of files', 'Number of files' + ); -- Hourly statistics ---------------------------------------------------- + my $tempfile_size_peak = 0; + my $tempfile_size_peak_date = ''; + foreach (sort {$overall_stat{'peak'}{$b}{tempfile_size} <=> $overall_stat{'peak'}{$a}{tempfile_size}} keys %{$overall_stat{'peak'}}) { + $tempfile_size_peak = &pretty_print_size($overall_stat{'peak'}{$_}{tempfile_size}); + $tempfile_size_peak_date = $_; + last; + } + print $fh qq{ +

Temporary Files

-Report not supported by text format +
+

Size of temporary files

+
+

Key values

+
+
    +
  • $tempfile_size_peak Temp Files Peak
  • +
  • $tempfile_size_peak_date Date
  • +
+
+
+
+

Size of temporary files ($avg_minutes minutes average)

+$drawn_graphs{temporarydata_graph} +
+
+}; + delete $drawn_graphs{temporarydata_graph}; + my $tempfile_count_peak = 0; + my $tempfile_count_peak_date = ''; + foreach (sort {$overall_stat{'peak'}{$b}{tempfile_count} <=> $overall_stat{'peak'}{$a}{tempfile_count}} keys %{$overall_stat{'peak'}}) { + $tempfile_count_peak = &comma_numbers($overall_stat{'peak'}{$_}{tempfile_count}); + $tempfile_count_peak_date = $_; + last; + } + print $fh qq{ +
+

Number of temporary files

+
+

Key values

+
+
    +
  • $tempfile_count_peak per second Temp Files Peak
  • +
  • $tempfile_count_peak_date Date
  • +
+
+
+
+

Number of temporary files ($avg_minutes minutes average)

+$drawn_graphs{temporaryfile_graph} +
+
}; + delete $drawn_graphs{temporaryfile_graph}; + + my $tempfiles_activity = ''; + foreach my $d (sort {$a <=> $b} keys %per_minute_info) { + $d =~ /^\d{4}(\d{2})(\d{2})$/; + my $zday = "$abbr_month{$1} $2"; + foreach my $h (sort {$a <=> $b} keys %{$per_minute_info{$d}}) { + $tempfiles_activity .= "$zday$h"; + $zday = ""; + my %tinf = (); + foreach my $m (keys %{$per_minute_info{$d}{$h}}) { + if (exists $per_minute_info{$d}{$h}{$m}{tempfile}) { + $tinf{size} += $per_minute_info{$d}{$h}{$m}{tempfile}{size}; + $tinf{count} += $per_minute_info{$d}{$h}{$m}{tempfile}{count}; + } + } + if (scalar keys %tinf) { + my $temp_average = &pretty_print_size(sprintf("%.2f", $tinf{size} / $tinf{count})); + $tempfiles_activity .= "" . &comma_numbers($tinf{count}) . + "$temp_average"; + } else { + $tempfiles_activity .= "00"; + } + } } - if (!$disable_type) { + $tempfiles_activity = qq{$NODATA} if (!$tempfiles_activity); - # INSERT/DELETE/UPDATE/SELECT repartition - my $totala = $overall_stat{'SELECT'} + $overall_stat{'INSERT'} + $overall_stat{'UPDATE'} + $overall_stat{'DELETE'}; - my $total = $overall_stat{'queries_number'}; - print $fh "\n- Queries by type ------------------------------------------------------\n\n"; - print $fh "Type Count Percentage\n"; - print $fh "SELECT: ", &comma_numbers($overall_stat{'SELECT'}) || 0, " ", - sprintf("%0.2f", ($overall_stat{'SELECT'} * 100) / $total), "%\n"; - print $fh "INSERT: ", &comma_numbers($overall_stat{'INSERT'}) || 0, " ", - sprintf("%0.2f", ($overall_stat{'INSERT'} * 100) / $total), "%\n"; - print $fh "UPDATE: ", &comma_numbers($overall_stat{'UPDATE'}) || 0, " ", - sprintf("%0.2f", ($overall_stat{'UPDATE'} * 100) / $total), "%\n"; - print $fh "DELETE: ", &comma_numbers($overall_stat{'DELETE'}) || 0, " ", - sprintf("%0.2f", ($overall_stat{'DELETE'} * 100) / $total), "%\n"; - print $fh "OTHERS: ", &comma_numbers($total - $totala) || 0, " ", sprintf("%0.2f", (($total - $totala) * 100) / $total), "%\n" - if (($total - $totala) > 0); - print $fh "\n"; + print $fh qq{ +
+

Temporary Files Activity

+
+ +
+
+ + + + + + + + + + $tempfiles_activity + +
DayHourCountAverage size
+
+
+ Back to the top of the Temporay Files Activity table +
+ +
+}; + +} + +sub print_analyze_per_table +{ + # ANALYZE stats per table + my %infos = (); + my $total_count = 0; + my $analyze_info = ''; + my @main_analyze = ('unknown',0); + foreach my $t (sort {$autoanalyze_info{tables}{$b}{analyzes} <=> $autoanalyze_info{tables}{$a}{analyzes}} keys %{$autoanalyze_info{tables}}) { + $analyze_info .= "$t" . $autoanalyze_info{tables}{$t}{analyzes} . + ""; + $total_count += $autoanalyze_info{tables}{$t}{analyzes}; + if ($main_analyze[1] < $autoanalyze_info{tables}{$t}{analyzes}) { + $main_analyze[0] = $t; + $main_analyze[1] = $autoanalyze_info{tables}{$t}{analyzes}; + } + } + $analyze_info .= "Total" . &comma_numbers($total_count) . ""; + + if ($graph) { + my @small = (); + foreach my $d (sort keys %{$autoanalyze_info{tables}}) { + if ((($autoanalyze_info{tables}{$d}{analyzes} * 100) / ($total_count||1)) > $pie_percentage_limit) { + $infos{$d} = $autoanalyze_info{tables}{$d}{analyzes} || 0; + } else { + $infos{"Sum analyzes < $pie_percentage_limit%"} += $autoanalyze_info{tables}{$d}{analyzes} || 0; + push(@small, $d); + } + } + if ($#small == 0) { + $infos{$small[0]} = $infos{"Sum analyzes < $pie_percentage_limit%"}; + delete $infos{"Sum analyzes < $pie_percentage_limit%"}; + } + } + $drawn_graphs{tableanalyzes_graph} = &flotr2_piegraph($graphid++, 'tableanalyzes_graph', 'Analyzes per tables', %infos); + $total_count = &comma_numbers($total_count); + my $database = ''; + if ($main_analyze[0] =~ s/^([^\.]+)\.//) { + $database = $1; } - if (!$disable_lock && scalar keys %lock_info > 0) { - print $fh "\n- Locks by type ------------------------------------------------------\n\n"; - print $fh "Type Object Count Total Duration Av. duration (s)\n"; - my $total_count = 0; - my $total_duration = 0; - foreach my $t (sort keys %lock_info) { - print $fh "$t\t\t", &comma_numbers($lock_info{$t}{count}), " ", &convert_time($lock_info{$t}{duration}), " ", - &convert_time($lock_info{$t}{duration} / $lock_info{$t}{count}), "\n"; - foreach my $o (sort keys %{$lock_info{$t}}) { - next if (($o eq 'count') || ($o eq 'duration') || ($o eq 'chronos')); - print $fh "\t$o\t", &comma_numbers($lock_info{$t}{$o}{count}), " ", &convert_time($lock_info{$t}{$o}{duration}), " ", - &convert_time($lock_info{$t}{$o}{duration} / $lock_info{$t}{$o}{count}), "\n"; + $analyze_info = qq{$NODATA} if (!$total_count); + + print $fh qq{ +
+

Analyses per table

+
+

Key values

+
+
    +
  • $main_analyze[0] ($main_analyze[1]) Main table analyzed (database $database)
  • +
  • $total_count analyzes Total
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{tableanalyzes_graph} +
+
+ + + + + + + + + $analyze_info + +
TableNumber of analyzes
+
+
+
+
+
+}; + delete $drawn_graphs{tableanalyzes_graph}; + +} + +sub print_vacuum +{ + + # checkpoint + my %graph_data = (); + foreach my $tm (sort {$a <=> $b} keys %per_minute_info) { + $tm =~ /(\d{4})(\d{2})(\d{2})/; + my $y = $1 - 1900; + my $mo = $2 - 1; + my $d = $3; + foreach my $h ("00" .. "23") { + next if (!exists $per_minute_info{$tm}{$h}); + my %chk_dataavg = (); + my %t_dataavg = (); + my %v_dataavg = (); + foreach my $m ("00" .. "59") { + next if (!exists $per_minute_info{$tm}{$h}{$m}); + + my $rd = &average_per_minutes($m, $avg_minutes); + + if (exists $per_minute_info{$tm}{$h}{$m}{autovacuum}) { + $v_dataavg{vcount}{"$rd"} += ($per_minute_info{$tm}{$h}{$m}{autovacuum}{count} || 0); + } + if (exists $per_minute_info{$tm}{$h}{$m}{autoanalyze}) { + $v_dataavg{acount}{"$rd"} += ($per_minute_info{$tm}{$h}{$m}{autoanalyze}{count} || 0); + } + } + + foreach my $rd (@avgs) { + my $t = timegm_nocheck(0, $rd, $h, $d, $mo, $y) * 1000; + + next if ($t < $t_min); + last if ($t > $t_max); + + if (exists $v_dataavg{vcount}) { + $graph_data{vcount} .= "[$t, " . ($v_dataavg{vcount}{"$rd"} || 0) . "],"; + } + if (exists $v_dataavg{acount}) { + $graph_data{acount} .= "[$t, " . ($v_dataavg{acount}{"$rd"} || 0) . "],"; + } + } - $total_count += $lock_info{$t}{count}; - $total_duration += $lock_info{$t}{duration}; } - print $fh "Total:\t\t\t", &comma_numbers($total_count), " ", &convert_time($total_duration), " ", - &convert_time($total_duration / ($total_count || 1)), "\n"; + } + foreach (keys %graph_data) { + $graph_data{$_} =~ s/,$//; } - # Show session per database statistics - if (!$disable_session && exists $session_info{database}) { - print $fh "\n- Sessions per database ------------------------------------------------------\n\n"; - print $fh "Database Count Total Duration Av. duration (s)\n"; - foreach my $d (sort keys %{$session_info{database}}) { - print $fh "$d - ", &comma_numbers($session_info{database}{$d}{count}), " ", - &convert_time($session_info{database}{$d}{duration}), " ", - &convert_time($session_info{database}{$d}{duration} / $session_info{database}{$d}{count}), "\n"; + # VACUUMs vs ANALYZEs chart + $drawn_graphs{autovacuum_graph} = + &flotr2_graph($graphid++, 'autovacuum_graph', $graph_data{vcount}, $graph_data{acount}, + '', 'Autovacuum actions (' . $avg_minutes . ' minutes period)', '', 'VACUUMs', 'ANALYZEs' + ); + + my $vacuum_size_peak = 0; + my $vacuum_size_peak_date = ''; + foreach (sort {$overall_stat{'peak'}{$b}{vacuum_size} <=> $overall_stat{'peak'}{$a}{vacuum_size}} keys %{$overall_stat{'peak'}}) { + $vacuum_size_peak = &comma_numbers($overall_stat{'peak'}{$_}{vacuum_size}); + $vacuum_size_peak_date = $_; + last; + } + my $autovacuum_peak_system_usage_db = ''; + if ($autovacuum_info{peak}{system_usage}{table} =~ s/^([^\.]+)\.//) { + $autovacuum_peak_system_usage_db = $1; + } + my $autoanalyze_peak_system_usage_db = ''; + if ($autoanalyze_info{peak}{system_usage}{table} =~ s/^([^\.]+)\.//) { + $autoanalyze_peak_system_usage_db = $1; + } + $autovacuum_info{peak}{system_usage}{elapsed} ||= 0; + $autoanalyze_info{peak}{system_usage}{elapsed} ||= 0; + print $fh qq{ +

Vacuums

+ +
+

Vacuums / Analyzes Distribution

+
+

Key values

+
+
    +
  • $autovacuum_info{peak}{system_usage}{elapsed} sec More CPU costly vacuum
    Table $autovacuum_info{peak}{system_usage}{table}
    Database $autovacuum_peak_system_usage_db
  • +
  • $autovacuum_info{peak}{system_usage}{date} Date
  • +
  • $autoanalyze_info{peak}{system_usage}{elapsed} sec More CPU costly analyze
    Table $autoanalyze_info{peak}{system_usage}{table}
    Database $autovacuum_peak_system_usage_db
  • +
  • $autoanalyze_info{peak}{system_usage}{date} Date
  • +
+
+
+
+

Autovacuum actions ($avg_minutes minutes average)

+$drawn_graphs{autovacuum_graph} +
+
+}; + delete $drawn_graphs{autovacuum_graph}; + + # ANALYZE stats per table + &print_analyze_per_table(); + + # VACUUM stats per table + &print_vacuum_per_table(); + + # Show tuples and pages removed per table + &print_vacuum_tuple_removed; + &print_vacuum_page_removed; + + my $vacuum_activity = ''; + foreach my $d (sort {$a <=> $b} keys %per_minute_info) { + my $c = 1; + $d =~ /^\d{4}(\d{2})(\d{2})$/; + my $zday = "$abbr_month{$1} $2"; + foreach my $h (sort {$a <=> $b} keys %{$per_minute_info{$d}}) { + $vacuum_activity .= "$zday$h"; + $zday = ""; + my %ainf = (); + foreach my $m (keys %{$per_minute_info{$d}{$h}}) { + + if (exists $per_minute_info{$d}{$h}{$m}{autovacuum}{count}) { + $ainf{vcount} += $per_minute_info{$d}{$h}{$m}{autovacuum}{count}; + } + if (exists $per_minute_info{$d}{$h}{$m}{autoanalyze}{count}) { + $ainf{acount} += $per_minute_info{$d}{$h}{$m}{autoanalyze}{count}; + } + + } + if (scalar keys %ainf) { + $vacuum_activity .= "" . &comma_numbers($ainf{vcount}) . ""; + } else { + $vacuum_activity .= "0"; + } + if (scalar keys %ainf) { + $vacuum_activity .= "" . &comma_numbers($ainf{acount}) . ""; + } else { + $vacuum_activity .= "0"; + } } } - # Show session per user statistics - if (!$disable_session && exists $session_info{user}) { - print $fh "\n- Sessions per user ------------------------------------------------------\n\n"; - print $fh "User Count Total Duration Av. duration (s)\n"; - foreach my $d (sort keys %{$session_info{user}}) { - print $fh "$d - ", &comma_numbers($session_info{user}{$d}{count}), " ", &convert_time($session_info{user}{$d}{duration}), - " ", &convert_time($session_info{user}{$d}{duration} / $session_info{user}{$d}{count}), "\n"; + $vacuum_activity = qq{$NODATA} if (!$vacuum_activity); + + print $fh qq{ +
+

Autovacuum Activity

+
+ +
+
+ + + + + + + + + + $vacuum_activity + +
DayHourVACUUMsANALYZEs
+
+
+ Back to the top of the Temporay Files Activity table +
+ +
+}; + +} + +sub print_vacuum_per_table +{ + # VACUUM stats per table + my $total_count = 0; + my $total_idxscan = 0; + my $vacuum_info = ''; + my @main_vacuum = ('unknown',0); + foreach my $t (sort {$autovacuum_info{tables}{$b}{vacuums} <=> $autovacuum_info{tables}{$a}{vacuums}} keys %{$autovacuum_info{tables}}) { + $vacuum_info .= "$t" . $autovacuum_info{tables}{$t}{vacuums} . + "" . $autovacuum_info{tables}{$t}{idxscans} . + ""; + $total_count += $autovacuum_info{tables}{$t}{vacuums}; + $total_idxscan += $autovacuum_info{tables}{$t}{idxscans}; + if ($main_vacuum[1] < $autovacuum_info{tables}{$t}{vacuums}) { + $main_vacuum[0] = $t; + $main_vacuum[1] = $autovacuum_info{tables}{$t}{vacuums}; + } + } + $vacuum_info .= "Total" . &comma_numbers($total_count) . "" . &comma_numbers($total_idxscan) . ""; + + my %infos = (); + my @small = (); + foreach my $d (sort keys %{$autovacuum_info{tables}}) { + if ((($autovacuum_info{tables}{$d}{vacuums} * 100) / ($total_count||1)) > $pie_percentage_limit) { + $infos{$d} = $autovacuum_info{tables}{$d}{vacuums} || 0; + } else { + $infos{"Sum vacuums < $pie_percentage_limit%"} += $autovacuum_info{tables}{$d}{vacuums} || 0; + push(@small, $d); + } + } + if ($#small == 0) { + $infos{$small[0]} = $infos{"Sum vacuums < $pie_percentage_limit%"}; + delete $infos{"Sum vacuums < $pie_percentage_limit%"}; + } + $drawn_graphs{tablevacuums_graph} = &flotr2_piegraph($graphid++, 'tablevacuums_graph', 'Analyzes per tables', %infos); + $vacuum_info = qq{$NODATA} if (!$total_count); + $total_count = &comma_numbers($total_count); + my $database = ''; + if ($main_vacuum[0] =~ s/^([^\.]+)\.//) { + $database = $1; + } + print $fh qq{ +
+

Vacuums per table

+
+

Key values

+
+
    +
  • $main_vacuum[0] ($main_vacuum[1]) Main table vacuumed on database $database
  • +
  • $total_count vacuums Total
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{tablevacuums_graph} +
+
+ + + + + + + + + + $vacuum_info + +
TableNumber of vacuumsIndex scans
+
+
+
+
+
+}; + delete $drawn_graphs{tablevacuums_graph}; + +} + +sub print_vacuum_tuple_removed +{ + # VACUUM stats per table + my $total_count = 0; + my $total_idxscan = 0; + my $total_tuple = 0; + my $total_page = 0; + my $vacuum_info = ''; + my @main_tuple = ('unknown',0); + foreach my $t (sort {$autovacuum_info{tables}{$b}{tuples}{removed} <=> $autovacuum_info{tables}{$a}{tuples}{removed}} keys %{$autovacuum_info{tables}}) { + $vacuum_info .= "$t" . $autovacuum_info{tables}{$t}{vacuums} . + "" . $autovacuum_info{tables}{$t}{idxscans} . + "" . $autovacuum_info{tables}{$t}{tuples}{removed} . + "" . $autovacuum_info{tables}{$t}{pages}{removed} . + ""; + $total_count += $autovacuum_info{tables}{$t}{vacuums}; + $total_idxscan += $autovacuum_info{tables}{$t}{idxscans}; + $total_tuple += $autovacuum_info{tables}{$t}{tuples}{removed}; + $total_page += $autovacuum_info{tables}{$t}{pages}{removed}; + if ($main_tuple[1] < $autovacuum_info{tables}{$t}{tuples}{removed}) { + $main_tuple[0] = $t; + $main_tuple[1] = $autovacuum_info{tables}{$t}{tuples}{removed}; + } + } + $vacuum_info .= "Total" . &comma_numbers($total_count) . "" . &comma_numbers($total_idxscan) . + "" . &comma_numbers($total_tuple) . "" . &comma_numbers($total_page) . ""; + + my %infos_tuple = (); + my @small = (); + foreach my $d (sort keys %{$autovacuum_info{tables}}) { + if ((($autovacuum_info{tables}{$d}{tuples}{removed} * 100) / ($total_tuple||1)) > $pie_percentage_limit) { + $infos_tuple{$d} = $autovacuum_info{tables}{$d}{tuples}{removed} || 0; + } else { + $infos_tuple{"Sum tuples removed < $pie_percentage_limit%"} += $autovacuum_info{tables}{$d}{tuples}{removed} || 0; + push(@small, $d); + } + } + if ($#small == 0) { + $infos_tuple{$small[0]} = $infos_tuple{"Sum tuples removed < $pie_percentage_limit%"}; + delete $infos_tuple{"Sum tuples removed < $pie_percentage_limit%"}; + } + $drawn_graphs{tuplevacuums_graph} = &flotr2_piegraph($graphid++, 'tuplevacuums_graph', 'Tuples removed per tables', %infos_tuple); + $vacuum_info = qq{$NODATA} if (!$total_count); + $total_count = &comma_numbers($total_count); + my $database = ''; + if ($main_tuple[0] =~ s/^([^\.]+)\.//) { + $database = $1; + } + print $fh qq{ +
+

Tuples removed per table

+
+

Key values

+
+
    +
  • $main_tuple[0] ($main_tuple[1]) Main table with removed tuples on database $database
  • +
  • $total_tuple tuples Total removed
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{tuplevacuums_graph} +
+
+ + + + + + + + + + + + $vacuum_info + +
TableNumber of vacuumsIndex scansTuples removedPages removed
+
+
+
+
+
+}; + delete $drawn_graphs{tuplevacuums_graph}; + +} + +sub print_vacuum_page_removed +{ + # VACUUM stats per table + my $total_count = 0; + my $total_idxscan = 0; + my $total_tuple = 0; + my $total_page = 0; + my $vacuum_info = ''; + my @main_tuple = ('unknown',0); + my @main_page = ('unknown',0); + foreach my $t (sort {$autovacuum_info{tables}{$b}{pages}{removed} <=> $autovacuum_info{tables}{$a}{pages}{removed}} keys %{$autovacuum_info{tables}}) { + $vacuum_info .= "$t" . $autovacuum_info{tables}{$t}{vacuums} . + "" . $autovacuum_info{tables}{$t}{idxscans} . + "" . $autovacuum_info{tables}{$t}{tuples}{removed} . + "" . $autovacuum_info{tables}{$t}{pages}{removed} . + ""; + $total_count += $autovacuum_info{tables}{$t}{vacuums}; + $total_idxscan += $autovacuum_info{tables}{$t}{idxscans}; + $total_tuple += $autovacuum_info{tables}{$t}{tuples}{removed}; + $total_page += $autovacuum_info{tables}{$t}{pages}{removed}; + if ($main_page[1] < $autovacuum_info{tables}{$t}{pages}{removed}) { + $main_page[0] = $t; + $main_page[1] = $autovacuum_info{tables}{$t}{pages}{removed}; + } + } + $vacuum_info .= "Total" . &comma_numbers($total_count) . "" . &comma_numbers($total_idxscan) . + "" . &comma_numbers($total_tuple) . "" . &comma_numbers($total_page) . ""; + + my %infos_page = (); + my @small = (); + foreach my $d (sort keys %{$autovacuum_info{tables}}) { + if ((($autovacuum_info{tables}{$d}{pages}{removed} * 100) / ($total_page || 1)) > $pie_percentage_limit) { + $infos_page{$d} = $autovacuum_info{tables}{$d}{pages}{removed} || 0; + } else { + $infos_page{"Sum pages removed < $pie_percentage_limit%"} += $autovacuum_info{tables}{$d}{pages}{removed} || 0; + push(@small, $d); + } + } + if ($#small == 0) { + $infos_page{$small[0]} = $infos_page{"Sum pages removed < $pie_percentage_limit%"}; + delete $infos_page{"Sum pages removed < $pie_percentage_limit%"}; + } + $drawn_graphs{pagevacuums_graph} = &flotr2_piegraph($graphid++, 'pagevacuums_graph', 'Tuples removed per tables', %infos_page); + $vacuum_info = qq{$NODATA} if (!$total_count); + $total_count = &comma_numbers($total_count); + my $database = ''; + if ($main_page[0] =~ s/^([^\.]+)\.//) { + $database = $1; + } + print $fh qq{ +
+

Pages removed per table

+
+

Key values

+
+
    +
  • $main_page[0] ($main_page[1]) Main table with removed pages on database $database
  • +
  • $total_page pages Total removed
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{pagevacuums_graph} +
+
+ + + + + + + + + + + + $vacuum_info + +
TableNumber of vacuumsIndex scansTuples removedPages removed
+
+
+
+
+
+}; + delete $drawn_graphs{pagevacuums_graph}; + +} + +sub print_lock_type +{ + my %locktype = (); + my $total_count = 0; + my $total_duration = 0; + my $locktype_info = ''; + my @main_locktype = ('unknown',0); + foreach my $t (sort keys %lock_info) { + $locktype_info .= "$t" . &comma_numbers($lock_info{$t}{count}) . + "" . &convert_time($lock_info{$t}{duration}) . "" . + &convert_time($lock_info{$t}{duration} / ($lock_info{$t}{count} || 1)) . ""; + $total_count += $lock_info{$t}{count}; + $total_duration += $lock_info{$t}{duration}; + if ($main_locktype[1] < $lock_info{$t}{count}) { + $main_locktype[0] = $t; + $main_locktype[1] = $lock_info{$t}{count}; + } + foreach my $o (sort keys %{$lock_info{$t}}) { + next if (($o eq 'count') || ($o eq 'duration') || ($o eq 'chronos')); + $locktype_info .= "$o" . &comma_numbers($lock_info{$t}{$o}{count}) . + "" . &convert_time($lock_info{$t}{$o}{duration}) . "" . + &convert_time($lock_info{$t}{$o}{duration} / $lock_info{$t}{$o}{count}) . + "\n"; + } + } + if ($total_count > 0) { + $locktype_info .= "Total" . &comma_numbers($total_count) . + "" . &convert_time($total_duration) . "" . + &convert_time($total_duration / ($total_count || 1)) . ""; + } else { + $locktype_info = qq{$NODATA}; + } + if ($graph) { + my @small = (); + foreach my $d (sort keys %lock_info) { + if ((($lock_info{$d}{count} * 100) / ($total_count||1)) > $pie_percentage_limit) { + $locktype{$d} = $lock_info{$d}{count} || 0; + } else { + $locktype{"Sum lock types < $pie_percentage_limit%"} += $lock_info{$d}{count} || 0; + push(@small, $d); + + } + } + if ($#small == 0) { + $locktype{$small[0]} = $locktype{"Sum types < $pie_percentage_limit%"}; + delete $locktype{"Sum lock types < $pie_percentage_limit%"}; + } + } + $drawn_graphs{lockbytype_graph} = &flotr2_piegraph($graphid++, 'lockbytype_graph', 'Type of locks', %locktype); + $total_count = &comma_numbers($total_count); + print $fh qq{ +

Locks

+
+

Locks by types

+
+

Key values

+
+
    +
  • $main_locktype[0] Main Lock Type
  • +
  • $total_count locks Total
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{lockbytype_graph} +
+
+ + + + + + + + + + + + $locktype_info + +
TypeObjectCountTotal DurationAverage Duration (s)
+
+
+
+
+
+}; + delete $drawn_graphs{lockbytype_graph}; +} + +sub print_query_type +{ + + my %data = (); + my $total_queries = 0; + my $total_select = 0; + my $total_write = 0; + foreach my $a (@SQL_ACTION) { + $total_queries += $overall_stat{$a}; + if ($a eq 'SELECT') { + $total_select += $overall_stat{$a}; + } elsif ($a ne 'OTHERS') { + $total_write += $overall_stat{$a}; + } + } + my $total = $overall_stat{'queries_number'}; + + my $querytype_info = ''; + foreach my $a (@SQL_ACTION) { + $querytype_info .= "$a" . &comma_numbers($overall_stat{$a}) . + "" . sprintf("%0.2f", ($overall_stat{$a} * 100) / ($total||1)) . "%"; + } + if (($total - $total_queries) > 0) { + $querytype_info .= "OTHERS" . &comma_numbers($total - $total_queries) . + "" . sprintf("%0.2f", (($total - $total_queries) * 100) / ($total||1)) . "%"; + } + $querytype_info = qq{$NODATA} if (!$total); + + if ($graph && $total) { + foreach my $t (@SQL_ACTION) { + if ((($overall_stat{$t} * 100) / ($total||1)) > $pie_percentage_limit) { + $data{$t} = $overall_stat{$t} || 0; + } else { + $data{"Sum query types < $pie_percentage_limit%"} += $overall_stat{$t} || 0; + } + } + if (((($total - $total_queries) * 100) / ($total||1)) > $pie_percentage_limit) { + $data{'Others'} = $total - $total_queries; + } else { + $data{"Sum query types < $pie_percentage_limit%"} += $total - $total_queries; } } + $drawn_graphs{queriesbytype_graph} = &flotr2_piegraph($graphid++, 'queriesbytype_graph', 'Type of queries', %data); + + $total_select = &comma_numbers($total_select); + $total_write = &comma_numbers($total_write); + print $fh qq{ +

Queries

+
+

Queries by type

+
+

Key values

+
+
    +
  • $total_select Total read queries
  • +
  • $total_write Total write queries
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{queriesbytype_graph} +
+
+ + + + + + + + + + $querytype_info + +
TypeCountPercentage
+
+
+
+
+
+}; + delete $drawn_graphs{queriesbytype_graph}; + +} - # Show session per host statistics - if (!$disable_session && exists $session_info{host}) { - print $fh "\n- Sessions per host ------------------------------------------------------\n\n"; - print $fh "User Count Total Duration Av. duration (s)\n"; - foreach my $d (sort keys %{$session_info{host}}) { - print $fh "$d - ", &comma_numbers($session_info{host}{$d}{count}), " ", &convert_time($session_info{host}{$d}{duration}), - " ", &convert_time($session_info{host}{$d}{duration} / $session_info{host}{$d}{count}), "\n"; +sub print_query_per_database +{ + my %infos = (); + my $total_count = 0; + my $query_database_info = ''; + my @main_database = ('unknown', 0); + foreach my $d (sort keys %database_info) { + $query_database_info .= "$dTotal" . + &comma_numbers($database_info{$d}{count}) . ""; + $total_count += $database_info{$d}{count}; + if ($main_database[1] < $database_info{$d}{count}) { + $main_database[0] = $d; + $main_database[1] = $database_info{$d}{count}; + } + foreach my $r (sort keys %{$database_info{$d}}) { + next if ($r eq 'count'); + $query_database_info .= "$r" . + &comma_numbers($database_info{$d}{$r}) . ""; } } - # Show connection per database statistics - if (!$disable_connection && exists $connection_info{database}) { - print $fh "\n- Connections per database ------------------------------------------------------\n\n"; - print $fh "Database User Count\n"; - foreach my $d (sort keys %{$connection_info{database}}) { - print $fh "$d - ", &comma_numbers($connection_info{database}{$d}), "\n"; - foreach my $u (sort keys %{$connection_info{user}}) { - next if (!exists $connection_info{database_user}{$d}{$u}); - print $fh "\t$u ", &comma_numbers($connection_info{database_user}{$d}{$u}), "\n"; + $query_database_info = qq{$NODATA} if (!$total_count); + + if ($graph) { + my @small = (); + foreach my $d (sort keys %database_info) { + if ((($database_info{$d}{count} * 100) / ($total_count || 1)) > $pie_percentage_limit) { + $infos{$d} = $database_info{$d}{count} || 0; + } else { + $infos{"Sum queries per databases < $pie_percentage_limit%"} += $database_info{$d}{count} || 0; + push(@small, $d); } } + if ($#small == 0) { + $infos{$small[0]} = $infos{"Sum queries per databases < $pie_percentage_limit%"}; + delete $infos{"Sum queries per databases < $pie_percentage_limit%"}; + } } + $drawn_graphs{queriesbydatabase_graph} = &flotr2_piegraph($graphid++, 'queriesbydatabase_graph', 'Queries per database', %infos); - # Show connection per user statistics - if (!$disable_connection && exists $connection_info{user}) { - print $fh "\n- Connections per user ------------------------------------------------------\n\n"; - print $fh "User Count\n"; - foreach my $d (sort keys %{$connection_info{user}}) { - print $fh "$d - ", &comma_numbers($connection_info{user}{$d}), "\n"; + $main_database[1] = &comma_numbers($main_database[1]); + print $fh qq{ +
+

Queries by database

+
+

Key values

+
+
    +
  • $main_database[0] Main database
  • +
  • $main_database[1] Requests
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{queriesbydatabase_graph} +
+
+ + + + + + + + + + $query_database_info + +
DatabaseRequest typeCount
+
+
+
+
+
+}; + delete $drawn_graphs{queriesbydatabase_graph}; + + +} + +sub print_query_per_application +{ + my %infos = (); + my $total_count = 0; + my $query_application_info = ''; + my @main_application = ('unknown', 0); + foreach my $d (sort keys %application_info) { + $query_application_info .= "$dTotal" . + &comma_numbers($application_info{$d}{count}) . ""; + $total_count += $application_info{$d}{count}; + if ($main_application[1] < $application_info{$d}{count}) { + $main_application[0] = $d; + $main_application[1] = $application_info{$d}{count}; + } + foreach my $r (sort keys %{$application_info{$d}}) { + next if ($r eq 'count'); + $query_application_info .= "$r" . + &comma_numbers($application_info{$d}{$r}) . ""; + } + } + $query_application_info = qq{$NODATA} if (!$total_count); + if ($graph) { + my @small = (); + foreach my $d (sort keys %application_info) { + if ((($application_info{$d}{count} * 100) / ($total_count || 1)) > $pie_percentage_limit) { + $infos{$d} = $application_info{$d}{count} || 0; + } else { + $infos{"Sum queries per applications < $pie_percentage_limit%"} += $application_info{$d}{count} || 0; + push(@small, $d); + } + } + if ($#small == 0) { + $infos{$small[0]} = $infos{"Sum queries per applications < $pie_percentage_limit%"}; + delete $infos{"Sum queries per applications < $pie_percentage_limit%"}; } } + $drawn_graphs{queriesbyapplication_graph} = &flotr2_piegraph($graphid++, 'queriesbyapplication_graph', 'Queries per application', %infos); - # Show connection per host statistics - if (!$disable_connection && exists $connection_info{host}) { - print $fh "\n- Connections per host ------------------------------------------------------\n\n"; - print $fh "User Count\n"; - foreach my $d (sort keys %{$connection_info{host}}) { - print $fh "$d - ", &comma_numbers($connection_info{host}{$d}), "\n"; + $main_application[1] = &comma_numbers($main_application[1]); + print $fh qq{ +
+

Queries by application

+
+

Key values

+
+
    +
  • $main_application[0] Main application
  • +
  • $main_application[1] Requests
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{queriesbyapplication_graph} +
+
+ + + + + + + + + + $query_application_info + +
ApplicationRequest typeCount
+
+
+
+
+
+}; + delete $drawn_graphs{queriesbyapplication_graph}; + +} + +sub print_query_per_user +{ + my %infos = (); + my $total_count = 0; + my $query_user_info = ''; + my @main_user = ('unknown', 0); + foreach my $d (sort keys %user_info) { + $query_user_info .= "$dTotal" . + &comma_numbers($user_info{$d}{count}) . ""; + $total_count += $user_info{$d}{count}; + if ($main_user[1] < $user_info{$d}{count}) { + $main_user[0] = $d; + $main_user[1] = $user_info{$d}{count}; + } + foreach my $r (sort keys %{$user_info{$d}}) { + next if ($r eq 'count'); + $query_user_info .= "$r" . + &comma_numbers($user_info{$d}{$r}) . ""; + } + } + $query_user_info = qq{$NODATA} if (!$total_count); + + if ($graph) { + my @small = (); + foreach my $d (sort keys %user_info) { + if ((($user_info{$d}{count} * 100) / ($total_count || 1)) > $pie_percentage_limit) { + $infos{$d} = $user_info{$d}{count} || 0; + } else { + $infos{"Sum queries per users < $pie_percentage_limit%"} += $user_info{$d}{count} || 0; + push(@small, $d); + } + } + if ($#small == 0) { + $infos{$small[0]} = $infos{"Sum queries per users < $pie_percentage_limit%"}; + delete $infos{"Sum queries per users < $pie_percentage_limit%"}; } } + $drawn_graphs{queriesbyuser_graph} = &flotr2_piegraph($graphid++, 'queriesbyuser_graph', 'Queries per user', %infos); - # Show top informations - if (!$disable_query) { - print $fh "\n- Slowest queries ------------------------------------------------------\n\n"; - print $fh "Rank Duration (s) Query\n"; - for (my $i = 0 ; $i <= $#top_slowest ; $i++) { - print $fh $i + 1, ") " . &convert_time($top_slowest[$i]->[0]) . " - $top_slowest[$i]->[2]\n"; - print $fh "--\n"; - } - @top_slowest = (); + $main_user[1] = &comma_numbers($main_user[1]); + print $fh qq{ +
+

Queries by user

+
+

Key values

+
+
    +
  • $main_user[0] Main user
  • +
  • $main_user[1] Requests
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{queriesbyuser_graph} +
+
+ + + + + + + + + + $query_user_info + +
UserRequest typeCount
+
+
+
+
+
+}; + delete $drawn_graphs{queriesbyuser_graph}; - print $fh "\n- Queries that took up the most time (N) -------------------------------\n\n"; - print $fh "Rank Total duration Times executed Av. duration (s) Query\n"; - my $idx = 1; - foreach my $k (sort {$normalyzed_info{$b}{duration} <=> $normalyzed_info{$a}{duration}} keys %normalyzed_info) { - next if (!$normalyzed_info{$k}{count}); - last if ($idx > $top); - my $q = $k; - if ($normalyzed_info{$k}{count} == 1) { - foreach (keys %{$normalyzed_info{$k}{samples}}) { - $q = $normalyzed_info{$k}{samples}{$_}{query}; - last; - } - } - $normalyzed_info{$k}{average} = $normalyzed_info{$k}{duration} / $normalyzed_info{$k}{count}; - print $fh "$idx) " - . &convert_time($normalyzed_info{$k}{duration}) . " - " - . &comma_numbers($normalyzed_info{$k}{count}) . " - " - . &convert_time($normalyzed_info{$k}{average}) - . " - $q\n"; - print $fh "--\n"; - my $i = 1; - foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { - print $fh "\t- Example $i: ", &convert_time($d), " - ", $normalyzed_info{$k}{samples}{$d}{query}, "\n"; - $i++; +} + +sub print_query_per_host +{ + my %infos = (); + my $total_count = 0; + my $query_host_info = ''; + my @main_host = ('unknown', 0); + foreach my $d (sort keys %host_info) { + $query_host_info .= "$dTotal" . + &comma_numbers($host_info{$d}{count}) . ""; + $total_count += $host_info{$d}{count}; + if ($main_host[1] < $host_info{$d}{count}) { + $main_host[0] = $d; + $main_host[1] = $host_info{$d}{count}; + } + foreach my $r (sort keys %{$host_info{$d}}) { + next if ($r eq 'count'); + $query_host_info .= "$r" . + &comma_numbers($host_info{$d}{$r}) . ""; + } + } + $query_host_info = qq{$NODATA} if (!$total_count); + + if ($graph) { + my @small = (); + foreach my $d (sort keys %host_info) { + if ((($host_info{$d}{count} * 100) / ($total_count || 1)) > $pie_percentage_limit) { + $infos{$d} = $host_info{$d}{count} || 0; + } else { + $infos{"Sum queries per hosts < $pie_percentage_limit%"} += $host_info{$d}{count} || 0; + push(@small, $d); } - $idx++; } + if ($#small == 0) { + $infos{$small[0]} = $infos{"Sum queries per hosts < $pie_percentage_limit%"}; + delete $infos{"Sum queries per hosts < $pie_percentage_limit%"}; + } + } + $drawn_graphs{queriesbyhost_graph} = &flotr2_piegraph($graphid++, 'queriesbyhost_graph', 'Queries per host', %infos); - print $fh "\n- Most frequent queries (N) --------------------------------------------\n\n"; - print $fh "Rank Times executed Total duration Av. duration (s) Query\n"; - $idx = 1; - foreach my $k (sort {$normalyzed_info{$b}{count} <=> $normalyzed_info{$a}{count}} keys %normalyzed_info) { - next if (!$normalyzed_info{$k}{count}); - last if ($idx > $top); - my $q = $k; - if ($normalyzed_info{$k}{count} == 1) { - foreach (keys %{$normalyzed_info{$k}{samples}}) { - $q = $normalyzed_info{$k}{samples}{$_}{query}; - last; - } - } - print $fh "$idx) " - . &comma_numbers($normalyzed_info{$k}{count}) . " - " - . &convert_time($normalyzed_info{$k}{duration}) . " - " - . &convert_time($normalyzed_info{$k}{duration} / $normalyzed_info{$k}{count}) - . " - $q\n"; - print $fh "--\n"; - my $i = 1; - foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { - print $fh "\tExample $i: ", &convert_time($d), " - ", $normalyzed_info{$k}{samples}{$d}{query}, "\n"; - $i++; - } - $idx++; + $main_host[1] = &comma_numbers($main_host[1]); + print $fh qq{ +
+

Queries by host

+
+

Key values

+
+
    +
  • $main_host[0] Main host
  • +
  • $main_host[1] Requests
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{queriesbyhost_graph} +
+
+ + + + + + + + + + $query_host_info + +
HostRequest typeCount
+
+
+
+
+
+}; + delete $drawn_graphs{queriesbyhost_graph}; + +} + + +sub print_lock_queries_report +{ + my @top_locked_queries; + foreach my $h (keys %normalyzed_info) { + if (exists($normalyzed_info{$h}{locks})) { + push (@top_locked_queries, [$h, $normalyzed_info{$h}{locks}{count}, $normalyzed_info{$h}{locks}{wait}, + $normalyzed_info{$h}{locks}{minwait}, $normalyzed_info{$h}{locks}{maxwait}]); } + } - print $fh "\n- Slowest queries (N) --------------------------------------------------\n\n"; - print $fh "Rank Av. duration (s) Times executed Total duration Query\n"; - $idx = 1; - foreach my $k (sort {$normalyzed_info{$b}{average} <=> $normalyzed_info{$a}{average}} keys %normalyzed_info) { - next if (!$normalyzed_info{$k}{count}); - last if ($idx > $top); - my $q = $k; - if ($normalyzed_info{$k}{count} == 1) { - foreach (keys %{$normalyzed_info{$k}{samples}}) { - $q = $normalyzed_info{$k}{samples}{$_}{query}; - last; - } - } - print $fh "$idx) " - . &convert_time($normalyzed_info{$k}{average}) . " - " - . &comma_numbers($normalyzed_info{$k}{count}) . " - " - . &convert_time($normalyzed_info{$k}{duration}) - . " - $q\n"; - print $fh "--\n"; - my $i = 1; + # Most frequent waiting queries (N) + @top_locked_queries = sort {$b->[2] <=> $a->[2]} @top_locked_queries; + print $fh qq{ +
+

Most frequent waiting queries (N)

+
+ + + + + + + + + + + + + +}; + + my $rank = 1; + for (my $i = 0 ; $i <= $#top_locked_queries ; $i++) { + my $count = &comma_numbers($top_locked_queries[$i]->[1]); + my $total_time = &convert_time($top_locked_queries[$i]->[2]); + my $min_time = &convert_time($top_locked_queries[$i]->[3]); + my $max_time = &convert_time($top_locked_queries[$i]->[4]); + my $avg_time = &convert_time($top_locked_queries[$i]->[4] / ($top_locked_queries[$i]->[1] || 1)); + my $query = &highlight_code($top_locked_queries[$i]->[0]); + my $k = $top_locked_queries[$i]->[0]; + my $example = qq{

}; + $example = '' if (scalar keys %{$normalyzed_info{$k}{samples}} <= 1); + print $fh qq{ + + + + + + + + + +}; } + $rank++; } - - if (!$disable_error) { - &show_error_as_text(); + if ($#top_locked_queries == -1) { + print $fh qq{}; } + print $fh qq{ + +
RankCountTotal timeMin timeMax timeAvg durationQuery
$rank$count$total_time$min_time$max_time$avg_time +
$query
+ $example + +
+
+}; + if ($normalyzed_info{$k}{count} > 1) { foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { - print $fh "\tExample $i: ", &convert_time($d), " - ", $normalyzed_info{$k}{samples}{$d}{query}, "\n"; - $i++; + $query = &highlight_code($normalyzed_info{$k}{samples}{$d}{query}); + my $details = "[ Date: $normalyzed_info{$k}{samples}{$d}{date}"; + $details .= " - Duration: " . &convert_time($d) if ($normalyzed_info{$k}{samples}{$d}{duration}); + $details .= " - Database: $normalyzed_info{$k}{samples}{$d}{db}" if ($normalyzed_info{$k}{samples}{$d}{db}); + $details .= " - User: $normalyzed_info{$k}{samples}{$d}{user}" if ($normalyzed_info{$k}{samples}{$d}{user}); + $details .= " - Remote: $normalyzed_info{$k}{samples}{$d}{remote}" if ($normalyzed_info{$k}{samples}{$d}{remote}); + $details .= " - Application: $normalyzed_info{$k}{samples}{$d}{app}" if ($normalyzed_info{$k}{samples}{$d}{app}); + $details .= " ]"; + print $fh qq{ +
$query
+
$details
+}; + } - $idx++; + print $fh qq{ +
+

+
+ +
$NODATA
+
+
+}; - print $fh "\n\n"; - print $fh "Report generated by PgBadger $VERSION ($project_url).\n"; + @top_locked_queries = (); + + # Queries that waited the most + @top_locked_info = sort {$b->[1] <=> $a->[1]} @top_locked_info; + print $fh qq{ +
+

Queries that waited the most

+
+ + + + + + + + + +}; + + $rank = 1; + for (my $i = 0 ; $i <= $#top_locked_info ; $i++) { + my $query = &highlight_code($top_locked_info[$i]->[2]); + my $details = "[ Date: " . ($top_locked_info[$i]->[1] || ''); + $details .= " - Database: $top_locked_info[$i]->[3]" if ($top_locked_info[$i]->[3]); + $details .= " - User: $top_locked_info[$i]->[4]" if ($top_locked_info[$i]->[4]); + $details .= " - Remote: $top_locked_info[$i]->[5]" if ($top_locked_info[$i]->[5]); + $details .= " - Application: $top_locked_info[$i]->[6]" if ($top_locked_info[$i]->[6]); + $details .= " ]"; + my $time = &convert_time($top_locked_info[$i]->[0]); + print $fh qq{ + + + + + +}; + $rank++; + } + if ($#top_locked_info == -1) { + print $fh qq{}; + } + print $fh qq{ + +
RankWait timeQuery
$rank$time +
$query
+
$details
+
$NODATA
+
+
+}; } -sub dump_error_as_text +sub print_tempfile_report { - - # Global informations - my $curdate = localtime(time); - my $fmt_nlines = &comma_numbers($nlines); - my $total_time = timestr($td); - $total_time =~ s/^([\.0-9]+) wallclock.*/$1/; - $total_time = &convert_time($total_time * 1000); - my $logfile_str = $log_files[0]; - if ($#log_files > 0) { - $logfile_str .= ', ..., ' . $log_files[-1]; + my @top_temporary = (); + foreach my $h (keys %normalyzed_info) { + if (exists($normalyzed_info{$h}{tempfiles})) { + push (@top_temporary, [$h, $normalyzed_info{$h}{tempfiles}{count}, $normalyzed_info{$h}{tempfiles}{size}, + $normalyzed_info{$h}{tempfiles}{minsize}, $normalyzed_info{$h}{tempfiles}{maxsize}]); + } } - print $fh qq{ -$report_title -- Global informations -------------------------------------------------- + # Queries generating the most temporary files (N) + if ($#top_temporary >= 0) { + @top_temporary = sort {$b->[1] <=> $a->[1]} @top_temporary; + print $fh qq{ +
+

Queries generating the most temporary files (N)

+
+ + + + + + + + + + + + + +}; + my $rank = 1; + for (my $i = 0 ; $i <= $#top_temporary ; $i++) { + my $count = &comma_numbers($top_temporary[$i]->[1]); + my $total_size = &pretty_print_size($top_temporary[$i]->[2]); + my $min_size = &pretty_print_size($top_temporary[$i]->[3]); + my $max_size = &pretty_print_size($top_temporary[$i]->[4]); + my $avg_size = &pretty_print_size($top_temporary[$i]->[2] / ($top_temporary[$i]->[1] || 1)); + my $query = &highlight_code($top_temporary[$i]->[0]); + my $example = qq{

}; + $example = '' if ($count <= 1); + print $fh qq{ + + + + + + + + + +}; + } + $rank++; + } + print $fh qq{ + +
RankCountTotal sizeMin sizeMax sizeAvg sizeQuery
$rank$count$total_size$min_size$max_size$avg_size +
$query
+ $example + +
+
+}; + my $k = $top_temporary[$i]->[0]; + if ($normalyzed_info{$k}{count} > 1) { + foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { + $query = &highlight_code($normalyzed_info{$k}{samples}{$d}{query}); + my $details = "Duration: " . &convert_time($d) . "
"; + $details .= "Database: $normalyzed_info{$k}{samples}{$d}{db}
" if ($normalyzed_info{$k}{samples}{$d}{db}); + $details .= "User: $normalyzed_info{$k}{samples}{$d}{user}
" if ($normalyzed_info{$k}{samples}{$d}{user}); + $details .= "Remote: $normalyzed_info{$k}{samples}{$d}{remote}
" if ($normalyzed_info{$k}{samples}{$d}{remote}); + $details .= "Application: $normalyzed_info{$k}{samples}{$d}{app}
" if ($normalyzed_info{$k}{samples}{$d}{app}); + print $fh qq{ +
$query
+
$details
+}; -Generated on $curdate -Log file: $logfile_str -Parsed $fmt_nlines log entries in $total_time -Log start from $first_log_date to $last_log_date + } + print $fh qq{ +
+

+
+ +
+
+
}; - &show_error_as_text(); + @top_temporary = (); + } + + # Top queries generating the largest temporary files + if ($#top_tempfile_info >= 0) { + @top_tempfile_info = sort {$b->[0] <=> $a->[0]} @top_tempfile_info; + my $largest = &comma_numbers($top_temporary[0]->[0]); + print $fh qq{ +
+

Queries generating the largest temporary files

+
+ + + + + + + + + +}; + my $rank = 1; + for (my $i = 0 ; $i <= $#top_tempfile_info ; $i++) { + my $size = &pretty_print_size($top_tempfile_info[$i]->[0]); + my $details = "[ Date: $top_tempfile_info[$i]->[1]"; + $details .= " - Database: $top_tempfile_info[$i]->[3]" if ($top_tempfile_info[$i]->[3]); + $details .= " - User: $top_tempfile_info[$i]->[4]" if ($top_tempfile_info[$i]->[4]); + $details .= " - Remote: $top_tempfile_info[$i]->[5]" if ($top_tempfile_info[$i]->[5]); + $details .= " - Application: $top_tempfile_info[$i]->[6]" if ($top_tempfile_info[$i]->[6]); + $details .= " ]"; + my $query = &highlight_code($top_tempfile_info[$i]->[2]); + print $fh qq{ + + + + + +}; + $rank++; + } + print $fh qq{ + +
RankSizeQuery
$rank$size +
$query
+
$details
+
+
+
+}; + @top_tempfile_info = (); + } - print $fh "\n\n"; - print $fh "Report generated by pgBadger $VERSION ($project_url).\n"; } -sub show_error_as_text +sub print_histogram_query_times { - print $fh "\n- Most frequent events (N) ---------------------------------------------\n\n"; - my $idx = 1; - foreach my $k (sort {$error_info{$b}{count} <=> $error_info{$a}{count}} keys %error_info) { - next if (!$error_info{$k}{count}); - last if ($idx > $top); - if ($error_info{$k}{count} > 1) { - my $msg = $k; - $msg =~ s/HINT: (parameter "[^"]+" changed to)/LOG: $1/; - print $fh "$idx) " . &comma_numbers($error_info{$k}{count}) . " - $msg\n"; - print $fh "--\n"; - my $j = 1; - for (my $i = 0 ; $i <= $#{$error_info{$k}{date}} ; $i++) { - if ($error_info{$k}{error}[$i] =~ s/HINT: (parameter "[^"]+" changed to)/LOG: $1/) { - $logs_type{HINT}--; - $logs_type{LOG}++; - } - print $fh "\t- Example $j: $error_info{$k}{date}[$i] - $error_info{$k}{error}[$i]\n"; - print $fh "\t\tDetail: $error_info{$k}{detail}[$i]\n" if ($error_info{$k}{detail}[$i]); - print $fh "\t\tContext: $error_info{$k}{context}[$i]\n" if ($error_info{$k}{context}[$i]); - print $fh "\t\tHint: $error_info{$k}{hint}[$i]\n" if ($error_info{$k}{hint}[$i]); - print $fh "\t\tStatement: $error_info{$k}{statement}[$i]\n" if ($error_info{$k}{statement}[$i]); - $j++; - } - } else { - print $fh "$idx) " . &comma_numbers($error_info{$k}{count}) . " - $error_info{$k}{error}[0]\n"; - print $fh "--\n"; - print $fh "\t- Date: $error_info{$k}{date}[0]\n"; - print $fh "\t\tDetail: $error_info{$k}{detail}[0]\n" if ($error_info{$k}{detail}[0]); - print $fh "\t\tContext: $error_info{$k}{context}[0]\n" if ($error_info{$k}{context}[0]); - print $fh "\t\tHint: $error_info{$k}{hint}[0]\n" if ($error_info{$k}{hint}[0]); - print $fh "\t\tStatement: $error_info{$k}{statement}[0]\n" if ($error_info{$k}{statement}[0]); + my %data = (); + my $histogram_info = ''; + my $most_range = ''; + my $most_range_value = ''; + + for (my $i = 1; $i <= $#histogram_query_time; $i++) { + $histogram_info .= "$histogram_query_time[$i-1]-$histogram_query_time[$i]ms" . &comma_numbers($overall_stat{histogram}{query_time}{$histogram_query_time[$i-1]}) . + "" . sprintf("%0.2f", ($overall_stat{histogram}{query_time}{$histogram_query_time[$i-1]} * 100) / ($overall_stat{histogram}{total}||1)) . "%"; + $data{"$histogram_query_time[$i-1]-$histogram_query_time[$i]ms"} = $overall_stat{histogram}{query_time}{$histogram_query_time[$i-1]} if ($overall_stat{histogram}{query_time}{$histogram_query_time[$i-1]} > 0); + if ($overall_stat{histogram}{query_time}{$histogram_query_time[$i-1]} > $most_range_value) { + $most_range = "$histogram_query_time[$i-1]-$histogram_query_time[$i]ms"; + $most_range_value = $overall_stat{histogram}{query_time}{$histogram_query_time[$i-1]}; + } + } + if ($overall_stat{histogram}{total} > 0) { + $histogram_info .= " > $histogram_query_time[-1]ms" . &comma_numbers($overall_stat{histogram}{query_time}{'-1'}) . + "" . sprintf("%0.2f", ($overall_stat{histogram}{query_time}{'-1'} * 100) / ($overall_stat{histogram}{total}||1)) . "%"; + $data{"> $histogram_query_time[-1]ms"} = $overall_stat{histogram}{query_time}{"-1"} if ($overall_stat{histogram}{query_time}{"-1"} > 0); + if ($overall_stat{histogram}{query_time}{"-1"} > $most_range_value) { + $most_range = "> $histogram_query_time[-1]ms"; + $most_range_value = $overall_stat{histogram}{query_time}{"-1"}; } - $idx++; + } else { + $histogram_info = qq{$NODATA}; } - if (scalar keys %logs_type > 0) { - print $fh "\n- Logs per type ---------------------------------------------\n\n"; + $drawn_graphs{histogram_query_times_graph} = &flotr2_piegraph($graphid++, 'histogram_query_times_graph', 'Histogram of query times', %data); - my $total_logs = 0; - foreach my $d (keys %logs_type) { - $total_logs += $logs_type{$d}; - } - print $fh "Logs type Count Percentage\n"; - foreach my $d (sort keys %logs_type) { - next if (!$logs_type{$d}); - print $fh "$d\t\t", &comma_numbers($logs_type{$d}), "\t", sprintf("%0.2f", ($logs_type{$d} * 100) / $total_logs), "%\n"; - } + $most_range_value = &comma_numbers($most_range_value) if ($most_range_value); + + print $fh qq{ +

Top Queries

+
+

Histogram of query times

+
+

Key values

+
+
    +
  • $most_range_value $most_range duration
  • +
+
+
+
+
+ +
+
+ $drawn_graphs{histogram_query_times_graph} +
+
+ + + + + + + + + + $histogram_info + +
RangeCountPercentage
+
+
+
+
+
+}; + delete $drawn_graphs{histogram_query_times_graph}; +} + +sub print_slowest_individual_queries +{ + + print $fh qq{ +
+

Slowest individual queries

+
+ + + + + + + + + +}; + + for (my $i = 0 ; $i <= $#top_slowest ; $i++) { + my $rank = $i + 1; + my $duration = &convert_time($top_slowest[$i]->[0]); + my $date = $top_slowest[$i]->[1] || ''; + my $details = "[ Date: " . ($top_slowest[$i]->[1] || ''); + $details .= " - Database: $top_slowest[$i]->[3]" if ($top_slowest[$i]->[3]); + $details .= " - User: $top_slowest[$i]->[4]" if ($top_slowest[$i]->[4]); + $details .= " - Remote: $top_slowest[$i]->[5]" if ($top_slowest[$i]->[5]); + $details .= " - Application: $top_slowest[$i]->[6]" if ($top_slowest[$i]->[6]); + $details .= " ]"; + my $query = &highlight_code($top_slowest[$i]->[2]); + print $fh qq{ + + + + + + + }; } + if ($#top_slowest == -1) { + print $fh qq{}; + } + print $fh qq{ + +
RankDurationQuery
$rank$duration +
$query
+
$details
+
$NODATA
+
+
+}; + } -sub html_header +sub print_time_consuming { - my $date = localtime(time); - print $fh qq{ - - -$report_title - - - - - - -}; - if ($JQGRAPH) { - my @jslib = ; - print $fh < - -
- -

$report_title

+
+

Normalized slowest queries (N)

+
+ + + + + + + + + + + + + }; - print $fh qq{ -"; + $zday = ""; + } + } + # Set graph dataset + my %graph_data = (); + foreach my $h ("00" .. "23") { + $graph_data{count} .= "[$h, " . (int($hourly_count{"$h"}/$days) || 0) . "],"; + $graph_data{duration} .= "[$h, " . (int($hourly_duration{"$h"} / ($hourly_count{"$h"} || 1)) || 0) . "],"; + } + $graph_data{count} =~ s/,$//; + $graph_data{duration} =~ s/,$//; + %hourly_count = (); + %hourly_duration = (); + + my $query_histo = + &flotr2_histograph($graphid++, 'normalizedslowest_graph_'.$rank, $graph_data{count}, $graph_data{duration}); + + print $fh qq{ + + + + + + + + + +}; + $rank++; } - if (!$disable_error) { - print $fh qq{Most frequent events (N) | }; - print $fh qq{Logs per type}; + if (scalar keys %normalyzed_info == 0) { + print $fh qq{}; } - print $fh "\n"; - print $fh "

Normalized reports are marked with a \"(N)\".

\n"; + print $fh qq{ + +
RankMin durationMax durationAvg durationTimes executedTotal durationQuery
$zday$h" . + &comma_numbers($normalyzed_info{$k}{chronos}{$d}{$h}{count}) . "" . + &convert_time($normalyzed_info{$k}{chronos}{$d}{$h}{duration}) . "" . + &convert_time($normalyzed_info{$k}{chronos}{$d}{$h}{average}) . "
$rank$min$max$avg$count +

Details

+
$duration +
$query
+ +
+

Times Reported Time consuming queries #$rank

+ $query_histo + + + + + + + + + + + + $details + +
DayHourCountDurationAvg duration
+

+
+

+ +
+
}; - if (!$error_only) { - if (!$disable_hourly) { - print $fh qq{Hourly statistics | }; - } - if (!$disable_type) { - print $fh qq{Queries by type | }; - } - if (!$disable_query) { + + foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { + my $details = "[ Date: $normalyzed_info{$k}{samples}{$d}{date}"; + $details .= " - Duration: " . &convert_time($d); + $details .= " - Database: $normalyzed_info{$k}{samples}{$d}{details}" if ($normalyzed_info{$k}{samples}{$d}{details}); + $details .= " - User: $normalyzed_info{$k}{samples}{$d}{user}" if ($normalyzed_info{$k}{samples}{$d}{user}); + $details .= " - Remote: $normalyzed_info{$k}{samples}{$d}{remote}" if ($normalyzed_info{$k}{samples}{$d}{remote}); + $details .= " - Application: $normalyzed_info{$k}{samples}{$d}{app}" if ($normalyzed_info{$k}{samples}{$d}{app}); + $details .= " ]"; + $query = &highlight_code($normalyzed_info{$k}{samples}{$d}{query}); print $fh qq{ -Slowest queries | -Queries that took up the most time (N) | -Most frequent queries (N) | -Slowest queries (N)
- } - } - if (!$disable_lock && scalar keys %lock_info > 0) { - print $fh qq{Locks by type |}; - } - if (!$disable_session) { - if (exists $session_info{database}) { - print $fh qq{Sessions per database |}; - } - if (exists $session_info{user}) { - print $fh qq{Sessions per user |}; - } - if (exists $session_info{host}) { - print $fh qq{Sessions per host |}; - } - } - if (!$disable_connection) { - if (exists $connection_info{database}) { - print $fh qq{Connections per database |}; - } - if (exists $connection_info{user}) { - print $fh qq{Connections per user |}; - } - if (exists $connection_info{host}) { - print $fh qq{Connections per host
}; - } +
$query
+
$details
+}; } + print $fh qq{ +
+

+
+ +
$NODATA
+
+
+}; } -sub html_footer +sub dump_as_html { - print $fh qq{ -

 

- -
-}; - print $fh qq{ -
-
-
    -
  • ^ Back to top
  • Overall statistics
  • -}; + # Dump the html header + &html_header(); + if (!$error_only) { + + # Overall statistics + print $fh qq{ +
  • +}; + &print_overall_statistics(); + + # Set graphs limits + $overall_stat{'first_log_ts'} =~ /(\d+)-(\d+)-(\d+) (\d+):(\d+):(\d+)/; + $t_min = timegm_nocheck(0, $5, $4, $3, $2 - 1, $1) * 1000; + $t_min -= ($avg_minutes * 60000); + + $overall_stat{'last_log_ts'} =~ /(\d+)-(\d+)-(\d+) (\d+):(\d+):(\d+)/; + $t_max = timegm_nocheck(59, $5, $4, $3, $2 - 1, $1) * 1000; + $t_max += ($avg_minutes * 60000); + if (!$disable_hourly) { - print $fh qq{
  • Hourly statistics
  • }; - } - if (!$disable_type) { - print $fh qq{
  • Queries by type
  • }; + # Build graphs based on hourly stat + &compute_query_graphs(); + + # Show global SQL traffic + &print_sql_traffic(); + + # Show hourly statistics + &print_general_activity(); + } - if (!$disable_lock && scalar keys %lock_info > 0) { - print $fh qq{
  • Locks by type
  • }; + + if (!$disable_connection) { + print $fh qq{ + +
  • +

    Connections

    + +}; + + # Draw connections information + &print_established_connection() if (!$disable_hourly); + + # Show per database/user connections + &print_database_connection() if (exists $connection_info{database}); + + # Show per user connections + &print_user_connection() if (exists $connection_info{user}); + + # Show per client ip connections + &print_host_connection() if (exists $connection_info{host}); } + + + # Show session per database statistics if (!$disable_session) { - if (exists $session_info{database}) { - print $fh qq{
  • Sessions per database
  • }; - } - if (exists $session_info{user}) { - print $fh qq{
  • Sessions per user
  • }; - } - if (exists $session_info{host}) { - print $fh qq{
  • Sessions per host
  • }; - } + print $fh qq{ +
  • +
  • +

    Sessions

    + +}; + # Show number of simultaneous sessions + &print_simultaneous_session(); + # Show per database sessions + &print_database_session(); + # Show per user sessions + &print_user_session(); + # Show per host sessions + &print_host_session(); } - if (!$disable_connection) { - if (exists $connection_info{database}) { - print $fh qq{
  • Connections per database
  • }; - } - if (exists $connection_info{user}) { - print $fh qq{
  • Connections per user
  • }; - } - if (exists $connection_info{host}) { - print $fh qq{
  • Connections per host
  • }; - } + + + # Display checkpoint and temporary files report + if (!$disable_checkpoint) { + print $fh qq{ +
  • +
  • + }; + &print_checkpoint(); + } + + if (!$disable_temporary) { + print $fh qq{ +
  • +
  • +}; + # Show temporary files detailed information + &print_temporary_file(); + + # Show information about queries generating temporary files + &print_tempfile_report(); } + + if (!$disable_autovacuum) { + print $fh qq{ +
  • +
  • +}; + # Show detailed vacuum/analyse information + &print_vacuum(); + + } + + if (!$disable_lock) { + print $fh qq{ +
  • +
  • +}; + # Lock stats per type + &print_lock_type(); + + # Show lock wait detailed information + &print_lock_queries_report(); + } + + %per_minute_info = (); + if (!$disable_query) { - print $fh -qq{Slowest queries
  • Queries that took up the most time (N)
  • Most frequent queries (N)
  • Slowest queries (N)
  • }; + print $fh qq{ + +
  • +}; + # INSERT/DELETE/UPDATE/SELECT repartition + if (!$disable_type) { + &print_query_type(); + + # Show requests per database + &print_query_per_database(); + + # Show requests per user + &print_query_per_user(); + + # Show requests per host + &print_query_per_host(); + + # Show requests per application + &print_query_per_application(); + } + + print $fh qq{ +
  • +
  • +}; + # Show histogram for query times + &print_histogram_query_times(); + + # Show top information + &print_slowest_individual_queries(); + + # Show queries that took up the most time + &print_time_consuming(); + + # Show most frequent queries + &print_most_frequent(); + + # Print normalized slowest queries + &print_slowest_queries } + } + + # Show errors report if (!$disable_error) { - print $fh "
  • Most frequent events (N)
  • \n"; - print $fh qq{
  • Logs per type
  • \n}; - } - print $fh qq{
-
-
Table of contents
-
+ + if (!$error_only) { + print $fh qq{ + +
  • }; - print $fh qq{ - - + } else { + print $fh qq{ +
  • }; + } + # Show log level distribution + &print_log_level(); + + # Show Most Frequent Errors/Events + &show_error_as_html(); + } + + + # Dump the html footer + &html_footer(); } -sub dump_as_html +sub escape_html { + $_[0] =~ s/<([\/a-zA-Z][\s\t\>]*)/\<$1/sg; - # Dump the html header - &html_header(); + return $_[0]; +} - # Global informations - my $curdate = localtime(time); - my $fmt_nlines = &comma_numbers($nlines); - my $total_time = timestr($td); - $total_time =~ s/^([\.0-9]+) wallclock.*/$1/; - $total_time = &convert_time($total_time * 1000); - my $logfile_str = $log_files[0]; - if ($#log_files > 0) { - $logfile_str .= ', ..., ' . $log_files[-1]; +sub print_log_level +{ + my %infos = (); + + # Some messages have seen their log level change during log parsing. + # Set the real log level count back + foreach my $k (sort {$error_info{$b}{count} <=> $error_info{$a}{count}} keys %error_info) { + next if (!$error_info{$k}{count}); + if ($error_info{$k}{count} > 1) { + for (my $i = 0 ; $i <= $#{$error_info{$k}{date}} ; $i++) { + if ( ($error_info{$k}{error}[$i] =~ s/ERROR: (parameter "[^"]+" changed to)/LOG: $1/) + || ($error_info{$k}{error}[$i] =~ s/ERROR: (database system was shut down)/LOG: $1/) + || ($error_info{$k}{error}[$i] =~ s/ERROR: (database system was interrupted while in recovery)/LOG: $1/) + || ($error_info{$k}{error}[$i] =~ s/ERROR: (recovery has paused)/LOG: $1/)) + { + $logs_type{ERROR}--; + $logs_type{LOG}++; + } + } + } else { + if ( ($error_info{$k}{error}[0] =~ s/ERROR: (parameter "[^"]+" changed to)/LOG: $1/) + || ($error_info{$k}{error}[0] =~ s/ERROR: (database system was shut down)/LOG: $1/) + || ($error_info{$k}{error}[0] =~ s/ERROR: (database system was interrupted while in recovery)/LOG: $1/) + || ($error_info{$k}{error}[0] =~ s/ERROR: (recovery has paused)/LOG: $1/)) + { + $logs_type{ERROR}--; + $logs_type{LOG}++; + } + } } - print $fh qq{ -
    -
      -
    • Generated on $curdate
    • -
    • Log file: $logfile_str
    • -
    • Parsed $fmt_nlines log entries in $total_time
    • -
    • Log start from $first_log_date to $last_log_date
    • -
    -
    -}; - # Overall statistics - my $fmt_unique = &comma_numbers(scalar keys %normalyzed_info) || 0; - my $fmt_queries = &comma_numbers($overall_stat{'queries_number'}) || 0; - my $fmt_duration = &convert_time($overall_stat{'queries_duration'}) || 0; - print $fh qq{ -
    -

    Overall statistics ^

    -
    -"; } - if ($tempfile_info{count}) { - my $fmt_temp_maxsise = &comma_numbers($tempfile_info{maxsize}) || 0; - my $fmt_temp_avsize = &comma_numbers(sprintf("%.2f", $tempfile_info{maxsize} / $tempfile_info{count})); - print $fh qq{ -
  • Number temporary file: $tempfile_info{count}
  • -
  • Max size of temporary file: $fmt_temp_maxsise
  • -
  • Average size of temporary file: $fmt_temp_avsize
  • -}; + if ($graph) { + my @small = (); + foreach my $d (sort keys %logs_type) { + if ((($logs_type{$d} * 100) / ($total_logs || 1)) > $pie_percentage_limit) { + $infos{$d} = $logs_type{$d} || 0; + } else { + $infos{"Sum log types < $pie_percentage_limit%"} += $logs_type{$d} || 0; + push(@small, $d); + } + } + + if ($#small == 0) { + $infos{$small[0]} = $infos{"Sum log types < $pie_percentage_limit%"}; + delete $infos{"Sum log types < $pie_percentage_limit%"}; + } } - if (!$disable_session && $session_info{count}) { - my $avg_session_duration = &convert_time($session_info{duration} / $session_info{count}); - my $tot_session_duration = &convert_time($session_info{duration}); - print $fh qq{ -
  • Total number of sessions: $session_info{count}
  • -
  • Total duration of sessions: $tot_session_duration
  • -
  • Average duration of sessions: $avg_session_duration
  • -}; + $drawn_graphs{logstype_graph} = &flotr2_piegraph($graphid++, 'logstype_graph', 'Logs per type', %infos); + if (!$total_logs) { + $logtype_info = qq{}; } - if (!$disable_connection && $connection_info{count}) { - print $fh qq{ -
  • Total number of connections: $connection_info{count}
  • + $logs_type{ERROR} ||= 0; + $logs_type{FATAL} ||= 0; + $total_logs = &comma_numbers($total_logs); + print $fh qq{ +

    Events

    +
    +

    Log levels

    +
    +

    Key values

    +
    +
      +
    • $total_logs Log entries
    • +
    • $logs_type{ERROR} Number of ERROR entries
    • +
    • $logs_type{FATAL} Number of FATAL entries
    • +
    +
    +
    +
    +
    + +
    +
    + $drawn_graphs{logstype_graph} +
    +
    +
    -
      -
    • Number of unique normalized queries: $fmt_unique
    • -
    • Number of queries: $fmt_queries
    • -
    • Total query duration: $fmt_duration
    • -
    • First query: $overall_stat{'first_query'}
    • -
    • Last query: $overall_stat{'last_query'}
    • -}; - foreach (sort {$overall_stat{'query_peak'}{$b} <=> $overall_stat{'query_peak'}{$a}} keys %{$overall_stat{'query_peak'}}) { - print $fh "
    • Query peak: ", &comma_numbers($overall_stat{'query_peak'}{$_}), " queries/s at $_
    • "; - last; + # Show log types + my $total_logs = 0; + foreach my $d (sort keys %logs_type) { + $total_logs += $logs_type{$d}; } - if (!$disable_error) { - my $fmt_errors = &comma_numbers($overall_stat{'errors_number'}) || 0; - my $fmt_unique_error = &comma_numbers(scalar keys %{$overall_stat{'unique_normalized_errors'}}) || 0; - print $fh qq{ -
    • Number of events: $fmt_errors
    • -
    • Number of unique normalized events: $fmt_unique_error
    • -
    -
    -
      -}; + + my $logtype_info = ''; + foreach my $d (sort keys %logs_type) { + next if (!$logs_type{$d}); + $logtype_info .= "
    $d" . &comma_numbers($logs_type{$d}) . + "" . sprintf("%0.2f", ($logs_type{$d} * 100) / ($total_logs||1)) . "%
    $NODATA
    + + + + + + + + + $logtype_info + +
    TypeCountPercentage
    +
    +
    + + + +}; + delete $drawn_graphs{logstype_graph}; + } + +sub show_error_as_html +{ + + my $main_error = 0; + my $total = 0; + foreach my $k (sort {$error_info{$b}{count} <=> $error_info{$a}{count}} keys %error_info) { + next if (!$error_info{$k}{count}); + $main_error = &comma_numbers($error_info{$k}{count}) if (!$main_error); + $total += $error_info{$k}{count}; } + $total = &comma_numbers($total); + print $fh qq{ - - - +
    +

    Most Frequent Errors/Events

    +
    +

    Key values

    +
    +
      +
    • $main_error Max number of times the same event was reported
    • +
    • $total Total events found
    • +
    +
    +
    +
    + + + + + + + + + }; + my $rank = 1; + foreach my $k (sort {$error_info{$b}{count} <=> $error_info{$a}{count}} keys %error_info) { + next if (!$error_info{$k}{count}); + my $count = &comma_numbers($error_info{$k}{count}); + my $msg = $k; + $msg =~ s/ERROR: (parameter "[^"]+" changed to)/LOG: $1/; + $msg =~ s/ERROR: (database system was shut down)/LOG: $1/; + $msg =~ s/ERROR: (database system was interrupted while in recovery)/LOG: $1/; + $msg =~ s/ERROR: (recovery has paused)/LOG: $1/; + my $error_level_class = 'text-error'; + if ($msg =~ /^WARNING: /) { + $error_level_class = 'text-warning'; + } elsif ($msg =~ /^LOG: /) { + $error_level_class = 'text-success'; + } elsif ($msg =~ /^HINT: /) { + $error_level_class = 'text-info'; + } elsif ($msg =~ /^FATAL: /) { + $error_level_class = 'text-fatal'; + } elsif ($msg =~ /^PANIC: /) { + $error_level_class = 'text-panic'; + } + my $details = ''; + my %hourly_count = (); + my $days = 0; + foreach my $d (sort keys %{$error_info{$k}{chronos}}) { + my $c = 1; + $d =~ /^\d{4}(\d{2})(\d{2})$/; + $days++; + my $zday = "$abbr_month{$1} $2"; + foreach my $h (sort keys %{$error_info{$k}{chronos}{$d}}) { + $details .= ""; + $hourly_count{"$h"} += $error_info{$k}{chronos}{$d}{$h}{count}; + $zday = ""; + } + } + # Set graph dataset + my %graph_data = (); + foreach my $h ("00" .. "23") { + $graph_data{count} .= "[$h, " . (int($hourly_count{"$h"}/$days) || 0) . "],"; + } + $graph_data{count} =~ s/,$//; + %hourly_count = (); + + my $error_histo = + &flotr2_histograph($graphid++, 'error_graph_'.$rank, $graph_data{count}); - if (!$disable_hourly) { + # Escape HTML code in error message + $msg = &escape_html($msg); print $fh qq{ -

    Hourly statistics ^

    + + + + "; - if (exists $connection_info{chronos}) { - print $fh ""; - } - if (exists $session_info{chronos}) { - $per_hour_info{$d}{$h}{'session'}{average} = - $session_info{chronos}{"$d"}{"$h"}{duration} / ($session_info{chronos}{"$d"}{"$h"}{count} || 1); - print $fh ""; - } - print $fh "\n"; - $c++; - } - } +} - print $fh "
    RankTimes reportedError
    $zday$h" . + &comma_numbers($error_info{$k}{chronos}{$d}{$h}{count}) . "
    $rank$count +

    Details

    +
    +
    $msg
    + +
    +

    Times Reported Most Frequent Error / Event #$rank

    + $error_histo + + + + + + + + + + $details + +
    DayHourCount
    +

    +
    +

    + +
    +
    +}; - - - - - - - + for (my $i = 0 ; $i <= $#{$error_info{$k}{date}} ; $i++) { + # Escape HTML code in error message + my $message = &escape_html($error_info{$k}{error}[$i]); + my $details = "Date: " . $error_info{$k}{date}[$i] . "\n"; + if ($error_info{$k}{detail}[$i]) { + $details .= "Detail: " . &escape_html($error_info{$k}{detail}[$i]) . "
    "; + } + if ($error_info{$k}{context}[$i]) { + $details .= "Context: " . &escape_html($error_info{$k}{context}[$i]) . "
    "; + } + if ($error_info{$k}{hint}[$i]) { + $details .= "Hint: " . &escape_html($error_info{$k}{hint}[$i]) . "
    "; + } + if ($error_info{$k}{statement}[$i]) { + $details .= "Statement: " . &escape_html($error_info{$k}{statement}[$i]) . "
    "; + } + if ($error_info{$k}{db}[$i]) { + $details .= "Database: " . $error_info{$k}{db}[$i] . "
    "; + } + print $fh qq{ +
    $message
    +
    $details
    }; - if (exists $connection_info{chronos}) { - print $fh " \n"; - } - if (exists $session_info{chronos}) { - print $fh " \n"; } print $fh qq{ - - - - - - - - - - + +

    + + + + }; - if (exists $connection_info{chronos}) { - print $fh " \n"; - } - if (exists $session_info{chronos}) { - print $fh " \n"; - } - print $fh qq{ - + $rank++; + } + if (scalar keys %error_info == 0) { + print $fh qq{}; + } + + print $fh qq{ + +
    DayHourQueriesSELECT queriesWrite queriesConnectionsSessions
    CountAv. duration CountAv. duration INSERTUPDATEDELETEAv. duration 
    CountAv./sCountAv. duration 
    $NODATA
    +
    + }; - foreach my $d (sort {$a <=> $b} keys %per_hour_info) { - my $c = 1; - $d =~ /^\d{4}(\d{2})(\d{2})$/; - my $zday = "$abbr_month{$1} $2"; - foreach my $h (sort {$a <=> $b} keys %{$per_hour_info{$d}}) { - my $colb = $c % 2; - $zday = " " if ($c > 1); - $per_hour_info{$d}{$h}{average} = $per_hour_info{$d}{$h}{duration} / ($per_hour_info{$d}{$h}{count} || 1); - $per_hour_info{$d}{$h}{'SELECT'}{average} = - $per_hour_info{$d}{$h}{'SELECT'}{duration} / ($per_hour_info{$d}{$h}{'SELECT'}{count} || 1); - my $write_average = ( - ( - $per_hour_info{$d}{$h}{'INSERT'}{duration} + - $per_hour_info{$d}{$h}{'UPDATE'}{duration} + - $per_hour_info{$d}{$h}{'DELETE'}{duration} - ) - || 0 - ) / ( - ( - $per_hour_info{$d}{$h}{'INSERT'}{count} + - $per_hour_info{$d}{$h}{'UPDATE'}{count} + - $per_hour_info{$d}{$h}{'DELETE'}{count} - ) - || 1 - ); - print $fh "
    $zday$h", - &comma_numbers($per_hour_info{$d}{$h}{count}), "", - &convert_time($per_hour_info{$d}{$h}{average}), "", - &comma_numbers($per_hour_info{$d}{$h}{'SELECT'}{count} || 0), "", - &convert_time($per_hour_info{$d}{$h}{'SELECT'}{average} || 0), "", - &comma_numbers($per_hour_info{$d}{$h}{'INSERT'}{count} || 0), "", - &comma_numbers($per_hour_info{$d}{$h}{'UPDATE'}{count} || 0), "", - &comma_numbers($per_hour_info{$d}{$h}{'DELETE'}{count} || 0), "", - &convert_time($write_average), "", &comma_numbers($connection_info{chronos}{"$d"}{"$h"}{count} || 0), - "", - &comma_numbers(sprintf("%0.2f", $connection_info{chronos}{"$d"}{"$h"}{count} / 3600)), "/s", &comma_numbers($session_info{chronos}{"$d"}{"$h"}{count} || 0), - "", &convert_time($per_hour_info{$d}{$h}{'session'}{average}), "
    \n"; +sub load_stats +{ - if ($graph) { - $first_log_date =~ /(\d+)-(\d+)-(\d+) (\d+):(\d+):(\d+)/; - $t_min = timegm_nocheck(0, $5, $4, $3, $2-1, $1) * 1000; - $t_min -= ($avg_minutes*60000); - $t_min_hour = timegm_nocheck(0, 0, $4, $3, $2-1, $1) * 1000; - $last_log_date =~ /(\d+)-(\d+)-(\d+) (\d+):(\d+):(\d+)/; - $t_max = timegm_nocheck(59, $5, $4, $3, $2-1, $1) * 1000; - $t_max += ($avg_minutes*60000); - $t_max_hour = timegm_nocheck(0, 0, $4, $3, $2-1, $1) * 1000; - my @labels = (); - my @data1 = (); - my @data2 = (); - my @data3 = (); - my $d1 = ''; - my $d2 = ''; - my $d3 = ''; - my @avgs = (); - for (my $i = 0 ; $i < 59 ; $i += $avg_minutes) { - push(@avgs, sprintf("%02d", $i)); - } - push(@avgs, 59); - foreach my $tm (sort {$a <=> $b} keys %{$per_minute_info{query}}) { - $tm =~ /(\d{4})(\d{2})(\d{2})/; - my $y = $1 - 1900; - my $mo = $2 - 1; - my $d = $3; - foreach my $h ("00" .. "23") { - my %dataavg = (); - foreach my $m ("00" .. "59") { - my $rd = &average_per_minutes($m, $avg_minutes); - if (exists $per_minute_info{query}{$tm}{$h}{$m}) { - - # Average per minute - $dataavg{average}{"$rd"} += $per_minute_info{query}{$tm}{$h}{$m}{count}; - - # Search minimum and maximum during this minute - foreach my $s (keys %{$per_minute_info{query}{$tm}{$h}{$m}{second}}) { - $dataavg{max}{"$rd"} = $per_minute_info{query}{$tm}{$h}{$m}{second}{$s} - if ($per_minute_info{query}{$tm}{$h}{$m}{second}{$s} > $dataavg{max}{"$rd"}); - $dataavg{min}{"$rd"} = $per_minute_info{query}{$tm}{$h}{$m}{second}{$s} - if (not exists $dataavg{min}{"$rd"} - || ($per_minute_info{query}{$tm}{$h}{$m}{second}{$s} < $dataavg{min}{"$rd"})); - } - } - } - foreach my $rd (@avgs) { - my $t = timegm_nocheck(0, $rd, $h, $d, $mo, $y) * 1000; + my $fd = shift; - next if ($t < $t_min); - last if ($t > $t_max); - # Average per minutes - $d2 .= "[$t, " . int(($dataavg{average}{"$rd"} || 0) / (60 * $avg_minutes)) . "],"; + my %stats = %{ fd_retrieve($fd) }; + my %_overall_stat = %{$stats{overall_stat}}; + my %_overall_checkpoint = %{$stats{overall_checkpoint}}; + my %_normalyzed_info = %{$stats{normalyzed_info}}; + my %_error_info = %{$stats{error_info}}; + my %_connection_info = %{$stats{connection_info}}; + my %_database_info = %{$stats{database_info}}; + my %_application_info = %{$stats{application_info}}; + my %_user_info = %{$stats{user_info}}; + my %_host_info = %{$stats{host_info}}; + my %_checkpoint_info = %{$stats{checkpoint_info}}; + my %_session_info = %{$stats{session_info}}; + my %_tempfile_info = %{$stats{tempfile_info}}; + my %_logs_type = %{$stats{logs_type}}; + my %_lock_info = %{$stats{lock_info}}; + my %_per_minute_info = %{$stats{per_minute_info}}; + my @_top_slowest = @{$stats{top_slowest}}; + my $_nlines = $stats{nlines}; + my $_first_log_timestamp = $stats{first_log_timestamp}; + my $_last_log_timestamp = $stats{last_log_timestamp}; + my @_log_files = @{$stats{log_files}}; + my %_autovacuum_info = %{$stats{autovacuum_info}}; + my %_autoanalyze_info = %{$stats{autoanalyze_info}}; + my @_top_locked_info = @{$stats{top_locked_info}}; + my @_top_tempfile_info = @{$stats{top_tempfile_info}}; - # Maxi per minute - $d1 .= "[$t, " . ($dataavg{max}{"$rd"} || 0) . "],"; + ### overall_stat ### - # Mini per minute - $d3 .= "[$t, " . ($dataavg{min}{"$rd"} || 0) . "],"; - } - } - } - delete $per_minute_info{query}; - $d1 =~ s/,$//; - $d2 =~ s/,$//; - $d3 =~ s/,$//; - &flotr2_graph( - 1, 'queriespersecond_graph', $d1, $d2, $d3, 'Queries per second (' . $avg_minutes . ' minutes average)', - 'Queries per second', 'Maximum', 'Average', 'Minimum' - ); - $d1 = ''; - $d2 = ''; - $d3 = ''; - - if (exists $per_minute_info{connection}) { - foreach my $tm (sort {$a <=> $b} keys %{$per_minute_info{connection}}) { - $tm =~ /(\d{4})(\d{2})(\d{2})/; - my $y = $1 - 1900; - my $mo = $2 - 1; - my $d = $3; - foreach my $h ("00" .. "23") { - my %dataavg = (); - foreach my $m ("00" .. "59") { - my $rd = &average_per_minutes($m, $avg_minutes); - if (exists $per_minute_info{connection}{$tm}{$h}{$m}) { + $overall_stat{queries_number} += $_overall_stat{queries_number}; - # Average per minute - $dataavg{average}{"$rd"} += $per_minute_info{connection}{$tm}{$h}{$m}{count}; + if ($_overall_stat{'first_log_ts'}) { + $overall_stat{'first_log_ts'} = $_overall_stat{'first_log_ts'} + if (!$overall_stat{'first_log_ts'} || + ($overall_stat{'first_log_ts'} gt $_overall_stat{'first_log_ts'})); + } - # Search minimum and maximum during this minute - foreach my $s (keys %{$per_minute_info{connection}{$tm}{$h}{$m}{second}}) { - $dataavg{max}{"$rd"} = $per_minute_info{connection}{$tm}{$h}{$m}{second}{$s} - if ($per_minute_info{connection}{$tm}{$h}{$m}{second}{$s} > $dataavg{max}{"$rd"}); - $dataavg{min}{"$rd"} = $per_minute_info{connection}{$tm}{$h}{$m}{second}{$s} - if (not exists $dataavg{min}{"$rd"} - || ($per_minute_info{connection}{$tm}{$h}{$m}{second}{$s} < $dataavg{min}{"$rd"})); - } - } - } - foreach my $rd (@avgs) { - my $t = timegm_nocheck(0, $rd, $h, $d, $mo, $y) * 1000; + $overall_stat{'last_log_ts'} = $_overall_stat{'last_log_ts'} + if not $overall_stat{'last_log_ts'} + or $overall_stat{'last_log_ts'} lt $_overall_stat{'last_log_ts'}; - next if ($t < $t_min); - last if ($t > $t_max); - # Average per minutes - $d2 .= "[$t, " . int(($dataavg{average}{"$rd"} || 0) / (60 * $avg_minutes)) . "],"; + if ($_overall_stat{'first_query_ts'}) { + $overall_stat{'first_query_ts'} = $_overall_stat{'first_query_ts'} + if (!$overall_stat{'first_query_ts'} || + ($overall_stat{'first_query_ts'} gt $_overall_stat{'first_query_ts'})); + } - # Maxi per minute - $d1 .= "[$t, " . ($dataavg{max}{"$rd"} || 0) . "],"; + $overall_stat{'last_query_ts'} = $_overall_stat{'last_query_ts'} + if not $overall_stat{'last_query_ts'} + or $overall_stat{'last_query_ts'} lt $_overall_stat{'last_query_ts'}; - # Mini per minute - $d3 .= "[$t, " . ($dataavg{min}{"$rd"} || 0) . "],"; - } - } - } - delete $per_minute_info{connection}; - $d1 =~ s/,$//; - $d2 =~ s/,$//; - $d3 =~ s/,$//; - &flotr2_graph( - 2, 'connectionspersecond_graph', $d1, $d2, $d3, 'Connections per second (' . $avg_minutes . ' minutes average)', - 'Connections per second', 'Maximum', 'Average', 'Minimum' - ); - $d1 = ''; - $d2 = ''; - $d3 = ''; - } - - # All queries - foreach my $tm (sort {$a <=> $b} keys %per_hour_info) { - $tm =~ /(\d{4})(\d{2})(\d{2})/; - my $y = $1 - 1900; - my $mo = $2 - 1; - my $d = $3; - foreach my $h ("00" .. "23") { - my $t = timegm_nocheck(0, 0, $h, $d, $mo, $y) * 1000; - next if ($t < $t_min_hour); - last if ($t > $t_max_hour); - $d1 .= "[$t, " . ($per_hour_info{$tm}{$h}{count} || 0) . "],"; - $d2 .= "[$t, " - . sprintf("%.2f", (($per_hour_info{$tm}{$h}{duration} || 0) / ($per_hour_info{$tm}{$h}{count} || 1)) / 1000) - . "],"; - } - } - $d1 =~ s/,$//; - $d2 =~ s/,$//; - &flotr2_graph( - 3, 'allqueries_graph', $d1, '', '', 'All queries', - 'Queries', 'Number of queries', '', '', 'Duration', $d2, 'Average duration (s)' - ); - $d1 = ''; - $d2 = ''; - - # Select queries - foreach my $tm (sort {$a <=> $b} keys %per_hour_info) { - $tm =~ /(\d{4})(\d{2})(\d{2})/; - my $y = $1 - 1900; - my $mo = $2 - 1; - my $d = $3; - foreach my $h ("00" .. "23") { - my $t = timegm_nocheck(0, 0, $h, $d, $mo, $y) * 1000; - next if ($t < $t_min_hour); - last if ($t > $t_max_hour); - $d1 .= "[$t, " . ($per_hour_info{$tm}{$h}{'SELECT'}{count} || 0) . "],"; - $d2 .= "[$t, " - . sprintf( - "%.2f", - (($per_hour_info{$tm}{$h}{'SELECT'}{duration} || 0) / ($per_hour_info{$tm}{$h}{'SELECT'}{count} || 1)) / 1000 - ) . "],"; - } - } - $d1 =~ s/,$//; - $d2 =~ s/,$//; - &flotr2_graph( - 4, 'selectqueries_graph', $d1, '', '', 'SELECT queries', - 'Queries', 'Number of queries', '', '', 'Duration', $d2, 'Average duration (s)' - ); - $d1 = ''; - $d2 = ''; - - # Write queries - my $d4 = ''; - foreach my $tm (sort {$a <=> $b} keys %per_hour_info) { - $tm =~ /(\d{4})(\d{2})(\d{2})/; - my $y = $1 - 1900; - my $mo = $2 - 1; - my $d = $3; - foreach my $h ("00" .. "23") { - my $t = timegm_nocheck(0, 0, $h, $d, $mo, $y) * 1000; - next if ($t < $t_min_hour); - last if ($t > $t_max_hour); - my $wcount = - $per_hour_info{$tm}{$h}{'UPDATE'}{count} + - $per_hour_info{$tm}{$h}{'DELETE'}{count} + - $per_hour_info{$tm}{$h}{'INSERT'}{count}; - my $wduration = - $per_hour_info{$tm}{$h}{'UPDATE'}{duration} + - $per_hour_info{$tm}{$h}{'DELETE'}{duration} + - $per_hour_info{$tm}{$h}{'INSERT'}{duration}; - $d1 .= "[$t, " . ($per_hour_info{$tm}{$h}{'DELETE'}{count} || 0) . "],"; - $d2 .= "[$t, " . ($per_hour_info{$tm}{$h}{'INSERT'}{count} || 0) . "],"; - $d3 .= "[$t, " . ($per_hour_info{$tm}{$h}{'UPDATE'}{count} || 0) . "],"; - $d4 .= "[$t, " . sprintf("%.2f", (($wduration || 0) / ($wcount || 1)) / 1000) . "],"; - } - } - $d1 =~ s/,$//; - $d2 =~ s/,$//; - $d3 =~ s/,$//; - $d4 =~ s/,$//; - &flotr2_graph( - 5, 'writequeries_graph', $d1, $d2, $d3, 'Write queries', - 'Queries', 'DELETE queries', 'INSERT queries', 'UPDATE queries', 'Duration', $d4, 'Average duration (s)' - ); - $d1 = ''; - $d2 = ''; - $d3 = ''; - $d4 = ''; - - if ($tempfile_info{count} || exists $checkpoint_info{chronos}) { - print $fh qq{}; - } - if ($tempfile_info{count}) { - print $fh qq{}; - } - if (exists $checkpoint_info{chronos}) { - print $fh qq{}; - } - if (exists $checkpoint_info{warning}) { - print $fh qq{}; - } - if ($tempfile_info{count} || exists $checkpoint_info{chronos}) { - print $fh qq{}; - } - if ($tempfile_info{count}) { - print $fh qq{}; - } - if (exists $checkpoint_info{chronos}) { - print $fh -qq{}; - } - if (exists $checkpoint_info{warning}) { - print $fh qq{}; - } - if ($tempfile_info{count} || exists $checkpoint_info{chronos}) { - print $fh qq{}; - foreach my $d (sort {$a <=> $b} keys %per_hour_info) { - my $c = 1; - $d =~ /^\d{4}(\d{2})(\d{2})$/; - my $zday = "$abbr_month{$1} $2"; - foreach my $h (sort {$a <=> $b} keys %{$per_hour_info{$d}}) { - my $colb = $c % 2; - $zday = " " if ($c > 1); - print $fh ""; - if ($tempfile_info{count}) { - my $temp_average = '0'; - if ($tempfile_info{chronos}{$d}{$h}{count}) { - $temp_average = &comma_numbers( - sprintf("%.2f", $tempfile_info{chronos}{$d}{$h}{size} / $tempfile_info{chronos}{$d}{$h}{count})); - } - print $fh ""; - } - if (exists $checkpoint_info{chronos}) { - if (exists $checkpoint_info{chronos}{$d}{$h}) { - print $fh ""; - if ($checkpoint_info{warning}) { - print $fh ""; - } - } else { - print $fh -""; - if ($checkpoint_info{warning}) { - print $fh ""; - } - } - } - print $fh "\n"; - $c++; - } - } - print $fh "
    DayHourTemporary filesCheckpointsCheckpoint warning
    CountAv. sizeWrote buffersAddedRemovedRecycledWrite time (sec)Sync time (sec)Total time (sec)CountAv. time (sec)
    $zday$h", &comma_numbers($tempfile_info{chronos}{$d}{$h}{count} || 0), - "$temp_average", &comma_numbers($checkpoint_info{chronos}{$d}{$h}{wbuffer}) || 0, - "", &comma_numbers($checkpoint_info{chronos}{$d}{$h}{file_added}) || 0, - "", &comma_numbers($checkpoint_info{chronos}{$d}{$h}{file_removed}) || 0, - "", - &comma_numbers($checkpoint_info{chronos}{$d}{$h}{file_recycled}) || 0, - "", &comma_numbers($checkpoint_info{chronos}{$d}{$h}{write}) || 0, - "", &comma_numbers($checkpoint_info{chronos}{$d}{$h}{sync}) || 0, - "", &comma_numbers($checkpoint_info{chronos}{$d}{$h}{total}) || 0, - "", &comma_numbers($checkpoint_info{chronos}{$d}{$h}{warning}) || 0, - "", - &comma_numbers( - sprintf( - "%.2f", - ($checkpoint_info{chronos}{$d}{$h}{warning_seconds} || 0) / - ($checkpoint_info{chronos}{$d}{$h}{warning} || 1) - ) - ) || 0, "000000000
    \n"; - } + $overall_stat{errors_number} += $_overall_stat{errors_number}; + $overall_stat{queries_duration} += $_overall_stat{queries_duration}; - # checkpoint size - if (exists $checkpoint_info{chronos}) { - foreach my $tm (sort {$a <=> $b} keys %{$checkpoint_info{chronos}}) { - $tm =~ /(\d{4})(\d{2})(\d{2})/; - my $y = $1 - 1900; - my $mo = $2 - 1; - my $d = $3; - foreach my $h ("00" .. "23") { - my $t = timegm_nocheck(0, 0, $h, $d, $mo, $y) * 1000; - next if ($t < $t_min_hour); - last if ($t > $t_max_hour); - $d1 .= "[$t, " . ($checkpoint_info{chronos}{$tm}{$h}{wbuffer} || 0) . "],"; - } - } - $d1 =~ s/,$//; - &flotr2_graph( - 6, 'checkpointwritebuffers_graph', $d1, '', '', 'Checkpoint write buffers', - 'Buffers', 'Write buffers', '', '' - ); - $d1 = ''; - - foreach my $tm (sort {$a <=> $b} keys %{$checkpoint_info{chronos}}) { - $tm =~ /(\d{4})(\d{2})(\d{2})/; - my $y = $1 - 1900; - my $mo = $2 - 1; - my $d = $3; - foreach my $h ("00" .. "23") { - my $t = timegm_nocheck(0, 0, $h, $d, $mo, $y) * 1000; - next if ($t < $t_min_hour); - last if ($t > $t_max_hour); - $d1 .= "[$t, " . ($checkpoint_info{chronos}{$tm}{$h}{file_added} || 0) . "],"; - $d2 .= "[$t, " . ($checkpoint_info{chronos}{$tm}{$h}{file_removed} || 0) . "],"; - $d3 .= "[$t, " . ($checkpoint_info{chronos}{$tm}{$h}{file_recycled} || 0) . "],"; - } - } - $d1 =~ s/,$//; - $d2 =~ s/,$//; - $d3 =~ s/,$//; - &flotr2_graph( - 7, 'checkpointfiles_graph', $d1, $d2, $d3, 'Checkpoint Wal files usage', - 'Number of files', 'Added', 'Removed', 'Recycled' - ); - $d1 = ''; - $d2 = ''; - $d3 = ''; - } - - # Temporary file size - if (exists $tempfile_info{chronos}) { - foreach my $tm (sort {$a <=> $b} keys %{$tempfile_info{chronos}}) { - $tm =~ /(\d{4})(\d{2})(\d{2})/; - my $y = $1 - 1900; - my $mo = $2 - 1; - my $d = $3; - foreach my $h ("00" .. "23") { - my $t = timegm_nocheck(0, 0, $h, $d, $mo, $y) * 1000; - next if ($t < $t_min_hour); - last if ($t > $t_max_hour); - $d1 .= "[$t, " . ($tempfile_info{chronos}{$tm}{$h}{size} || 0) . "],"; - $d2 .= "[$t, " . ($tempfile_info{chronos}{$tm}{$h}{count} || 0) . "],"; - } - } - $d1 =~ s/,$//; - $d2 =~ s/,$//; - &flotr2_graph( - 8, 'temporaryfile_graph', $d1, '', '', 'Temporary files', - 'Size of files', 'Size of files', '', '', 'Number of files', $d2, 'Number of files' - ); - $d1 = ''; - $d2 = ''; - } - } + $overall_stat{DELETE} += $_overall_stat{DELETE} + if exists $_overall_stat{DELETE}; + $overall_stat{UPDATE} += $_overall_stat{UPDATE} + if exists $_overall_stat{UPDATE}; + $overall_stat{INSERT} += $_overall_stat{INSERT} + if exists $_overall_stat{INSERT}; + $overall_stat{SELECT} += $_overall_stat{SELECT} + if exists $_overall_stat{SELECT}; + + $overall_checkpoint{checkpoint_warning} += $_overall_checkpoint{checkpoint_warning}; + $overall_checkpoint{checkpoint_write} = $_overall_checkpoint{checkpoint_write} + if ($_overall_checkpoint{checkpoint_write} > $overall_checkpoint{checkpoint_write}); + $overall_checkpoint{checkpoint_sync} = $_overall_checkpoint{checkpoint_sync} + if ($_overall_checkpoint{checkpoint_sync} > $overall_checkpoint{checkpoint_sync}); + foreach my $k (keys %{$_overall_stat{peak}}) { + $overall_stat{peak}{$k}{query} += $_overall_stat{peak}{$k}{query}; + $overall_stat{peak}{$k}{select} += $_overall_stat{peak}{$k}{select}; + $overall_stat{peak}{$k}{write} += $_overall_stat{peak}{$k}{write}; + $overall_stat{peak}{$k}{connection} += $_overall_stat{peak}{$k}{connection}; + $overall_stat{peak}{$k}{session} += $_overall_stat{peak}{$k}{session}; + $overall_stat{peak}{$k}{tempfile_size} += $_overall_stat{peak}{$k}{tempfile_size}; + $overall_stat{peak}{$k}{tempfile_count} += $_overall_stat{peak}{$k}{tempfile_count}; } - # INSERT/DELETE/UPDATE/SELECT repartition - if (!$disable_type) { - print $fh qq{ -

    Queries by type ^

    - -
    - - - - - - - -}; - $overall_stat{'SELECT'} ||= 0; - $overall_stat{'INSERT'} ||= 0; - $overall_stat{'UPDATE'} ||= 0; - $overall_stat{'DELETE'} ||= 0; - my $totala = $overall_stat{'SELECT'} + $overall_stat{'INSERT'} + $overall_stat{'UPDATE'} + $overall_stat{'DELETE'}; - my $total = $overall_stat{'queries_number'} || 1; + foreach my $k (keys %{$_overall_stat{histogram}{query_time}}) { + $overall_stat{histogram}{query_time}{$k} += $_overall_stat{histogram}{query_time}{$k}; + } + $overall_stat{histogram}{total} += $_overall_stat{histogram}{total}; - print $fh "\n"; - print $fh "\n"; - print $fh "\n"; - print $fh "\n"; - print $fh "\n" - if (($total - $totala) > 0); - print $fh "
    TypeCountPercentage
    SELECT", &comma_numbers($overall_stat{'SELECT'}), - "", sprintf("%0.2f", ($overall_stat{'SELECT'} * 100) / $total), "%
    INSERT", &comma_numbers($overall_stat{'INSERT'}), - "", sprintf("%0.2f", ($overall_stat{'INSERT'} * 100) / $total), "%
    UPDATE", &comma_numbers($overall_stat{'UPDATE'}), - "", sprintf("%0.2f", ($overall_stat{'UPDATE'} * 100) / $total), "%
    DELETE", &comma_numbers($overall_stat{'DELETE'}), - "", sprintf("%0.2f", ($overall_stat{'DELETE'} * 100) / $total), "%
    OTHERS", &comma_numbers($total - $totala), - "", sprintf("%0.2f", (($total - $totala) * 100) / $total), "%
    \n"; + foreach my $k ('prepare', 'bind','execute') { + $overall_stat{$k} += $_overall_stat{$k}; + } - if ($graph && $totala) { - my %data = (); - foreach my $t ('SELECT', 'INSERT', 'UPDATE', 'DELETE') { - if ((($overall_stat{$t} * 100) / $total) > $pie_percentage_limit) { - $data{$t} = $overall_stat{$t} || 0; - } else { - $data{"Sum types < $pie_percentage_limit%"} += $overall_stat{$t} || 0; - } - } - if (((($total - $totala) * 100) / $total) > $pie_percentage_limit) { - $data{'Others'} = $total - $totala; - } else { - $data{"Sum types < $pie_percentage_limit%"} += $total - $totala; - } - &flotr2_piegraph(9, 'queriesbytype_graph', 'Type of queries', %data); - } - print $fh "
    \n"; + foreach my $k (keys %{$_overall_checkpoint{peak}}) { + $overall_checkpoint{peak}{$k}{checkpoint_wbuffer} += $_overall_checkpoint{peak}{$k}{checkpoint_wbuffer}; + $overall_checkpoint{peak}{$k}{walfile_usage} += $_overall_checkpoint{peak}{$k}{walfile_usage}; } - # Lock stats per type - if (!$disable_lock && scalar keys %lock_info > 0) { - print $fh qq{ -

    Locks by type ^

    - -
    - - - - - - - - -}; - my $total_count = 0; - my $total_duration = 0; - foreach my $t (sort keys %lock_info) { - print $fh "\n"; - foreach my $o (sort keys %{$lock_info{$t}}) { - next if (($o eq 'count') || ($o eq 'duration') || ($o eq 'chronos')); - print $fh "\n"; - } - $total_count += $lock_info{$t}{count}; - $total_duration += $lock_info{$t}{duration}; - } - print $fh "\n"; - print $fh "
    TypeObjectCountTotal DurationAv. duration (s)
    $t", &comma_numbers($lock_info{$t}{count}), - "", &convert_time($lock_info{$t}{duration}), "", - &convert_time($lock_info{$t}{duration} / $lock_info{$t}{count}), "
    $o", - &comma_numbers($lock_info{$t}{$o}{count}), "", - &convert_time($lock_info{$t}{$o}{duration}), "", - &convert_time($lock_info{$t}{$o}{duration} / $lock_info{$t}{$o}{count}), "
    Total", &comma_numbers($total_count), - "", &convert_time($total_duration), "", - &convert_time($total_duration / ($total_count || 1)), "
    \n"; - if ($graph && $total_count) { - my %locktype = (); - my @small = (); - foreach my $d (sort keys %lock_info) { - if ((($lock_info{$d}{count} * 100) / $total_count) > $pie_percentage_limit) { - $locktype{$d} = $lock_info{$d}{count} || 0; - } else { - $locktype{"Sum types < $pie_percentage_limit%"} += $lock_info{$d}{count} || 0; - push(@small, $d); + ### Logs level ### + foreach my $l (qw(LOG WARNING ERROR FATAL PANIC DETAIL HINT STATEMENT CONTEXT)) { + $logs_type{$l} += $_logs_type{$l} if exists $_logs_type{$l}; + } - } - } - if ($#small == 0) { - $locktype{$small[0]} = $locktype{"Sum types < $pie_percentage_limit%"}; - delete $locktype{"Sum types < $pie_percentage_limit%"}; - } - &flotr2_piegraph(10, 'lockbytype_graph', 'Type of locks', %locktype); + ### database_info ### + + foreach my $db (keys %_database_info) { + foreach my $k (keys %{ $_database_info{$db} }) { + $database_info{$db}{$k} += $_database_info{$db}{$k}; } - print $fh "
    \n"; } - # Show session per database statistics - if (!$disable_session && exists $session_info{database}) { - print $fh qq{ -

    Sessions per database ^

    - -
    - - - - - - - -}; - my $total_count = 0; - my $c = 0; - foreach my $d (sort keys %{$session_info{database}}) { - my $colb = $c % 2; - print $fh "\n"; - $total_count += $session_info{database}{$d}{count}; - $c++; - } - print $fh "
    DatabaseCountTotal DurationAv. duration (s)
    $d", &comma_numbers($session_info{database}{$d}{count}), - "", &convert_time($session_info{database}{$d}{duration}), "", - &convert_time($session_info{database}{$d}{duration} / $session_info{database}{$d}{count}), "
    \n"; - if ($graph && $total_count) { - my %infos = (); - my @small = (); - foreach my $d (sort keys %{$session_info{database}}) { - if ((($session_info{database}{$d}{count} * 100) / $total_count) > $pie_percentage_limit) { - $infos{$d} = $session_info{database}{$d}{count} || 0; - } else { - $infos{"Sum sessions < $pie_percentage_limit%"} += $session_info{database}{$d}{count} || 0; - push(@small, $d); - } - } - if ($#small == 0) { - $infos{$small[0]} = $infos{"Sum sessions < $pie_percentage_limit%"}; - delete $infos{"Sum sessions < $pie_percentage_limit%"}; - } - &flotr2_piegraph(11, 'databasesessions_graph', 'Sessions per database', %infos); + ### application_info ### + + foreach my $app (keys %_application_info) { + foreach my $k (keys %{ $_application_info{$app} }) { + $application_info{$app}{$k} += $_application_info{$app}{$k}; } - print $fh "
    \n"; } - # Show session per user statistics - if (!$disable_session && exists $session_info{user}) { - print $fh qq{ -

    Sessions per user ^

    - -
    - - - - - - - -}; - my $total_count = 0; - my $c = 0; - foreach my $d (sort keys %{$session_info{user}}) { - my $colb = $c % 2; - $total_count += $session_info{user}{$d}{count}; - print $fh "\n"; - $c++; - } - print $fh "
    UserCountTotal DurationAv. duration (s)
    $d", &comma_numbers($session_info{user}{$d}{count}), - "", &convert_time($session_info{user}{$d}{duration}), "", - &convert_time($session_info{user}{$d}{duration} / $session_info{user}{$d}{count}), "
    \n"; - if ($graph && $total_count) { - my %infos = (); - my @small = (); - foreach my $d (sort keys %{$session_info{user}}) { - if ((($session_info{user}{$d}{count} * 100) / $total_count) > $pie_percentage_limit) { - $infos{$d} = $session_info{user}{$d}{count} || 0; - } else { - $infos{"Sum sessions < $pie_percentage_limit%"} += $session_info{user}{$d}{count} || 0; - push(@small, $d); - } - } - if ($#small == 0) { - $infos{$small[0]} = $infos{"Sum sessions < $pie_percentage_limit%"}; - delete $infos{"Sum sessions < $pie_percentage_limit%"}; - } - &flotr2_piegraph(12, 'usersessions_graph', 'Sessions per user', %infos); + ### user_info ### + + foreach my $u (keys %_user_info) { + foreach my $k (keys %{ $_user_info{$u} }) { + $user_info{$u}{$k} += $_user_info{$u}{$k}; } - print $fh "
    \n"; } - # Show session per host statistics - if (!$disable_session && exists $session_info{host}) { - print $fh qq{ -

    Sessions per host ^

    - -
    - - - - - - - -}; - my $total_count = 0; - my $c = 0; - foreach my $d (sort keys %{$session_info{host}}) { - my $colb = $c % 2; - $total_count += $session_info{host}{$d}{count}; - print $fh "\n"; - $c++; + ### host_info ### + + foreach my $h (keys %_host_info) { + foreach my $k (keys %{ $_host_info{$h} }) { + $host_info{$h}{$k} += $_host_info{$h}{$k}; } - print $fh "
    HostCountTotal DurationAv. duration (s)
    $d", &comma_numbers($session_info{host}{$d}{count}), - "", &convert_time($session_info{host}{$d}{duration}), "", - &convert_time($session_info{host}{$d}{duration} / $session_info{host}{$d}{count}), "
    \n"; - if ($graph && $total_count) { - my %infos = (); - my @small = (); - foreach my $d (sort keys %{$session_info{host}}) { - if ((($session_info{host}{$d}{count} * 100) / $total_count) > $pie_percentage_limit) { - $infos{$d} = $session_info{host}{$d}{count} || 0; - } else { - $infos{"Sum sessions < $pie_percentage_limit%"} += $session_info{host}{$d}{count} || 0; - push(@small, $d); - } - } - if ($#small == 0) { - $infos{$small[0]} = $infos{"Sum sessions < $pie_percentage_limit%"}; - delete $infos{"Sum sessions < $pie_percentage_limit%"}; - } - &flotr2_piegraph(13, 'hostsessions_graph', 'Sessions per host', %infos); + } + + + ### connection_info ### + + foreach my $db (keys %{ $_connection_info{database} }) { + $connection_info{database}{$db} += $_connection_info{database}{$db}; + } + + foreach my $db (keys %{ $_connection_info{database_user} }) { + foreach my $user (keys %{ $_connection_info{database_user}{$db} }) { + $connection_info{database_user}{$db}{$user} += $_connection_info{database_user}{$db}{$user}; } - print $fh "
    \n"; } - # Show connection per database statistics - if (!$disable_connection && exists $connection_info{database}) { - print $fh qq{ -

    Connections per database ^

    - -
    - - - - - - -}; - my $total_count = 0; - foreach my $d (sort keys %{$connection_info{database}}) { - print $fh "\n"; - $total_count += $connection_info{database}{$d}; - foreach my $u (sort keys %{$connection_info{user}}) { - next if (!exists $connection_info{database_user}{$d}{$u}); - print $fh "\n"; - } + foreach my $user (keys %{ $_connection_info{user} }) { + $connection_info{user}{$user} += $_connection_info{user}{$user}; + } + + foreach my $host (keys %{ $_connection_info{host} }) { + $connection_info{host}{$host} += $_connection_info{host}{$host}; + } + + $connection_info{count} += $_connection_info{count}; + + foreach my $day (keys %{ $_connection_info{chronos} }) { + foreach my $hour (keys %{ $_connection_info{chronos}{$day} }) { + + $connection_info{chronos}{$day}{$hour}{count} += $_connection_info{chronos}{$day}{$hour}{count} + +############################################################################### +# May be used in the future to display more detailed information on connection +# +# foreach my $db (keys %{ $_connection_info{chronos}{$day}{$hour}{database} }) { +# $connection_info{chronos}{$day}{$hour}{database}{$db} += $_connection_info{chronos}{$day}{$hour}{database}{$db}; +# } +# +# foreach my $db (keys %{ $_connection_info{chronos}{$day}{$hour}{database_user} }) { +# foreach my $user (keys %{ $_connection_info{chronos}{$day}{$hour}{database_user}{$db} }) { +# $connection_info{chronos}{$day}{$hour}{database_user}{$db}{$user} += +# $_connection_info{chronos}{$day}{$hour}{database_user}{$db}{$user}; +# } +# } +# +# foreach my $user (keys %{ $_connection_info{chronos}{$day}{$hour}{user} }) { +# $connection_info{chronos}{$day}{$hour}{user}{$user} += +# $_connection_info{chronos}{$day}{$hour}{user}{$user}; +# } +# +# foreach my $host (keys %{ $_connection_info{chronos}{$day}{$hour}{host} }) { +# $connection_info{chronos}{$day}{$hour}{host}{$host} += +# $_connection_info{chronos}{$day}{$hour}{host}{$host}; +# } +############################################################################### } - print $fh "
    DatabaseUserCount
    $d", - &comma_numbers($connection_info{database}{$d}), "
    $u", - &comma_numbers($connection_info{database_user}{$d}{$u}), "
    \n"; - if ($graph && $total_count) { - my %infos = (); - my @small = (); - foreach my $d (sort keys %{$connection_info{database}}) { - if ((($connection_info{database}{$d} * 100) / $total_count) > $pie_percentage_limit) { - $infos{$d} = $connection_info{database}{$d} || 0; - } else { - $infos{"Sum connections < $pie_percentage_limit%"} += $connection_info{database}{$d} || 0; - push(@small, $d); + } + + ### log_files ### + + foreach my $f (@_log_files) { + push(@log_files, $f) if (!grep(m#^$f$#, @_log_files)); + } + + ### error_info ### + + foreach my $q (keys %_error_info) { + $error_info{$q}{count} += $_error_info{$q}{count}; + # Keep only the wanted sample number + if (!exists $error_info{$q}{date} || ($#{$error_info{$q}{date}} < $sample)) { + push(@{$error_info{$q}{date}}, @{$_error_info{$q}{date}}); + push(@{$error_info{$q}{detail}}, @{$_error_info{$q}{detail}}); + push(@{$error_info{$q}{context}}, @{$_error_info{$q}{context}}); + push(@{$error_info{$q}{statement}}, @{$_error_info{$q}{statement}}); + push(@{$error_info{$q}{hint}}, @{$_error_info{$q}{hint}}); + push(@{$error_info{$q}{error}}, @{$_error_info{$q}{error}}); + push(@{$error_info{$q}{db}}, @{$_error_info{$q}{db}}); + foreach my $day (keys %{ $_error_info{$q}{chronos} }) { + foreach my $hour (keys %{$_error_info{$q}{chronos}{$day}}) { + $error_info{$q}{chronos}{$day}{$hour}{count} += $_error_info{$q}{chronos}{$day}{$hour}{count}; } } - if ($#small == 0) { - $infos{$small[0]} = $infos{"Sum connections < $pie_percentage_limit%"}; - delete $infos{"Sum connections < $pie_percentage_limit%"}; - } - &flotr2_piegraph(14, 'databaseconnections_graph', 'Connections per database', %infos); } - print $fh "
    \n"; } - # Show connection per user statistics - if (!$disable_connection && exists $connection_info{user}) { - print $fh qq{ -

    Connections per user ^

    - -
    - - - - - -}; + ### per_minute_info ### - my $total_count = 0; - my $c = 0; - foreach my $u (sort keys %{$connection_info{user}}) { - my $colb = $c % 2; - print $fh "\n"; - $total_count += $connection_info{user}{$u}; - $c++; - } - print $fh "
    UserCount
    $u", &comma_numbers($connection_info{user}{$u}), - "
    \n"; - if ($graph && $total_count) { - my %infos = (); - my @small = (); - foreach my $d (sort keys %{$connection_info{user}}) { - if ((($connection_info{user}{$d} * 100) / $total_count) > $pie_percentage_limit) { - $infos{$d} = $connection_info{user}{$d} || 0; - } else { - $infos{"Sum connections < $pie_percentage_limit%"} += $connection_info{user}{$d} || 0; - push(@small, $d); + foreach my $day (keys %_per_minute_info) { + foreach my $hour (keys %{ $_per_minute_info{$day} }) { + foreach my $min (keys %{ $_per_minute_info{$day}{$hour} }) { + $per_minute_info{$day}{$hour}{$min}{connection}{count} += + ($_per_minute_info{$day}{$hour}{$min}{connection}{count} || 0); + $per_minute_info{$day}{$hour}{$min}{session}{count} += + ($_per_minute_info{$day}{$hour}{$min}{session}{count} || 0); + $per_minute_info{$day}{$hour}{$min}{query}{count} += + ($_per_minute_info{$day}{$hour}{$min}{query}{count} || 0); + $per_minute_info{$day}{$hour}{$min}{query}{duration} += $_per_minute_info{$day}{$hour}{$min}{query}{duration}; + + foreach my $sec (keys %{ $_per_minute_info{$day}{$hour}{$min}{connection}{second} }) { + $per_minute_info{$day}{$hour}{$min}{connection}{second}{$sec} += + ($_per_minute_info{$day}{$hour}{$min}{connection}{second}{$sec} || 0); } + foreach my $sec (keys %{ $_per_minute_info{$day}{$hour}{$min}{session}{second} }) { + $per_minute_info{$day}{$hour}{$min}{session}{second}{$sec} += + ($_per_minute_info{$day}{$hour}{$min}{session}{second}{$sec} || 0); + } + foreach my $sec (keys %{ $_per_minute_info{$day}{$hour}{$min}{query}{second} }) { + $per_minute_info{$day}{$hour}{$min}{query}{second}{$sec} += + ($_per_minute_info{$day}{$hour}{$min}{query}{second}{$sec} || 0); + } + foreach my $action (@SQL_ACTION) { + if (exists $_per_minute_info{$day}{$hour}{$min}{$action}) { + $per_minute_info{$day}{$hour}{$min}{$action}{count} += $_per_minute_info{$day}{$hour}{$min}{$action}{count}; + $per_minute_info{$day}{$hour}{$min}{$action}{duration} += $_per_minute_info{$day}{$hour}{$min}{$action}{duration}; + foreach my $sec (keys %{ $_per_minute_info{$day}{$hour}{$min}{$action}{second} }) { + $per_minute_info{$day}{$hour}{$min}{$action}{second}{$sec} += + ($_per_minute_info{$day}{$hour}{$min}{$action}{second}{$sec} || 0); + } + } + } + foreach my $k ('prepare', 'bind','execute') { + if (exists $_per_minute_info{$day}{$hour}{$min}{$k}) { + $per_minute_info{$day}{$hour}{$min}{$k} += $_per_minute_info{$day}{$hour}{$min}{$k}; + } + } + + $per_minute_info{$day}{$hour}{$min}{tempfile}{count} += $_per_minute_info{$day}{$hour}{$min}{tempfile}{count} + if defined $_per_minute_info{$day}{$hour}{$min}{tempfile}{count}; + $per_minute_info{$day}{$hour}{$min}{tempfile}{size} += $_per_minute_info{$day}{$hour}{$min}{tempfile}{size} + if defined $_per_minute_info{$day}{$hour}{$min}{tempfile}{size}; + + $per_minute_info{$day}{$hour}{$min}{checkpoint}{file_removed} += $_per_minute_info{$day}{$hour}{$min}{checkpoint}{file_removed}; + $per_minute_info{$day}{$hour}{$min}{checkpoint}{sync} += $_per_minute_info{$day}{$hour}{$min}{checkpoint}{sync}; + $per_minute_info{$day}{$hour}{$min}{checkpoint}{wbuffer} += $_per_minute_info{$day}{$hour}{$min}{checkpoint}{wbuffer}; + $per_minute_info{$day}{$hour}{$min}{checkpoint}{file_recycled} += $_per_minute_info{$day}{$hour}{$min}{checkpoint}{file_recycled}; + $per_minute_info{$day}{$hour}{$min}{checkpoint}{total} += $_per_minute_info{$day}{$hour}{$min}{checkpoint}{total}; + $per_minute_info{$day}{$hour}{$min}{checkpoint}{file_added} += $_per_minute_info{$day}{$hour}{$min}{checkpoint}{file_added}; + $per_minute_info{$day}{$hour}{$min}{checkpoint}{write} += $_per_minute_info{$day}{$hour}{$min}{checkpoint}{write}; + $per_minute_info{$day}{$hour}{$min}{autovacuum}{count} += $_per_minute_info{$day}{$hour}{$min}{autovacuum}{count}; + $per_minute_info{$day}{$hour}{$min}{autoanalyze}{count} += $_per_minute_info{$day}{$hour}{$min}{autoanalyze}{count}; + + $per_minute_info{$day}{$hour}{$min}{checkpoint}{sync_files} += $_per_minute_info{$day}{$hour}{$min}{checkpoint}{sync_files}; + $per_minute_info{$day}{$hour}{$min}{checkpoint}{sync_avg} += $_per_minute_info{$day}{$hour}{$min}{checkpoint}{sync_avg}; + $per_minute_info{$day}{$hour}{$min}{checkpoint}{sync_longest} = $_per_minute_info{$day}{$hour}{$min}{checkpoint}{sync_longest} + if ($_per_minute_info{$day}{$hour}{$min}{checkpoint}{sync_longest} > $per_minute_info{$day}{$hour}{$min}{checkpoint}{sync_longest}); } - if ($#small == 0) { - $infos{$small[0]} = $infos{"Sum connections < $pie_percentage_limit%"}; - delete $infos{"Sum connections < $pie_percentage_limit%"}; - } - &flotr2_piegraph(15, 'userconnections_graph', 'Connections per user', %infos); } - print $fh "
    \n"; } - # Show connection per host statistics - if (!$disable_connection && exists $connection_info{host}) { - print $fh qq{ -

    Connections per host ^

    - -
    - - - - - -}; - - my $total_count = 0; - my $c = 0; - foreach my $h (sort keys %{$connection_info{host}}) { - my $colb = $c % 2; - print $fh "\n"; - $total_count += $connection_info{host}{$h}; - $c++; - } - print $fh "
    HostCount
    $h", &comma_numbers($connection_info{host}{$h}), - "
    \n"; - if ($graph && $total_count) { - my %infos = (); - my @small = (); - foreach my $d (sort keys %{$connection_info{host}}) { - if ((($connection_info{host}{$d} * 100) / $total_count) > $pie_percentage_limit) { - $infos{$d} = $connection_info{host}{$d} || 0; - } else { - $infos{"Sum connections < $pie_percentage_limit%"} += $connection_info{host}{$d} || 0; - push(@small, $d); - } - } - if ($#small == 0) { - $infos{$small[0]} = $infos{"Sum connections < $pie_percentage_limit%"}; - delete $infos{"Sum connections < $pie_percentage_limit%"}; + ### lock_info ### + + foreach my $lock (keys %_lock_info) { + $lock_info{$lock}{count} += $_lock_info{$lock}{count}; + + foreach my $day (keys %{ $_lock_info{chronos} }) { + foreach my $hour (keys %{ $_lock_info{chronos}{$day} }) { + $lock_info{chronos}{$day}{$hour}{count} += $_lock_info{chronos}{$day}{$hour}{count}; + $lock_info{chronos}{$day}{$hour}{duration} += $_lock_info{chronos}{$day}{$hour}{duration}; } - &flotr2_piegraph(16, 'hostconnections_graph', 'Connections per host', %infos); } - print $fh "
    \n"; - } - # Show top informations - if (!$disable_query) { - print $fh qq{ -

    Slowest queries ^

    - - - - - - - -}; - for (my $i = 0 ; $i <= $#top_slowest ; $i++) { - my $col = $i % 2; - my $ttl = $top_slowest[$i]->[1] || ''; - print $fh "\n"; + $lock_info{$lock}{duration} += $_lock_info{$lock}{duration}; + + foreach my $type (keys %{$_lock_info{$lock}}) { + next if $type =~ /^(count|chronos|duration)$/; + + $lock_info{$lock}{$type}{count} += $_lock_info{$lock}{$type}{count}; + $lock_info{$lock}{$type}{duration} += $_lock_info{$lock}{$type}{duration}; } - print $fh "
    RankDuration (s)Query
    ", $i + 1, "", - &convert_time($top_slowest[$i]->[0]), - "
    ", - &highlight_code($top_slowest[$i]->[2]), "
    \n"; - @top_slowest = (); + } - print $fh qq{ -

    Queries that took up the most time (N) ^

    - - - - - - - - - -}; - my $idx = 1; - foreach my $k (sort {$normalyzed_info{$b}{duration} <=> $normalyzed_info{$a}{duration}} keys %normalyzed_info) { - next if (!$normalyzed_info{$k}{count}); - last if ($idx > $top); - my $q = $k; - if ($normalyzed_info{$k}{count} == 1) { - foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { - $q = $normalyzed_info{$k}{samples}{$d}{query}; - last; - } - } - $normalyzed_info{$k}{average} = $normalyzed_info{$k}{duration} / $normalyzed_info{$k}{count}; - my $col = $idx % 2; - print $fh ""; - print $fh "\n"; - $idx++; + $nlines += $_nlines; + + ### normalyzed_info ### + + foreach my $stmt (keys %_normalyzed_info) { + + foreach my $dt (keys %{$_normalyzed_info{$stmt}{samples}} ) { + $normalyzed_info{$stmt}{samples}{$dt} = $_normalyzed_info{$stmt}{samples}{$dt}; } - print $fh "
    RankTotal durationTimes executedAv. duration (s)Query
    $idx", - &convert_time($normalyzed_info{$k}{duration}), - "
    ", - &comma_numbers($normalyzed_info{$k}{count}), -"
    "; - foreach my $d (sort keys %{$normalyzed_info{$k}{chronos}}) { - my $c = 1; - $d =~ /^\d{4}(\d{2})(\d{2})$/; - my $zday = "$abbr_month{$1} $2"; - foreach my $h (sort keys %{$normalyzed_info{$k}{chronos}{$d}}) { - $normalyzed_info{$k}{chronos}{$d}{$h}{average} = - $normalyzed_info{$k}{chronos}{$d}{$h}{duration} / $normalyzed_info{$k}{chronos}{$d}{$h}{count}; - my $colb = $c % 2; - $zday = " " if ($c > 1); - print $fh ""; - $c++; - } - } - print $fh "
    DayHourCountDurationAv. Duration
    $zday$h", - &comma_numbers($normalyzed_info{$k}{chronos}{$d}{$h}{count}), "", - &convert_time($normalyzed_info{$k}{chronos}{$d}{$h}{duration}), "", - &convert_time($normalyzed_info{$k}{chronos}{$d}{$h}{average}), "
    ", &convert_time($normalyzed_info{$k}{average}), - "
    ", - &highlight_code($q), "
    "; + ### nlines ### - if ($normalyzed_info{$k}{count} > 1) { - print $fh -"
    "; - my $i = 0; - foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { - my $colb = $i % 2; - print $fh -"
    ", - &convert_time($d), " | ", &highlight_code($normalyzed_info{$k}{samples}{$d}{query}), "
    "; - $i++; - } - print $fh "
    "; - } - print $fh "
    \n"; - print $fh qq{ -

    Most frequent queries (N) ^

    - - - - - - - - -}; - $idx = 1; - foreach my $k (sort {$normalyzed_info{$b}{count} <=> $normalyzed_info{$a}{count}} keys %normalyzed_info) { - next if (!$normalyzed_info{$k}{count}); - last if ($idx > $top); - my $q = $k; - if ($normalyzed_info{$k}{count} == 1) { - foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { - $q = $normalyzed_info{$k}{samples}{$d}{query}; - last; - } + # Keep only the top N samples + my $i = 1; + foreach my $k (sort {$b <=> $a} keys %{$normalyzed_info{$stmt}{samples}}) { + if ($i > $sample) { + delete $normalyzed_info{$stmt}{samples}{$k}; } - my $col = $idx % 2; - print $fh -""; - print $fh "\n"; - $idx++; } - print $fh "
    RankTimes executedTotal durationAv. duration (s)Query
    $idx
    ", - &comma_numbers($normalyzed_info{$k}{count}), -"
    "; - foreach my $d (sort keys %{$normalyzed_info{$k}{chronos}}) { - my $c = 1; - $d =~ /^\d{4}(\d{2})(\d{2})$/; - my $zday = "$abbr_month{$1} $2"; - foreach my $h (sort keys %{$normalyzed_info{$k}{chronos}{$d}}) { - $normalyzed_info{$k}{chronos}{$d}{$h}{average} = - $normalyzed_info{$k}{chronos}{$d}{$h}{duration} / $normalyzed_info{$k}{chronos}{$d}{$h}{count}; - my $colb = $c % 2; - $zday = " " if ($c > 1); - print $fh ""; - $c++; - } - } - print $fh "
    DayHourCountDurationAv. Duration
    $zday$h", - &comma_numbers($normalyzed_info{$k}{chronos}{$d}{$h}{count}), "", - &convert_time($normalyzed_info{$k}{chronos}{$d}{$h}{duration}), "", - &convert_time($normalyzed_info{$k}{chronos}{$d}{$h}{average}), "
    ", &convert_time($normalyzed_info{$k}{duration}), "", - &convert_time($normalyzed_info{$k}{average}), "
    ", - &highlight_code($q), "
    "; + $i++; + } - if ($normalyzed_info{$k}{count} > 1) { - print $fh -"
    "; - my $i = 0; - foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { - my $colb = $i % 2; - print $fh -"
    ", - &convert_time($d), " | ", &highlight_code($normalyzed_info{$k}{samples}{$d}{query}), "
    "; - $i++; - } - print $fh "
    "; + $normalyzed_info{$stmt}{count} += $_normalyzed_info{$stmt}{count}; + + # Set min / max duration for this query + if (!exists $normalyzed_info{$stmt}{min} || ($normalyzed_info{$stmt}{min} > $_normalyzed_info{$stmt}{min})) { + $normalyzed_info{$stmt}{min} = $_normalyzed_info{$stmt}{min}; + } + if (!exists $normalyzed_info{$stmt}{max} || ($normalyzed_info{$stmt}{max} < $_normalyzed_info{$stmt}{max})) { + $normalyzed_info{$stmt}{max} = $_normalyzed_info{$stmt}{max}; + } + + foreach my $day (keys %{$_normalyzed_info{$stmt}{chronos}} ) { + foreach my $hour (keys %{$_normalyzed_info{$stmt}{chronos}{$day}} ) { + $normalyzed_info{$stmt}{chronos}{$day}{$hour}{count} += + $_normalyzed_info{$stmt}{chronos}{$day}{$hour}{count}; + $normalyzed_info{$stmt}{chronos}{$day}{$hour}{duration} += + $_normalyzed_info{$stmt}{chronos}{$day}{$hour}{duration}; } - print $fh "
    \n"; - print $fh qq{ -

    Slowest queries (N) ^

    - - - - - - - - - -}; - $idx = 1; - foreach my $k (sort {$normalyzed_info{$b}{average} <=> $normalyzed_info{$a}{average}} keys %normalyzed_info) { - next if (!$k || !$normalyzed_info{$k}{count}); - last if ($idx > $top); - my $q = $k; - if ($normalyzed_info{$k}{count} == 1) { - foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { - $q = $normalyzed_info{$k}{samples}{$d}{query}; - last; - } + $normalyzed_info{$stmt}{duration} += $_normalyzed_info{$stmt}{duration}; + + if (exists $_normalyzed_info{$stmt}{locks}) { + $normalyzed_info{$stmt}{locks}{count} += $_normalyzed_info{$stmt}{locks}{count}; + $normalyzed_info{$stmt}{locks}{wait} += $_normalyzed_info{$stmt}{locks}{wait}; + if (!exists $normalyzed_info{$stmt}{locks}{minwait} || ($normalyzed_info{$stmt}{locks}{minwait} > $_normalyzed_info{$stmt}{locks}{minwait})) { + $normalyzed_info{$stmt}{locks}{minwait} = $_normalyzed_info{$stmt}{locks}{minwait}; + } + if (!exists $normalyzed_info{$stmt}{locks}{maxwait} || ($normalyzed_info{$stmt}{locks}{maxwait} < $_normalyzed_info{$stmt}{locks}{maxwait})) { + $normalyzed_info{$stmt}{locks}{maxwait} = $_normalyzed_info{$stmt}{locks}{maxwait}; } - my $col = $idx % 2; - print $fh ""; - print $fh "\n"; - $idx++; } - print $fh "
    RankAv. duration (s)Times executedTotal durationQuery
    $idx", - &convert_time($normalyzed_info{$k}{average}), - "
    ", - &comma_numbers($normalyzed_info{$k}{count}), -"
    "; - foreach my $d (sort keys %{$normalyzed_info{$k}{chronos}}) { - my $c = 1; - $d =~ /^\d{4}(\d{2})(\d{2})$/; - my $zday = "$abbr_month{$1} $2"; - foreach my $h (sort keys %{$normalyzed_info{$k}{chronos}{$d}}) { - $normalyzed_info{$k}{chronos}{$d}{$h}{average} = - $normalyzed_info{$k}{chronos}{$d}{$h}{duration} / $normalyzed_info{$k}{chronos}{$d}{$h}{count}; - my $colb = $c % 2; - $zday = " " if ($c > 1); - print $fh ""; - $c++; - } - } - print $fh "
    DayHourCountDurationAv. Duration
    $zday$h", - &comma_numbers($normalyzed_info{$k}{chronos}{$d}{$h}{count}), "", - &convert_time($normalyzed_info{$k}{chronos}{$d}{$h}{duration}), "", - &convert_time($normalyzed_info{$k}{chronos}{$d}{$h}{average}), "
    ", &convert_time($normalyzed_info{$k}{duration}), - "
    ", - &highlight_code($q), "
    "; - if ($normalyzed_info{$k}{count} > 1) { - print $fh -"
    "; - my $i = 0; - foreach my $d (sort {$b <=> $a} keys %{$normalyzed_info{$k}{samples}}) { - my $colb = $i % 2; - print $fh -"
    ", - &convert_time($d), " | ", &highlight_code($normalyzed_info{$k}{samples}{$d}{query}), "
    "; - $i++; - } - print $fh "
    "; + } + + if (exists $_normalyzed_info{$stmt}{tempfiles}) { + $normalyzed_info{$stmt}{tempfiles}{count} += $_normalyzed_info{$stmt}{tempfiles}{count}; + $normalyzed_info{$stmt}{tempfiles}{size} += $_normalyzed_info{$stmt}{tempfiles}{size}; + if (!exists $normalyzed_info{$stmt}{tempfiles}{minsize} || ($normalyzed_info{$stmt}{tempfiles}{minsize} > $_normalyzed_info{$stmt}{tempfiles}{minsize})) { + $normalyzed_info{$stmt}{tempfiles}{minsize} = $_normalyzed_info{$stmt}{tempfiles}{minsize}; + } + if (!exists $normalyzed_info{$stmt}{tempfiles}{maxsize} || ($normalyzed_info{$stmt}{tempfiles}{maxsize} < $_normalyzed_info{$stmt}{tempfiles}{maxsize})) { + $normalyzed_info{$stmt}{tempfiles}{maxsize} = $_normalyzed_info{$stmt}{tempfiles}{maxsize}; } - print $fh "
    \n"; } - if (!$disable_error) { - &show_error_as_html(); + ### session_info ### + + foreach my $db (keys %{ $_session_info{database}}) { + $session_info{database}{$db}{count} += $_session_info{database}{$db}{count}; + $session_info{database}{$db}{duration} += $_session_info{database}{$db}{duration}; } - # Dump the html footer - &html_footer(); + $session_info{count} += $_session_info{count}; -} + foreach my $day (keys %{ $_session_info{chronos}}) { + foreach my $hour (keys %{ $_session_info{chronos}{$day}}) { + $session_info{chronos}{$day}{$hour}{count} += $_session_info{chronos}{$day}{$hour}{count}; + $session_info{chronos}{$day}{$hour}{duration} += $_session_info{chronos}{$day}{$hour}{duration}; + } + } -sub dump_error_as_html -{ + foreach my $user (keys %{ $_session_info{user}}) { + $session_info{user}{$user}{count} += $_session_info{user}{$user}{count}; + $session_info{user}{$user}{duration} += $_session_info{user}{$user}{duration}; + } - # Dump the html header - &html_header(); + $session_info{duration} += $_session_info{duration}; - # Global informations - my $curdate = localtime(time); - my $fmt_nlines = &comma_numbers($nlines); - my $total_time = timestr($td); - $total_time =~ s/^([\.0-9]+) wallclock.*/$1/; - $total_time = &convert_time($total_time * 1000); - my $logfile_str = $log_files[0]; - if ($#log_files > 0) { - $logfile_str .= ', ..., ' . $log_files[-1]; + foreach my $host (keys %{ $_session_info{host}}) { + $session_info{host}{$host}{count} += $_session_info{host}{$host}{count}; + $session_info{host}{$host}{duration} += $_session_info{host}{$host}{duration}; } - print $fh qq{ -
    -
      -
    • Generated on $curdate
    • -
    • Log file: $logfile_str
    • -
    • Parsed $fmt_nlines log entries in $total_time
    • -
    • Log start from $first_log_date to $last_log_date
    • -
    -
    -}; - my $fmt_errors = &comma_numbers($overall_stat{'errors_number'}) || 0; - my $fmt_unique_error = &comma_numbers(scalar keys %{$overall_stat{'unique_normalized_errors'}}) || 0; - print $fh qq{ -
    -

    Overall statistics ^

    -
    -
    -
      -
    • Number of events: $fmt_errors
    • -
    • Number of unique normalized events: $fmt_unique_error
    • -
    -
    -
    -}; - &show_error_as_html(); + ### tempfile_info ### - # Dump the html footer - &html_footer(); -} + $tempfile_info{count} += $_tempfile_info{count} + if defined $_tempfile_info{count}; + $tempfile_info{size} += $_tempfile_info{size} + if defined $_tempfile_info{size}; + $tempfile_info{maxsize} = $_tempfile_info{maxsize} + if defined $_tempfile_info{maxsize} and ( not defined $tempfile_info{maxsize} + or $tempfile_info{maxsize} < $_tempfile_info{maxsize} ); -sub show_error_as_html -{ - print $fh qq{ -

    Most frequent events (N) ^

    - - - - - + ### top_slowest ### + my @tmp_top_slowest = sort {$b->[0] <=> $a->[0]} (@top_slowest, @_top_slowest); + @top_slowest = (); + for (my $i = 0; $i <= $#tmp_top_slowest; $i++) { + push(@top_slowest, $tmp_top_slowest[$i]); + last if ($i == $end_top); + } - -}; - my $idx = 1; - foreach my $k (sort {$error_info{$b}{count} <=> $error_info{$a}{count}} keys %error_info) { - next if (!$error_info{$k}{count}); - last if ($idx > $top); - my $col = $idx % 2; - print $fh -"\n"; - if ($error_info{$k}{count} > 1) { - my $msg = $k; - $msg =~ s/HINT: (parameter "[^"]+" changed to)/LOG: $1/; - print $fh "\n"; - $idx++; + ### top_locked ### + my @tmp_top_locked_info = sort {$b->[0] <=> $a->[0]} (@top_locked_info, @_top_locked_info); + @top_locked_info = (); + for (my $i = 0; $i <= $#tmp_top_locked_info; $i++) { + push(@top_locked_info, $tmp_top_locked_info[$i]); + last if ($i == $end_top); } - print $fh "
    RankTimes reportedError
    $idx
    ", - &comma_numbers($error_info{$k}{count}), ""; - print $fh "
    "; - foreach my $d (sort keys %{$error_info{$k}{chronos}}) { - my $c = 1; - $d =~ /^\d{4}(\d{2})(\d{2})$/; - my $zday = "$abbr_month{$1} $2"; - foreach my $h (sort keys %{$error_info{$k}{chronos}{$d}}) { - my $colb = $c % 2; - $zday = " " if ($c > 1); - print $fh ""; - $c++; - } - } - print $fh "
    DayHourCount
    $zday$h", - &comma_numbers($error_info{$k}{chronos}{$d}{$h}{count}), "
    $msg
    "; - print $fh -"
    "; - for (my $i = 0 ; $i <= $#{$error_info{$k}{date}} ; $i++) { - if ($error_info{$k}{error}[$i] =~ s/HINT: (parameter "[^"]+" changed to)/LOG: $1/) { - $logs_type{HINT}--; - $logs_type{LOG}++; - } - my $c = $i % 2; - print $fh "
    $error_info{$k}{error}[$i]
    \n"; - print $fh "
    Detail: $error_info{$k}{detail}[$i]
    \n" - if ($error_info{$k}{detail}[$i]); - print $fh "
    Context: $error_info{$k}{context}[$i]
    \n" - if ($error_info{$k}{context}[$i]); - print $fh "
    Hint: $error_info{$k}{hint}[$i]
    \n" if ($error_info{$k}{hint}[$i]); - print $fh "
    Statement: $error_info{$k}{statement}[$i]
    \n" - if ($error_info{$k}{statement}[$i]); - } - print $fh "
    "; - } else { - print $fh "
    $error_info{$k}{error}[0]
    "; - print $fh "
    Detail: $error_info{$k}{detail}[0]
    \n" if ($error_info{$k}{detail}[0]); - print $fh "
    Context: $error_info{$k}{context}[0]
    \n" if ($error_info{$k}{context}[0]); - print $fh "
    Hint: $error_info{$k}{hint}[0]
    \n" if ($error_info{$k}{hint}[0]); - print $fh "
    Statement: $error_info{$k}{statement}[0]
    \n" - if ($error_info{$k}{statement}[0]); - } - print $fh "
    \n"; - if (scalar keys %logs_type > 0) { + ### top_tempfile ### + my @tmp_top_tempfile_info = sort {$b->[0] <=> $a->[0]} (@top_tempfile_info, @_top_tempfile_info); + @top_tempfile_info = (); + for (my $i = 0; $i <= $#tmp_top_tempfile_info; $i++) { + push(@top_tempfile_info, $tmp_top_tempfile_info[$i]); + last if ($i == $end_top); + } - # Show log' types - print $fh qq{ -

    Logs per type ^

    - -
    - - - - - - - }; + ### checkpoint_info ### + $checkpoint_info{file_removed} += $_checkpoint_info{file_removed}; + $checkpoint_info{sync} += $_checkpoint_info{sync}; + $checkpoint_info{wbuffer} += $_checkpoint_info{wbuffer}; + $checkpoint_info{file_recycled} += $_checkpoint_info{file_recycled}; + $checkpoint_info{total} += $_checkpoint_info{total}; + $checkpoint_info{file_added} += $_checkpoint_info{file_added}; + $checkpoint_info{write} += $_checkpoint_info{write}; - my $total_logs = 0; - foreach my $d (sort keys %logs_type) { - $total_logs += $logs_type{$d}; - } + #### Autovacuum info #### - my $c = 0; + $autovacuum_info{count} += $_autovacuum_info{count}; - foreach my $d (sort keys %logs_type) { - next if (!$logs_type{$d}); - my $colb = $c % 2; - print $fh "\n"; - $c++; + foreach my $day (keys %{ $_autovacuum_info{chronos} }) { + foreach my $hour (keys %{ $_autovacuum_info{chronos}{$day} }) { + $autovacuum_info{chronos}{$day}{$hour}{count} += $_autovacuum_info{chronos}{$day}{$hour}{count}; } + } + foreach my $table (keys %{ $_autovacuum_info{tables} }) { + $autovacuum_info{tables}{$table}{vacuums} += $_autovacuum_info{tables}{$table}{vacuums}; + $autovacuum_info{tables}{$table}{idxscans} += $_autovacuum_info{tables}{$table}{idxscans}; + $autovacuum_info{tables}{$table}{tuples}{removed} += $_autovacuum_info{tables}{$table}{tuples}{removed}; + $autovacuum_info{tables}{$table}{pages}{removed} += $_autovacuum_info{tables}{$table}{pages}{removed}; + } + if ($_autovacuum_info{peak}{system_usage}{elapsed} > $autovacuum_info{peak}{system_usage}{elapsed}) { + $autovacuum_info{peak}{system_usage}{elapsed} = $_autovacuum_info{peak}{system_usage}{elapsed}; + $autovacuum_info{peak}{system_usage}{table} = $_autovacuum_info{peak}{system_usage}{table}; + $autovacuum_info{peak}{system_usage}{date} = $_autovacuum_info{peak}{system_usage}{date}; + } + #### Autoanalyze info #### - print $fh "
    TypeCountPercentage
    $d", &comma_numbers($logs_type{$d}), - "", sprintf("%0.2f", ($logs_type{$d} * 100) / $total_logs), "%
    \n"; - if ($graph && $total_logs) { - my %infos = (); - my @small = (); - foreach my $d (sort keys %logs_type) { - if ((($logs_type{$d} * 100) / $total_logs) > $pie_percentage_limit) { - $infos{$d} = $logs_type{$d} || 0; - } else { - $infos{"Sum log types < $pie_percentage_limit%"} += $logs_type{$d} || 0; - push(@small, $d); - } - } + $autoanalyze_info{count} += $_autoanalyze_info{count}; - if ($#small == 0) { - $infos{$small[0]} = $infos{"Sum log types < $pie_percentage_limit%"}; - delete $infos{"Sum log types < $pie_percentage_limit%"}; - } - &flotr2_piegraph(17, 'logstype_graph', 'Logs per type', %infos); + foreach my $day (keys %{ $_autoanalyze_info{chronos} }) { + foreach my $hour (keys %{ $_autoanalyze_info{chronos}{$day} }) { + $autoanalyze_info{chronos}{$day}{$hour}{count} += $_autoanalyze_info{chronos}{$day}{$hour}{count}; } - print $fh "
    \n"; + } + foreach my $table (keys %{ $_autoanalyze_info{tables} }) { + $autoanalyze_info{tables}{$table}{analyzes} += $_autoanalyze_info{tables}{$table}{analyzes}; + } + if ($_autoanalyze_info{peak}{system_usage}{elapsed} > $autoanalyze_info{peak}{system_usage}{elapsed}) { + $autoanalyze_info{peak}{system_usage}{elapsed} = $_autoanalyze_info{peak}{system_usage}{elapsed}; + $autoanalyze_info{peak}{system_usage}{table} = $_autoanalyze_info{peak}{system_usage}{table}; + $autoanalyze_info{peak}{system_usage}{date} = $_autoanalyze_info{peak}{system_usage}{date}; } + return; +} + +sub dump_as_binary +{ + my $lfh = shift(); + + store_fd({ + 'overall_stat' => \%overall_stat, + 'overall_checkpoint' => \%overall_checkpoint, + 'normalyzed_info' => \%normalyzed_info, + 'error_info' => \%error_info, + 'connection_info' => \%connection_info, + 'database_info' => \%database_info, + 'application_info' => \%application_info, + 'user_info' => \%user_info, + 'host_info' => \%host_info, + 'checkpoint_info' => \%checkpoint_info, + 'session_info' => \%session_info, + 'tempfile_info' => \%tempfile_info, + 'error_info' => \%error_info, + 'logs_type' => \%logs_type, + 'lock_info' => \%lock_info, + 'per_minute_info' => \%per_minute_info, + 'top_slowest' => \@top_slowest, + 'nlines' => $nlines, + 'log_files' => \@log_files, + 'autovacuum_info' => \%autovacuum_info, + 'autoanalyze_info' => \%autoanalyze_info, + 'top_tempfile_info' => \@top_tempfile_info, + 'top_locked_info' => \@top_locked_info, + }, $lfh) || die ("Couldn't save binary data to «$outfile»!\n"); } # Highlight SQL code @@ -3059,95 +7731,184 @@ { my $code = shift; + # Escape HTML code into SQL values + $code = &escape_html($code); + + # Do not try to prettify queries longer + # than 10KB as this will take too much time + return $code if (length($code) > 10240); + # prettify SQL query if (!$noprettify) { - my $sql = SQL::Beautify->new; - $sql->query($code); - $code = $sql->beautify; + $sql_prettified->query($code); + $code = $sql_prettified->beautify; } return $code if ($nohighlight); - foreach my $x (keys %SYMBOLS) { - $code =~ s/$x/\$\$CLASSSY0A\$\$$SYMBOLS{$x}\$\$CLASSSY0B\$\$/gs; + my $i = 0; + my @qqcode = (); + while ($code =~ s/("[^\"]*")/QQCODEY${i}A/s) { + push(@qqcode, $1); + $i++; + } + $i = 0; + my @qcode = (); + while ($code =~ s/('[^\']*')/QCODEY${i}B/s) { + push(@qcode, $1); + $i++; } - #$code =~ s/("[^"]*")/$1<\/span>/igs; - $code =~ s/('[^']*')/$1<\/span>/gs; - $code =~ s/(`[^`]*`)/$1<\/span>/gs; - + foreach my $x (keys %SYMBOLS) { + $code =~ s/$x/\$\$STYLESY0A\$\$$SYMBOLS{$x}\$\$STYLESY0B\$\$/gs; + } for (my $x = 0 ; $x <= $#KEYWORDS1 ; $x++) { - - #$code =~ s/\b($KEYWORDS1[$x])\b/$1<\/span>/igs; $code =~ s/\b$KEYWORDS1[$x]\b/$KEYWORDS1[$x]<\/span>/igs; + $code =~ s/(?$KEYWORDS1[$x]<\/span>/igs; } for (my $x = 0 ; $x <= $#KEYWORDS2 ; $x++) { - - #$code =~ s/\b($KEYWORDS2[$x])\b/$1<\/span>/igs; - $code =~ s/\b$KEYWORDS2[$x]\b/$KEYWORDS2[$x]<\/span>/igs; + $code =~ s/(?$KEYWORDS2[$x]<\/span>/igs; } for (my $x = 0 ; $x <= $#KEYWORDS3 ; $x++) { - - #$code =~ s/\b($KEYWORDS3[$x])\b/$1<\/span>/igs; $code =~ s/\b$KEYWORDS3[$x]\b/$KEYWORDS3[$x]<\/span>/igs; } for (my $x = 0 ; $x <= $#BRACKETS ; $x++) { $code =~ s/($BRACKETS[$x])/$1<\/span>/igs; } - $code =~ s/\$\$CLASSSY0A\$\$([^\$]+)\$\$CLASSSY0B\$\$/$1<\/span>/gs; + + $code =~ s/\$\$STYLESY0A\$\$([^\$]+)\$\$STYLESY0B\$\$/$1<\/span>/gs; $code =~ s/\b(\d+)\b/$1<\/span>/igs; + for (my $x = 0; $x <= $#qcode; $x++) { + $code =~ s/QCODEY${x}B/$qcode[$x]/s; + } + for (my $x = 0; $x <= $#qqcode; $x++) { + $code =~ s/QQCODEY${x}A/$qqcode[$x]/s; + } + + $code =~ s/('[^']*')/$1<\/span>/gs; + $code =~ s/(`[^`]*`)/$1<\/span>/gs; + return $code; } +sub compute_arg_list +{ + + # Some command line arguments can be used multiple times or written + # as a comma-separated list. + # For example: --dbuser=postgres --dbuser=joe or --dbuser=postgres,joe + # So we have to aggregate all the possible values + my @tmp = (); + foreach my $v (@exclude_user) { + push(@tmp, split(/,/, $v)); + } + @exclude_user = (); + push(@exclude_user, @tmp); + + @tmp = (); + foreach my $v (@dbname) { + push(@tmp, split(/,/, $v)); + } + @dbname = (); + push(@dbname, @tmp); + + @tmp = (); + foreach my $v (@dbuser) { + push(@tmp, split(/,/, $v)); + } + @dbuser = (); + push(@dbuser, @tmp); + + @tmp = (); + foreach my $v (@dbclient) { + push(@tmp, split(/,/, $v)); + } + @dbclient = (); + push(@dbclient, @tmp); + + @tmp = (); + foreach my $v (@dbappname) { + push(@tmp, split(/,/, $v)); + } + @dbappname = (); + push(@dbappname, @tmp); + + @tmp = (); + foreach my $v (@exclude_appname) { + push(@tmp, split(/,/, $v)); + } + @exclude_appname = (); + push(@exclude_appname, @tmp); + +} + sub validate_log_line { my ($t_pid) = @_; + # Look at particular cas of vacuum/analyze that have the database + # name inside the log message so that they could be associated + if ($prefix_vars{'t_query'} =~ / of table "([^\.]+)\.[^\.]+\.[^\.]+":/) { + $prefix_vars{'t_dbname'} = $1; + } + # Check user and/or database if require - if ($dbname) { + if ($#dbname >= 0) { - # Log line do not match the required dbname - if (!$prefix_vars{'t_dbname'} || ($dbname ne $prefix_vars{'t_dbname'})) { - delete $cur_info{$t_pid}; + # Log line does not match the required dbname + if (!$prefix_vars{'t_dbname'} || !grep(/^$prefix_vars{'t_dbname'}$/i, @dbname)) { return 0; } } - if ($dbuser) { + if ($#dbuser >= 0) { - # Log line do not match the required dbuser - if (!$prefix_vars{'t_dbuser'} || ($dbuser ne $prefix_vars{'t_dbuser'})) { - delete $cur_info{$t_pid}; + # Log line does not match the required dbuser + if (!$prefix_vars{'t_dbuser'} || !grep(/^$prefix_vars{'t_dbuser'}$/i, @dbuser)) { return 0; } } - if ($dbclient) { + if ($#dbclient >= 0) { + + # Log line does not match the required dbclient $prefix_vars{'t_client'} ||= $prefix_vars{'t_hostport'}; - # Log line do not match the required dbclient - if (!$prefix_vars{'t_client'} || ($dbclient ne $prefix_vars{'t_client'})) { - delete $cur_info{$t_pid}; + if (!$prefix_vars{'t_client'} || !grep(/^$prefix_vars{'t_client'}$/i, @dbclient)) { return 0; } } - if ($dbappname) { + if ($#dbappname >= 0) { - # Log line do not match the required dbname - if (!$prefix_vars{'t_appname'} || ($dbappname ne $prefix_vars{'t_appname'})) { - delete $cur_info{$t_pid}; + # Log line does not match the required dbname + if (!$prefix_vars{'t_appname'} || !grep(/^$prefix_vars{'t_appname'}$/i, @dbappname)) { + return 0; + } + } + if ($#exclude_user >= 0) { + + # Log line matches the excluded dbuser + if ($prefix_vars{'t_dbuser'} && grep(/^$prefix_vars{'t_dbuser'}$/i, @exclude_user)) { + return 0; + } + } + if ($#exclude_appname >= 0) { + + # Log line matches the excluded appname + if ($prefix_vars{'t_appname'} && grep(/^$prefix_vars{'t_appname'}$/i, @exclude_appname)) { return 0; } } + return 1; } - sub parse_log_prefix { my ($t_logprefix) = @_; # Extract user and database information from the logprefix part if ($t_logprefix) { + # Search for database user if ($t_logprefix =~ $regex_prefix_dbuser) { $prefix_vars{'t_dbuser'} = $1; @@ -3165,44 +7926,67 @@ my $t_pid = $prefix_vars{'t_pid'}; - # Force parameter change to be a hint message so that it can appear - # in the event/error/warning messages report part. - if (($prefix_vars{'t_loglevel'} eq 'LOG') && ($prefix_vars{'t_query'} =~ /parameter "[^"]+" changed to "[^"]+"/)) { - $prefix_vars{'t_loglevel'} = 'HINT'; + # Force some LOG messages to be ERROR messages so that they will appear + # in the event/error/warning messages report. + if ($prefix_vars{'t_loglevel'} eq 'LOG') { + if ($prefix_vars{'t_query'} =~ /parameter "[^"]+" changed to "[^"]+"/) { + $prefix_vars{'t_loglevel'} = 'ERROR'; + } elsif ($prefix_vars{'t_query'} =~ /database system was shut down at /) { + $prefix_vars{'t_loglevel'} = 'ERROR'; + } elsif ($prefix_vars{'t_query'} =~ /database system was interrupted while in recovery/) { + $prefix_vars{'t_loglevel'} = 'ERROR'; + } elsif ($prefix_vars{'t_query'} =~ /recovery has paused/) { + $prefix_vars{'t_loglevel'} = 'ERROR'; + } } - # Do not parse lines that are not an error like message - if ($error_only && ($prefix_vars{'t_loglevel'} !~ /(WARNING|ERROR|FATAL|PANIC|DETAIL|HINT|STATEMENT|CONTEXT)/)) { - if (exists $cur_info{$t_pid} && ($prefix_vars{'t_session_line'} != $cur_info{$t_pid}{session})) { - &store_queries($t_pid); - delete $cur_info{$t_pid}; - } + # Do not parse lines that are not an error message when error only report is requested + if ($error_only && ($prefix_vars{'t_loglevel'} !~ $full_error_regex)) { return; } - # Do not parse lines that are an error like message - if ($disable_error && ($prefix_vars{'t_loglevel'} =~ /WARNING|ERROR|FATAL|PANIC|HINT|CONTEXT|DETAIL|STATEMENT/)) { - if (exists $cur_info{$t_pid} && ($prefix_vars{'t_session_line'} != $cur_info{$t_pid}{session})) { - &store_queries($t_pid); - delete $cur_info{$t_pid}; - } + # Do not parse lines that are an error-like message when error reports are not wanted + if ($disable_error && ($prefix_vars{'t_loglevel'} =~ $full_error_regex)) { return; } - # Store the current timestamp of the log line - $first_log_date = $prefix_vars{'t_timestamp'} if (!$first_log_date); - $last_log_date = $prefix_vars{'t_timestamp'}; - # Store a counter of logs type $logs_type{$prefix_vars{'t_loglevel'}}++; - # Replace syslog tablulation rewrite - $prefix_vars{'t_query'} =~ s/#011/\t/g if ($format eq 'syslog'); + # Replace syslog tabulation rewrite + if ($format =~ /syslog/) { + $prefix_vars{'t_query'} =~ s/#011/\t/g; + } + + # Reject lines generated by debug tool + if ( ($prefix_vars{'t_loglevel'} eq 'CONTEXT') && ($prefix_vars{'t_query'} =~ /SQL statement "/) ) { + return; + } + + # Stores the error's detail if previous line was an error + if ($cur_info{$t_pid}{loglevel} =~ $main_error_regex) { + # and current one is a detailed information + if ($prefix_vars{'t_loglevel'} =~ /(DETAIL|STATEMENT|CONTEXT|HINT)/) { + $cur_info{$t_pid}{"\L$1\E"} .= $prefix_vars{'t_query'}; + return; + } + } my $date_part = "$prefix_vars{'t_year'}$prefix_vars{'t_month'}$prefix_vars{'t_day'}"; + my $cur_last_log_timestamp = "$prefix_vars{'t_year'}-$prefix_vars{'t_month'}-$prefix_vars{'t_day'} " . + "$prefix_vars{t_hour}:$prefix_vars{t_min}:$prefix_vars{t_sec}"; + + # set current session workload + if (!$disable_session) { + my $sess_count = scalar keys %current_sessions; + $overall_stat{'peak'}{$cur_last_log_timestamp}{session} = $sess_count; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{"$prefix_vars{'t_min'}"}{session}{count} = $sess_count; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{"$prefix_vars{'t_min'}"}{session}{second}{$prefix_vars{'t_sec'}} = $sess_count; + } # Stores lock activity - if (($prefix_vars{'t_loglevel'} eq 'LOG') && ($prefix_vars{'t_query'} =~ /acquired ([^\s]+) on ([^\s]+) .* after ([0-9\.]+) ms/)) { + if (($prefix_vars{'t_loglevel'} eq 'LOG') && ($prefix_vars{'t_query'} =~ /acquired ([^\s]+) on ([^\s]+) .* after ([0-9\.]+) ms/)) + { return if ($disable_lock); $lock_info{$1}{count}++; $lock_info{$1}{duration} += $3; @@ -3210,6 +7994,28 @@ $lock_info{$1}{$2}{duration} += $3; $lock_info{$1}{chronos}{$date_part}{$prefix_vars{'t_hour'}}{count}++; $lock_info{$1}{chronos}{$date_part}{$prefix_vars{'t_hour'}}{duration}++; + # Store current lock information that will be used later + # when we will parse the query responsible of the locks + $cur_lock_info{$t_pid}{wait} = $3; + if ($format eq 'csv') { + $cur_lock_info{$t_pid}{query} = $prefix_vars{'t_statement'}; + $cur_lock_info{$t_pid}{timestamp} = $prefix_vars{'t_timestamp'}; + $cur_lock_info{$t_pid}{dbname} = $prefix_vars{'t_dbname'}; + $cur_lock_info{$t_pid}{dbuser} = $prefix_vars{'t_dbuser'}; + $cur_lock_info{$t_pid}{dbclient} = $prefix_vars{'t_client'}; + $cur_lock_info{$t_pid}{dbappname} = $prefix_vars{'t_appname'}; + } + return; + } + + # Stores query related to last lock information + if (($prefix_vars{'t_loglevel'} eq 'STATEMENT') && exists $cur_lock_info{$t_pid}) { + $cur_lock_info{$t_pid}{query} = $prefix_vars{'t_query'}; + $cur_lock_info{$t_pid}{timestamp} = $prefix_vars{'t_timestamp'}; + $cur_lock_info{$t_pid}{dbname} = $prefix_vars{'t_dbname'}; + $cur_lock_info{$t_pid}{dbuser} = $prefix_vars{'t_dbuser'}; + $cur_lock_info{$t_pid}{dbclient} = $prefix_vars{'t_client'}; + $cur_lock_info{$t_pid}{dbappname} = $prefix_vars{'t_appname'}; return; } @@ -3218,55 +8024,117 @@ return if ($disable_temporary); $tempfile_info{count}++; $tempfile_info{size} += $1; - $tempfile_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{count}++; - $tempfile_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{size} += $1; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{tempfile}{count}++; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{tempfile}{size} += $1; $tempfile_info{maxsize} = $1 if ($tempfile_info{maxsize} < $1); + # Store current temporary file information that will be used later + # when we will parse the query responsible of the tempfile + $cur_temp_info{$t_pid}{size} = $1; + $overall_stat{'peak'}{$cur_last_log_timestamp}{tempfile_size} += $1; + $overall_stat{'peak'}{$cur_last_log_timestamp}{tempfile_count}++; + if ($format eq 'csv') { + $cur_temp_info{$t_pid}{query} = $prefix_vars{'t_statement'}; + $cur_temp_info{$t_pid}{timestamp} = $prefix_vars{'t_timestamp'}; + $cur_temp_info{$t_pid}{dbname} = $prefix_vars{'t_dbname'}; + $cur_temp_info{$t_pid}{dbuser} = $prefix_vars{'t_dbuser'}; + $cur_temp_info{$t_pid}{dbclient} = $prefix_vars{'t_client'}; + $cur_temp_info{$t_pid}{dbappname} = $prefix_vars{'t_appname'}; + } + return; + } + + # Stores query related to last created temporary file + if (($prefix_vars{'t_loglevel'} eq 'STATEMENT') && $cur_temp_info{$t_pid}{size}) { + $cur_temp_info{$t_pid}{query} = $prefix_vars{'t_query'}; + $cur_temp_info{$t_pid}{timestamp} = $prefix_vars{'t_timestamp'}; + $cur_temp_info{$t_pid}{dbname} = $prefix_vars{'t_dbname'}; + $cur_temp_info{$t_pid}{dbuser} = $prefix_vars{'t_dbuser'}; + $cur_temp_info{$t_pid}{dbclient} = $prefix_vars{'t_client'}; + $cur_temp_info{$t_pid}{dbappname} = $prefix_vars{'t_appname'}; return; } - # Stores pre connection activity + # Stores pre-connection activity if (($prefix_vars{'t_loglevel'} eq 'LOG') && ($prefix_vars{'t_query'} =~ /connection received: host=([^\s]+) port=(\d+)/)) { return if ($disable_connection); + $current_sessions{$prefix_vars{'t_pid'}} = 1; $conn_received{$t_pid} = $1; return; } # Stores connection activity - if (($prefix_vars{'t_loglevel'} eq 'LOG') && ($prefix_vars{'t_query'} =~ /connection authorized: user=([^\s]+) database=([^\s]+)/)) { + if ( ($prefix_vars{'t_loglevel'} eq 'LOG') + && ($prefix_vars{'t_query'} =~ /connection authorized: user=([^\s]+) /)) + { return if ($disable_connection); + $current_sessions{$prefix_vars{'t_pid'}} = 1; my $usr = $1; - my $db = $2; + my $db = 'unknown'; + my $host = ''; + if ($prefix_vars{'t_query'} =~ / database=([^\s]+)/) { + $db = $1; + } elsif ($prefix_vars{'t_dbname'}) { + $db = $prefix_vars{'t_dbname'}; + } + if ($prefix_vars{'t_query'} =~ / host=([^\s]+)/) { + $host = $1; + } + if ($extension eq 'tsung') { + $tsung_session{$prefix_vars{'t_pid'}}{connection}{database} = $db; + $tsung_session{$prefix_vars{'t_pid'}}{connection}{user} = $usr; + $tsung_session{$prefix_vars{'t_pid'}}{connection}{date} = $prefix_vars{'t_date'}; + return; + + } + $overall_stat{'peak'}{$cur_last_log_timestamp}{connection}++; + $connection_info{count}++; $connection_info{user}{$usr}++; $connection_info{database}{$db}++; $connection_info{database_user}{$db}{$usr}++; $connection_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{count}++; - $connection_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{user}{$usr}++; - $connection_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{database}{$db}++; - $connection_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{database_user}{$db}{$usr}++; +############################################################################### +# May be used in the future to display more detailed information on connection +# $connection_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{user}{$usr}++; +# $connection_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{database}{$db}++; +# $connection_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{database_user}{$db}{$usr}++; +############################################################################### if ($graph) { - $per_minute_info{connection}{$date_part}{$prefix_vars{'t_hour'}}{"$prefix_vars{'t_min'}"}{count}++; - $per_minute_info{connection}{$date_part}{$prefix_vars{'t_hour'}}{"$prefix_vars{'t_min'}"}{second}{$prefix_vars{'t_sec'}}++; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{"$prefix_vars{'t_min'}"}{connection}{count}++; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{"$prefix_vars{'t_min'}"}{connection}{second}{$prefix_vars{'t_sec'}}++; } if (exists $conn_received{$t_pid}) { $connection_info{host}{$conn_received{$t_pid}}++; - $connection_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{host}{$conn_received{$t_pid}}++; + #$connection_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{host}{$conn_received{$t_pid}}++; delete $conn_received{$t_pid}; + } elsif ($host) { + $connection_info{host}{$host}++; + #$connection_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{host}{$host}++; } return; } - # Stores session duration - if ( ($prefix_vars{'t_loglevel'} eq 'LOG') - && ($prefix_vars{'t_query'} =~ /disconnection: session time: ([^\s]+) user=([^\s]+) database=([^\s]+) host=([^\s]+) port=(\d+)/)) + # Store session duration + if (($prefix_vars{'t_loglevel'} eq 'LOG') + && ($prefix_vars{'t_query'} =~ + /disconnection: session time: ([^\s]+) user=([^\s]+) database=([^\s]+) host=([^\s]+)/)) { return if ($disable_session); + if ($extension eq 'tsung') { + $tsung_session{$prefix_vars{'t_pid'}}{disconnection}{date} = $prefix_vars{'t_timestamp'}; + } + delete $current_sessions{$prefix_vars{'t_pid'}}; my $time = $1; my $usr = $2; my $db = $3; my $host = $4; + if ($extension eq 'tsung') { + &store_tsung_session($prefix_vars{'t_pid'}); + return; + } + # Store time in millisecond $time =~ /(\d+):(\d+):(\d+\.\d+)/; $time = ($3 * 1000) + ($2 * 60 * 1000) + ($1 * 60 * 60 * 1000); @@ -3283,57 +8151,147 @@ return; } - # Store checkpoint information + # Store autovacuum information + if ( + ($prefix_vars{'t_loglevel'} eq 'LOG') + && ($prefix_vars{'t_query'} =~ +/automatic vacuum of table "([^\s]+)": index scans: (\d+)/ + ) + ) + { + return if ($disable_autovacuum); + $autovacuum_info{count}++; + $autovacuum_info{tables}{$1}{vacuums} += 1; + $autovacuum_info{tables}{$1}{idxscans} += $2; + $autovacuum_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{count}++; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{autovacuum}{count}++; + $cur_info{$t_pid}{vacuum} = $1; + $cur_info{$t_pid}{year} = $prefix_vars{'t_year'}; + $cur_info{$t_pid}{month} = $prefix_vars{'t_month'}; + $cur_info{$t_pid}{day} = $prefix_vars{'t_day'}; + $cur_info{$t_pid}{hour} = $prefix_vars{'t_hour'}; + $cur_info{$t_pid}{min} = $prefix_vars{'t_min'}; + $cur_info{$t_pid}{sec} = $prefix_vars{'t_sec'}; + return; + } + + # Store autoanalyze information + if ( + ($prefix_vars{'t_loglevel'} eq 'LOG') + && ($prefix_vars{'t_query'} =~ +/automatic analyze of table "([^\s]+)"/ + ) + ) + { + return if ($disable_autovacuum); + my $table = $1; + $autoanalyze_info{count}++; + $autoanalyze_info{tables}{$table}{analyzes} += 1; + $autoanalyze_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{count}++; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{autoanalyze}{count}++; + if ($prefix_vars{'t_query'} =~ m#system usage: CPU .* sec elapsed (.*) sec#) { + if ($1 > $autoanalyze_info{peak}{system_usage}{elapsed}) { + $autoanalyze_info{peak}{system_usage}{elapsed} = $1; + $autoanalyze_info{peak}{system_usage}{table} = $table; + $autoanalyze_info{peak}{system_usage}{date} = $cur_last_log_timestamp; + } + } + } + + # Store checkpoint or restartpoint information if ( ($prefix_vars{'t_loglevel'} eq 'LOG') && ($prefix_vars{'t_query'} =~ -/checkpoint complete: wrote (\d+) buffers \(([^\)]+)\); (\d+) transaction log file\(s\) added, (\d+) removed, (\d+) recycled; write=([0-9\.]+) s, sync=([0-9\.]+) s, total=([0-9\.]+) s/ +/point complete: wrote (\d+) buffers \(([^\)]+)\); (\d+) transaction log file\(s\) added, (\d+) removed, (\d+) recycled; write=([0-9\.]+) s, sync=([0-9\.]+) s, total=([0-9\.]+) s/ ) ) { + # Example: LOG: checkpoint complete: wrote 175 buffers (5.7%); 0 transaction log file(s) added, 1 removed, 2 recycled; write=17.437 s, sync=0.722 s, total=18.259 s; sync files=2, longest=0.708 s, average=0.361 s return if ($disable_checkpoint); + $checkpoint_info{wbuffer} += $1; #$checkpoint_info{percent_wbuffer} += $2; $checkpoint_info{file_added} += $3; $checkpoint_info{file_removed} += $4; $checkpoint_info{file_recycled} += $5; + $overall_checkpoint{'peak'}{$cur_last_log_timestamp}{walfile_usage} += ($3 + $5); $checkpoint_info{write} += $6; $checkpoint_info{sync} += $7; $checkpoint_info{total} += $8; - - $checkpoint_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{wbuffer} += $1; - - #$checkpoint_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{percent_wbuffer} += $2; - $checkpoint_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{file_added} += $3; - $checkpoint_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{file_removed} += $4; - $checkpoint_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{file_recycled} += $5; - $checkpoint_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{write} += $6; - $checkpoint_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{sync} += $7; - $checkpoint_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{total} += $8; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{wbuffer} += $1; + #$per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{percent_wbuffer} += $2; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{file_added} += $3; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{file_removed} += $4; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{file_recycled} += $5; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{write} += $6; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{sync} += $7; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{total} += $8; + + $overall_checkpoint{'peak'}{$cur_last_log_timestamp}{checkpoint_wbuffer} += $1; + if ($6 > $overall_checkpoint{checkpoint_write}) { + $overall_checkpoint{checkpoint_write} = $6; + $overall_checkpoint{checkpoint_sync} = $7; + } + + if ($prefix_vars{'t_query'} =~ /sync files=(\d+), longest=([0-9\.]+) s, average=([0-9\.]+) s/) { + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{sync_files} += $1; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{sync_longest} = $2 + if ($2 > $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{sync_longest}); + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{sync_avg} += $3; + } return; } - if (($prefix_vars{'t_loglevel'} eq 'LOG') && ($prefix_vars{'t_query'} =~ /checkpoints are occurring too frequently \((\d+) seconds apart\)/)) { + + # Store checkpoint warning information + if ( ($prefix_vars{'t_loglevel'} eq 'LOG') + && ($prefix_vars{'t_query'} =~ /checkpoints are occurring too frequently \((\d+) seconds apart\)/)) + { return if ($disable_checkpoint); $checkpoint_info{warning}++; $checkpoint_info{warning_seconds} += $1; - $checkpoint_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{warning}++; - $checkpoint_info{chronos}{$date_part}{$prefix_vars{'t_hour'}}{warning_seconds} += $1; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{warning}++; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{warning_seconds} += $1; + $overall_checkpoint{checkpoint_warning}++; return; } - # Store the detail of the error - if ($cur_info{$t_pid}{loglevel} =~ /WARNING|ERROR|FATAL|PANIC/) { - if ($prefix_vars{'t_loglevel'} =~ /(DETAIL|STATEMENT|CONTEXT|HINT)/) { - $cur_info{$t_pid}{"\L$1\E"} .= $prefix_vars{'t_query'}; - return; + # Store old restartpoint information + if ( + ($prefix_vars{'t_loglevel'} eq 'LOG') + && ($prefix_vars{'t_query'} =~ +/restartpoint complete: wrote (\d+) buffers \(([^\)]+)\); write=([0-9\.]+) s, sync=([0-9\.]+) s, total=([0-9\.]+) s/ + ) + ) + { + # Example: LOG: restartpoint complete: wrote 1568 buffers (0.3%); write=146.237 s, sync=0.251 s, total=146.489 s + return if ($disable_checkpoint); + + $checkpoint_info{wbuffer} += $1; + + #$checkpoint_info{percent_wbuffer} += $2; + $checkpoint_info{write} += $6; + $checkpoint_info{sync} += $7; + $checkpoint_info{total} += $8; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{wbuffer} += $1; + #$per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{percent_wbuffer} += $2; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{write} += $6; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{sync} += $7; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{checkpoint}{total} += $8; + + $overall_checkpoint{'peak'}{$cur_last_log_timestamp}{checkpoint_wbuffer} += $1; + if ($6 > $overall_checkpoint{checkpoint_write}) { + $overall_checkpoint{checkpoint_write} = $6; + $overall_checkpoint{checkpoint_sync} = $7; } + + return; } - # Process current query following context + # Look at bind parameters if any if ($cur_info{$t_pid}{query}) { - # Remove obsolete connexion storage + # Remove obsolete connection storage delete $conn_received{$cur_info{$t_pid}{pid}}; # The query is complete but we are missing some debug/info/bind parameter logs @@ -3341,227 +8299,473 @@ # Apply bind parameters if any if (($prefix_vars{'t_loglevel'} eq 'DETAIL') && ($prefix_vars{'t_query'} =~ /parameters: (.*)/)) { - my @t_res = split(/[,\s]*\$(\d+)\s=\s/, $1); - shift(@t_res); - for (my $i = 0 ; $i < $#t_res ; $i += 2) { - $cur_info{$t_pid}{query} =~ s/\$$t_res[$i]\b/$t_res[$i+1]/s; - } - &store_queries($t_pid); - delete $cur_info{$t_pid}; + $cur_info{$t_pid}{parameters} = "$1"; + # go look at other params + return; } } + } - # When we are ready to overwrite the last storage, add it to the global stats - if ( ($prefix_vars{'t_loglevel'} =~ /LOG|FATAL|PANIC|ERROR|WARNING|HINT/) - && exists $cur_info{$t_pid} - && (($format eq 'csv') || ($prefix_vars{'t_session_line'} != $cur_info{$t_pid}{session}))) - { + # Apply bind parameters if any + if ($prefix_vars{'t_detail'} =~ /parameters: (.*)/) { + $cur_info{$t_pid}{parameters} = "$1"; + # go look at other params + } + + #### + # Registrer previous query storage into global statistics before starting to store current query + #### + if (exists $cur_info{$t_pid}{query}) { + # when switching to a new log message + if ( ($prefix_vars{'t_loglevel'} eq 'LOG') || ($format eq 'csv') || ($prefix_vars{'t_loglevel'} =~ $main_error_regex) ) { &store_queries($t_pid); delete $cur_info{$t_pid}; } } - # Extract the duration and the query parts from the entry - my $duration_required = 1; - if ($error_only || ($disable_hourly && $disable_query)) { - $duration_required = 0; + #### + # Store current query information + #### + + # Log lines with duration only, generated by log_duration = on in postgresql.conf + if ($prefix_vars{'t_query'} =~ s/duration: ([0-9\.]+) ms$//s) { + $prefix_vars{'t_duration'} = $1; + $prefix_vars{'t_query'} = ''; + my $k = &get_hist_inbound($1); + $overall_stat{histogram}{query_time}{$k}++; + $overall_stat{histogram}{total}++; + &set_current_infos($t_pid); + return; + } + + # Store info as tsung session following the output file extension + if (($extension eq 'tsung') && !exists $tsung_session{$prefix_vars{'t_pid'}}{connection} && $prefix_vars{'t_dbname'}) { + $tsung_session{$prefix_vars{'t_pid'}}{connection}{database} = $prefix_vars{'t_dbname'}; + $tsung_session{$prefix_vars{'t_pid'}}{connection}{user} = $prefix_vars{'t_dbuser'}; + $tsung_session{$prefix_vars{'t_pid'}}{connection}{date} = $prefix_vars{'t_date'}; } - my $t_action = ''; - my $t_duration = ''; + + my $t_action = ''; + # Store query duration generated by log_min_duration >= 0 in postgresql.conf if ($prefix_vars{'t_query'} =~ s/duration: ([0-9\.]+) ms (query|statement): //is) { - $t_duration = $1; - $t_action = $2; - } elsif ($prefix_vars{'t_query'} =~ s/duration: ([0-9\.]+) ms (prepare|parse|bind|execute|execute from fetch)\s+[^:]+:\s//is) { - $t_duration = $1; + $prefix_vars{'t_duration'} = $1; $t_action = $2; - - # Skiping parse and bind logs + my $k = &get_hist_inbound($1); + $overall_stat{histogram}{query_time}{$k}++; + $overall_stat{histogram}{total}++; + if (($t_action eq 'statement') && $prefix_vars{'t_query'} =~ /^(PREPARE|EXECUTE)\b/i) { + $overall_stat{lc($1)}++; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{lc($1)}++; + } + # Log line with duration and statement from prepared queries + } elsif ($prefix_vars{'t_query'} =~ s/duration: ([0-9\.]+) ms (prepare|parse|bind|execute from fetch|execute)\s+[^:]+:\s//is) + { + $prefix_vars{'t_duration'} = $1; + $t_action = $2; + my $k = &get_hist_inbound($1); + $overall_stat{histogram}{query_time}{$k}++; + $overall_stat{histogram}{total}++; + $t_action =~ s/ from fetch//; + $t_action = 'prepare' if ($t_action eq 'parse'); + $overall_stat{$t_action}++; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{$t_action}++; + # Skipping parse and bind logs return if ($t_action !~ /query|statement|execute/); - } elsif (!$duration_required && ($prefix_vars{'t_query'} =~ s/(query|statement): //is)) { + # Log line without duration at all + } elsif ($prefix_vars{'t_query'} =~ s/(query|statement): //is) { $t_action = $1; - } elsif (!$duration_required && ($prefix_vars{'t_query'} =~ s/(prepare|parse|bind|execute|execute from fetch)\s+[^:]+:\s//is)) { + if (($t_action eq 'statement') && $prefix_vars{'t_query'} =~ /^(PREPARE|EXECUTE)\b/i) { + $overall_stat{lc($1)}++; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{lc($1)}++; + } + # Log line without duration at all from prepared queries + } elsif ($prefix_vars{'t_query'} =~ s/(prepare|parse|bind|execute from fetch|execute)\s+[^:]+:\s//is) + { $t_action = $1; - - # Skiping parse and bind logs + $t_action =~ s/ from fetch//; + $t_action = 'prepare' if ($t_action eq 'parse'); + $overall_stat{$t_action}++; + $per_minute_info{$date_part}{$prefix_vars{'t_hour'}}{$prefix_vars{'t_min'}}{$t_action}++; + # Skipping parse and bind logs return if ($t_action !~ /query|statement|execute/); + # Log line that would not be parse } elsif ($prefix_vars{'t_loglevel'} eq 'LOG') { if ($prefix_vars{'t_query'} !~ -/incomplete startup packet|connection|receive|unexpected EOF|still waiting for [^\s]+Lock|checkpoint starting:|could not send data to client|parameter .*configuration file|autovacuum launcher|automatic (analyze|vacuum)/ +/incomplete startup packet|connection|receive|unexpected EOF|still waiting for [^\s]+Lock|checkpoint starting:|could not send data to client|parameter .*configuration file|autovacuum launcher|automatic (analyze|vacuum)|detected deadlock while waiting for/ ) { &logmsg('DEBUG', "Unrecognized line: $prefix_vars{'t_loglevel'}: $prefix_vars{'t_query'} at line $nlines"); } - if (exists $cur_info{$t_pid} && ($prefix_vars{'t_session_line'} != $cur_info{$t_pid}{session})) { - &store_queries($t_pid); - delete $cur_info{$t_pid}; + return; + } + + if ( ($format eq 'csv') && ($prefix_vars{'t_loglevel'} ne 'LOG')) { + $cur_info{$t_pid}{detail} = $prefix_vars{'t_detail'}; + $cur_info{$t_pid}{hint} = $prefix_vars{'t_hint'}; + $cur_info{$t_pid}{context} = $prefix_vars{'t_context'}; + $cur_info{$t_pid}{statement} = $prefix_vars{'t_statement'} + } + &set_current_infos($t_pid); + + return 1; +} + +sub set_current_infos +{ + + my $t_pid = shift; + + $cur_info{$t_pid}{year} = $prefix_vars{'t_year'}; + $cur_info{$t_pid}{month} = $prefix_vars{'t_month'}; + $cur_info{$t_pid}{day} = $prefix_vars{'t_day'}; + $cur_info{$t_pid}{hour} = $prefix_vars{'t_hour'}; + $cur_info{$t_pid}{min} = $prefix_vars{'t_min'}; + $cur_info{$t_pid}{sec} = $prefix_vars{'t_sec'}; + $cur_info{$t_pid}{timestamp} = $prefix_vars{'t_timestamp'}; + $cur_info{$t_pid}{ident} = $prefix_vars{'t_ident'}; + $cur_info{$t_pid}{query} = $prefix_vars{'t_query'}; + $cur_info{$t_pid}{duration} = $prefix_vars{'t_duration'}; + $cur_info{$t_pid}{pid} = $prefix_vars{'t_pid'}; + $cur_info{$t_pid}{session} = $prefix_vars{'t_session_line'}; + $cur_info{$t_pid}{loglevel} = $prefix_vars{'t_loglevel'}; + $cur_info{$t_pid}{dbname} = $prefix_vars{'t_dbname'}; + $cur_info{$t_pid}{dbuser} = $prefix_vars{'t_dbuser'}; + $cur_info{$t_pid}{dbclient} = $prefix_vars{'t_client'}; + $cur_info{$t_pid}{dbappname} = $prefix_vars{'t_appname'}; + $cur_info{$t_pid}{date} = $prefix_vars{'t_date'}; + +} + +sub store_tsung_session +{ + my $pid = shift; + + return if ($#{$tsung_session{$pid}{dates}} < 0); + + # Open filehandle + my $fh = new IO::File ">>$outfile"; + if (not defined $fh) { + die "FATAL: can't write to $outfile, $!\n"; + } + if ($pid) { + print $fh " \n"; + if (exists $tsung_session{$pid}{connection}{database}) { + print $fh qq{ + + + +}; + } + if ($#{$tsung_session{$pid}{dates}} >= 0) { + my $sec = 0; + if ($tsung_session{$pid}{connection}{date}) { + $sec = $tsung_session{$pid}{dates}[0] - $tsung_session{$pid}{connection}{date}; + } + print $fh " \n" if ($sec > 0); + print $fh " \n"; + for (my $i = 0 ; $i <= $#{$tsung_session{$pid}{queries}} ; $i++) { + $tsung_queries++; + $sec = 0; + if ($i > 0) { + $sec = $tsung_session{$pid}{dates}[$i] - $tsung_session{$pid}{dates}[$i - 1]; + print $fh " \n" if ($sec > 0); + } + print $fh " \n"; + } + print $fh " \n"; + } + if ($#{$tsung_session{$pid}{dates}} >= 0) { + my $sec = $tsung_session{$pid}{disconnection}{date} - $tsung_session{$pid}{dates}[-1]; + print $fh " \n" if ($sec > 0); } - return; + if (exists $tsung_session{$pid}{connection}{database}) { + print $fh " \n"; + } + print $fh " \n\n"; + delete $tsung_session{$pid}; } - - $cur_info{$t_pid}{year} = $prefix_vars{'t_year'}; - $cur_info{$t_pid}{month} = $prefix_vars{'t_month'}; - $cur_info{$t_pid}{day} = $prefix_vars{'t_day'}; - $cur_info{$t_pid}{hour} = $prefix_vars{'t_hour'}; - $cur_info{$t_pid}{min} = $prefix_vars{'t_min'}; - $cur_info{$t_pid}{sec} = $prefix_vars{'t_sec'}; - $cur_info{$t_pid}{date} = $prefix_vars{'t_date'}; - $cur_info{$t_pid}{ident} = $prefix_vars{'t_ident'}; - $cur_info{$t_pid}{query} = $prefix_vars{'t_query'}; - $cur_info{$t_pid}{duration} = $t_duration; - $cur_info{$t_pid}{pid} = $prefix_vars{'t_pid'}; - $cur_info{$t_pid}{session} = $prefix_vars{'t_session_line'}; - $cur_info{$t_pid}{loglevel} = $prefix_vars{'t_loglevel'}; - $cur_info{$t_pid}{dbname} = $prefix_vars{'t_dbname'}; - $cur_info{$t_pid}{dbuser} = $prefix_vars{'t_dbuser'}; - $cur_info{$t_pid}{dbclient} = $prefix_vars{'t_client'}; - $cur_info{$t_pid}{dbappname}= $prefix_vars{'t_appname'}; - - return 1; + $fh->close; } sub store_queries { - my $t_pid = shift; + my $t_pid = shift; + + # Remove comments if required + if ($remove_comment) { + $cur_info{$t_pid}{query} =~ s/\/\*(.*?)\*\///gs; + } + + # Stores temporary files and locks information + &store_temporary_and_lock_infos($t_pid); + + return if (!exists $cur_info{$t_pid}); + return if (!$cur_info{$t_pid}{year}); # Cleanup and normalize the current query - $cur_info{$t_pid}{query} =~ s/^[\t\s]+//s; - $cur_info{$t_pid}{query} =~ s/[\t\s]+$//s; + $cur_info{$t_pid}{query} =~ s/^[\t\s\r\n]+//s; + $cur_info{$t_pid}{query} =~ s/[\t\s\r\n;]+$//s; - # Should we have to exclude some queries - if ($#exclude_query >= 0) { - foreach (@exclude_query) { - if ($cur_info{$t_pid}{query} =~ /$_/i) { - $cur_info{$t_pid}{query} = ''; - return; + # Replace bind parameters values in the query if any + if (exists $cur_info{$t_pid}{parameters}) { + my @t_res = split(/[,\s]*\$(\d+)\s=\s/, $cur_info{$t_pid}{parameters}); + shift(@t_res); + for (my $i = 0 ; $i < $#t_res ; $i += 2) { + $cur_info{$t_pid}{query} =~ s/\$$t_res[$i]\b/$t_res[$i+1]/s; + } + } + + # We only process stored object with query here + if ($cur_info{$t_pid}{query}) { + # Should we just want select queries + if ($select_only) { + return if (($cur_info{$t_pid}{query} !~ /^SELECT/is) || ($cur_info{$t_pid}{query} =~ /FOR UPDATE/is)); + } + + # Should we have to exclude some queries + if ($#exclude_query >= 0) { + foreach (@exclude_query) { + if ($cur_info{$t_pid}{query} =~ /$_/i) { + $cur_info{$t_pid}{query} = ''; + return; + } + } + } + + # Should we have to include only some queries + if ($#include_query >= 0) { + foreach (@include_query) { + if ($cur_info{$t_pid}{query} !~ /$_/i) { + $cur_info{$t_pid}{query} = ''; + return; + } + } + } + + # Truncate the query if requested by the user + $cur_info{$t_pid}{query} = substr($cur_info{$t_pid}{query}, 0, $maxlength) . '[...]' + if (($maxlength > 0) && (length($cur_info{$t_pid}{query}) > $maxlength)); + + # Dump queries as tsung request and return + if ($extension eq 'tsung') { + if ($cur_info{$t_pid}{loglevel} eq 'LOG') { + push(@{$tsung_session{$t_pid}{queries}}, $cur_info{$t_pid}{query}); + push(@{$tsung_session{$t_pid}{dates}}, $cur_info{$t_pid}{date}); + if (!exists $tsung_session{$t_pid}{connection} && $cur_info{$t_pid}{dbname}) { + $tsung_session{$t_pid}{connection}{database} = $cur_info{$t_pid}{dbname}; + $tsung_session{$t_pid}{connection}{user} = $cur_info{$t_pid}{dbuser}; + $tsung_session{$t_pid}{connection}{date} = $cur_info{$t_pid}{date}; + } } + return; } } - # Truncate the query if requested by the user - $cur_info{$t_pid}{query} = substr($cur_info{$t_pid}{query}, 0, $maxlength) . '[...]' - if (($maxlength > 0) && (length($cur_info{$t_pid}{query}) > $maxlength)); + my $cur_day_str = "$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"; + my $cur_hour_str = "$cur_info{$t_pid}{hour}"; - # Store the collected informations into global statistics - if ($cur_info{$t_pid}{loglevel} =~ /WARNING|ERROR|FATAL|PANIC|HINT/) { + # Store the collected information into global statistics + if ($cur_info{$t_pid}{loglevel} =~ $main_error_regex) { - # Add log level at beginning of the query and normalyze it + # Add log level at beginning of the query and normalize it $cur_info{$t_pid}{query} = $cur_info{$t_pid}{loglevel} . ": " . $cur_info{$t_pid}{query}; my $normalized_error = &normalize_error($cur_info{$t_pid}{query}); - # Stores total and normalyzed error count + # Stores total and normalized error count $overall_stat{'errors_number'}++; - $overall_stat{'unique_normalized_errors'}{"$normalized_error"}++; $error_info{$normalized_error}{count}++; - # Stores normalyzed error count per time - $error_info{$normalized_error}{chronos}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"} - {"$cur_info{$t_pid}{hour}"}{count}++; - - # Stores normalyzed query samples - my $cur_last_log_date = -"$cur_info{$t_pid}{year}-$cur_info{$t_pid}{month}-$cur_info{$t_pid}{day} $cur_info{$t_pid}{hour}:$cur_info{$t_pid}{min}:$cur_info{$t_pid}{sec}"; + # Stores normalized error count per time + $error_info{$normalized_error}{chronos}{"$cur_day_str"}{"$cur_hour_str"}{count}++; + + # Stores normalized query samples + my $cur_last_log_timestamp = "$cur_info{$t_pid}{year}-$cur_info{$t_pid}{month}-$cur_info{$t_pid}{day} " . + "$cur_info{$t_pid}{hour}:$cur_info{$t_pid}{min}:$cur_info{$t_pid}{sec}"; &set_top_error_sample( - $normalized_error, $cur_last_log_date, $cur_info{$t_pid}{query}, $cur_info{$t_pid}{detail}, - $cur_info{$t_pid}{context}, $cur_info{$t_pid}{statement}, $cur_info{$t_pid}{hint} + $normalized_error, $cur_last_log_timestamp, $cur_info{$t_pid}{query}, $cur_info{$t_pid}{detail}, + $cur_info{$t_pid}{context}, $cur_info{$t_pid}{statement}, $cur_info{$t_pid}{hint}, $cur_info{$t_pid}{dbname} ); } elsif ($cur_info{$t_pid}{loglevel} eq 'LOG') { - # Add a semi-colon at end of the query - $cur_info{$t_pid}{query} .= ';' if (substr($cur_info{$t_pid}{query}, -1, 1) ne ';'); - - # Normalyze query - my $normalized = &normalize_query($cur_info{$t_pid}{query}); - # Stores global statistics - my $cur_last_log_date = -"$cur_info{$t_pid}{year}-$cur_info{$t_pid}{month}-$cur_info{$t_pid}{day} $cur_info{$t_pid}{hour}:$cur_info{$t_pid}{min}:$cur_info{$t_pid}{sec}"; + $overall_stat{'queries_number'}++; - $overall_stat{'queries_duration'} += $cur_info{$t_pid}{duration}; - $overall_stat{'first_query'} = $cur_last_log_date if (!$overall_stat{'first_query'}); - $overall_stat{'last_query'} = $cur_last_log_date; - $overall_stat{'query_peak'}{$cur_last_log_date}++; - $per_hour_info{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"}{"$cur_info{$t_pid}{hour}"}{count}++; - $per_hour_info{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"}{"$cur_info{$t_pid}{hour}"} - {duration} += $cur_info{$t_pid}{duration}; + $overall_stat{'queries_duration'} += $cur_info{$t_pid}{duration} if ($cur_info{$t_pid}{duration}); + + my $cur_last_log_timestamp = "$cur_info{$t_pid}{year}-$cur_info{$t_pid}{month}-$cur_info{$t_pid}{day} " . + "$cur_info{$t_pid}{hour}:$cur_info{$t_pid}{min}:$cur_info{$t_pid}{sec}"; + if (!$overall_stat{'first_query_ts'} || ($overall_stat{'first_query_ts'} gt $cur_last_log_timestamp)) { + $overall_stat{'first_query_ts'} = $cur_last_log_timestamp; + } + if (!$overall_stat{'last_query_ts'} || ($overall_stat{'last_query_ts'} lt $cur_last_log_timestamp)) { + $overall_stat{'last_query_ts'} = $cur_last_log_timestamp; + } + $overall_stat{'peak'}{$cur_last_log_timestamp}{query}++; if ($graph) { - $per_minute_info{query}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"}{$cur_info{$t_pid}{hour}} - {$cur_info{$t_pid}{min}}{count}++; - $per_minute_info{query}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"}{$cur_info{$t_pid}{hour}} - {$cur_info{$t_pid}{min}}{second}{$cur_info{$t_pid}{sec}}++; - $per_minute_info{query}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"}{$cur_info{$t_pid}{hour}} - {$cur_info{$t_pid}{min}}{duration} += $cur_info{$t_pid}{duration}; - } - if ($normalized =~ /delete from/) { - $overall_stat{'DELETE'}++; - $per_hour_info{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"}{"$cur_info{$t_pid}{hour}"} - {'DELETE'}{count}++; - $per_hour_info{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"}{"$cur_info{$t_pid}{hour}"} - {'DELETE'}{duration} += $cur_info{$t_pid}{duration}; - if ($graph) { - $per_minute_info{delete}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"} - {$cur_info{$t_pid}{hour}}{$cur_info{$t_pid}{min}}{count}++; - $per_minute_info{delete}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"} - {$cur_info{$t_pid}{hour}}{$cur_info{$t_pid}{min}}{duration} += $cur_info{$t_pid}{duration}; - } - } elsif ($normalized =~ /insert into/) { - $overall_stat{'INSERT'}++; - $per_hour_info{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"}{"$cur_info{$t_pid}{hour}"} - {'INSERT'}{count}++; - $per_hour_info{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"}{"$cur_info{$t_pid}{hour}"} - {'INSERT'}{duration} += $cur_info{$t_pid}{duration}; - if ($graph) { - $per_minute_info{insert}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"} - {"$cur_info{$t_pid}{hour}"}{"$cur_info{$t_pid}{min}"}{count}++; - $per_minute_info{insert}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"} - {"$cur_info{$t_pid}{hour}"}{"$cur_info{$t_pid}{min}"}{duration} += $cur_info{$t_pid}{duration}; - } - } elsif ($normalized =~ /update.*set\b/) { - $overall_stat{'UPDATE'}++; - $per_hour_info{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"}{"$cur_info{$t_pid}{hour}"} - {'UPDATE'}{count}++; - $per_hour_info{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"}{"$cur_info{$t_pid}{hour}"} - {'UPDATE'}{duration} += $cur_info{$t_pid}{duration}; - if ($graph) { - $per_minute_info{update}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"} - {"$cur_info{$t_pid}{hour}"}{"$cur_info{$t_pid}{min}"}{count}++; - $per_minute_info{update}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"} - {"$cur_info{$t_pid}{hour}"}{"$cur_info{$t_pid}{min}"}{duration} += $cur_info{$t_pid}{duration}; - } - } elsif ($normalized =~ /\bselect\b/is) { - $overall_stat{'SELECT'}++; - $per_hour_info{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"}{"$cur_info{$t_pid}{hour}"} - {'SELECT'}{count}++; - $per_hour_info{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"}{"$cur_info{$t_pid}{hour}"} - {'SELECT'}{duration} += $cur_info{$t_pid}{duration}; - if ($graph) { - $per_minute_info{select}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"} - {"$cur_info{$t_pid}{hour}"}{"$cur_info{$t_pid}{min}"}{count}++; - $per_minute_info{select}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"} - {"$cur_info{$t_pid}{hour}"}{"$cur_info{$t_pid}{min}"}{duration} += $cur_info{$t_pid}{duration}; - } - } - &set_top_slowest($cur_info{$t_pid}{query}, $cur_info{$t_pid}{duration}, $cur_last_log_date); - - # Store normalyzed query count - $normalyzed_info{$normalized}{count}++; - - # Store normalyzed query total duration - $normalyzed_info{$normalized}{duration} += $cur_info{$t_pid}{duration}; - - # Store normalyzed query count and duration per time - $normalyzed_info{$normalized}{chronos}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"} - {"$cur_info{$t_pid}{hour}"}{count}++; - $normalyzed_info{$normalized}{chronos}{"$cur_info{$t_pid}{year}$cur_info{$t_pid}{month}$cur_info{$t_pid}{day}"} - {"$cur_info{$t_pid}{hour}"}{duration} += $cur_info{$t_pid}{duration}; + $per_minute_info{"$cur_day_str"}{"$cur_hour_str"}{$cur_info{$t_pid}{min}}{query}{count}++; + $per_minute_info{"$cur_day_str"}{"$cur_hour_str"}{$cur_info{$t_pid}{min}}{query}{second}{$cur_info{$t_pid}{sec}}++; + $per_minute_info{"$cur_day_str"}{"$cur_hour_str"}{$cur_info{$t_pid}{min}}{query}{duration} += $cur_info{$t_pid}{duration} if ($cur_info{$t_pid}{duration}); + # Store min / max duration + if (!exists $per_minute_info{"$cur_day_str"}{"$cur_hour_str"}{$cur_info{$t_pid}{min}}{min} || ($per_minute_info{"$cur_day_str"}{"$cur_hour_str"}{$cur_info{$t_pid}{min}}{min} > $cur_info{$t_pid}{duration})) { + $per_minute_info{"$cur_day_str"}{"$cur_hour_str"}{$cur_info{$t_pid}{min}}{min} = $cur_info{$t_pid}{duration}; + } + if (!exists $per_minute_info{"$cur_day_str"}{"$cur_hour_str"}{$cur_info{$t_pid}{min}}{max} || ($per_minute_info{"$cur_day_str"}{"$cur_hour_str"}{$cur_info{$t_pid}{min}}{max} < $cur_info{$t_pid}{duration})) { + $per_minute_info{"$cur_day_str"}{"$cur_hour_str"}{$cur_info{$t_pid}{min}}{max} = $cur_info{$t_pid}{duration}; + } + } + + # Counter per database and application name + if ($cur_info{$t_pid}{dbname}) { + $database_info{$cur_info{$t_pid}{dbname}}{count}++; + } else { + $database_info{'unknown'}{count}++; + } + if ($cur_info{$t_pid}{dbappname}) { + $application_info{$cur_info{$t_pid}{dbappname}}{count}++; + } else { + $application_info{'unknown'}{count}++; + } + if ($cur_info{$t_pid}{dbuser}) { + $user_info{$cur_info{$t_pid}{dbuser}}{count}++; + } else { + $user_info{'unknown'}{count}++; + } + if ($cur_info{$t_pid}{dbclient}) { + $host_info{$cur_info{$t_pid}{dbclient}}{count}++; + } else { + $host_info{'unknown'}{count}++; + } + + if ($cur_info{$t_pid}{query}) { + # Add a semi-colon at end of the query + $cur_info{$t_pid}{query} .= ';' if (substr($cur_info{$t_pid}{query}, -1, 1) ne ';'); + + # Normalize query + my $normalized = &normalize_query($cur_info{$t_pid}{query}); + + foreach my $act (@action_regex) { + if ($normalized =~ $act) { + my $action = uc($1); + $overall_stat{$action}++; + if ($action eq 'SELECT') { + $overall_stat{'peak'}{$cur_last_log_timestamp}{select}++; + } else { + $overall_stat{'peak'}{$cur_last_log_timestamp}{write}++; + } + $per_minute_info{"$cur_day_str"}{"$cur_hour_str"}{$cur_info{$t_pid}{min}}{$action}{count}++; + $per_minute_info{"$cur_day_str"}{"$cur_hour_str"}{$cur_info{$t_pid}{min}}{$action}{second}{$cur_info{$t_pid}{sec}}++; + $per_minute_info{"$cur_day_str"}{"$cur_hour_str"}{$cur_info{$t_pid}{min}}{$action}{duration} += $cur_info{$t_pid}{duration} if ($cur_info{$t_pid}{duration}); + if ($cur_info{$t_pid}{dbname}) { + $database_info{$cur_info{$t_pid}{dbname}}{$action}++; + } else { + $database_info{'unknown'}{$action}++; + } + if ($cur_info{$t_pid}{dbappname}) { + $application_info{$cur_info{$t_pid}{dbappname}}{$action}++; + } else { + $application_info{'unknown'}{$action}++; + } + if ($cur_info{$t_pid}{dbuser}) { + $user_info{$cur_info{$t_pid}{dbuser}}{$action}++; + } else { + $user_info{'unknown'}{$action}++; + } + if ($cur_info{$t_pid}{dbclient}) { + $host_info{$cur_info{$t_pid}{dbclient}}{$action}++; + } else { + $host_info{'unknown'}{$action}++; + } + last; + } + } + + # Store normalized query count + $normalyzed_info{$normalized}{count}++; + + # Store normalized query count and duration per time + $normalyzed_info{$normalized}{chronos}{"$cur_day_str"}{"$cur_hour_str"}{count}++; + if ($cur_info{$t_pid}{duration}) { + + # Updtate top slowest queries statistics + &set_top_slowest($cur_info{$t_pid}{query}, $cur_info{$t_pid}{duration}, $cur_last_log_timestamp, $cur_info{$t_pid}{dbname}, $cur_info{$t_pid}{dbuser}, $cur_info{$t_pid}{dbclient},$cur_info{$t_pid}{dbappname}); + + # Store normalized query total duration + $normalyzed_info{$normalized}{duration} += $cur_info{$t_pid}{duration}; + # Store min / max duration + if (!exists $normalyzed_info{$normalized}{min} || ($normalyzed_info{$normalized}{min} > $cur_info{$t_pid}{duration})) { + $normalyzed_info{$normalized}{min} = $cur_info{$t_pid}{duration}; + } + if (!exists $normalyzed_info{$normalized}{max} || ($normalyzed_info{$normalized}{max} < $cur_info{$t_pid}{duration})) { + $normalyzed_info{$normalized}{max} = $cur_info{$t_pid}{duration}; + } + + # Store normalized query count and duration per time + $normalyzed_info{$normalized}{chronos}{"$cur_day_str"}{"$cur_hour_str"}{duration} += $cur_info{$t_pid}{duration}; + + # Store normalized query samples + &set_top_sample($normalized, $cur_info{$t_pid}{query}, $cur_info{$t_pid}{duration}, $cur_last_log_timestamp, $cur_info{$t_pid}{dbname}, $cur_info{$t_pid}{dbuser}, $cur_info{$t_pid}{dbclient},$cur_info{$t_pid}{dbappname}); + } + } + } + +} + +sub store_temporary_and_lock_infos +{ + my $t_pid = shift; + + return if (!$t_pid); + + # Store normalized query temp file size if required + if (exists $cur_temp_info{$t_pid} && ($cur_temp_info{$t_pid}{query} ne '') && $cur_temp_info{$t_pid}{size}) { + + # Add a semi-colon at end of the query + $cur_temp_info{$t_pid}{query} .= ';' if (substr($cur_temp_info{$t_pid}{query}, -1, 1) ne ';'); + + # Normalize query + my $normalized = &normalize_query($cur_temp_info{$t_pid}{query}); + + $normalyzed_info{$normalized}{tempfiles}{size} += $cur_temp_info{$t_pid}{size}; + $normalyzed_info{$normalized}{tempfiles}{count}++; + + if ($normalyzed_info{$normalized}{tempfiles}{maxsize} < $cur_temp_info{$t_pid}{size}) { + $normalyzed_info{$normalized}{tempfiles}{maxsize} = $cur_temp_info{$t_pid}{size}; + } + if (!exists($normalyzed_info{$normalized}{tempfiles}{minsize}) + || $normalyzed_info{$normalized}{tempfiles}{minsize} > $cur_temp_info{$t_pid}{size}) { + $normalyzed_info{$normalized}{tempfiles}{minsize} = $cur_temp_info{$t_pid}{size}; + } + &set_top_tempfile_info($cur_temp_info{$t_pid}{query}, $cur_temp_info{$t_pid}{size}, $cur_temp_info{$t_pid}{timestamp}, $cur_temp_info{$t_pid}{dbname}, $cur_temp_info{$t_pid}{dbuser}, $cur_temp_info{$t_pid}{dbclient}, $cur_temp_info{$t_pid}{dbappname}); + delete $cur_temp_info{$t_pid}; + } + + # Store normalized query that waited the most if required + if (exists $cur_lock_info{$t_pid} && ($cur_lock_info{$t_pid}{query} ne '') && $cur_lock_info{$t_pid}{wait}) { + + # Add a semi-colon at end of the query + $cur_lock_info{$t_pid}{query} .= ';' if (substr($cur_lock_info{$t_pid}{query}, -1, 1) ne ';'); + + # Normalize query + my $normalized = &normalize_query($cur_lock_info{$t_pid}{query}); - # Store normalyzed query samples - &set_top_sample($normalized, $cur_info{$t_pid}{query}, $cur_info{$t_pid}{duration}, $last_log_date); + $normalyzed_info{$normalized}{locks}{wait} += $cur_lock_info{$t_pid}{wait}; + $normalyzed_info{$normalized}{locks}{count}++; + if ($normalyzed_info{$normalized}{locks}{maxwait} < $cur_lock_info{$t_pid}{wait}) { + $normalyzed_info{$normalized}{locks}{maxwait} = $cur_lock_info{$t_pid}{wait}; + } + if (!exists($normalyzed_info{$normalized}{locks}{minwait}) + || $normalyzed_info{$normalized}{locks}{minwait} > $cur_lock_info{$t_pid}{wait}) { + $normalyzed_info{$normalized}{locks}{minwait} = $cur_lock_info{$t_pid}{wait}; + } + &set_top_locked_info($cur_lock_info{$t_pid}{query}, $cur_lock_info{$t_pid}{wait}, $cur_lock_info{$t_pid}{timestamp}, $cur_lock_info{$t_pid}{dbname}, $cur_lock_info{$t_pid}{dbuser}, $cur_lock_info{$t_pid}{dbclient}, $cur_lock_info{$t_pid}{dbappname}); + delete $cur_lock_info{$t_pid}; } + } -# Normalyze error messages +# Normalize error messages sub normalize_error { my $orig_query = shift; @@ -3578,9 +8782,11 @@ $orig_query =~ s/"[^"]*"/"..."/g; $orig_query =~ s/\(.*\)/\(...\)/g; $orig_query =~ s/column .* does not exist/column "..." does not exist/; + $orig_query =~ s/(database system was shut down at).*/$1 .../; + $orig_query =~ s/(relation) \d+ (deleted while still in use)/$1 ... $2/g; + $orig_query =~ s/[0-9A-F]{24}/.../g; # Remove WAL filename # Need more normalization stuff here - return $orig_query; } @@ -3590,16 +8796,15 @@ my $idx = shift; my @avgs = (); - for (my $i = 0 ; $i < 59 ; $i += $idx) { + for (my $i = 0 ; $i < 60 ; $i += $idx) { push(@avgs, sprintf("%02d", $i)); } - push(@avgs, 59); for (my $i = 0 ; $i <= $#avgs ; $i++) { if ($val == $avgs[$i]) { return "$avgs[$i]"; - } elsif ($avgs[$i] == $avgs[-1]) { - return "$avgs[$i-1]"; + } elsif ($i == $#avgs) { + return "$avgs[$i]"; } elsif (($val > $avgs[$i]) && ($val < $avgs[$i + 1])) { return "$avgs[$i]"; } @@ -3615,48 +8820,68 @@ my $nfound = 0; my $nline = 0; my $fmt = ''; - my $tfile = new IO::File; - if ($file !~ /\.gz/) { - $tfile->open($file) || die "FATAL: cannot read logfile $file. $!\n"; - } else { - # Open a pipe to zcat program for compressed log - $tfile->open("$zcat $file |") || die "FATAL: cannot read from pipe to $zcat $file. $!\n"; - } - my $duration = 'duration:'; - if ($error_only || ($disable_hourly && $disable_query)) { - $duration = ''; - } - my %ident_name = (); - while (my $line = <$tfile>) { - chomp($line); - $line =~ s/ //; - next if (!$line); - $nline++; - - # Is syslog lines ? - if ($line =~ /^[A-Z][a-z]{2}\s+\d+\s\d+:\d+:\d+\s[^\s]+\s([^\[]+)\[\d+\]:\s\[[0-9\-]+\](.*?)(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):\s+$duration/) { - $fmt = 'syslog'; - $nfound++; - $ident_name{$1}++; - - # Is stderr lines - } elsif ( ($line =~ /^\d+-\d+-\d+ \d+:\d+:\d+\.\d+ [A-Z\d]{3,6},.*,(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT),/) && ($line =~ tr/,/,/ >= 12) ) { - $fmt = 'csv'; - $nfound++; - } elsif ($line =~ /\d+-\d+-\d+ \d+:\d+:\d+[\.0-9]* [A-Z\d]{3,6}(.*?)(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):\s+$duration/) { - $fmt = 'stderr'; - $nfound++; - } - last if (($nfound > 10) || ($nline > 5000)); - } - $tfile->close(); - if (!$fmt || ($nfound < 10)) { - die "FATAL: unable to detect log file format from $file, please use -f option.\n"; - } + die "FATAL: can't open file $file, $!\n" unless(open(TESTFILE, $file)); + my $fltf = ; + close($fltf); + # is file in binary format ? + if ( $fltf =~ /^pst\d/ ) { + $fmt = 'binary'; + } + else { # try to detect syslogs or csv + my ($tfile, $totalsize) = &get_log_file($file); + my %ident_name = (); + while (my $line = <$tfile>) { + chomp($line); + $line =~ s/\r//; + next if (!$line); + $nline++; + + # Are syslog lines ? + if ($line =~ + /^[A-Z][a-z]{2}\s+\d+\s\d+:\d+:\d+(?:\s[^\s]+)?\s[^\s]+\s([^\s\[]+)\[\d+\]:(?:\s\[[^\]]+\])?\s\[\d+\-\d+\].*?(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):/ + ) + { + $fmt = 'syslog'; + $nfound++; + $ident_name{$1}++; + + } elsif ($line =~ + /^\d+-\d+-\d+T\d+:\d+:\d+(?:.[^\s]+)?\s[^\s]+\s(?:[^\s]+\s)?([^\s\[]+)\[\d+\]:(?:\s\[[^\]]+\])?\s\[\d+\-\d+\].*?(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):/ + ) + { + $fmt = 'syslog2'; + $nfound++; + $ident_name{$1}++; + + # Are stderr lines ? + } elsif ( + ( + $line =~ + /^\d+-\d+-\d+ \d+:\d+:\d+\.\d+(?: [A-Z\d]{3,6})?,.*,(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT),/ + ) + && ($line =~ tr/,/,/ >= 12) + ) + { + $fmt = 'csv'; + $nfound++; + } elsif ($line =~ + /\d+-\d+-\d+ \d+:\d+:\d+[\.0-9]*(?: [A-Z\d]{3,6})?(.*?)(LOG|WARNING|ERROR|FATAL|PANIC|DETAIL|STATEMENT|HINT|CONTEXT):\s+/ + ) + { + $fmt = 'stderr'; + $nfound++; + } + last if (($nfound > 10) || ($nline > 5000)); + } + $tfile->close(); + if (!$fmt || ($nfound < 10)) { + die "FATAL: unable to detect log file format from $file, please use -f option.\n"; + } - if (($fmt eq 'syslog') && !$ident && (scalar keys %ident_name == 1)) { - $ident = (keys %ident_name)[0]; + if (($fmt =~ /syslog/) && !$ident && (scalar keys %ident_name == 1)) { + $ident = (keys %ident_name)[0]; + } } &logmsg('DEBUG', "Autodetected log format '$fmt' from $file"); @@ -3666,50 +8891,103 @@ sub progress_bar { - my ($got, $total, $width, $char) = @_; + my ($got, $total, $width, $char, $queries, $errors) = @_; $width ||= 25; $char ||= '='; my $num_width = length $total; - sprintf( - "[%-${width}s] Parsed %${num_width}s bytes of %s (%.2f%%)\r", - $char x (($width - 1) * $got / $total) . '>', - $got, $total, 100 * $got / +$total - ); + my $nchars = (($width - 1) * $got / $total); + $nchars = ($width - 1) if ($nchars >= $width); + if ($extension eq 'tsung') { + sprintf( + "[%-${width}s] Parsed %${num_width}s bytes of %s (%.2f%%), queries: %d\r", + $char x $nchars . '>', + $got, $total, 100 * $got / +$total, ($queries || $tsung_queries) + ); + } elsif ($format eq 'binary') { + my $file = $_[-1]; + sprintf( + "Loaded %d queries and %d events from binary file %s...\r", + $overall_stat{'queries_number'}, $overall_stat{'errors_number'}, $file + ); + } else { + sprintf( + "[%-${width}s] Parsed %${num_width}s bytes of %s (%.2f%%), queries: %d, events: %d\r", + $char x $nchars . '>', + $got, $total, 100 * $got / +$total, ($queries || $overall_stat{'queries_number'}), ($errors || $overall_stat{'errors_number'}) + ); + } } sub flotr2_graph { my ($buttonid, $divid, $data1, $data2, $data3, $title, $ytitle, $legend1, $legend2, $legend3, $ytitle2, $data4, $legend4) = @_; - $data1 = "var d1 = [$data1];" if ($data1); - $data2 = "var d2 = [$data2];" if ($data2); - $data3 = "var d3 = [$data3];" if ($data3); - $data4 = "var d4 = [$data4];" if ($data4); - - $legend1 = "{ data: d1, label: \"$legend1\" }," if ($legend1); - $legend2 = "{ data: d2, label: \"$legend2\" }," if ($legend2); - $legend3 = "{ data: d3, label: \"$legend3\" }," if ($legend3); - $legend4 = "{ data: d4, label: \"$legend4\",yaxis: 2 }," if ($legend4); - + if (!$data1) { + return qq{ +
    NO DATASET
    +}; + } + my $dateTracker_lblopts = ''; + if ($legend1) { + $dateTracker_lblopts .= "'$legend1',"; + $legend1 = "{ data: d1, label: \"$legend1\", color: \"#6e9dc9\", mouse:{track:true}},"; + } + if ($legend2) { + $dateTracker_lblopts .= "'$legend2',"; + $legend2 = "{ data: d2, label: \"$legend2\", color: \"#f4ab3a\", mouse:{track:true}},"; + } + if ($legend3) { + $dateTracker_lblopts .= "'$legend3',"; + $legend3 = "{ data: d3, label: \"$legend3\", color: \"#ac7fa8\", mouse:{track:true}},"; + } + if ($legend4) { + $dateTracker_lblopts .= "'$legend4',"; + $legend4 = "{ data: d4, label: \"$legend4\", color: \"#8dbd0f\",yaxis: 2},"; + } + $dateTracker_lblopts =~ s/,$//; + $dateTracker_lblopts = "[$dateTracker_lblopts]"; + + my $dateTracker_dataopts = ''; + if ($data1) { + $data1 = "var d1 = [$data1];"; + $dateTracker_dataopts .= "d1,"; + } + if ($data2) { + $data2 = "var d2 = [$data2];"; + $dateTracker_dataopts .= "d2,"; + } + if ($data3) { + $data3 = "var d3 = [$data3];"; + $dateTracker_dataopts .= "d3,"; + } + if ($data4) { + $data4 = "var d4 = [$data4];"; + $dateTracker_dataopts .= "d4,"; + } + $dateTracker_dataopts =~ s/,$//; + $dateTracker_dataopts = "[$dateTracker_dataopts]"; + my $yaxis2 = ''; if ($ytitle2) { - $yaxis2 = "y2axis: { title: \"$ytitle2\", min: 0, color: \"#4DA74D\" },"; + $yaxis2 = "y2axis: { mode: \"normal\", title: \"$ytitle2\", min: 0, color: \"#8dbd0f\" },"; } - my $min = $t_min; - my $max = $t_max; - if ($divid !~ /persecond/) { - $min = $t_min_hour; - $max = $t_max_hour; + my $type = ''; + if ($ytitle eq 'Size of files') { + $type = 'size'; + } elsif ($ytitle eq 'Duration') { + $type = 'duration'; } - print $fh <
    + + return <
    EOF @@ -3810,30 +9099,40 @@ { my ($buttonid, $divid, $title, %data) = @_; + if (scalar keys %data == 0) { + return qq{ +
    NO DATASET
    +}; + + } + my @colors = ("#6e9dc9", "#f4ab3a", "#ac7fa8", "#8dbd0f"); my @datadef = (); my @contdef = (); my $i = 1; foreach my $k (sort keys %data) { push(@datadef, "var d$i = [ [0,$data{$k}] ];\n"); - push(@contdef, "{ data: d$i, label: \"$k\" },\n"); + my $color = ''; + $color = ", color: \"$colors[$i-1]\"" if (($i-1) <= $#colors); + push(@contdef, "{ data: d$i, label: \"$k\"$color },\n"); $i++; } - print $fh <
    + + return < +EOF + +} + +sub flotr2_histograph +{ + my ($buttonid, $divid, $data1, $data2) = @_; + + if (!$data1) { + return qq{ +
    NO DATASET
    +}; + } + my $legend1 = 'Avg. queries'; + my $legend2 = ''; + my $dateTracker_lblopts = "'$legend1',"; + $legend1 = "{ data: d1, label: \"$legend1\", color: \"#6e9dc9\", mouse:{track:true}, bars: {show: true,shadowSize: 0}, },"; + if ($data2) { + $legend2 = 'Avg. duration'; + $dateTracker_lblopts .= "'$legend2'"; + $legend2 = "{ data: d2, label: \"$legend2\", color: \"#8dbd0f\", yaxis: 2},"; + } + $dateTracker_lblopts =~ s/,$//; + $dateTracker_lblopts = "[$dateTracker_lblopts]"; + + my $dateTracker_dataopts = ''; + if ($data1) { + $data1 = "var d1 = [$data1];"; + $dateTracker_dataopts .= "d1,"; + } + if ($data2) { + $data2 = "var d2 = [$data2];"; + $dateTracker_dataopts .= "d2"; + } + $dateTracker_dataopts =~ s/,$//; + $dateTracker_dataopts = "[$dateTracker_dataopts]"; + + my $yaxis2 = "y2axis: { mode: \"normal\", title: \"Duration\", min: 0, color: \"#8dbd0f\", tickFormatter: function(val){ return pretty_print_number(val,'duration') }, },"; + $yaxis2 = '' if (!$data2); + + return < + EOF @@ -3891,34 +9338,38 @@ sub build_log_line_prefix_regex { -#'%m %u@%d %p %r %a : ' -#2012-09-09 10:32:26.810 CEST [unknown]@[unknown] 21111 [unknown] : my %regex_map = ( - '%a' => [ ('t_appname', '([0-9a-zA-Z\.\-\_\/\[\]]*)') ], # application name - '%u' => [ ('t_dbuser', '([0-9a-zA-Z\_\[\]]*)') ], # user name - '%d' => [ ('t_dbname', '([0-9a-zA-Z\_\[\]]*)') ], # database name - '%r' => [ ('t_hostport', '(\[local\]|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})?[:\d]*') ], # remote host and port - '%h' => [ ('t_client', '(\[local\]|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})?') ], # remote host - '%p' => [ ('t_pid', '(\d+)') ], # process ID - '%t' => [ ('t_timestamp', '(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) [A-Z\d]{3,6}') ], # timestamp without milliseconds - '%m' => [ ('t_mtimestamp', '(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\.\d+ [A-Z\d]{3,6}') ], # timestamp with milliseconds - '%l' => [ ('t_session_line', '(\d+)') ], # session line number - '%s' => [ ('t_session_timestamp', '(\d{4}-\d{2}-\d{2} \d{2}):\d{2}:\d{2}) [A-Z\d]{3,6}') ], # session start timestamp - '%c' => [ ('t_session_id', '([0-9a-f\.]*)') ], # session ID - '%v' => [ ('t_virtual_xid', '([0-9a-f\.\/]*)') ], # virtual transaction ID - '%x' => [ ('t_xid', '([0-9a-f\.\/]*)') ], # transaction ID - '%i' => [ ('t_command', '([0-9a-zA-Z\.\-\_]*)') ], # command tag - '%e' => [ ('t_sqlstate', '([0-9a-zA-Z]+)') ], # SQL state + #'%a' => [('t_appname', '([0-9a-zA-Z\.\-\_\/\[\]]*)')], # application name + '%a' => [('t_appname', '(.*)')], # application name + '%u' => [('t_dbuser', '([0-9a-zA-Z\_\[\]\-]*)')], # user name + '%d' => [('t_dbname', '([0-9a-zA-Z\_\[\]\-]*)')], # database name + '%r' => [('t_hostport', '([a-zA-Z0-9\-\.]+|\[local\]|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|[0-9a-fA-F:]+)?[\(\d\)]*')], # remote host and port + '%h' => [('t_client', '([a-zA-Z0-9\-\.]+|\[local\]|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|[0-9a-fA-F:]+)?')], # remote host + '%p' => [('t_pid', '(\d+)')], # process ID + '%t' => [('t_timestamp', '(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})(?: [A-Z\d]{3,6})?')], # timestamp without milliseconds + '%m' => [('t_mtimestamp', '(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\.\d+(?: [A-Z\d]{3,6})?')], # timestamp with milliseconds + '%l' => [('t_session_line', '(\d+)')], # session line number + '%s' => [('t_session_timestamp', '(\d{4}-\d{2}-\d{2} \d{2}):\d{2}:\d{2}(?: [A-Z\d]{3,6})?')], # session start timestamp + '%c' => [('t_session_id', '([0-9a-f\.]*)')], # session ID + '%v' => [('t_virtual_xid', '([0-9a-f\.\/]*)')], # virtual transaction ID + '%x' => [('t_xid', '([0-9a-f\.\/]*)')], # transaction ID + '%i' => [('t_command', '([0-9a-zA-Z\.\-\_]*)')], # command tag + '%e' => [('t_sqlstate', '([0-9a-zA-Z]+)')], # SQL state ); my @param_list = (); + $log_line_prefix =~ s/([\[\]\|\(\)\{\}])/\\$1/g; $log_line_prefix =~ s/\%l([^\d])\d+/\%l$1\\d\+/; while ($log_line_prefix =~ s/(\%[audrhptmlscvxie])/$regex_map{"$1"}->[1]/) { push(@param_list, $regex_map{"$1"}->[0]); } + # replace %% by a single % + $log_line_prefix =~ s/\%\%/\%/; return @param_list; } -# Inclusion of package SQL::Beautify +# Inclusion of Perl package SQL::Beautify +# Copyright (C) 2009 by Jonas Kramer +# Published under the terms of the Artistic License 2.0. { package SQL::Beautify; @@ -4000,7 +9451,7 @@ (?:[\w:@]+(?:\.(?:\w+|\*)?)*) # words, standard named placeholders, db.table.*, db.* | - (?: \$_\$ | \$\d+ | \${1,2} ) + (?: \$_\$ | \$\d+ | \${1,2}) # dollar expressions - eg $_$ $3 $$ | \n # newline @@ -4072,7 +9523,7 @@ $self->{_level_stack} = []; $self->{_new_line} = 1; - my $last; + my $last = ''; $self->{_tokens} = [tokenize_sql($self->query, 1)]; while (defined(my $token = $self->_token)) { @@ -4091,10 +9542,15 @@ } elsif ($token eq ')') { - $self->_new_line; +# $self->_new_line; $self->{_level} = pop(@{$self->{_level_stack}}) || 0; $self->_add_token($token); - $self->_new_line; + $self->_new_line if ($self->_next_token + and $self->_next_token !~ /^AS$/i + and $self->_next_token ne ')' + and $self->_next_token !~ /::/ + and $self->_next_token ne ';' + ); } elsif ($token eq ',') { @@ -4111,11 +9567,11 @@ $self->{_level} = 0; } - elsif ($token =~ /^(?:SELECT|FROM|WHERE|HAVING)$/i) { - $self->_back unless $last and $last eq '('; + elsif ($token =~ /^(?:SELECT|FROM|WHERE|HAVING|BEGIN|SET)$/i) { + $self->_back if ($last and $last ne '(' and $last ne 'FOR'); $self->_new_line; $self->_add_token($token); - $self->_new_line if ($self->_next_token and $self->_next_token ne '('); + $self->_new_line if ((($token ne 'SET') || $last) and $self->_next_token and $self->_next_token ne '(' and $self->_next_token ne ';'); $self->_over; } @@ -4131,10 +9587,33 @@ $self->_over; } - elsif ($token =~ /^(?:UNION|INTERSECT|EXCEPT)$/i) { + elsif ($token =~ /^(?:CASE)$/i) { + $self->_add_token($token); + $self->_over; + } + + elsif ($token =~ /^(?:WHEN)$/i) { + $self->_new_line; + $self->_add_token($token); + } + + elsif ($token =~ /^(?:ELSE)$/i) { + $self->_new_line; + $self->_add_token($token); + } + + elsif ($token =~ /^(?:END)$/i) { + $self->_back; $self->_new_line; $self->_add_token($token); + } + + elsif ($token =~ /^(?:UNION|INTERSECT|EXCEPT)$/i) { + $self->_back unless $last and $last eq '('; $self->_new_line; + $self->_add_token($token); + $self->_new_line if ($self->_next_token and $self->_next_token ne '('); + $self->_over; } elsif ($token =~ /^(?:LEFT|RIGHT|INNER|OUTER|CROSS)$/i) { @@ -4152,10 +9631,26 @@ $self->_add_token($token); } - elsif ($token =~ /^(?:AND|OR)$/i) { - $self->_new_line; - $self->_add_token($token); - $self->_new_line; + elsif ($token =~ /^(?:AND|OR)$/i) { + $self->_new_line; + $self->_add_token($token); +# $self->_new_line; + } + + elsif ($token =~ /^--/) { + if (!$self->{no_comments}) { + $self->_add_token($token); + $self->_new_line; + } + } + + elsif ($token =~ /^\/\*.*\*\/$/s) { + if (!$self->{no_comments}) { + $token =~ s/\n[\s\t]+\*/\n\*/gs; + $self->_new_line; + $self->_add_token($token); + $self->_new_line; + } } else { @@ -4338,9 +9833,236 @@ } +sub get_log_file +{ + my $logf = shift; + + my $lfile = undef; + + # get file size + my $totalsize = (stat("$logf"))[7] || 0; + my $iscompressed = 1; + + # Open a file handle + if ($logf !~ /\.(gz|bz2|zip)/i) { + open($lfile, $logf) || die "FATAL: cannot read log file $logf. $!\n"; + $totalsize = 0 if ($lfile eq '-'); + $iscompressed = 0; + } else { + my $uncompress = $zcat; + if (($logf =~ /\.bz2/i) && ($zcat =~ /^$zcat_cmd$/)) { + $uncompress = $bzcat; + } elsif (($logf =~ /\.zip/i) && ($zcat =~ /^$zcat_cmd$/)) { + $uncompress = $ucat; + } + &logmsg('DEBUG', "Compressed log file, will use command: $uncompress \"$logf\""); + + # Open a pipe to zcat program for compressed log + open($lfile,"$uncompress \"$logf\" |") || die "FATAL: cannot read from pipe to $uncompress \"$logf\". $!\n"; + + # Real size of the file is unknown, try to find it + # bz2 does not report real size + $totalsize = 0; + if ($logf =~ /\.(gz|zip)/i) { + my $cmd_file_size = $gzip_uncompress_size; + if ($logf =~ /\.zip/i) { + $cmd_file_size = $zip_uncompress_size; + } + $cmd_file_size =~ s/\%f/$logf/g; + $totalsize = `$cmd_file_size`; + chomp($totalsize); + } + $queue_size = 0; + } + + # In list context returns the filehandle and the size of the file + if (wantarray()) { + return ($lfile, $totalsize, $iscompressed); + } + # In scalar context return size only + close($lfile); + return $totalsize; +} + +sub split_logfile +{ + my $logf = shift; + + # CSV file can't be parsed using multiprocessing + return (0, -1) if ( $format eq 'csv' ); + + # get file size + my $totalsize = (stat("$logf"))[7] || 0; + + # Real size of the file is unknown, try to find it + # bz2 does not report real size + if ($logf =~ /\.(gz|zip)/i) { + $totalsize = 0; + my $cmd_file_size = $gzip_uncompress_size; + if ($logf =~ /\.zip/i) { + $cmd_file_size = $zip_uncompress_size; + } + $cmd_file_size =~ s/\%f/$logf/g; + $totalsize = `$cmd_file_size`; + chomp($totalsize); + $queue_size = 0; + } elsif ($logf =~ /\.bz2/i) { + $totalsize = 0; + $queue_size = 0; + } + + return (0, -1) if (!$totalsize); + + my @chunks = (0); + my $i = 1; + if ($last_parsed && $saved_last_line{current_pos} && ($#given_log_files == 0)) { + $chunks[0] = $saved_last_line{current_pos}; + $i = $saved_last_line{current_pos}; + } + while ($i < $queue_size) { + push(@chunks, int(($totalsize/$queue_size) * $i)); + $i++; + } + push(@chunks, $totalsize); + + return @chunks; +} + +# Return the week number of the year for a given date +sub get_week_number +{ + my ($year, $month, $day) = @_; + +# %U The week number of the current year as a decimal number, range 00 to 53, starting with the first +# Sunday as the first day of week 01. +# %V The ISO 8601 week number (see NOTES) of the current year as a decimal number, range 01 to 53, +# where week 1 is the first week that has at least 4 days in the new year. +# %W The week number of the current year as a decimal number, range 00 to 53, starting with the first +# Monday as the first day of week 01. + + # Check if the date is valide first + my $datefmt = POSIX::strftime("%F", 1, 1, 1, $day, $month - 1, $year - 1900); + if ($datefmt ne "$year-$month-$day") { + return -1; + } + my $weekNumber = POSIX::strftime("%W", 1, 1, 1, $day, $month - 1, $year - 1900); + + return sprintf("%02d", $weekNumber+1); +} + +# Returns day number of the week of a given days +sub get_day_of_week +{ + my ($year, $month, $day) = @_; + +# %u The day of the week as a decimal, range 1 to 7, Monday being 1. +# %w The day of the week as a decimal, range 0 to 6, Sunday being 0. + + my $weekDay = POSIX::strftime("%u", 1,1,1,$day,--$month,$year-1900); + + return $weekDay; +} + +# Returns all days following the week number +sub get_wdays_per_month +{ + my $wn = shift; + my ($year, $month) = split(/\-/, shift); + my @months = (); + my @retdays = (); + + $month ||= '01'; + push(@months, "$year$month"); + my $start_month = $month; + if ($month eq '01') { + unshift(@months, ($year - 1) . "12"); + } else { + unshift(@months, $year . sprintf("%02d", $month - 1)); + } + if ($month == 12) { + push(@months, ($year+1) . "01"); + } else { + push(@months, $year . sprintf("%02d", $month + 1)); + } + + foreach my $d (@months) { + $d =~ /^(\d{4})(\d{2})$/; + my $y = $1; + my $m = $2; + foreach my $day ("01" .. "31") { + # Check if the date is valide first + my $datefmt = POSIX::strftime("%F", 1, 1, 1, $day, $m - 1, $y - 1900); + if ($datefmt ne "$y-$m-$day") { + next; + } + my $weekNumber = POSIX::strftime("%W", 1, 1, 1, $day, $m - 1, $y - 1900); + if ( ($weekNumber == $wn) || ( ($weekNumber eq '00') && (($wn == 1) || ($wn >= 52)) ) ) { + push(@retdays, "$year-$m-$day"); + return @retdays if ($#retdays == 6); + } + next if ($weekNumber > $wn); + } + } + + return @retdays; +} + +# Returns all days following the week number +sub get_wdays_per_year +{ + my $y = shift; + my %result = (); + + foreach my $m ("01" .. "12") { + foreach my $day ("01" .. "31") { + # Check if the date is valide first + my $datefmt = POSIX::strftime("%F", 1, 1, 1, $day, $m - 1, $y - 1900); + if ($datefmt ne "$y-$m-$day") { + next; + } + my $weekNumber = POSIX::strftime("%W", 1, 1, 1, $day, $m - 1, $y - 1900); + push(@{$result{$weekNumber}}, "$y-$m-$day"); + } + } + + return %result; +} + + + __DATA__ + + + + + + + + + + diff -Nru pgbadger-2.0/README pgbadger-5.0/README --- pgbadger-2.0/README 2012-09-18 09:56:31.000000000 +0000 +++ pgbadger-5.0/README 2014-02-06 15:29:04.000000000 +0000 @@ -1,4 +1,4 @@ -ABSTRACT +NAME pgBadger - a fast PostgreSQL log analysis report SYNOPSIS @@ -9,64 +9,89 @@ Arguments: logfile can be a single log file, a list of files, or a shell command - returning a list of file. If you want to pass log content from stdin - use - as filename. + returning a list of files. If you want to pass log content from stdin + use - as filename. Note that input from stdin will not work with csvlog. Options: -a | --average minutes : number of minutes to build the average graphs of queries and connections. -b | --begin datetime : start date/time for the data to be parsed in log. - -d | --dbname database : only report what concern the given database + -c | --dbclient host : only report on entries for the given client host. + -C | --nocomment : remove comments like /* ... */ from queries. + -d | --dbname database : only report on entries for the given database. -e | --end datetime : end date/time for the data to be parsed in log. -f | --format logtype : possible values: syslog,stderr,csv. Default: stderr -G | --nograph : disable graphs on HTML output. Enable by default. -h | --help : show this message and exit. -i | --ident name : programname used as syslog ident. Default: postgres + -I | --incremental : use incremental mode, reports will be generated by + days in a separate directory, --outdir must be set. + -j | --jobs number : number of jobs to run on parallel on each log file. + Default is 1, run as single process. + -J | --Jobs number : number of log file to parse in parallel. Default + is 1, run as single process. -l | --last-parsed file: allow incremental log parsing by registering the last datetime and line parsed. Useful if you want to watch errors since last run or if you want one report per day with a log rotated each week. - -m | --maxlength size : maximum length of a query, it will be cutted above + -m | --maxlength size : maximum length of a query, it will be restricted to the given size. Default: no truncate -n | --nohighlight : disable SQL code highlighting. - -N | --appname name : only report what concern the given application name - -o | --outfile filename: define the filename for the output. Default depends - of the output format: out.html or out.txt. To dump - output to stdout use - as filename. + -N | --appname name : only report on entries for given application name + -o | --outfile filename: define the filename for output. Default depends on + the output format: out.html, out.txt or out.tsung. + To dump output to stdout use - as filename. + -O | --outdir path : directory where out file must be saved. -p | --prefix string : give here the value of your custom log_line_prefix - defined in your postgresql.conf. You may used it only - if don't have the default allowed formats ot to use - other custom variables like client ip or application - name. See examples below. + defined in your postgresql.conf. Only use it if you + aren't using one of the standard prefixes specified + in the pgBadger documentation, such as if your prefix + includes additional variables like client ip or + application name. See examples below. -P | --no-prettify : disable SQL queries prettify formatter. - -q | --quiet : don't print anything to stdout, even not a progress bar. - -s | --sample number : number of query sample to store/display. Default: 3 - -t | --top number : number of query to store/display. Default: 20 + -q | --quiet : don't print anything to stdout, not even a progress bar. + -s | --sample number : number of query samples to store/display. Default: 3 + -S | --select-only : use it if you want to report select queries only. + -t | --top number : number of queries to store/display. Default: 20 -T | --title string : change title of the HTML page report. - -u | --dbuser username : only report what concern the given user + -u | --dbuser username : only report on entries for the given user. + -U | --exclude-user username : exclude entries for the specified user from report. -v | --verbose : enable verbose or debug mode. Disabled by default. -V | --version : show pgBadger version and exit. -w | --watch-mode : only report errors just like logwatch could do. - -x | --extension : output format. Values: text or html. Default: html + -x | --extension : output format. Values: text, html or tsung. Default: html -z | --zcat exec_path : set the full path to the zcat program. Use it if - zcat is not on your path or you want to use gzcat. + zcat or bzcat or unzip is not on your path. --pie-limit num : pie data lower than num% will show a sum instead. --exclude-query regex : any query matching the given regex will be excluded from the report. For example: "^(VACUUM|COMMIT)" - you can use this option multiple time. + You can use this option multiple times. --exclude-file filename: path of the file which contains all the regex to use to exclude queries from the report. One regex per line. + --include-query regex : any query that does not match the given regex will be + excluded from the report. For example: "(table_1|table_2)" + You can use this option multiple times. + --include-file filename: path of the file which contains all the regex of the + queries to include from the report. One regex per line. --disable-error : do not generate error report. - --disable-hourly : do not generate hourly reports. + --disable-hourly : do not generate hourly report. --disable-type : do not generate query type report. - --disable-query : do not generate queries reports (slowest, most + --disable-query : do not generate query reports (slowest, most frequent, ...). --disable-session : do not generate session report. --disable-connection : do not generate connection report. --disable-lock : do not generate lock report. --disable-temporary : do not generate temporary report. --disable-checkpoint : do not generate checkpoint report. + --disable-autovacuum : do not generate autovacuum report. + --charset : used to set the HTML charset to be used. Default: utf-8. + --csv-separator : used to set the CSV field separator, default: , + --exclude-time regex : any timestamp matching the given regex will be + excluded from the report. Example: "2013-04-12 .*" + You can use this option multiple times. + --exclude-appname name : exclude entries for the specified application name + from report. Example: "pg_dump". Examples: @@ -84,6 +109,14 @@ perl pgbadger --prefix 'user=%u,db=%d,client=%h,appname=%a' \ /pglog/postgresql-2012-08-21* + Use my 8 CPUs to parse my 10GB file faster, really faster + + perl pgbadger -j 8 /pglog/postgresql-9.1-main.log + + Generate Tsung sessions XML file with select queries only: + + perl pgbadger -S -o sessions.tsung --prefix '%t [%p]: [%l-1] user=%u,db=%d ' /pglog/postgresql-9.1.log + Reporting errors every week by cron job: 30 23 * * 1 /usr/bin/pgbadger -q -w /var/log/postgresql.log -o /var/reports/pg_errors.html @@ -93,83 +126,163 @@ 0 4 * * 1 /usr/bin/pgbadger -q `find /var/log/ -mtime -7 -name "postgresql.log*"` \ -o /var/reports/pg_errors-`date +%F`.html -l /var/reports/pgbadger_incremental_file.dat - This suppose that your log file and HTML report are also rotated every + This supposes that your log file and HTML report are also rotated every week. + Or better, use the auto-generated incremental reports: + + 0 4 * * * /usr/bin/pgbadger -I -q /var/log/postgresql/postgresql.log.1 \ + -O /var/www/pg_reports/ + + will generate a report per day and per week in the given output + directory. + + If you have a pg_dump at 23:00 and 13:00 each day during half an hour, + you can use pgbadger as follow to exclude these periods from the report: + + pgbadger --exclude-time "2013-09-.* (23|13):.*" postgresql.log + + This will help to not have all COPY order on top of slowest queries. You + can also use --exclude-appname "pg_dump" to solve this problem in a more + simple way. + DESCRIPTION - pgBadger is a PostgreSQL log analyzer built for speed with fully + pgBadger is a PostgreSQL log analyzer build for speed with fully detailed reports from your PostgreSQL log file. It's a single and small - Perl script that aims to replace and outperform the old php script - pgFouine. + Perl script that outperform any other PostgreSQL log analyzer. - By the way, we would like to thank Guillaume Smet for all the work he - has done on this really nice tool. We've been using it a long time, it - is a really great tool! - - pgBadger is written in pure Perl language. It uses a javascript library - to draw graphs so that you don't need additional Perl modules or any - other package to install. Furthermore, this library gives us additional - features, such as zooming. + It is written in pure Perl language and uses a javascript library + (flotr2) to draw graphs so that you don't need to install any additional + Perl modules or other packages. Furthermore, this library gives us more + features such as zooming. pgBadger also uses the Bootstrap javascript + library and the FontAwesome webfont for better design. Everything is + embedded. pgBadger is able to autodetect your log file format (syslog, stderr or - csvlog). It is designed to parse huge log files, as well as gzip + csvlog). It is designed to parse huge log files as well as gzip compressed file. See a complete list of features below. + All charts are zoomable and can be saved as PNG images. + + You can also limit pgBadger to only report errors or remove any part of + the report using command line options. + + pgBadger supports any custom format set into log_line_prefix of your + postgresql.conf file provide that you use the %t, %p and %l patterns. + + pgBadger allow parallel processing on a single log file and multiple + files through the use of the -j option and the number of CPUs as value. + + If you want to save system performance you can also use log_duration + instead of log_min_duration_statement to have reports on duration and + number of queries only. + FEATURE pgBadger reports everything about your SQL queries: - Overall statistics. + Overall statistics + The most frequent waiting queries. + Queries that waited the most. + Queries generating the most temporary files. + Queries generating the largest temporary files. The slowest queries. Queries that took up the most time. The most frequent queries. The most frequent errors. + Histogram of query times. + + The following reports are also available with hourly charts divide by + periods of five minutes: + + SQL queries statistics. + Temporary file statistics. + Checkpoints statistics. + Autovacuum and autoanalyze statistics. - The following reports are also available with hourly charts: + There's also some pie reports of distribution about: - Hourly queries statistics. - Hourly temporary file statistics. - Hourly checkpoints statistics. Locks statistics. - Queries by type (select/insert/update/delete). + ueries by type (select/insert/update/delete). + Distribution of queries type per database/application Sessions per database/user/client. Connections per database/user/client. + Autovacuum and autoanalyze per table. All charts are zoomable and can be saved as PNG images. SQL queries reported are highlighted and beautified automatically. + You can also have incremental reports with one report per day and a + cumulative report per week. + REQUIREMENT - PgBadger comes as a single Perl script- you do not need anything else + pgBadger comes as a single Perl script - you do not need anything other than a modern Perl distribution. Charts are rendered using a Javascript library so you don't need anything. Your browser will do all the work. If you planned to parse PostgreSQL CSV log files you might need some Perl Modules: - Text::CSV - to parse PostgreSQL CSV log files. + Text::CSV_XS - to parse PostgreSQL CSV log files. This module is optional, if you don't have PostgreSQL log in the CSV format you don't need to install it. - Under Windows OS you may not be able to use gzipped log files unless you - have a zcat like utility that could uncompress the log file and send - content to stdout. If you have such an utility or in other OSes you want - to use other compression utility like bzip2 or Zip, use the --zcat - comand line option as follow: + Compressed log file format is autodetected from the file exension. If + pgBadger find a gz extension it will use the zcat utility, with a bz2 + extension it will use bzcat and if the file extension is zip then the + unzip utility will be used. + + If those utilities are not found in the PATH environment variable then + use the --zcat command line option to change this path. For example: + + --zcat="/usr/local/bin/gunzip -c" or --zcat="/usr/local/bin/bzip2 -dc" + --zcat="C:\tools\unzip -p" + + By default pgBadger will use the zcat, bzcat and unzip utilities + following the file extension. If you use the default autodetection + compress format you can mixed gz, bz2 or zip files. Specifying a custom + value to --zcat option will remove this feature of mixed compressed + format. + + Note that multiprocessing can not be used with compressed files or CSV + files as well as under Windows platform. + +INSTALLATION + Download the tarball from github and unpack the archive as follow: + + tar xzf pgbadger-4.x.tar.gz + cd pgbadger-4.x/ + perl Makefile.PL + make && sudo make install + + This will copy the Perl script pgbadger to /usr/local/bin/pgbadger by + default and the man page into /usr/local/share/man/man1/pgbadger.1. + Those are the default installation directories for 'site' install. - --zcat="unzip -p" or --zcat="gunzip -c" or --zcat="bzip2 -dc" + If you want to install all under /usr/ location, use INSTALLDIRS='perl' + as an argument of Makefile.PL. The script will be installed into + /usr/bin/pgbadger and the manpage into /usr/share/man/man1/pgbadger.1. + + For example, to install everything just like Debian does, proceed as + follows: - the last example can also be used like this: --zcat="bzcat" + perl Makefile.PL INSTALLDIRS=vendor + + By default INSTALLDIRS is set to site. POSTGRESQL CONFIGURATION - You must enable some configuration directives in your postgresql.conf - before starting. + You must enable and set some configuration directives in your + postgresql.conf before starting. You must first enable SQL query logging to have something to parse: log_min_duration_statement = 0 - Note that pgBadger is not compatible with statements logs provided by - log_statement and log_duration. + Here every statement will be logged, on busy server you may want to + increase this value to only log queries with a higher duration time. + Note that if you have log_statement set to 'all' nothing will be logged + through log_min_duration_statement. See next chapter for more + information. With 'stderr' log format, log_line_prefix must be at least: @@ -192,51 +305,170 @@ log_line_prefix = 'db=%d,user=%u ' You need to enable other parameters in postgresql.conf to get more - informations from your log files: + information from your log files: log_checkpoints = on log_connections = on log_disconnections = on log_lock_waits = on log_temp_files = 0 + log_autovacuum_min_duration = 0 - Do not enable log_statement and log_duration, their log format will not - be parsed by pgBadger. + Do not enable log_statement as their log format will not be parsed by + pgBadger. - Of course your log messages should be in english without locale support: + Of course your log messages should be in English without locale support: lc_messages='C' - but this is not only recommanded by pgbadger. + but this is not only recommended by pgBadger. -INSTALLATION - Download the tarball from github and unpack the archive as follow: - - tar xzf pgbadger-1.x.tar.gz - cd pgbadger-1.x/ - perl Makefile.PL - make && sudo make install +log_min_duration_statement, log_duration and log_statement + If you want full statistics reports you must set + log_min_duration_statement to 0 or more milliseconds. + + If you just want to report duration and number of queries and don't want + all details about queries, set log_min_duration_statement to -1 to + disable it and enable log_duration in your postgresql.conf file. If you + want to add the most common request report you can either choose to set + log_min_duration_statement to a higher value or choose to enable + log_statement. + + Enabling log_min_duration_statement will add reports about slowest + queries and queries that took up the most time. Take care that if you + have log_statement set to 'all' nothing will be logged with + log_line_prefix. + +PARALLEL PROCESSING + To enable parallel processing you just have to use the -j N option where + N is the number of cores you want to use. + + pgbadger will then proceed as follow: + + for each log file + chunk size = int(file size / N) + look at start/end offsets of these chunks + fork N processes and seek to the start offset of each chunk + each process will terminate when the parser reach the end offset + of its chunk + each process write stats into a binary temporary file + wait for all children has terminated + All binary temporary files generated will then be read and loaded into + memory to build the html output. + + With that method, at start/end of chunks pgbadger may truncate or omit a + maximum of N queries perl log file which is an insignificant gap if you + have millions of queries in your log file. The chance that the query + that you were looking for is loose is near 0, this is why I think this + gap is livable. Most of the time the query is counted twice but + truncated. + + When you have lot of small log files and lot of CPUs it is speedier to + dedicate one core to one log file at a time. To enable this behavior you + have to use option -J N instead. With 200 log files of 10MB each the use + of the -J option start being really interesting with 8 Cores. Using this + method you will be sure to not loose any queries in the reports. + + He are a benchmarck done on a server with 8 CPUs and a single file of + 9.5GB. + + Option | 1 CPU | 2 CPU | 4 CPU | 8 CPU + --------+---------+-------+-------+------ + -j | 1h41m18 | 50m25 | 25m39 | 15m58 + -J | 1h41m18 | 54m28 | 41m16 | 34m45 + + With 200 log files of 10MB each and a total og 2GB the results are + slightly different: + + Option | 1 CPU | 2 CPU | 4 CPU | 8 CPU + --------+-------+-------+-------+------ + -j | 20m15 | 9m56 | 5m20 | 4m20 + -J | 20m15 | 9m49 | 5m00 | 2m40 + + So it is recommanded to use -j unless you have hundred of small log file + and can use at least 8 CPUs. + + IMPORTANT: when you are using parallel parsing pgbadger will generate a + lot of temporary files in the /tmp directory and will remove them at + end, so do not remove those files unless pgbadger is not running. They + are all named with the following template tmp_pgbadgerXXXX.bin so they + can be easily identified. + +INCREMENTAL REPORTS + pgBadger include an automatic incremental report mode using option -I or + --incremental. When running in this mode, pgBadger will generate one + report per day and a cumulative report per week. Output is first done in + binary format into the mandatory output directory (see option -O or + --outdir), then in HTML format for daily and weekly reports with a main + index file. + + The main index file will show a dropdown menu per week with a link to + the week report and links to daily reports of this week. + + For example, if you run pgBadger as follow based on a daily rotated + file: + + 0 4 * * * /usr/bin/pgbadger -I -q /var/log/postgresql/postgresql.log.1 \ + -O /var/www/pg_reports/ + + you will have all daily and weekly reports for the full running period. + + In this mode pgBagder will create an automatic incremental file into the + output directory, so you don't have to use the -l option unless you want + to change the path of that file. This mean that you can run pgBadger in + this mode each days on a log file rotated each week, it will not count + the log entries twice. + +BINARY FORMAT + Using the binary format it is possible to create custom incremental and + cumulative reports. For example, if you want to refresh a pgbadger + report each hour from a daily PostgreSQl log file, you can proceed by + running each hour the following commands: + + pgbadder --last-parsed .pgbadger_last_state_file -o sunday/hourX.bin /var/log/pgsql/postgresql-Sun.log - This will copy the Perl script pgbadger in /usr/local/bin/pgbadger - directory by default and the man page into - /usr/local/share/man/man1/pgbadger.1. Those are the default installation - directory for 'site' install. + to generate the incremental data files in binary format. And to generate + the fresh HTML report from that binary file: - If you want to install all under /usr/ location, use INSTALLDIRS='perl' - as argument of Makefile.PL. The script will be installed into - /usr/bin/pgbadger and the manpage into /usr/share/man/man1/pgbadger.1. + pgbadder sunday/*.bin - For example, to install everything just like Debian does, proceed as + Or an other example, if you have one log file per hour and you want a + reports to be rebuild each time the log file is switched. Proceed as follow: - perl Makefile.PL INSTALLDIRS=vendor + pgbadger -o day1/hour01.bin /var/log/pgsql/pglog/postgresql-2012-03-23_10.log + pgbadger -o day1/hour02.bin /var/log/pgsql/pglog/postgresql-2012-03-23_11.log + pgbadger -o day1/hour03.bin /var/log/pgsql/pglog/postgresql-2012-03-23_12.log + ... - By default INSTALLDIRS is set to site. + When you want to refresh the HTML report, for example each time after a + new binary file is generated, just do the following: + + pgbadger -o day1_report.html day1/*.bin + + Adjust the commands following your needs. AUTHORS - pgBadger is an original work from Gilles Darold. It is maintained by the - good folks at Dalibo and every one who wants to contribute. + pgBadger is an original work from Gilles Darold. + + The pgBadger logo is an original creation of Damien Clochard. + + The pgBadger v4.x design comes from the "Art is code" company. + + This web site is a work of Gilles Darold. + + pgBadger is maintained by Gilles Darold, the good folks at Dalibo, and + every one who wants to contribute. + + Many people have contributed to pgBadger, they are all quoted in the + Changelog file. LICENSE pgBadger is free software distributed under the PostgreSQL Licence. + Copyright (c) 2012-2014, Dalibo + + A modified version of the SQL::Beautify Perl Module is embedded in + pgBadger with copyright (C) 2009 by Jonas Kramer and is published under + the terms of the Artistic License 2.0. +