环境
MySQL 5.7
Loki 3.5.3
Promtail 3.5.3
检索方式
MySQL 5.7 日志不支持JSON格式,原日志格式不利于检索,若不经处理,则每次检索都要正则匹配,麻烦且耗性能。
下面有两种方式可以实现:
- 提取指定字段做标签,通过标签检索,这种适用保留原日志格式
- 提取所有字段转存储为JSON日志
原格式提取指定字段做标签
注:标签不宜过多,
loki
标签限制最多15个,一般控制在3-5个,同时应该避免高基数标签(如IP、用户ID、时间戳),这些会导致性能问题。
正则测试
下面是Percona MySQL 慢日志, 不同MySQL版本格式会有差异。
# Time: 2025-07-30T15:11:02.141824+08:00
# User@Host: sunday_mall[sunday_mall] @ [192.168.1.42] Id: 338614801
# Schema: sunday_mall Last_errno: 0 Killed: 0
# Query_time: 14.194599 Lock_time: 0.000029 Rows_sent: 1 Rows_examined: 5868820 Rows_affected: 0
# Bytes_sent: 264 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0
# InnoDB_trx_id: 0
# QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No
# Filesort: No Filesort_on_disk: No Merge_passes: 0
# InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000
# InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000
# InnoDB_pages_distinct: 8191
# Log_slow_rate_type: query Log_slow_rate_limit: 100
SET timestamp=1753858862;
SELECT ope.order_id,max(ope.id) as max_id,ord.platform_shop_id,count(*) as count from mall_order_operate as ope left join mall_order as ord on ope.order_id = ord.id where ope.user_id != '-1' and ope.deleted_at is null and ope.created_at <= '2025-07-30 15:00:34' and ope.created_at > '2025-07-30 14:54:34' and ope.end_status = 4 and ope.start_status = 2 and ord.platform_shop_id = 27 and ope.id > if(ord.platform_shop_id = 27, 10404807, 0) order by ope.id desc;
在线测试正则表达式 https://regex101.com/
下面正则表达式 主要提取time
、schema
、query_time
'#\s*Time:\s*(?P<time>\S+)\n.*\n#\sSchema:\s(?P<schema>\S+).*\n#\sQuery_time:\s(?P<query_time>\d+)[\s\S]+'
这里的query_time
使用\d提取为整数,忽略毫秒,避免loki标签高基数问题
正则简单解释
*
是一个量词,表示零次或多次
+
是一个量词,表示一次或多次
.*
:匹配零个或多个任意字符(除了换行符)
\s*
: 匹配零个或多个空白字符(包括空格、制表符、换行符等)
\S+
:匹配的是一个或多个非空白字符
(?P<schema>\S+)
:命名捕获组,名为schema
,提取\S+匹配到的结果,通过schema
来访问捕获到的内容。
[\s\S]+
:匹配一个或多个任何字符(包括空白字符和非空白字符,即所有字符),主要匹配多行数据
配置文件
vim /etc/promtail/config.yml
limits_config:
readline_rate: 2048 # 默认10000
readline_burst: 4096 # 默认10000
scrape_configs:
- job_name: mysql_slow
static_configs:
- labels:
instance: db91
#project_name: sunday_mall
job: mysql_slow_log
__path__: /data/logs/mysql/slow_*.log
pipeline_stages:
- multiline:
firstline: '^#\sTime:.' # 正则多行合并
max_lines: 300
max_wait_time: 3s
- match:
selector: '{job="mysql_slow_log"}'
stages:
# 提取time schema query_time字段
- regex:
expression: '#\s*Time:\s*(?P<time>\S+)\n.*\n#\sSchema:\s(?P<schema>\S+).*\n#\sQuery_time:\s(?P<query_time>\d+)[\s\S]+'
- labels:
schema:
query_time:
- timestamp: # 调整解析时间为time字段,而非Promtail接收到日志的时间
source: time
format: RFC3339
检查语法及前台启动测试
promtail -config.file=/etc/promtail/config.yml -check-syntax
promtail -config.file=/etc/promtail/config.yml
标签检索及正则检索
Fields
标签 多了schema
和query_time
标签查询
{job="mysql_slow_log", schema="sunday_mall"} | query_time > 13
{job="mysql_slow_log" } | schema="sunday_mall" | query_time > 13
正则查询(不通过标签),\
注意转义
{job="mysql_slow_log"} | regexp "Query_time: (?P<query_time>\\S+) Schema: (?P<schema>\\w+)" | query_time >= 10 | schema = "sunday_mall"
提取所有字段转储为JSON
正则测试
用ip和localhost连接的日志格式不同
# User@Host: sunday_mall[sunday_mall] @ [192.168.1.42] Id: 338614801
或
# User@Host: backup[backup] @ localhost [] Id: 381613365
下面正则只匹配[]
中地址,若为空,则promtail 设置默认值为localhost
#\s+User@Host:\s+\w+\[(?P<user>\S+)\]\s+@\s+[\w\s]*\[(?P<host>\S*)\]\s+Id:\s+(?P<connection_id>\d+)\n
在线测试正则表达式 https://regex101.com/
^#\s+Time:\s+(?P<time>\S+)\n#\s+User@Host:\s+\w+\[(?P<user>\S+)\]\s+@\s+[\w\s]*\[(?P<host>\S*)\]\s+Id:\s+(?P<connection_id>\d+)\n#\s+Schema:\s(?P<schema>\S+)\s+Last_errno:\s+(?P<last_errno>\d+)\s+Killed:\s+(?P<killed>\d+)\n#\s+Query_time:\s+(?P<query_time>\S+)\s+Lock_time:\s+(?P<lock_time>\S+)\s+Rows_sent:\s+(?P<rows_sent>\d+)\s+Rows_examined:\s+(?P<rows_examined>\d+)\s+Rows_affected:\s+(?P<rows_affected>\d+)\n#\s+Bytes_sent:\s+(?P<bytes_sent>\d+)\s+Tmp_tables:\s+(?P<tmp_tables>\d+)\s+Tmp_disk_tables:\s+(?P<tmp_disk_tables>\d+)\s+Tmp_table_sizes:\s(?P<tmp_table_sizes>\d+)\n#\s+InnoDB_trx_id:\s(?P<innodb_trx>\d+)\n#\s+QC_Hit:\s+(?P<qc_hit>\S+)\s+Full_scan:\s(?P<full_scan>\S+)\s+Full_join:\s+(?P<full_join>\S+)\s+Tmp_table:\s(?P<tmp_table>\S+)\s+Tmp_table_on_disk:\s+(?P<tmp_table_on_disk>\S+)\n#\s+Filesort:\s+(?P<filesort>\S+)\s+Filesort_on_disk:\s+(?P<filesort_on_disk>\S+)\s+Merge_passes:\s+(?P<merge_passes>\d+)\n#\s+InnoDB_IO_r_ops:\s+(?P<innodb_io_r_ops>\d+)\s+InnoDB_IO_r_bytes:\s+(?P<innodb_io_r_bytes>\S+)\s+InnoDB_IO_r_wait:\s+(?P<innodb_io_r_wait>\S+)\n#\s+InnoDB_rec_lock_wait:\s+(?P<innodb_rec_lock_wait>\S+)\s+InnoDB_queue_wait:\s+(?P<innodb_queue_wait>\S+)\n#\s+InnoDB_pages_distinct:\s+(?P<innodb_pages_distinct>\S+)\n#\s+Log_slow_rate_type:\s+(?P<log_slow_rate_type>\S+)\s+Log_slow_rate_limit:\s+(?P<log_slow_rate_limit>\d+)\nSET\s+timestamp=(?P<timestamp>\d+);\n(?P<query>[\s\S]*)'
配置文件
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://192.168.1.54:3100/loki/api/v1/push
limits_config:
readline_rate: 10000 # 每秒每文件最大行数
readline_burst: 10000
scrape_configs:
- job_name: mysql_slow
static_configs:
- labels:
instance: db91
job: mysql_slow_log
__path__: /data/logs/mysql/slow_*.log
pipeline_stages:
# 多行处理:识别每条日志的起始
- multiline:
firstline: '^#\sTime:.'
max_lines: 1000
max_wait_time: 3s
- match:
selector: '{job="mysql_slow_log"}'
stages:
- regex:
#expression: '#\s*Time:\s*(?P<time>\S+)\n.*\n#\sSchema:\s(?P<schema>\S+).*\n#\sQuery_time:\s(?P<query_time>\d+)[\s\S]+'
expression: '^#\s+Time:\s+(?P<time>\S+)\n#\s+User@Host:\s+\w+\[(?P<user>\S+)\]\s+@\s+[\w\s]*\[(?P<host>\S*)\]\s+Id:\s+(?P<connection_id>\d+)\n#\s+Schema:\s(?P<schema>\S+)\s+Last_errno:\s+(?P<last_errno>\d+)\s+Killed:\s+(?P<killed>\d+)\n#\s+Query_time:\s+(?P<query_time>\S+)\s+Lock_time:\s+(?P<lock_time>\S+)\s+Rows_sent:\s+(?P<rows_sent>\d+)\s+Rows_examined:\s+(?P<rows_examined>\d+)\s+Rows_affected:\s+(?P<rows_affected>\d+)\n#\s+Bytes_sent:\s+(?P<bytes_sent>\d+)\s+Tmp_tables:\s+(?P<tmp_tables>\d+)\s+Tmp_disk_tables:\s+(?P<tmp_disk_tables>\d+)\s+Tmp_table_sizes:\s(?P<tmp_table_sizes>\d+)\n#\s+InnoDB_trx_id:\s(?P<innodb_trx>\d+)\n#\s+QC_Hit:\s+(?P<qc_hit>\S+)\s+Full_scan:\s(?P<full_scan>\S+)\s+Full_join:\s+(?P<full_join>\S+)\s+Tmp_table:\s(?P<tmp_table>\S+)\s+Tmp_table_on_disk:\s+(?P<tmp_table_on_disk>\S+)\n#\s+Filesort:\s+(?P<filesort>\S+)\s+Filesort_on_disk:\s+(?P<filesort_on_disk>\S+)\s+Merge_passes:\s+(?P<merge_passes>\d+)\n#\s+InnoDB_IO_r_ops:\s+(?P<innodb_io_r_ops>\d+)\s+InnoDB_IO_r_bytes:\s+(?P<innodb_io_r_bytes>\S+)\s+InnoDB_IO_r_wait:\s+(?P<innodb_io_r_wait>\S+)\n#\s+InnoDB_rec_lock_wait:\s+(?P<innodb_rec_lock_wait>\S+)\s+InnoDB_queue_wait:\s+(?P<innodb_queue_wait>\S+)\n#\s+InnoDB_pages_distinct:\s+(?P<innodb_pages_distinct>\S+)\n#\s+Log_slow_rate_type:\s+(?P<log_slow_rate_type>\S+)\s+Log_slow_rate_limit:\s+(?P<log_slow_rate_limit>\d+)\nSET\s+timestamp=(?P<timestamp>\d+);\n(?P<query>[\s\S]*)'
# 取整
- regex:
expression: '(?P<query_time_int>\d+)(\.\d+)?'
source: query_time
- labels:
schema:
query_time: query_time_int
- timestamp:
source: time
#format: RFC3339
format: "2006-01-02T15:04:05Z07:00"
location: "Asia/Shanghai"
- template:
source: json_log_message
template: |
{
"time": "{{ .time }}",
"user": "{{ .user }}",
"host": "{{ if .host }}{{ .host }}{{ else }}localhost{{ end }}",
"connection_id": {{ .connection_id }},
"schema": "{{ .schema }}",
"last_errno": {{ .last_errno }},
"killed": {{ .killed }},
"query_time": "{{ .query_time }}",
"lock_time": "{{ .lock_time }}",
"rows_sent": {{ .rows_sent }},
"rows_examined": {{ .rows_examined }},
"rows_affected": {{ .rows_affected }},
"bytes_sent": {{ .bytes_sent }},
"tmp_tables": {{ .tmp_tables }},
"tmp_disk_tables": {{ .tmp_disk_tables }},
"tmp_table_sizes": {{ .tmp_table_sizes }},
"innodb_trx": "{{ .innodb_trx }}",
"qc_hit": "{{ .qc_hit }}",
"full_scan": "{{ .full_scan }}",
"full_join": "{{ .full_join }}",
"tmp_table": "{{ .tmp_table }}",
"tmp_table_on_disk": "{{ .tmp_table_on_disk }}",
"filesort": "{{ .filesort }}",
"filesort_on_disk": "{{ .filesort_on_disk }}",
"merge_passes": {{ .merge_passes }},
"innodb_io_r_ops": {{ .innodb_io_r_ops }},
"innodb_io_r_bytes": "{{ .innodb_io_r_bytes }}",
"innodb_io_r_wait": "{{ .innodb_io_r_wait }}",
"innodb_rec_lock_wait": "{{ .innodb_rec_lock_wait }}",
"innodb_queue_wait": "{{ .innodb_queue_wait }}",
"innodb_pages_distinct": "{{ .innodb_pages_distinct }}",
"log_slow_rate_type": "{{ .log_slow_rate_type }}",
"log_slow_rate_limit": {{ .log_slow_rate_limit }},
"timestamp": {{ if .timestamp }}"{{ .timestamp }}"{{ else }}null{{ end }},
"query": "{{ .query }}"
}
- output:
source: json_log_message
json检索
{job="mysql_slow_log"} |= `` | json | schema = "sunday_mall" | query_time >= 10