熱門搜索 Zabbix技術資料 Zabbix常見問、答討論 成功案例 Zabbix交流區 Prometheus交流區
大家可能都會遇到機房突然斷電,當zabbix恢復運行,查看日志發現有大量的PGRES_FATAL_ERROR錯誤信息,這種情況應該如何解決呢?
?[select clock,ns,value from history_uint where itemid=36570 and clock<=1662337221 and clock>1661732421 order by clock desc limit 2]
134751:20220906:092021.356 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: ?could not read block 619 in file “base/17376/55998”: read only 0 of 32768 bytes
?[select clock,ns,value from history_uint where itemid=36570 and clock<=1662337221 and clock>1661732421 order by clock desc limit 2]
134751:20220906:092021.359 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: ?could not read block 873 in file “base/17376/55991”: read only 0 of 32768 bytes
?[select clock,ns,value from history_uint where itemid=36567 and clock<=1662337221 and clock>1661732421 order by clock desc limit 2]
134751:20220906:092021.361 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: ?could not read block 873 in file “base/17376/55991”: read only 0 of 32768 bytes
[Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: could not read block 874
發現是因為突然斷電,導致pg數據庫的表部分索引出現問題了,需要修復,但是由于zabbix數據使用timescaledb時序數據庫插件以及超表分區功能,故無法指定單表修復。
在網上搜索后,在以下網站找到了類似的報錯以及修復方法:
http://lxadm.com/Repairing_broken_PostgreSQL_databases_/_tables
If your server happened to crash, PostgresSQL database is corrupted, but didn’t contain too precious information, you may try the following fix.
如果你的服務器突然發生崩潰,PostgresSQL?突然被中斷,但是沒有包含太多之前的信息,你可以嘗試安裝以下方法修復。
The typical symptoms of a corrupted Postgres database would be like below:
常見的因為數據庫運行突然中斷的日志結果如下:
2013-03-05 11:29:50 GMT ERROR: ?invalid page header in block 608102 of
relation base/16385/16615 2013-03-05 11:29:50 GMT STATEMENT: ?COPY
public.history (itemid, clock, value) TO stdout; 2013-03-05 11:29:50
GMT LOG: ?could not send data to client: Broken pipe
Or?或者
Query failed: [0] PGRES_FATAL_ERROR:ERROR: ?right sibling’s left-link doesn’t match:
block 149266 links to 70823 instead of expected 71357 in index “history_uint_1”
The actual fix is quite easy, and basically sets “zero_damaged_pages = on”, then performs vacuum and reindexing.
實際的修復也簡單,在數據庫設置sets “zero_damaged_pages = on”,然后執行vacuum and reindexing重新建立索引即可。
DATABASE=yourdatabase
?TABLES=$(echo?\\d?|?psql?$DATABASE?|?grep?“^ public”?|?awk?‘{print $3}’)
?for?TABLE in $TABLES;?do?
???echo?$TABLE
???echo?“SET zero_damaged_pages = on; VACUUM FULL?$TABLE; REINDEX TABLE?$TABLE”?|?psql?$DATABASEdone
在zabbix server或者是pg數據庫服務器,創建shell腳本,將以上的復制到腳本,修改為在使用的數據庫。
vim pg_repair_index.sh
!# /bin/bash
DATABASE=zabbix???#報錯的數據庫名稱
TABLES=$(echo \\d | psql $DATABASE | grep “^ public” | awk ‘{print $3}’)
for TABLE in $TABLES; do
???echo $TABLE
???echo “SET zero_damaged_pages = on; VACUUM FULL $TABLE; REINDEX TABLE $TABLE” | psql $DATABASE
done?
給腳本執行權限
chmod +x pg_repair_index.sh
執行執行腳本?./pg_repair_index.sh,會自動重置每個表的索引。
更多zabbix技術資料,請持續關注尊龍時凱社區:http://forum.ydcanyin.com/
zabbix是一個功能強大的網絡監控工具,它可以監控各種網絡設備、服務器、應用程序等。zabbix監控數據的收集和處理通過輪詢器進程完成,這些進程運行在zabbix server和zabbix proxy上。但是,有時候可能會遇到無...
View details在zabbix使用過程中,zabbix-agent通常部署在被監控目標上,用于主動監控本地資源和應用程序,并將目標的可用性、完整性及其他統計信息數據發送給zabbix-serv...
View details采用分布式架構:多server +?多?proxy?架構,服務器優化、增加表分區、采集方式優化等。
View details