62 Commits

Author SHA1 Message Date
luckky
2c3b8e9fa3 make debug msg clear 2024-11-05 19:22:52 +08:00
znzjugod
5bd58eff9d update nvme config 2024-11-05 14:45:04 +08:00
gaoruoshu
f7a5a43e12 change avg_block_io config 2024-11-05 10:37:32 +08:00
luckky
abafccd640 fix write file return code bug 2024-11-04 20:21:53 +08:00
luckky
0eb406ab73 fix uint8 bug and change isolation default value 2024-11-01 15:24:56 +08:00
jinsaihang
d84ff78577 excessive CPU usage
Signed-off-by: jinsaihang <jinsaihang@h-partners.com>
2024-11-01 11:01:46 +08:00
znzjugod
78e793b944 modify logrotate rule 2024-10-31 16:38:16 +08:00
jinsaihang
bcfb68bfe7 get_alarm -d abnormal display
Signed-off-by: jinsaihang <jinsaihang@h-partners.com>
2024-10-31 15:05:04 +08:00
luckky
28cc04e447 fix hbm online repair logic 2024-10-30 10:47:50 +08:00
luckky
6a84f3c770 merge master
Signed-off-by: luckky <guodashun1@huawei.com>
2024-10-28 10:01:26 +08:00
jinsaihang
2ef4d65dfb fix newline break error
Signed-off-by: jinsaihang <jinsaihang@h-partners.com>
2024-10-26 16:00:57 +08:00
luckky
4852ce4790 add hbm online repair 2024-10-26 14:20:33 +08:00
znzjugod
44bec2ee03 remove extra dependency 2024-10-26 11:20:52 +08:00
jinsaihang
134c4d9d18 fix get_alarm error
Signed-off-by: jinsaihang <jinsaihang@h-partners.com>
2024-10-23 10:57:16 +08:00
贺有志
6885e05395
ai_block_io support iodump
Signed-off-by: 贺有志 <1037617413@qq.com>
2024-10-22 11:33:56 +00:00
贺有志
4a84f5bae0
fix frequency param check bug
Signed-off-by: 贺有志 <1037617413@qq.com>
2024-10-22 01:30:36 +00:00
zhuofeng
b8770a6f94 update collect plugin period max 2024-10-21 19:25:51 +08:00
PshySimon
37e4e51192 fix xalarm non-uniform logging format 2024-10-21 17:39:42 +08:00
贺有志
a7571573cc ai_block_io lack section exit 2024-10-21 14:35:10 +08:00
贺有志
ca856c005a
enrich alert info about kernel stack
Signed-off-by: 贺有志 <1037617413@qq.com>
2024-10-16 12:20:59 +00:00
jinsaihang
039410b619 optimize log printing
Signed-off-by: jinsaihang <jinsaihang@h-partners.com>
2024-10-16 17:11:49 +08:00
zhuofeng
0337360f50 listen thread of collect module exits occasionally 2024-10-16 14:11:07 +08:00
贺有志
b3ac391178 fix ai_block_io root cause bug
Signed-off-by: 贺有志 <1037617413@qq.com>

fix ai_block_io root cause bug

Signed-off-by: 贺有志 <1037617413@qq.com>
2024-10-16 12:42:01 +08:00
gaoruoshu
7e035b92d0 refactor config.py and bugfix uncorrect slow io report
get_io_data failed wont stop avg_block_io and del disk not support

Signed-off-by: gaoruoshu <gaoruoshu@huawei.com>
2024-10-15 21:40:07 +08:00
贺有志
5f2e3dd4e4
ai_block_io fix some bugs
Signed-off-by: 贺有志 <1037617413@qq.com>
2024-10-14 15:21:13 +00:00
zhuofeng
c1da6e295d add pysentry_collect package and update collect log
modify abnormal stack when the disk field is not configured
2024-10-14 09:16:20 +08:00
贺有志
36a07c1468
add root cause analysis
Signed-off-by: 贺有志 <1037617413@qq.com>
2024-10-12 14:02:18 +00:00
zhuofeng
010a9386a8 fix io_dump for collect module 2024-10-12 14:39:46 +08:00
贺有志
43f4e9ae10
ai_block_io support stage and iotype
Signed-off-by: 贺有志 <1037617413@qq.com>
2024-10-11 13:54:50 +00:00
PhsySimon
e694d3a9d2 fix xalarm upgrade not return val, not refuse to send msg when length exceeds 8192,cleanup invalid socket peroidlly 2024-10-11 20:18:10 +08:00
jinsaihang
2532c6971d add parameters valication
Signed-off-by: jinsaihang <jinsaihang@h-partners.com>
2024-10-11 17:57:34 +08:00
gaoruoshu
73c9e3809b avg_block_io adapt different type of disk, use different config 2024-10-11 11:52:05 +08:00
zhuofeng
26ee44cd37 add get_disk_type and fix some bugs
add log for improving maintainability
2024-10-11 09:15:36 +08:00
贺有志
5eb6aaf745 ai_block_io adapt alarm module.patch
Signed-off-by: 贺有志 <1037617413@qq.com>

ai_block_io adapt alarm module

Signed-off-by: 贺有志 <1037617413@qq.com>
2024-10-10 21:19:16 +08:00
PhsySimon
312ba1d6c5 xalarm add alarm msg length to 8192 2024-10-10 17:34:30 +08:00
jinsaihang
5db6829da9 add dependency for sysSentry and avg_block_io
Signed-off-by: jinsaihang <jinsaihang@h-partners.com>
2024-10-10 15:25:50 +08:00
jinsaihang
522fde6dd5 fix get_alarm length and timestamp
Signed-off-by: jinsaihang <jinsaihang@h-partners.com>
2024-10-10 10:49:20 +08:00
zhuofeng
e032d94603 update log when it is not lock collect 2024-10-09 16:49:36 +08:00
贺有志
cd573a07aa
add fix-config-relative-some-issues.patch
Signed-off-by: 贺有志 <1037617413@qq.com>
2024-10-09 08:35:22 +00:00
zhuofeng
066cfe307e avg_block_io send alarm to xalarmd 2024-10-09 15:17:12 +08:00
PshySimon
8d5caea382 fix python 3.7 not support list bool type 2024-10-09 14:26:01 +08:00
jinsaihang
ae5556ff59 add alarm event query function
Signed-off-by: jinsaihang <jinsaihang@h-partners.com>
2024-10-08 20:12:17 +08:00
PshySimon
a620ff721e add pyxalarm and pysentry_notify lib and xalarmd support for multi users 2024-10-08 18:56:31 +08:00
贺有志
4db9027149 add fix-ai-block-io-issues.patch.
Signed-off-by: 贺有志 <1037617413@qq.com>

update sysSentry.spec.

Signed-off-by: 贺有志 <1037617413@qq.com>

rename fix-ai-block-io-issues.patch to fix-ai_block_io-some-issues.patch.

Signed-off-by: 贺有志 <1037617413@qq.com>

update sysSentry.spec.

Signed-off-by: 贺有志 <1037617413@qq.com>
2024-09-30 09:21:35 +08:00
zhuofeng
e35e45b1e0 相关日志格式以及日志打印修改
相关冗余代码删除
当磁盘disk字段设置为default的时候,采集不生效的情况
2024-09-27 16:25:50 +08:00
zhuofeng
d6e572e746 相关bug修复和优化
1、配置文件选项值相同的时候去重
2、配置项大小写敏感
3、配置了不存在的磁盘时,日志给出相关告警提示
4、一些拼写错误
5、avg_block_io.ini配置文件中,不同section缺失的检验行为不一致
6、avg_block_io.ini配置文件中common.disk和common.stage选项参数解析异常
2024-09-25 11:32:18 +08:00
shixuantong
6e33378214 optimize the handing of cat-cli error msg in cpu_sentry 2024-09-23 16:53:02 +08:00
贺有志
ec6f42737a add ai threshold slow io detection plugin
Signed-off-by: 贺有志 <1037617413@qq.com>
2024-09-23 14:36:47 +08:00
zhuofeng
4e1a7951ff 修改相关bug
1、当环境本身就超过10个disk,配置成default会导致配置有问题
2、调用巡检任务停止接口后,回显状态有误
3、重构相关代码和修改日志级别
2024-09-20 14:24:05 +08:00
zhuofeng
ee153a027b add collect module and avg_block_io plugin to sysSentry 2024-09-14 10:35:09 +08:00