Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
7481ca8
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Oct 11, 2025
4b54c18
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Oct 11, 2025
638668f
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Oct 11, 2025
0acfbe5
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Oct 11, 2025
92cfeed
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Oct 11, 2025
edc2722
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Oct 11, 2025
a7edf5c
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Oct 11, 2025
f140f6e
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Oct 11, 2025
cf0570b
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Oct 15, 2025
a53f9c2
Merge branch 'master' into master
youjie23 Oct 15, 2025
d4ad7c0
Merge branch 'master' into master
wu-sheng Oct 15, 2025
5829a48
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Oct 18, 2025
9b10401
Merge branch 'master' of github.com:youjie23/skywalking
youjie23 Oct 18, 2025
602262d
Merge branch 'master' into master
youjie23 Oct 18, 2025
4688cf7
merge master
youjie23 Oct 25, 2025
f97ad0c
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Oct 25, 2025
239439c
Merge branch 'master' into master
youjie23 Oct 27, 2025
587b2aa
chore(e2e): set allowed times to <=0 for endless trigger simulation
youjie23 Oct 30, 2025
e8b6200
Merge branch 'master' of github.com:youjie23/skywalking
youjie23 Oct 30, 2025
6b1f926
Merge branch 'master' into master
wu-sheng Oct 30, 2025
88d2c85
Merge branch 'master' into master
wu-sheng Oct 31, 2025
783ac8b
Merge branch 'master' into master
wu-sheng Nov 1, 2025
c4da5d2
chore:add logs for troubleshooting
youjie23 Nov 6, 2025
c080b31
Merge branch 'master' of github.com:youjie23/skywalking
youjie23 Nov 6, 2025
c6a8d83
chore:add logs for troubleshooting
youjie23 Nov 6, 2025
9c8651c
Revert "chore:add logs for troubleshooting"
youjie23 Nov 6, 2025
7c2b0f5
chore: remove the commented-out code
youjie23 Nov 6, 2025
4dcff48
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Nov 9, 2025
5307baf
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Nov 9, 2025
6ff7817
Merge branch 'master' into master
youjie23 Nov 10, 2025
ca113a5
enhance the alarm kernel with recovered status notification capabilit…
youjie23 Nov 12, 2025
f65414b
Merge branch 'master' into master
youjie23 Nov 12, 2025
4c1e2c6
fix Copilot review and CI fail
youjie23 Nov 12, 2025
06a96e8
Merge branch 'master' into master
youjie23 Nov 13, 2025
37cc68a
Merge branch 'master' into master
youjie23 Nov 13, 2025
3b8e9c5
Sync UI
youjie23 Nov 16, 2025
eb77ce3
Merge branch 'master' of github.com:youjie23/skywalking
youjie23 Nov 16, 2025
dad19b8
docs:update changes.md and backend-alarm.md
youjie23 Nov 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions .github/workflows/skywalking.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -780,7 +780,7 @@ jobs:
if: matrix.test.docker != null
run: docker build -t ${{ matrix.test.docker.name }} -f ${{ matrix.test.docker.base }}/${{ matrix.test.docker.file }} ${{ matrix.test.docker.base }}
- name: ${{ matrix.test.name }}
uses: apache/skywalking-infra-e2e@cf589b4a0b9f8e6f436f78e9cfd94a1ee5494180
uses: apache/skywalking-infra-e2e@01b80d98a38154f4f80d9cdb128b9d81727f2b80
with:
e2e-file: $GITHUB_WORKSPACE/${{ matrix.test.config }}
- if: ${{ failure() }}
Expand Down Expand Up @@ -844,7 +844,7 @@ jobs:
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: ${{ matrix.test.name }}
uses: apache/skywalking-infra-e2e@cf589b4a0b9f8e6f436f78e9cfd94a1ee5494180
uses: apache/skywalking-infra-e2e@01b80d98a38154f4f80d9cdb128b9d81727f2b80
env:
ISTIO_VERSION: ${{ matrix.versions.istio }}
KUBERNETES_VERSION: ${{ matrix.versions.kubernetes }}
Expand Down Expand Up @@ -905,7 +905,7 @@ jobs:
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: ${{ matrix.test.name }}
uses: apache/skywalking-infra-e2e@cf589b4a0b9f8e6f436f78e9cfd94a1ee5494180
uses: apache/skywalking-infra-e2e@01b80d98a38154f4f80d9cdb128b9d81727f2b80
env:
ISTIO_VERSION: ${{ matrix.versions.istio }}
KUBERNETES_VERSION: ${{ matrix.versions.kubernetes }}
Expand Down Expand Up @@ -968,7 +968,7 @@ jobs:
shell: bash
run: ./mvnw -B -q -f test/e2e-v2/java-test-service/pom.xml clean package
- name: Java version ${{ matrix.java-version }}
uses: apache/skywalking-infra-e2e@cf589b4a0b9f8e6f436f78e9cfd94a1ee5494180
uses: apache/skywalking-infra-e2e@01b80d98a38154f4f80d9cdb128b9d81727f2b80
env:
SW_AGENT_JDK_VERSION: ${{ matrix.java-version }}
with:
Expand Down Expand Up @@ -1064,7 +1064,7 @@ jobs:
# fi
# docker compose -f ${BANYANDB_DATA_GENERATE_ROOT}/docker-compose.yml down -v
# - name: ${{ matrix.test.name }}
# uses: apache/skywalking-infra-e2e@cf589b4a0b9f8e6f436f78e9cfd94a1ee5494180
# uses: apache/skywalking-infra-e2e@01b80d98a38154f4f80d9cdb128b9d81727f2b80
# with:
# e2e-file: $GITHUB_WORKSPACE/${{ matrix.test.config }}
# - if: ${{ failure() }}
Expand Down
37 changes: 34 additions & 3 deletions dist-material/alarm-settings.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ rules:
expression: sum(service_resp_time > 1000) >= 3
period: 10
silence-period: 5
# Number of periods to wait before considering the alarm recovered,default as 0.
recovery-observation-period: 3
message: Response time of service {name} is more than 1000ms in 3 minutes of last 10 minutes.
# service_resp_time_rule:
# expression: avg(service_resp_time) > 1000
Expand All @@ -35,16 +37,20 @@ rules:
period: 10
# How many times of checks, the alarm keeps silence after alarm triggered, default as same as period.
silence-period: 3
# Number of periods to wait before considering the alarm recovered,default as 0.
recovery-observation-period: 2
message: Successful rate of service {name} is lower than 80% in 2 minutes of last 10 minutes
service_resp_time_percentile_rule:
expression: sum(service_percentile{p='50,75,90,95,99'} > 1000) >= 3
period: 10
silence-period: 5
recovery-observation-period: 3
message: Percentile response time of service {name} alarm in 3 minutes of last 10 minutes, due to more than one condition of p50 > 1000, p75 > 1000, p90 > 1000, p95 > 1000, p99 > 1000
service_instance_resp_time_rule:
expression: sum(service_instance_resp_time > 1000) >= 2
period: 10
silence-period: 5
recovery-observation-period: 2
message: Response time of service instance {name} is more than 1000ms in 2 minutes of last 10 minutes
database_access_resp_time_rule:
expression: sum(database_access_resp_time > 1000) >= 2
Expand All @@ -63,11 +69,36 @@ rules:
# silence-period: 5
# message: Response time of endpoint {name} is more than 1000ms in 2 minutes of last 10 minutes


#hooks:
# webhook:
# default:
# is-default: true
# urls:
# - http://127.0.0.1/notify/
# - http://127.0.0.1/go-wechat/

# - http://127.0.0.1/default/alarm
# recovery-urls:
# - http://127.0.0.1/default/alarm-recovery
# custom1:
# urls:
# - http://127.0.0.1/custom1/alarm
# recovery-urls:
# - http://127.0.0.1/custom1/alarm-recovery
# wechat:
# default:
# is-default: true
# text-template: |-
# {
# "msgtype": "text",
# "text": {
# "content": "Apache SkyWalking Alarm: \n %s."
# }
# }
# recovery-text-template: |-
# {
# "msgtype": "text",
# "text": {
# "content": "Apache SkyWalking Alarm Recovered: \n %s."
# }
# }
# webhooks:
# - https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=dummy_key
38 changes: 34 additions & 4 deletions dist-material/config-examples/alarm-settings.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ rules:
period: 10
# How many times of checks, the alarm keeps silence after alarm triggered, default as same as period.
silence-period: 10
# Number of periods to wait before considering the alarm recovered,default as 0.
recovery-observation-period: 3
message: Successful rate of endpoint {name} is lower than 75%
tags:
level: WARNING
Expand All @@ -43,7 +45,35 @@ rules:
silence-period: 5
message: Response time of service instance {name} is more than 1000ms in 2 minutes of last 10 minutes

#webhooks:
# - http://127.0.0.1/notify/
# - http://127.0.0.1/go-wechat/

#hooks:
# webhook:
# default:
# is-default: true
# urls:
# - http://127.0.0.1/default/alarm
# recovery-urls:
# - http://127.0.0.1/default/alarm-recovery
# custom1:
# urls:
# - http://127.0.0.1/custom1/alarm
# recovery-urls:
# - http://127.0.0.1/custom1/alarm-recovery
# wechat:
# default:
# is-default: true
# text-template: |-
# {
# "msgtype": "text",
# "text": {
# "content": "Apache SkyWalking Alarm: \n %s."
# }
# }
# recovery-text-template: |-
# {
# "msgtype": "text",
# "text": {
# "content": "Apache SkyWalking Alarm Recovered: \n %s."
# }
# }
# webhooks:
# - https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=dummy_key
3 changes: 3 additions & 0 deletions docs/en/changes/changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,11 @@
#### OAP Server

* KubernetesCoordinator: make self instance return real pod IP address instead of `127.0.0.1`.
* Enhance the alarm kernel with recovered status notification capability

#### UI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### UI
#### UI
* Fix the missing icon in new native trace view.

According to apache/skywalking-booster-ui@3092725...6eaf7fe, this submodule update includes two commits.

* Fix the missing icon in new native trace view.
* Enhance the alert page to show the recovery time of resolved alerts.

#### Documentation

Expand Down
Loading
Loading