|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
|
© 2012.
Site24x7 is a
trademark of
ZOHO Corp.
All other company and product names may be trademark of the respective companies with which they are associated. |
|||||||||||||||||||||||
Root Cause Analysis (RCA) is one of the recent significant releases from Site24x7. In this post we will see how to interpret a RCA report and how to put RCA to optimal usage in identifying and addressing performance and network issues.
RCA is different from a conventional downtime alert. Conventional downtime alerts contain details like start time of the instance and traceroute if available, which is sufficient for you to know what happened. However, the critical question as to why it happened still remains unanswered. You need to know what caused the issue to immediately get down to working on it.
Root Cause Analysis automatically generates a plethora of information to arrive at a definite conclusion as to what triggered a downtime. RCA intends to determine the root cause of specific downtime or performance issue. In short RCA aims to answer questions like what went wrong, how it went wrong and why it went wrong.
We will take the example of a simple 'website not reachable' scenario and try to interpret what the report says.
A normal RCA report will comprise of the following details.
Monitor Details & Location Details: This section shows you the current status of your website when polled from the Primary and Secondary monitoring locations. This will have downtime details, duration and location wise reason for down.

First Check - First check is done from the Primary Location
1. Screenshot: This
screenshot is the exact error returned when the monitoring stations
tried to connect to your website. This kind of acts as a proof or
evidence of what happened exactly when the site was tried to reach. In
our example the site returned ‘Connection Timed Out’ error.

2. Test: Ping Analysis
Status: Server Unreachable due to timeout in Hop 16
Traceroute
analysis can help you to identify any vulnerability in your network. In
this case RCA gives the conclusion that timeout happened in Hop 16.

3. Test: Domain Analysis
Status: Domain resolved properly
This is a complete health check of your Name Servers and Email Servers. In the below example, domain resolved properly.

Re-Checks from Secondary location
The same set of tests will be conducted from a secondary location as well to confirm the downtime.
4. Conclusion
This
is where the RCA report tells you the probable reason for the down
instance based on the above results. The conclusion reached in this case is “Connection to the server got dropped in the Hop 16”. This is obviously something to do with the network. Armed with this conclusion, you can immediately get down fixing the issue. May be you need to contact the hosting provider.

Have you already used RCA reports? We would love to hear how RCA was helpful in identifying and resolving your issues. Do let us know in comments.
| Location | IP |
| Copenhagen-Denmark | 77.66.111.162 |
| Location | Old IP | New IP |
| Rotterdam (Netherlands) | 213.163.84.159 213.163.84.160 213.163.84.161 213.163.84.162 | 134.19.176.30 134.19.176.80 134.19.176.81 134.19.176.82 |
If you have enabled IP restriction in your server, make sure that you add the new IP address into the allowed list. This change is necessary to ensure that our monitoring request carries on smoothly without any hindrance. If you have any questions, do contact us at support@site24x7.com.
| Location | Old IP | New IP |
| Chennai- India | 121.244.182.93 | 27.251.30.94 |
.jpg)




| Location | Old IP | New IP |
| São Paulo | 200.170.83.170 | 177.71.177.249 |
If you have enabled IP restriction in your server, make sure that you add the new IP address into the allowed list. This change is necessary to ensure that our monitoring request carries on smoothly without any hindrance. If you have any questions, do contact us at support@site24x7.com.