What Are The Steps In The Incident Management Process?

Home
Steps in the Incident Management Process

Any IT issue that results in the disruption of business services is known as an incident. The issue may affect a single or multiple users. It may affect your employees’ ability to complete a business deliverable or hinder customers from using your product or service.

Every organization should have an incident management team that can identify, categorize, escalate and resolve issues within agreed-upon SLAs. Depending on the available resources, you can handle incident management in-house or explore IT service desk outsourcing.

It is always a good idea to set business processes beforehand. ITIL is a framework aiming to standardize IT services’ lifecycle within businesses. You can refer to their incident management templates which are based on ITSM best practices, and adapt them to your unique business requirement.

The article will look at the overall steps involved in incident management and how you can learn from incidents and improve your IT infrastructure.

What are the Steps in the Incident Management Process?

These steps may differ from one organization to another, but overall incident management may include the five stages mentioned below.

1. Incident Logging

There are multiple routes through which one can identify incidents. For example, you may get an alert from the network monitoring automated processes. Or an employee or customer may call in to report a business function is not working.

It is typically the after hours emergency call center team that handles the initial processes. An agent will raise a ticket and gather information about the issue. They may go through the knowledge base to understand if it is a known or new issue.

They will determine if the issue is actually an incident and accordingly fill in the ticket details.

2. Incident Categorization and Assignment

Ticketing software allows you to create categories and subcategories for tickets. This ensures that issues are routed to the correct teams quickly. Without these categories, the service desk may waste valuable time trying to find people responsible for fixing the problem.

These categories can also help managers understand what IT issues users commonly face. In addition, the subcategories can provide visibility at a granular level.

For example, if users cannot connect to the corporate network, then the network can be the main category, while VPN could be the subcategory. So, this way, the incident can be straightaway assigned to the network team.

Similarly, if a file server is down, the category could be reporting, and the subcategory could be server. As the business users will not be able to access, view, or generate reports, the incident could be routed to the reporting team.

The service desk logs, categorizes, and assigns the incident to the team responsible.

3. Incident Diagnosis

In a typical IT support model, you have the service desk team that is considered the face of the business and is the first point of contact between users and the company. Then you have the L2 support team that will include technically adept resources, such as system analysts and programmers.

Then you have the L3 support team that will include resources with an advanced technical skillset. This could be your developers, DBAs, and system architects.

So, when an incident is created, the service desk will assign the ticket to the L2 support team. They may investigate the issue by checking on system logs and updates.

The L3 support team, including the specialists, will be involved if the incident needs further investigation.

The vendor will also be involved in the incident call if the issue is identified with the third-party tool or application.

4. Incident Resolution

As the name implies, once the cause of the issue is identified, the concerned team will fix the issue. The resolution time may vary depending on the fix.

For example, if it is a code fix, then changes will be made in the dev environment, tested, and moved to production. If the server is facing performance issues or is down, it will be rebooted. If some hardware needs to be replaced, the incident may take longer to resolve.

Once the issue is fixed and business operations are restored, the service desk may verify with the users who reported the problem and then mark the incident as resolved.

5. Incident Closure

Once an incident is resolved, the service desk may schedule a root cause analysis call with all the stakeholders involved.

This call involves identifying the root cause of the issue. For example, why did the incident occur in the first place? Are the system checks in place working? What can be done to prevent such issues from recurring?

You can discuss how the team handled the incident. If incident resolution met the defined SLAs, how the team performed, and how can the team perform better in the future?

The Post Implementation Review (PIR) is an opportunity to identify risks and improvement points for both the system and personnel. It is not about placing blame but understanding how you can do better as a business.

I am text block. Click edit button to change this text. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Why is it Necessary to Have an Incident Management Team?

Business disruptions can be expensive. For example, in March 2019, Facebook faced a downtime of 14 hours which cost them about $90 million in lost revenue. Of course, the monetary loss may vary, but for small to medium-sized companies, the impact could be much more widespread.

The incident management team will lead the resolution process from start to end. They will involve the responsible teams and follow up on the diagnosis. In the meantime, they will update the status board and other channels like social media to let users know the company is aware of the issue and working on it.

Regular updates are essential as they provide customers transparency and keep them in the loop. Without communication, even if the company resolves the issue, it could still lead to dissatisfied customers as they do not know what is happening.

The team can also document incidents that business users can use for reporting purposes.

Summary

According to the ITIL framework, the incident management process can overall include the stages of incident logging, categorization, assignment, diagnosis, resolution, and closure.

Depending on the cause, the service desk, L2, and L3 support teams, vendors, and business users could be involved in the resolution process.

It is essential to have an incident management team as the agents can drive the calls, update stakeholders and ensure the issue is fixed according to SLAs.



ITIL Best Practices

What Are The 7 Key ITIL Practices?



SWG vs Firewall

SWG vs Firewall: What’s The Difference?