IT Hurdles: If It Ain't Broken

By : adminDecember 14, 2022

When we consider so many elements in our lives, many people feel the common saying: “If it ain’t broke, don’t fix it.” In technology, this can be detrimental to our business and personal lives if we don’t pay close attention to the risks associated with taking such a stance. Trying to use outdated technology may be a money saver on the surface, but more often than not, it’s a money trap waiting to emerge from both a CAPEX perspective and an OPEX perspective.

Years ago, I was working to identify a number of systems and determine the usage and necessary upgrades for those systems. In my work, I came across several older, increasingly older systems that were in use and identified one, an older AS400 that was over 15 years old. The system was a core for about 400 people and was critical to doing their job, and every person I spoke to about the system was quick to tell me two things: they couldn’t work without the system and it was okay because they paid support for it. system.

Every single person involved insisted that we couldn’t touch that system because “they were special,” “it couldn’t be down,” and “they had support, so we didn’t have to worry about it.”

While my team was reviewing the system, I sat down with the owner of the system and called the vendor. They had been paying an excessive amount of money every year for support and I asked the vendor a simple question. “If the system fails due to hardware failure, will you guarantee that it will be fixed?” There was a pause, and then came the answer. “Our SLA is that we will have a technician on site within 4 hours.” I smiled and waited and asked the question differently: “Can you guarantee that you will be able to get the system back online?”, and the answer was again: “Our SLA is that we will have a technician on site within 4 hours” . We had some further discussions, but after the call I looked at the owner of the system, a non-technical person in charge of an important area, and asked him if he understood what had just happened, he was very attentive and simply said. “I think we have to look at some additional options.”

We replaced that system with a newer box and worked to replace the software. Using virtual techniques, we moved the system to a more resilient platform, ensuring that the system would be online as needed and ensuring that the solution was not a technician on site within 4 hours, but a system supporting 400 workers. that they would be online even in the event of a disaster.

So why do we make a good decision? It is easy. First, if the entity was down for even 1 hour, the 400 affected workers would cost an excessive amount of dollars. Even if it’s a minimum job at $10 an hour, which it wasn’t, that’s $4000 an hour. Had a power outage occurred, it could have become a massive amount of dollars in lost-time operating expenses that dwarfs any other cost. Second, if the data had been lost, there would have been no alternate operating systems or hardware to bring the system back online, and the cost of losing the data could be incalculable. Third, the system itself, being out of date for so long, had numerous security issues and could easily have been a breach of data that is protected by regulation. This alone can destroy both the credibility of a business and the finances of the business with minimal chance of recovery. Fourth, the system itself was taking its toll on users and becoming less and less usable, causing actual workers to find a solution to do their jobs that was even more expensive.

Of course, there were many more reasons, but how does this affect both small and large companies? Well, as a system ages, we add risk to that system and potential points of failure, including replacement issues. The bigger the system, that is, the more pieces that move, the more possibilities there are for problems to arise, since systems can be more easily affected and users can be impacted more easily.

A simple approach can be HardwareAge+OSAge+Risk+userimpact+financialimpact-DR resiliency<10.

Why?

Well, as hardware ages, it requires upgrades, but it may also require replacement parts. As parts become less available, the risk to the system is difficult and can be frustrating. If you virtualize, you should consider that the virtual strategy is part of the same equation, but in the case of system, your hardware age is always 1, since the virtual system becomes the necessary update.

The operating system can become a nightmare as it gets older as it will develop more and more security risks. If it is at the end of its life and is no longer supported, you are immediately at great risk and need to find a solution. We often forget about the operating system and it is the source of much of what we do and, in most programs, the basis for working.

Risk can be a massive discussion on its own, but in this case let’s consider risk as regulatory or agency risk, since the whole equation is indirectly about risk. Therefore, consider the risk from 0 to 5, where five are the most controlled items and regulatory work, such as HIPAA, and zero is no risk.

For user impact and financial impact, this is subjective, but it rates the impact from 0 to 3, where 0 is no impact and 3 is high impact.

Disaster resiliency can detract from your score by creating situations where you can get back online quickly without as much risk of downtime. This can be accomplished through programs that quickly bring your system back online. Using a virtual machine and a solution like Datto can get you back online quickly, even in the event of a total loss, reducing overall risk.

This is not a hard and fast rule and it’s something I put together to explain to people the risks associated with the systems in a simple way. A good tech professional would look at this and say it’s a start, but there’s so much more, but this will let you know where to start. If you get a number greater than 10, it’s definitely time to start talking to someone. If we take the example, we had it previously. We get these numbers:

15+16+5+3+3-1=42

Every increment beyond 10 should have been a red flag, and in this case, the DR resilience could have been 1, and it would still have been bad.

This is still just a guess. It is just as valid for measuring compatibility and availability without an equation. As the number of people who can support a system decreases, the risk increases rapidly, whether the system is high-risk or not. Several times, I have been put in the position of finding a way into a system that no one knows the password for and no one knows how to fix. If your support is single threaded, it’s time to replace the software, hardware, or both.

It is also important to carefully watch what providers tell you. Obviously, there’s no guarantee on any system, but when you’re not given an ETA or escalation path in the event of an outage, you’re skirting downtime and the potential costs associated with it.

Remember, if a system is not critical, won’t cost you time, won’t lose time, has no critical or useful data, and can be gone forever without affecting you or your business, then maybe it’s okay to keep a very old system. I’m sure there are some exceptions as well, where it would cost a lot to update a piece of software and the update is prevented, but in the end, if you have these systems and say, “If it ain’t broke, don’t do it.” fix it” maybe those machines should shut down anyway and find new solutions to really help business.

Technology

IT Hurdles: If It Ain’t Broken

Leave a Reply Cancel reply