Us vs them: how BAO Systems fights security breaches and cyber threats
21 Jan 2022By Lisa Spory, Chief Technology Officer at BAO Systems
In the early morning hours of December 10, 2021, before I even had the chance to enjoy my first sip of coffee, I was alerted by engineers on our BAO Systems team that a critical vulnerability was emerging that required immediate action for our over 400 managed cloud-based instances of DHIS2. A vulnerability in the open source Apache logging library Log4j exposed Java-based applications to breaches and attacks, and was extremely wide reaching in its scope. It quickly became clear that this vulnerability is one of the most serious vulnerabilities ever experienced across the digital world.
BAO engineers immediately on the battlefield
Applications use logs to keep track of what happens in its software, and this vulnerability allows attackers to enter arbitrary code, install malware, or steal data. While it would still be a few hours before specific guidance and patches would be available to the DHIS2 community, engineers at BAO Systems had already sprung into action to secure the server fleet.
Our ability to respond to this particular vulnerability, and future zero-day vulnerabilities, is enabled by our robust critical incident response process, which is based upon the National Institute of Standards and Technology (NIST) framework for incident response. Having a defined process for handling critical incidents is crucial to providing immediate response, and an absolute must for effectively adjusting and adapting to a constantly evolving threat, as we experienced with the Log4Shell vulnerability over the course of several days.
How we kept your DHIS2 instance safe
BAO Systems has developed a critical response process to protect against cyber security threats. In this article, I will highlight a few key aspects of our response process, and how it helped us to keep our hosting customers safe and secure during one of the most dangerous and wide-spread security vulnerabilities in history.
Preparation
No one wants to be scrambling when presented with a critical, zero-day vulnerability, which is why the first step in the process is adequate preparation. At BAO Systems, this means that we have plans and procedures in place for identifying vulnerabilities, determining who gets notified, who takes action, and how communication is managed both internally and externally. We are constantly connected to and monitor the DHIS2 Community, as well as industry standard sources of IT news and events. It also means that we have a solid handle on our inventory of IT assets, to include a list of assets, their purpose, and their priority (such as production vs development servers).
Additionally, preparation involves implementing good security hygiene and following cloud security best practices to mitigate vulnerabilities in advance, since often the first line of attack for hackers is to exploit “sloppy” security practices. For example, at BAO Systems, we employ principles of least privilege so that if a server were to become compromised during a Remote Code Execution (RCE) attack, such as Log4Shell, the impact and reach of the security breach is limited.
We also implement policies such as blocking all web traffic requests made by IP instead of host name, which helped to fend off the early, less sophisticated attacks we witnessed during Log4Shell.
“Early in this incident we immediately observed multiple Russian-based IP attacks on servers based in the AWS US-East-1 region, which were automatically blocked by security preparation measures we already had in place,” says Alan Ivey, BAO Systems Director of Systems Engineering. “It was rewarding to see our proactive measures and security practices offer a first line of defense.”
Detection & Analysis
While proactive preparation techniques can help you be ready, when a critical vulnerability actually hits, it’s immediately “go time.” Having a prepared process and inventory of IT assets was critical in allowing the BAO Systems team to quickly detect servers with impacted software versions and prioritize servers for mitigating actions and patching. As part of the process, the team maintained the inventory status in real time, which became critical as the Log4Shell exploit details evolved over time requiring additional actions.
Over the course of days, the team quickly iterated through multiple configuration changes and patches, which was accelerated by meticulous maintenance of the patch and inventory software status. Additionally, while part of the team was busy implementing mitigations and patches, other members of the team were penetration testing patched servers and analyzing the fleet for signs of compromise.
Containment, Eradication & Recovery
Our response to actively addressing the Log4Shell vulnerability was fast, aggressive and abundantly cautious. For example, while early in the Log4Shell exploit the initial belief was that only older versions of Java were impacted, BAO Systems proactively took measures to disable jndi lookups before the official guidance shifted to do so. Our ability to quickly roll out configuration changes and software patches across over 400 servers was accelerated again through proper preparation, in the form of Ansible scripts to quickly (and consistently) deploy changes across the fleet.
Our containment, eradication and recovery strategy was also extremely customer focused. While we knew we had to move quickly as patches became available, we also were keenly aware that we have production DHIS2 users in the field working on critical missions helping real people. In close collaboration with the Systems Engineering team, our BAO Systems Customer Success team engaged immediately with our hosting clients to communicate and solidify patch strategies and timelines, and where necessary, devise alternative plans.
For example, one of our DHIS2 hosting customers was in the middle of an extremely time-sensitive, mission critical operation while the Log4Shell vulnerability was unfolding. While they were unable to assume the risk of immediately upgrading multiple DHIS2 versions during this critical period, everyone understood that addressing the Log4Shell vulnerability was equally critical and time-sensitive. Our Systems Engineering team collaborated with the Customer Success team and the customer to roll out a direct patch to “surgically” remove the vulnerable jndi code, protecting the server from the Log4Shell vulnerability without requiring immediate DHIS2 patching and risking the mission.
Post-Incident Activity
While proper preparation and quick action enabled the Systems Engineering team to effectively navigate this critical security incident and keep our servers safe, no process is without room for improvement. Understanding the need for continuous improvement, we held a deep-dive retrospective meeting to dissect in detail activities and process elements that worked successfully, and those that could be improved. The retrospective resulted in a number of action items that we will be rolling out in coming weeks, such as improved processes to manage communications and improved tagging standards for better IT Asset management.
Additionally, longer-term activities already underway, such as continued implementation of Infrastructure as Code (IaC) through containerization, and the adoption of a new tool called RegScale (www.RegScale.com) to manage and track security compliance artifacts and assessments for hosted products, will further enable BAO Systems to remain safe, secure and continuously complaint in the face of ever emerging security threats.
Do you want to learn more about the technical details and timeline of the Log4Shell vulnerabilities? Check out Log4Shell: RCE 0-day exploit found in log4j 2, a popular Java logging package and Log4Shell log4j vulnerability – cheat-sheet reference guide